Ron Kohavi's publications
My Ph.D. thesis
Wrappers for Performance Enhancement and Oblivious
Decision Graphs (or compressed
postscript).
Publications are in reverse chronological order.
- Alex Deng, Ya Xu, Ron Kohavi, Toby Walker,
Improving the Sensitivity of Online Controlled Experiments
by Utilizing Pre-Experiment Data, WSDM 2013.
- Ron Kohavi, Alex Deng, Brian Frasca, Roger Longbotham, Toby
Walker, Ya Xu,
Trustworthy Online Controlled Experiments: Five Puzzling Outcomes
Explained, KDD
2012. Powerpoint slides,
DOI.
- Ron Kohavi and Roger
Longbotham,
Unexpected Results in Online Controlled Experiments, SIGKDD 2010.
DOI.
- Ron Kohavi, David Messner,Seth Eliot, Juan Lavista Ferres, Randy
Henne, Vignesh Kannappan, and Justin
Wang, Tracking Users' Clicks
and Submits: Tradeoffs between User Experience and Data Loss,
Microsoft White Paper, Oct 2010.
Word.
- Ron Kohavi, Roger Longbotham, and Toby Walker,
Online Experiments: Practical Lessons, IEEE Computer, Vol 43,
issue 9, pp. 82-85, Sept 2010. DOI.
- Ron Kohavi, Thomas Crook, Roger Longbotham, Brian Frasca,
Randy Henne, Juan Lavista Ferres, Tamir Melamed,
Online Experimentation at Microsoft, Microsoft ThinkWeek paper
recognized as top 30, 2009. (Modified version of workshop
paper below.)
- Ron Kohavi, Thomas Crook, Roger Longbotham, Online Experimentation at Microsoft,
Third workshop on Data Mining Case
Studies and Practice Prize, 2009. The paper won 3rd place.
- Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal
M. Henne,
Controlled Experiments on the Web: Survey and Practical Guide,
Data Mining and Knowledge Discovery journal, Vol 18(1), p. 140-181,
2009. DOI.
- Thomas Crook, Brian Frasca, Ron Kohavi, and Roger Longbotham,
Seven Pitfalls to Avoid when Running Controlled Experiments on the Web,
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on
Knowledge discovery and data mining, p. 1105-1114, 2009.
DOI.
- Ron Kohavi and Roger Longbotham,
Online Experiments: Lessons Learned, IEEE Computer, Vol 40,
issue 9, p. 103-105, Sept 2007. DOI.
- Ron Kohavi, Randy Henne, and Dan Sommerfield,
Practical Guide to Controlled Experiments on the Web:
Listen to Your Customers not to the HiPPO,
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on
Knowledge discovery and data mining, p. 959-967, 2007.
DOI.
- Ron Kohavi, Llew Mason, Rajesh
Parekh, Zijian Zheng, Lessons and
Challenges from Mining Retail E-Commerce Data, Machine Learning
journal, Special Issue on Data Mining Lessons Learned, Vol 57, issue
1, p. 83-113, 2004.
DOI.
- Ron Kohavi and Rajesh Parekh,
Visualizing RFM Segmentation,
Fourth SIAM International Conference on Data Mining
(SDM), 2004.
- Ron Kohavi and Rajesh Parekh,
Ten Supplementary Analyses to
Improve E-commerce Web Sites
(alt PDF),
WEBKDD'2003.
- Blue Martini Case Studies:
- Ron Kohavi, Neal Rothleder, and Evangelos Simoudis,
Emerging Trends in Business Analytics, Communications of the ACM,
Evolving data mining into solutions for insights,
Volume 45, Number 8, Aug 2002, pages 45-48.
DOI.
- Ron Kohavi and J. Ross Quinlan.,
Decision-tree discovery, in Will Klosgen and Jan M. Zytkow,
editors,
Handbook of Data Mining and Knowledge Discovery,
chapter 16.1.3, pages 267-276. Oxford University Press, 2002.
- Nir Friedman and Ron Kohavi,
Bayesian classification, in Will Klosgen and Jan M. Zytkow, editors,
Handbook of Data Mining and Knowledge Discovery,
chapter 16.1.5, pages 282-288. Oxford University Press, 2002.
- Cliff Brunk and Ron Kohavi, Mineset, in Will Klosgen and Jan M. Zytkow, editors, Handbook of Data Mining and Knowledge Discovery, chapter
24.2.4, pages 584-589. Oxford University Press, 2002.
- Ron Kohavi and Dan Sommerfield,
MLC++, in Will Klosgen and Jan M. Zytkow,
editors, Handbook of Data Mining and Knowledge Discovery, chapter
24.1.2, pages 548-553. Oxford University Press, 2002.
- Ron Kohavi, Brij Masand, Myra Spiliopoulou, and
Jaideep Srivastava, WEBKDD
2001 - Mining Web Log Data Across All Customers Touch Points,
Third International Workshop, San Francisco, CA, Aug 2001.
Original papers available here.
- Llew Mason, Zijian Zheng, Ron Kohavi, Brian Frasca,
eMetrics Study, Dec 2001.
This was an extensive study to generate a set of eMetrics using Blue
Martini customers' transactional, customer, and clickstream
data.
- Zijian Zheng, Ron Kohavi, and Llew Mason,
Real World
Performance of Association Rule Algorithms, KDD 2001:
Proceedings of the seventh ACM SIGKDD international conference on
Knowledge discovery and data mining, pages 401-406, 2001,
long version, and slides.
DOI.
- Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng,
Integrating E-Commerce and Data Mining: Architecture and Challenges,
IEEE International Conference on Data Mining (ICDM'01), p. 27, 2001.
DOI
- Ron Kohavi and Foster Provost,
Applications of Data Mining to Electronic Commerce, Data
Mining and Knowledge Discovery journal 5(1/2), p. 5-10, 2001.
DOI.
This special issue is also available as a hardcover book:
Applications of Data Mining to Electronic Commerce.
- Ron Kohavi, Carla Brodley, Brian Frasca, Llew
Mason, and Zijian Zheng,
KDD-Cup 2000 Organizers' Report: Peeling the Onion,
SIGKDD Explorations Volume 2, issue 2, p. 86-93, 2000.
Also translated to Japanese in Information
Processing Society of Japan, Vol 42 No. 5.
DOI.
- Myra Spiliopoulou, Jaideep Srivastava, Ron
Kohavi, and Brij Masand,
Web Mining,
Data Mining and Knowledge Discovery journal vol 6, p 5-8, 2002.
DOI.
Initially appeared as
WEBKDD 2000 - Web Mining for E-Commerce
in SIGKDD Explorations Volume 2, issue 2, 2000.
- Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng,
Integrating E-commerce and Data Mining:
Architecture and Challenges, WEBKDD'2000 workshop on Web Mining for
E-Commerce - Challenges and Opportunities, Aug 2000.
arXiv.
- Ron Kohavi and Mehran Sahami (co-chairs), Jim Bozik,
Dorian Pyle, Rob Gerritsen, Steve Belcher, Ken Ono (panelists).
Integrating Data Mining
into Vertical Solutions: Problems and Challenges (slides),
KDD-99 panel. The article
KDD-99 Panel Report: Data Mining into Vertical Solutions
appeared in SIGKDD Explorations Volume 1, issue 2
- Eric Bauer and Ron Kohavi,
An Empirical Comparison of
Voting Classification Algorithms: Bagging, Boosting, and Variants,
Machine Learning journal, Vol 36, Nos. 1/2, pages 105-139, 1999.
DOI.
The paper is cited over 400 times according to
CiteSeerX and over 1,200 times in Google Scholar.
- Ron Kohavi and George John,
The Wrapper Approach, book
chapter in Feature
Extraction, Construction and Selection : A Data Mining Perspective,
edited by Huan Liu and Hiroshi Motoda.
- Ron Kohavi, Improving Accuracy by voting
Classification Algorithms: Boosting, Bagging, and Variants. Invited
talk at Workshop on Computation-Intensive Machine Learning Techniques.
Australia, Sept 1998 compressed
postscript slides
- Ron Kohavi and Foster Provost, Glossary of Terms.
Editorial for the Special Issue on Applications of Machine Learning and
the Knowledge Discovery Process (volume 30, Number 2/3, February/March
1998). Postscript
or HTML
- Ron Kohavi and Foster Provost, On Applied Research in
Machine Learning. Editorial for the Special Issue on Applications of
Machine Learning and the Knowledge Discovery Process (volume 30, Number
2/3, February/March 1998). Postscript
- Ron Kohavi, Crossing the Chasm: From Academic Machine
Learning to Commercial Data Mining. Invited talk at ICML-98. compressed
postscript slides or acrobat (PDF)
slides
- Afshin Goodarzi, Ron Kohavi, Richard Harmon, and Aydin
Senkut, Loan Prepayment Modeling. Appeared in KDD-98 workshop on
Data Mining in
Finance. high-res
compressed
postscript or lower-res
acrobat (PDF)
- Ron Kohavi, Data Mining with MineSet: What Worked,
What Did Not, and What Might. Appeared in KDD-98 workshop on the
Commercial Success of Data Mining. compressed
postscript or acrobat (PDF)
- Ron Kohavi and Dan Sommerfield, Targeting Business
Users with Decision Table Classifiers. Appeared in KDD-98. compressed
postscript or acrobat
(PDF)
- Ron Kohavi, Technique Selection in Machine Learning
Applications. Invited talk at the ICML-98 workshop on the Methodology
of Applying Machine Learning. compressed
postscript slides or acrobat
(PDF)
slides
- Foster Provost, Tom Fawcett, Ron Kohavi, Building the
Case Against Accuracy Estimation for Comparing Induction Algorithms.
ICML-98. compressed
postscript or (low-res)
acrobat
(PDF)
- Jeff Bradford, Clay Kunz, Ron Kohavi, Cliff Brunk, and
Carla Brodley, Pruning Decision Trees with Misclassification
Costs. ECML-98. compressed
postscript and long
version in compressed postscript
- Ron Kohavi, Dan Sommerfield, and James Dougherty,
Data Mining using MLC++, a Machine Learning Library in C++.
International Journal of Artificial Intelligence Tools, Vol. 6, No. 4,
1997, p. 537-566. This is a longer version of the TAI'96 paper that
received the IEEE Tools With Artificial Intelligence Best Paper Award. compressed
postscript (283K) or acrobat (PDF)
- Barry Becker, Ron Kohavi, Dan Sommerfield, Visualizing
the Simple Bayesian Classifier. Appears in the KDD 1997 Workshop on
Issues in the Integration of Data Mining and Data Visualization.
Lecture Notes
in Computer Science by Springer Verlag. compressed
postscript (358K).
- Cliff Brunk, James Kelly, and Ron Kohavi, MineSet:
An Integrated System for Data Mining. Appears in the The Third
International Conference on Knowledge Discovery and Data Mining, 1997. compressed
postscript (276K).
- Ron Kohavi and Clayton Kunz, Option Decision
Trees with Majority Votes. Apears in the International Conference on
Machine
Learning 1997. postscript
(308K).
- Ron Kohavi and George John, Wrappers for Feature
Subset Selection. In Artificial Intelligence journal,
special issue on relevance, Vol. 97, Nos 1-2, pp. 273-324.NEC's ResearchIndex
one of the top referenced paper in Machine Learning.
PDF, postscript
- Ron Kohavi, Barry Becker, and Dan Sommerfield,
Improving Simple Bayes compressed
postscript. ECML-97 (poster).
- Ron Kohavi, Pat Langley, Yeogirl Yun, The Utility of
Feature Weighting in Nearest-Neighbor Algorithms compressed
postscript. ECML-97 (poster).
- Ron Kohavi, MLC++ Developments: Data Mining using
MLC++. AAAI Fall Symposium on Learning Complex Behaviors in Adaptive
Intelligent Systems, Nov 1996. compressed
postscript
slides.
- Ron Kohavi, Dan Sommerfield, and James Dougherty,
Data Mining using MLC++, a Machine Learning Library in C++. TAI 96. The
paper received the IEEE Tools With Artificial Intelligence Best
Paper Award, 1996. NEC's ResearchIndex
one of the top referenced paper in Machine Learning.
Compressed
postscript (245K) or uncompressed
postscript (3.3MB)
- Ron Kohavi and Mehran Sahami, Error-Based and
Entropy-Based Discretization of Continuous Features. KDD-96. postscript
(165K)
- Ron Kohavi, Scaling Up the Accuracy of Naive-Bayes
Classifiers: a Decision-Tree Hybrid. KDD-96. compressed
postscript (108K) or slides.
- Ron Kohavi, Book Review: Empirical Methods in
Artificial Intelligence by Paul Cohen. International Journal of Neural
Systems (IJNS), Vol 7, No 2, May 1996, p. 219-221. postscript.
(50K) Note: final formatting in the journal was slightly different
- Ron Kohavi and David Wolpert, Bias Plus Variance
Decomposition for Zero-One Loss Functions. ML96. NEC's ResearchIndex
one of the top referenced paper in Machine Learning.
PDF or postscript
(170K) or color
slides
for 2/7/96 talk (390K) (18 slides. ghostview doesn't work well on
these.
Use xpsview).
- Jerome Friedman, Ron Kohavi, and Yeogirl Yun, Lazy
Decision Trees. AAAI-96, p. 717-724. postscript(145K)
or slides.
- Ron Kohavi and Dan Sommerfield, Feature Subset
Selection Using the Wrapper Model: Overfitting and Dynamic Search Space
Topology.
KDD-95. postscript
(240K) or slides.
- Ron Kohavi and George John, Automatic Parameter
Selection by Minimizing Estimated Error. ML-95. postscript
(173K).
- James Dougherty, Ron Kohavi, and Mehran Sahami, Supervised
and unsupervised discretization of continuous features. ML-95. postscript (213K)
or slides.
- Ron Kohavi, A Study of Cross-Validation and Bootstrap
for Accuracy Estimation and Model Selection. IJCAI-95. postscript
(305K), PDF
, or slides.
- Ron Kohavi and Chia-Hsin Li, Oblivious Decision
Trees, Graphs, and Top-Down Pruning. IJCAI-95 postscript (171K).
- Ron Kohavi, The Power of Decision Tables. In the European
Conference on Machine Learning, 1995. postscript
(168K) or slides
with some new results on discretization.
- Ron Kohavi and Brian Frasca, Useful feature subsets and rough
set reducts. In the International Workshop on Rough Sets and Soft
Computing (RSSC), 1994. postscript
version (161K).
- Ron Kohavi, A third dimension to rough sets. In the International
Workshop on Rough Sets and Soft Computing (RSSC), 1994. postscript
version (163K).
- Ron Kohavi, Feature Subset Selection as Search with
Probabilistic Estimates. In the AAAI Fall Symposium on Relevance,
1994. postscript
version (126K).
- Ron Kohavi, George John, Richard Long, David Manley, and
Karl Pfleger, MLC++:A Machine Learning Library in C++. In Tools
with Artificial Intelligence, 1994. postscript
version (118K).
- Ron Kohavi, Bottom-up induction of oblivious,
read-once decision graphs : Strengths and limitations. In Twelfth
National Conference on Artificial Intelligence, 1994. postscript
version (199K).
- George John, Ron Kohavi, and Karl Pfleger, Irrelevant
features and the subset selection problem. In Machine Learning:
Proceedings
of the Eleventh International Conference, 1994. Morgan Kaufmann. postscript (224K)
or slides.
- Ron Kohavi, Bottom-up induction of oblivious,
read-once decision graphs. In Proceedings of the European
Conference on Machine Learning, 1994. postscript
version (211K).
- Ron Kohavi and Scott Benson., Research note on
decision lists. Journal of Machine Learning. 13(1), 1993
- Ron Kohavi and Yoav Shoham, Applications of datalog
theories in AI. In AAAI-92 Workshop on Tractable Reasoning.
82-87
ronnyk@ live dot com