Ronny Kohavi's publications
My Ph.D. thesis in compressed
postscript or acrobat (PDF): Wrappers for
Performance Enhancement and Oblivious Decision Graphs.
Note: publications below are in reverse chronological order,
i.e.,
the most recent ones are first.
Even though I am not in academia since 1995, my articles have been
cited over 1,000 times according to NEC's ResearchIndex
and I am in the list of Most cited authors in
Computer Science
- Ronny Kohavi,
Emetrics talk on Controlled Experiments, Oct 2007.
- Ronny Kohavi and Roger Longbotham,
Online Experiments: Lessons Learned, IEEE Computer, Sept 2007.
- Ronny Kohavi, Randy Henne, and Dan Sommerfield,
Practical Guide to Controlled Experiments on the Web:
Listen to Your Customers not to the HiPPO, KDD 2007.
- Ron Kohavi, ACM Data Mining SIG talk (PPT) (June 14, 2006)
- Ron Kohavi, PKDD/ECML 2005 keynote
Focus the Mining Beacon: Lessons and Challenges from the World of E-Commerce
based on the
Machine Learning journal paper
- Ron Kohavi, talk at Emetrics 2004
on Amazon's Data Mining and Personalization in PDF (June 2004)
- Ron Kohavi, Llew Mason, Rajesh
Parekh, Zijian Zheng, Lessons and
Challenges from Mining Retail E-Commerce Data. PDF.
To appear in Machine Learning
Journal, Special Issue on Data Mining Lessons Learned, 2004.
- Ron Kohavi and Rajesh Parekh, Visualizing RFM Segmentation,
SIAM International Conference on Data Mining, (SDM) 2004.
PDF
- Ron Kohavi and Rajesh Parekh, Ten Supplementary Analyses to
Improve E-commerce Web Sites, WEBKDD 2003. PDF
- Talk at CSLI's Seminar on Computational Learning and Adaptation
on Real-world Insights from Mining Retail
E-Commerce Data, May 22, 2003
- Deriving Key Insights from Blue Martini Business Intelligence:
Summary of key insights from using Business Intelligence against
Debenhams and MEC sites. Approved by Debenhams and MEC and presented at
Webinar on March 10, 2003. PPT
- Blue
Martini
Software, Bath Unlimited Tests Product Acceptance with Blue
Martini's Online Market Research Capabilities. PDF
- Blue
Martini
Software, Blue Martini Business Intelligence Delivers Unparalleled
Insight into User Behavior at the Debenhams Web Site PDF
- Blue
Martini
Software, Blue Martini Business Intelligence at Work: Charting the
Terrains
of MEC Website Data. PDF
- Ron Kohavi, Neal Rothleder, and Evangelos Simoudis,
Emerging Trends in Business Analytics, Communications of the ACM,
Volume 45, Number 8, Aug 2002, pages 45-48. PDF
- Ron Kohavi, Mining Customer Data, Etail CRM Summit, 2002. PDF slides. The talk was cited in ComputerWorld.
- Ron Kohavi and J. Ross Quinlan.
Decision-tree discovery. In Will Klosgen and Jan M. Zytkow,
editors, Handbook
of Data Mining and Knowledge Discovery, chapter 16.1.3, pages
267-276. Oxford University Press, 2002. Postscript, PDF.
- Nir Friedman and Ron Kohavi. Bayesian
classification. In Will Klosgen and Jan M. Zytkow, editors, Handbook
of Data Mining and Knowledge Discovery, chapter 16.1.5, pages
282-288. Oxford University Press, 2002. Postscript, PDF.
- Cliff Brunk and Ron Kohavi. Mineset.
In Will Klosgen and Jan M. Zytkow, editors, Handbook
of Data Mining and Knowledge Discovery, chapter 24.2.4, pages
584-589. Oxford University Press, 2002. Postscript, PDF.
- Ron Kohavi and Dan Sommerfield. MLC++. In Will
Klosgen and Jan M. Zytkow, editors, Handbook
of Data Mining and Knowledge Discovery, chapter 24.1.2, pages
548-553. Oxford University Press, 2002. Postscript, PDF.
- Ron Kohavi, Brij Masand, Myra Spiliopoulou, and
Jaideep Srivastava, Lecture
Notes in Artificial Intelligence (no 2356): WEBKDD 2001 - Mining
Log Data Across All Customer Touch Points, Revised papers, Third
International Workshop, San Francisco, CA, Aug 2001.
Original papers available here.
- Llew Mason, Zijian Zheng, Ron Kohavi, Brian Frasca,
eMetrics Study, Dec 2001. PDF
- Zijian Zheng, Ron Kohavi, and Llew Mason, Real World
Performance of Association Rule Algorithms, KDD 2001, short, long, slides.
One of the datastes used in this paper (BMS-WebView-1) was donated for
research use under similar terms to the KDD Cup 2000 data usage.
See link
at the bottom of http://www.ecn.purdue.edu/KDDCUP/
- Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng,
Integrating E-Commerce and Data Mining: Architecture and Challenges,
ICDM 2001, PDF
- Ron Kohavi Invited paper/talk at KDD 2001 industrial
track: Mining E-commerce Data, the Good, the Bad, and the Ugly PDF paper, slides
- Ron Kohavi, Mining E-commerce Data, the Good,
the Bad, and the Ugly, invited talk at PAKDD 2001, April 16-18,
2001, Hong Kong.
Coverage in the South China Morning
Press, the largest English newspaper in Hong Kong.
- Ron Kohavi
and Foster Provost, Applications of Data Mining to Electronic
Commerce, Data
Mining and Knowledge Discovery journal 5(1/2), 2001. Postscript, PDF
This special issue is also available as a hardcover book from
Kluwer Academic Publishers; ISBN:
0792373030
- Ron Kohavi, Carla Brodley, Brian Frasca, Llew
Mason, and Zijian Zheng, KDD-Cup 2000 Organizers' Report:
Peeling the Onion. SIGKDD Explorations
Volume 2, issue 2, 2000. PDF
Also translated to Japanese in Information
Processing Society of Japan, Vol
42 No. 5
- Myra Spiliopoulou, Jaideep Srivastava, Ron
Kohavi, and Brij Masand, WEBKDD 2000 - Web Mining for E-Commerce, SIGKDD Explorations
Volume 2, issue 2, 2000. PDF
- Ron Kohavi, An Ideal E-Commerce Architecture for
Building Web Sites Supporting Analysis and Personalization. Invited
talk at the Information
Architecture and Web Site Design class, Berkeley, Oct 2000. PDF
- Ron Kohavi, Mining E-Commerce Data: The Good, the Bad,
and the Ugly. Invited talk at the SAS's
M2000 Data Mining Technology conference, Oct 2000. PDF and Compressed
postscript
- Ron Kohavi, Data Mining and Visualization. Invited
talk at the National
Academy of Engineering US Frontiers of Engineers, Sept 2000. PDF and Compressed
postscript. Available in book form ISBN:
0-309-07319-7
- Ron Kohavi, Personalization
Panel, KDD conference, Aug 2000. Powepoint
slides
- Ron Kohavi and Carla Brodley, KDD-Cup 2000: Peeling the
Onion. Talk at KDD
2000. Powerpoint slides
- Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng,
Integrating E-commerce and Data Mining: Architecture and Challenges, WEBKDD'2000 workshop on Web Mining for
E-Commerce -- Challenges and Opportunities, Aug 2000. PDF and Compressed
postscript
- Ron Kohavi, Mining E-Commerce Data: Challenges and
Stories from the Trenches. DIMACS/IBM Workshop on Data Mining in the
Internet Age, 2 May 2000. HTML
slides and Compressed
postscript
- Ron Kohavi and Mehran Sahami (co-chairs), Jim Bozik,
Dorian Pyle, Rob Gerritsen, Steve Belcher, Ken Ono (panelists).
Integrating Data Mining into Vertical Solutions: Problems and
Challenges, KDD-99 panel ZIP'ed slides and
article in SigKDD
Explorations Volume 1, issue 2
- Ron Kohavi, Embedding Data Mining Technology in
E-Commerce Applications. Invited talk at ICML-99 industrial day. June
99, Slovenia. ZIP'ed powerpoint slides
and HTML slides.
- Eric Bauer and Ron Kohavi, An Empirical Comparison of
Voting Classification Algorithms: Bagging, Boosting, and Variants,
Journal of
Machine Learning Vol 36, Nos. 1/2, July/August 1999, pages 105-139 compressed
postscript (632K) updated 5/22 /99 or acrobat (PDF).
- Ron Kohavi and George John, The Wrapper Approach,
book
chapter in Feature
Extraction, Construction and Selection : A Data Mining Perspective,
edited by Huan Liu and Hiroshi Motoda. Postscript
- Ron Kohavi, Improving Accuracy by voting
Classification Algorithms: Boosting, Bagging, and Variants. Invited
talk at Workshop on Computation-Intensive Machine Learning Techniques.
Australia, Sept 1998 compressed
postscript slides
- Ron Kohavi and Foster Provost, Glossary of Terms.
Editorial for the Special Issue on Applications of Machine Learning and
the Knowledge Discovery Process (volume 30, Number 2/3, February/March
1998). Postscript
or HTML
- Ron Kohavi and Foster Provost, On Applied Research in
Machine Learning. Editorial for the Special Issue on Applications of
Machine Learning and the Knowledge Discovery Process (volume 30, Number
2/3, February/March 1998). Postscript
- Ron Kohavi, Crossing the Chasm: From Academic Machine
Learning to Commercial Data Mining. Invited talk at ICML-98. compressed
postscript slides or acrobat (PDF)
slides
- Afshin Goodarzi, Ron Kohavi, Richard Harmon, and Aydin
Senkut, Loan Prepayment Modeling. Appeared in KDD-98 workshop on
Data Mining in
Finance. high-res
compressed
postscript or lower-res
acrobat (PDF)
- Ron Kohavi, Data Mining with MineSet: What Worked,
What Did Not, and What Might. Appeared in KDD-98 workshop on the
Commercial Success of Data Mining. compressed
postscript or acrobat (PDF)
- Ron Kohavi and Dan Sommerfield, Targeting Business
Users with Decision Table Classifiers. Appeared in KDD-98. compressed
postscript or acrobat
(PDF)
- Ron Kohavi, Technique Selection in Machine Learning
Applications. Invited talk at the ICML-98 workshop on the Methodology
of Applying Machine Learning. compressed
postscript slides or acrobat
(PDF)
slides
- Foster Provost, Tom Fawcett, Ron Kohavi, Building the
Case Against Accuracy Estimation for Comparing Induction Algorithms.
ICML-98. compressed
postscript or (low-res)
acrobat
(PDF)
- Jeff Bradford, Clay Kunz, Ron Kohavi, Cliff Brunk, and
Carla Brodley, Pruning Decision Trees with Misclassification
Costs. ECML-98. compressed
postscript and long
version in compressed postscript
- Ron Kohavi, Dan Sommerfield, and James Dougherty,
Data Mining using MLC++, a Machine Learning Library in C++.
International Journal of Artificial Intelligence Tools, Vol. 6, No. 4,
1997, p. 537-566. This is a longer version of the TAI'96 paper that
received the IEEE Tools With Artificial Intelligence Best Paper Award. compressed
postscript (283K) or acrobat (PDF)
- Barry Becker, Ron Kohavi, Dan Sommerfield, Visualizing
the Simple Bayesian Classifier. Appears in the KDD 1997 Workshop on
Issues in the Integration of Data Mining and Data Visualization.
Lecture Notes
in Computer Science by Springer Verlag. compressed
postscript (358K).
- Cliff Brunk, James Kelly, and Ron Kohavi, MineSet:
An Integrated System for Data Mining. Appears in the The Third
International Conference on Knowledge Discovery and Data Mining, 1997. compressed
postscript (276K).
- Ron Kohavi and Clayton Kunz, Option Decision
Trees with Majority Votes. Apears in the International Conference on
Machine
Learning 1997. postscript
(308K).
- Ron Kohavi and George John, Wrappers for Feature
Subset Selection (late draft). In Artificial Intelligence journal,
special issue on relevance, Vol. 97, Nos 1-2, pp. 273-324.NEC's ResearchIndex
one of the top referenced paper in Machine Learning.
Compressed
postscript (305K) uncompressed
postscript (770K)
- Ron Kohavi, Barry Becker, and Dan Sommerfield,
Improving Simple Bayes compressed
postscript. ECML-97 (poster).
- Ron Kohavi, Pat Langley, Yeogirl Yun, The Utility of
Feature Weighting in Nearest-Neighbor Algorithms compressed
postscript. ECML-97 (poster).
- Ron Kohavi, MLC++ Developments: Data Mining using
MLC++. AAAI Fall Symposium on Learning Complex Behaviors in Adaptive
Intelligent Systems, Nov 1996. compressed
postscript
slides.
- Ron Kohavi, Dan Sommerfield, and James Dougherty,
Data Mining using MLC++, a Machine Learning Library in C++. TAI 96. The
paper received the IEEE Tools With Artificial Intelligence Best
Paper Award, 1996. NEC's ResearchIndex
one of the top referenced paper in Machine Learning.
Compressed
postscript (245K) or uncompressed
postscript (3.3MB)
- Ron Kohavi and Mehran Sahami, Error-Based and
Entropy-Based Discretization of Continuous Features. KDD-96. postscript
(165K)
- Ron Kohavi, Scaling Up the Accuracy of Naive-Bayes
Classifiers: a Decision-Tree Hybrid. KDD-96. compressed
postscript (108K) or slides.
- Ron Kohavi, Book Review: Empirical Methods in
Artificial Intelligence by Paul Cohen. International Journal of Neural
Systems (IJNS), Vol 7, No 2, May 1996, p. 219-221. postscript.
(50K) Note: final formatting in the journal was slightly different
- Ron Kohavi and David Wolpert, Bias Plus Variance
Decomposition for Zero-One Loss Functions. ML96. NEC's ResearchIndex
one of the top referenced paper in Machine Learning.
PDF or postscript
(170K) or color
slides
for 2/7/96 talk (390K) (18 slides. ghostview doesn't work well on
these.
Use xpsview).
- Jerome Friedman, Ron Kohavi, and Yeogirl Yun, Lazy
Decision Trees. AAAI-96, p. 717-724. postscript(145K)
or slides.
- Ron Kohavi and Dan Sommerfield, Feature Subset
Selection Using the Wrapper Model: Overfitting and Dynamic Search Space
Topology.
KDD-95. postscript
(240K) or slides.
- Ron Kohavi and George John, Automatic Parameter
Selection by Minimizing Estimated Error. ML-95. postscript
(173K).
- James Dougherty, Ron Kohavi, and Mehran Sahami, Supervised
and unsupervised discretization of continuous features. ML-95. postscript (213K)
or slides.
- Ron Kohavi, A Study of Cross-Validation and Bootstrap
for Accuracy Estimation and Model Selection. IJCAI-95. postscript
(305K), PDF
, or slides.
- Ron Kohavi and Chia-Hsin Li, Oblivious Decision
Trees, Graphs, and Top-Down Pruning. IJCAI-95 postscript (171K).
- Ron Kohavi, The Power of Decision Tables. In the European
Conference on Machine Learning, 1995. postscript
(168K) or slides
with some new results on discretization.
- Ron Kohavi and Brian Frasca, Useful feature subsets and rough
set reducts. In the International Workshop on Rough Sets and Soft
Computing (RSSC), 1994. postscript
version (161K).
- Ron Kohavi, A third dimension to rough sets. In the International
Workshop on Rough Sets and Soft Computing (RSSC), 1994. postscript
version (163K).
- Ron Kohavi, Feature Subset Selection as Search with
Probabilistic Estimates. In the AAAI Fall Symposium on Relevance,
1994. postscript
version (126K).
- Ron Kohavi, George John, Richard Long, David Manley, and
Karl Pfleger, MLC++:A Machine Learning Library in C++. In Tools
with Artificial Intelligence, 1994. postscript
version (118K).
- Ron Kohavi, Bottom-up induction of oblivious,
read-once decision graphs : Strengths and limitations. In Twelfth
National Conference on Artificial Intelligence, 1994. postscript
version (199K).
- George John, Ron Kohavi, and Karl Pfleger, Irrelevant
features and the subset selection problem. In Machine Learning:
Proceedings
of the Eleventh International Conference, 1994. Morgan Kaufmann. postscript (224K)
or slides.
- Ron Kohavi, Bottom-up induction of oblivious,
read-once decision graphs. In Proceedings of the European
Conference on Machine Learning, 1994. postscript
version (211K).
- Ron Kohavi and Scott Benson., Research note on
decision lists. Journal of Machine Learning. 13(1), 1993
- Ron Kohavi and Yoav Shoham, Applications of datalog
theories in AI. In AAAI-92 Workshop on Tractable Reasoning.
82-87
My
home page
ronnyk@CS.Stanford.edu