Ron Kohavi, PhD

     

About Me

I am the General Manager for the Experimentation Platform at Microsoft.
If you know of people who are interested in building high-impact, scalable software, with strong analytics, please ask them to send their resumes to rkjobs at microsoft dot you know what. A partial list of job descriptions is available here.

My Linkedin profile (View Full Profile).

Professional activities:

The papers Irrelevant Features and the Subset Selection Problem and Wrappers for Feature Subset Selection are in the top-10 most cited papers in Artificial Intelligence Expert Systems and Machine Learning according to NEC's ResearchIndex. Bias Plus Variance Decomposition for Zero-One Loss Functions and Mining using MLC++, a Machine Learning Library in C++ are in the top-100 most cited papers in Machine Learning.

Full list of publications

My resume in HTML and in Word


Bio

Ronny Kohavi is the GM for Microsoft's Experimentation Platform, a team whose mission is build a platform that will accelerate software innovation through trustworthy experimentation. Controlled experiments, A/B tests, or parallel flights, are synonyms for a methodology of reliably evaluating ideas through randomized assignment of users to a Control group or different Treatment groups. The methodology is practically the only scientific method we know to establish causal relationships between ideas and metrics of interest. More information is available at Experimentation Platform.

Prior to joining Microsoft in June 2005, Ronny was the director of data mining and personalization at Amazon.com, where he was responsible for personalization, automation, search engine marketing (SEM), consumer behavior / data mining, site experimentation, and automated e-mail. His teams introduced several features estimated to be worth several hundred million dollars in incremental revenue. Prior to Amazon, Ronny was the Vice President of Business Intelligence at Blue Martini Software, where he led the engineering group responsible for the data collection, analysis, visualization, reporting, and campaign management modules in Blue Martini's applications. Prior to joining Blue Martini, Kohavi managed the MineSet product, Silicon Graphics' award-winning product for data mining and visualization. Ronny joined Silicon Graphics after getting a Ph.D. in Machine Learning from Stanford University, where he led the MLC++ project, the Machine Learning library in C++ used in MineSet and at Blue Martini Software.  Kohavi received his BA from the Technion, Israel.  He was the General Chair for KDD 2004.  He co-chaired KDD 99's industrial track with Jim Gray and the KDD Cup 2000 with Carla Brodley.  He was an invited speaker at the National Academy of Engineering in 2000, a keynote speaker at PAKDD 2001,  and an invited speaker at KDD 2001's industrial track.  He co-chaired WEBKDD 2000, WEBKDD 2001, and WEBKDD 2003, and co-taught with Jon Becher a tutorial on e-commerce and clickstream mining at the SIAM Data Mining conference in 2001.  He co-edited with Foster Provost the special issue of the journal Machine Learning on Applications of Machine Learning and the special issue of the Data Mining and Knowledge Discovery journal on Applications of Data Mining to Electronic Commerce, now available as a book.  He was a member of the editorial board for the Data Mining and Knowledge Discovery journal from its inception and served as a member of the editorial board for the journal of Machine Learning from 1997 to 1999.


MineSet Visualizations

MineSet was built at SGI and now distributed by Puple Insight. It combined SGI visualizations and backend algorithms from MLC++. Here are some visualizations (quicktime).
  1. Decision Tree. This is a great example where decision trees with many nodes can be visualized effectively. The visualizer was based on SGI's file system navigator, shown in the Jurassic Park movie
  2. Evidence Visualizer (aka Naive Bayes). This is an example of how to make conditional probabilities easier to understand. Working in log space, they add up so the concept of "evidence" is easy to understand.
  3. Decision Table classifier is simple yet effective and is easy to visualize.
  4. Splat Visualizer allows visualizing large amounts of data by creating Gaussian splats
  5. Scatter VisualizerVisualizing scatterplots with sliders for a total of 7 dimensions: X, Y, Z, color, size, and two sliders

Some fun pictures

ronnyk@cs.stanford.edu