1	"Representation, Reasoning, Learning" Daphne Koller Stanford University
2	Acknowledgements My collaborators on some of this work: Nir Friedman, Lise Getoor, Avi Pfeffer, Eran Segal, Ben Taskar The many people whose work influenced mine My friends and colleagues, and especially the DAGS group, for their help and support ONR, DARPA, NSF, Sloan Stuart Russell, Joe Halpern, IJCAI Trustees My family for putting up with me while I prepared it
3	Roadmap Building Intelligence: 2001 Representation Reasoning Learning Where next
4	We’ve Come a Long Way Scheduling & planning Navigating in uncertain environment Chess Diagnosis & maintenance Speech recognition Parsing natural language Recognizing faces Many more …
5	This is 2001 …
6	… But Where is HAL?
7	Roadmap Building Intelligence: 2001 Representation Reasoning Learning Where Next
8	Representation: Design Decisions
9	Atomic Worlds Possible worlds are indivisible “atoms” e.g., states in a search algorithm Algorithms must treat worlds as black boxes e.g., access only via goal test, successor function Enumeration is all you can do!
10	Propositional Worlds More transparent representation: World = assignment of values to attributes / truth values to propositional symbols Knowledge base relates these propositions
11	Object-Relational Worlds Worlds are relational interpretations: Objects in the domain Properties of these objects Relations (links) between objects Relational languages allow universals: General patterns that apply to all objects
12	Representation: Design Decisions
13	Universal Truths All universals are false Smart students don’t always get A’s… True universals are rarely useful Smart students get either A, B, C, D, or F
14	Probable Worlds Probabilistic semantics: A set of possible worlds
15	Probable Worlds Probabilistic semantics: A set of possible worlds Each world associated with a probability
16	Probable Worlds Probabilistic semantics: A set of possible worlds Each world associated with a probability
17	Bayesian Networks Propositional representation: Avoid enumeration of worlds! Represent distribution compactly Bayesian networks use locality
18	Bayesian Networks Propositional representation: Avoid enumeration of worlds! Represent distribution compactly Bayesian networks use locality
19	Taxonomy of Methods
20	A Bridge Semantics: Probability distribution over possible worlds Worlds = first-order interpretations Representation (syntax): Probabilistic relational models (PRMs)
21	St. Nordaf University
22	Relational Schema Specifies types of objects in domain, attributes of each type of object, & types of links between objects
23	Possible Worlds
24	Representing the Distribution Many possible worlds for a given university All possible assignments of all attributes of all objects Infinitely many potential universities Each associated with a very different set of worlds
25	Probabilistic Relational Models Universals: Probabilistic patterns hold for all objects in class Locality: Represent direct probabilistic dependencies Links give us potential interactions!
26	PRM Semantics
27	PRMs are First-Order PRM can be applied to any university Any set of students, courses, registrations, faculty, …
28	The Ties that Bind Objects in the resulting model are not independent Links between objects induce correlations between them The result is a “web of influence” that spans many objects of different types
29	The Web of Influence
30	The Web of Influence
31	The Web of Influence
32	We’re Not in Kansas Anymore Objects & links now first-class citizens in the model Þ Can introduce probabilistic model over them Different worlds can have different objects & links! Example reasoning pattern: CS221 is easy — everyone gets an A No, it’s not easy — it’s a graduate class, and only smart students take it
33	Summary: Representation
34	Are We Done? Also need to represent: Time Events Actions Space Mental states of agents … There is much work on representing these concepts Need to integrate these with probabilistic language New questions are bound to arise…
35	Roadmap Building Intelligence: 2001 Representation Reasoning Learning Where Next
36	Reasoning: The Fight Continues
37	Reasoning: The Fight Continues The Complexity Monster: First Blood [Kautz, 1989] The Complexity Monster Strikes Back [Russell, 1995] The Return of the Complexity Monster [Koller, 2001]
38	Expressive Languages Corollary: Use less expressive representations Problem: “Poly(size)” is great, but not if size is exponential! Expressive language can capture structure, which can be exploited for faster inference
39	Complexity in Probabilistic Models
40	Weapon I: Locality Our language exploits locality: All information is presented in terms of local influences Locality also allows divide & conquer inference In Bayesian networks In constraint satisfaction In logical reasoning
41	Weapon I: Locality Our language exploits locality: All information is presented in terms of local influences Locality also allows divide & conquer inference In Bayesian networks In constraint satisfaction In logical reasoning
42	Weapon I: Locality Our language exploits locality: All information is presented in terms of local influences Locality also allows divide & conquer inference In Bayesian networks In constraint satisfaction In logical reasoning Divide & conquer inference extends to hierarchical structures, if only a few long-range interactions
43	Hierarchies Luckily, much of the world is local & hierarchical Departments in a university Geographical proximity Interactions between objects are often limited
44	Exploiting Hierarchical Structure
45	Weapon II: Universal Patterns Problem: At inference time, many algorithms convert objects to propositions recreate the repetition we wanted to avoid Robinson [1965]: Lifted theorem proving Do proof using variables x Only make distinctions about x that matter to proof
46	First-Order Probabilistic Inference Question: Can we use lifting in probabilistic case? Do probabilistic inference about “generic” objects x Only make distinctions that affect relevant probabilities
47	First-Order Probabilistic Inference Question: Can we use lifting in probabilistic case? Do probabilistic inference about “generic” objects x Only make distinctions that affect relevant probabilities
48	Exploiting Universals
49	Weapon III: Approximation
50	Stochasticity is Your Friend Question 1: What does this mean? In deterministic setting, unclear Probabilities provide formal framework: Weak influence = small effect on probabilities Question 2: Does it really help? If the world is stochastic, influence decays Over “distance” — long chains of influence Over “time” — as system evolves Over “quantity” — as many influences combine Recently, many successful algorithms that exploit this idea
51	Summary: Inference
52	Roadmap Building Intelligence: 2001 Representation Reasoning Learning Where Next
53	Learning for Reasoning Many possible knowledge bases: Different dependency structures Different parameters Which knowledge base is “good”? Probabilistic performance metric: How do we find a good one? Many learning algorithms have been developed
54	TB Patients in San Francisco
55	TB Patients in San Francisco
56	A Web of Data
57	Standard Approach
58	What’s in a Link
59	What’s in an Anchor
60	Reasoning for Learning
61	Reasoning for Learning Use reasoning to bootstrap learning Learn even when we never find out everything Substantial improvement in learned models “Lifelong” learning, from any data obtained Learn as a natural byproduct of reasoning Discover new concepts in data …
62	Discovering Hidden Concepts
63	Discovering Hidden Concepts
64	Web of Influence, Yet Again
65	Summary: Learning
66	Another Bridge
67	Roadmap Building Intelligence: 2001 Representation Reasoning Learning Where Next
68	Some Important Next Steps Perception and understanding Planning and decision making Rich language for actions that represents uncertainty about future Efficient algorithms for decision making under uncertainty that exploit representation structure Techniques for learning action models from interaction with real world
69	Perception
70	Probabilistic Perception
71	Hypothetical Bridge
72	Hypothetical Bridge
73	Understanding Language
74	Resolving Ambiguity Professors often meet with students Jane is probably a student Professors like to explain “She” is probably Prof. Sarah
75	Learning Semantic Models Statistical NLP reveals semantic patterns in words: Semantics is implicit We want to go from words to concepts represent, learn, & use patterns at semantic level
76	A Tale of Three Bridges Divide the problem into well-defined pieces Make progress on each one Build bridges to create a unified whole
77	A Tale of Three Bridges Divide the problem into well-defined pieces Make progress on each one Build bridges to create a unified whole
78