1

 Daphne Koller
 Stanford University

2

 My collaborators on some of this work:
 Nir Friedman, Lise Getoor, Avi Pfeffer, Eran Segal, Ben Taskar
 The many people whose work influenced mine
 My friends and colleagues, and especially the DAGS group, for their help
and support
 ONR, DARPA, NSF, Sloan
 Stuart Russell, Joe Halpern, IJCAI Trustees
 My family for putting up with me while I prepared it

3

 Building Intelligence: 2001
 Representation
 Reasoning
 Learning
 Where next

4

 Scheduling & planning
 Navigating in uncertain environment
 Chess
 Diagnosis & maintenance
 Speech recognition
 Parsing natural language
 Recognizing faces
 Many more …

5


6


7

 Building Intelligence: 2001
 Representation
 Reasoning
 Learning
 Where Next

8


9

 Possible worlds are indivisible “atoms”
 e.g., states in a search algorithm
 Algorithms must treat worlds as black boxes
 e.g., access only via goal test, successor function
 Enumeration is all you can do!

10

 More transparent representation:
 World = assignment of values to attributes / truth values to
propositional symbols
 Knowledge base relates these propositions

11

 Worlds are relational interpretations:
 Objects in the domain
 Properties of these objects
 Relations (links) between objects
 Relational languages allow universals:
 General patterns that apply to all objects

12


13

 All universals are false
 Smart students don’t always get A’s…
 True universals are rarely useful
 Smart students get either A, B, C, D, or F

14


15

 Probabilistic semantics:
 A set of possible worlds
 Each world associated with a probability

16

 Probabilistic semantics:
 A set of possible worlds
 Each world associated with a probability

17

 Propositional representation:
 Avoid enumeration of worlds!
 Represent distribution compactly
 Bayesian networks use locality

18

 Propositional representation:
 Avoid enumeration of worlds!
 Represent distribution compactly
 Bayesian networks use locality

19


20

 Semantics:
 Probability distribution over possible worlds
 Worlds = firstorder interpretations
 Representation (syntax):
 Probabilistic relational models (PRMs)

21


22

 Specifies types of objects in domain, attributes of each type of object,
& types of links between objects

23


24

 Many possible worlds for a given university
 All possible assignments of all attributes of all objects
 Infinitely many potential universities
 Each associated with a very different set of worlds

25

 Universals: Probabilistic patterns hold for all objects in class
 Locality: Represent direct probabilistic dependencies
 Links give us potential interactions!

26


27

 PRM can be applied to any university
 Any set of students, courses, registrations, faculty, …

28

 Objects in the resulting model are not independent
 Links between objects induce correlations between them
 The result is a “web of influence” that spans many objects of different
types

29


30


31


32

 Objects & links now firstclass citizens in the model
 Þ Can introduce probabilistic model over them
 Different worlds can have different objects & links!
 Example reasoning pattern:
 CS221 is easy — everyone gets an A
 No, it’s not easy — it’s a graduate class, and only smart students take
it

33


34

 Also need to represent:
 Time
 Events
 Actions
 Space
 Mental states of agents
 …
 There is much work on representing these concepts
 Need to integrate these with probabilistic language
 New questions are bound to arise…

35

 Building Intelligence: 2001
 Representation
 Reasoning
 Learning
 Where Next

36


37

 The Complexity Monster: First Blood [Kautz, 1989]
 The Complexity Monster Strikes Back [Russell, 1995]
 The Return of the Complexity Monster [Koller, 2001]

38

 Corollary: Use less expressive representations
 Problem:
 “Poly(size)” is great, but
not if size is exponential!
 Expressive language can capture structure, which can be exploited for
faster inference

39


40

 Our language exploits locality:
 All information is presented in terms of local influences
 Locality also allows divide & conquer inference
 In Bayesian networks
 In constraint satisfaction
 In logical reasoning

41

 Our language exploits locality:
 All information is presented in terms of local influences
 Locality also allows divide & conquer inference
 In Bayesian networks
 In constraint satisfaction
 In logical reasoning

42

 Our language exploits locality:
 All information is presented in terms of local influences
 Locality also allows divide & conquer inference
 In Bayesian networks
 In constraint satisfaction
 In logical reasoning
 Divide & conquer inference extends to hierarchical structures, if
only a few longrange interactions

43

 Luckily, much of the world is local & hierarchical
 Departments in a university
 Geographical proximity
 Interactions between objects are often limited

44


45

 Problem: At inference time, many algorithms
 convert objects to propositions
 recreate the repetition we wanted to avoid
 Robinson [1965]: Lifted theorem
proving
 Do proof using variables x
 Only make distinctions about x
that matter to proof

46

 Question: Can we use lifting in probabilistic case?
 Do probabilistic inference about “generic” objects x
 Only make distinctions that affect relevant probabilities

47

 Question: Can we use lifting in probabilistic case?
 Do probabilistic inference about “generic” objects x
 Only make distinctions that affect relevant probabilities

48


49


50

 Question 1: What does this mean?
 In deterministic setting, unclear
 Probabilities provide formal framework:
 Weak influence = small
effect on probabilities
 Question 2: Does it really help?
 If the world is stochastic, influence decays
 Over “distance” — long chains of influence
 Over “time” — as system evolves
 Over “quantity” — as many influences combine
 Recently, many successful algorithms that exploit this idea

51


52

 Building Intelligence: 2001
 Representation
 Reasoning
 Learning
 Where Next

53

 Many possible knowledge bases:
 Different dependency structures
 Different parameters
 Which knowledge base is “good”?
 Probabilistic performance metric:
 How do we find a good one?
 Many learning algorithms have been developed

54


55


56


57


58


59


60


61

 Use reasoning to bootstrap learning
 Learn even when we never find out everything
 Substantial improvement in learned models
 “Lifelong” learning, from any data obtained
 Learn as a natural byproduct of reasoning
 Discover new concepts in data …

62


63


64


65


66


67

 Building Intelligence: 2001
 Representation
 Reasoning
 Learning
 Where Next

68

 Perception and understanding
 Planning and decision making
 Rich language for actions that represents uncertainty about future
 Efficient algorithms for decision making under uncertainty that exploit
representation structure
 Techniques for learning action models from interaction with real world

69


70


71


72


73


74

 Professors often meet with students
 Jane is probably a student
 Professors like to explain
 “She” is probably Prof. Sarah

75

 Statistical NLP reveals semantic patterns in words:
 Semantics is implicit
 We want to go from words to concepts
 represent, learn, & use patterns at semantic level

76

 Divide the problem into welldefined pieces
 Make progress on each one
 Build bridges to create a unified whole

77

 Divide the problem into welldefined pieces
 Make progress on each one
 Build bridges to create a unified whole

78

