Research
Summary
We inhabit a vast, uncertain, and dynamic universe. To succeed in such an environment, machine learning approaches must handle massive amounts of noisy, changing evidence. My research tackles these challenges: I devise scalable algorithms for big data, design probabilistic models that gracefully capture uncertainty, and develop techniques for streaming inference that provide theoretical guarantees.
In addition to addressing fundamental challenges in machine learning, my work is also driven by important, practical questions in artificial intelligence. How can we use the wealth of knowledge on the Web to construct structured knowledge bases? When a user rates a new item, how can we update recommendations so other similar users benefit? What can we learn about the invisible influences of organizations from the social media activity of their followers? The central thread connecting these diverse questions is the need to exploit relationships and dependencies between instances -- whether they are facts in a knowledge base, items in a product catalog, or users of a social network.
Interests
- Scalable Machine Learning
- Probabilistic Models
- Statistical Relational Learning
- Knowledge Graph Construction
- Streaming and Online Inference
- Natural Language Processing
- Social Network Analysis
Research Projects
-
The web is a vast repository of knowledge, but automatically extracting that knowledge at scale has proven to be a formidable challenge. My research identifies the common failure patterns in information extraction projects and proposes solutions that use statistical signals from NLP pipelines and semantic knowledge from ontologies to build beter knowledge graphs. These techniques can improve F1 measure over IE approaches by 25% and, in a parallel implementation, require only ten minutes on KGs with millions of facts!
Learn more: ISWC13, AIMag15, Thesis16, AKBC13, GitHub
-
A key challenge of many artificial intelligence problems is that the evidence grows and changes over time, requiring updates to inferences. Every time a user rates a new movie on Netflix, posts a status update on Twitter, or adds a connection on LinkedIn, inferences about preferences, events, or relationships must be updated. My work investigates approximate updates as new evidence arrives, providing theoretical guarantees and offering practical algorithms
Learn more: UAI15, AKBC14, StaRAI15, GitHub
-
The world is filled with ambiguous references to entities. These problems show up everywhere from a blurry photo, pronouns and titles in news articles, or misspelled names. Often relational information exists that can help remove the ambiguity of these references. Unfortunately, most relational entity resolution systems require painstaking effort to adapt to new problems. My work provides a unified approach to entity resolution that can be applied to any problem, and is easily customized to incorporate domain knowledge.
Learn more: StaRAI16, BayLearn14
-
-
-
Learn more: BigLearn12, EMNLP15
-
Research Projects
-
The web is a vast repository of knowledge, but automatically extracting that knowledge at scale has proven to be a formidable challenge. My research identifies the common failure patterns in information extraction projects and proposes solutions that use statistical signals from NLP pipelines and semantic knowledge from ontologies to build beter knowledge graphs. These techniques can improve F1 measure over IE approaches by 25% and, in a parallel implementation, require only ten minutes on KGs with millions of facts!
Learn more: ISWC13, AIMag15, Thesis16, AKBC13, GitHub -
A key challenge of many artificial intelligence problems is that the evidence grows and changes over time, requiring updates to inferences. Every time a user rates a new movie on Netflix, posts a status update on Twitter, or adds a connection on LinkedIn, inferences about preferences, events, or relationships must be updated. My work investigates approximate updates as new evidence arrives, providing theoretical guarantees and offering practical algorithms
Learn more: UAI15, AKBC14, StaRAI15, GitHub -
The world is filled with ambiguous references to entities. These problems show up everywhere from a blurry photo, pronouns and titles in news articles, or misspelled names. Often relational information exists that can help remove the ambiguity of these references. Unfortunately, most relational entity resolution systems require painstaking effort to adapt to new problems. My work provides a unified approach to entity resolution that can be applied to any problem, and is easily customized to incorporate domain knowledge.
Learn more: StaRAI16, BayLearn14 -
-
-
Learn more: BigLearn12, EMNLP15
-