CS Seminar “From Text to Knowledge: Advances in Information Extraction and Social Media Analysis”

The IIT Department of Computer Science Seminar “From Text to Knowledge: Advances in Information Extraction and Social Media Analysis” presented by Aron W. Culotta will be held on Friday, March 1 from 11:25 am–12:25 pm in Stuart Building, Room 111.

The continued growth of online text data presents exciting opportunities for automated knowledge discovery. In this talk, Culotta will present two lines of research developing machine learning algorithms to convert large text collections into actionable knowledge. He will discuss information extraction (IE), which infers a relational database from unstructured text. After giving an overview of several IE tasks, including entity extraction, coreference resolution and relation extraction, Culotta will describe a new learning algorithm, SampleRank, that efficiently models the complex statistical dependencies inherent in IE, and present state-of-the-art results extracting information from news stories. He will then turn to the analysis of informal texts, specifically Twitter data. What can we infer about society from this data? He will outline the fundamental challenges in this line of research and present work monitoring flu activity, alcohol consumption and anxiety towards Hurricane Irene, as well as recent research inferring the geographical origin of Twitter messages.

Aron Culotta obtained his Ph.D. in Computer Science from the University of Massachusetts Amherst in 2008, advised by Andrew McCallum. He was a Microsoft Live Labs Fellow from 2006–08, and completed research internships at IBM, Google and Microsoft Research. He is currently an assistant professor of computer science at Northeastern Illinois University.