Domain experts are often interested in the extent to which a particular idea is expressed in a corpus. We explore (a) methods for efficiently identifying semantic matches of a query, and (b) domain-specific analyses enabled through their output.
Goal: increased understanding of the factors that go into long-term disaster recovery (e.g., rebuilding, funding, community wellbeing).
Goal: understanding how U.S. senators, representatives, and other congressional actors use religious rhetoric in the context of different policy priorities.
Data/domain: >100m website captures from the house.gov and senate.gov websites (1997-2013) obtained by the Internet Archive.
This work was funded by the National Science Foundation (grant #1541025, graduate fellowship).