Amy X. Zhang
About Me - CV - Projects - Publications - Talks - Code - Blog

Assistant Professor, starting Fall 2020 (currently a 2019-20 postdoc at Stanford CS)
Allen School of Computer Science & Engineering
University of Washington

My research is in the field of human-computer interaction and social computing. I work on designing and building systems to improve discourse, collaboration, and understanding online, with applications to social media and online communities, news and civic engagement, education, and computer-supported cooperative work and collective action.

Contact Info

Twitter: @amyxzh
Github: amyxzhang

Latest News

Sept 2019: Started postdoc at Stanford CS working with Michael Bernstein.

Aug 2019: Turned in my thesis and graduated from MIT!

May 2019: Defended my dissertation!

Oct 2018: Received CSCW Best Paper Award for paper on making sense of group chat.

July 2018: I did an interview about Squadbox on BBC TV!

Travel Schedule

Feb 7-9: Mountain View, Social Science FooCamp
Feb 11: Boston, BostonCHI invited talk
Feb 27-28: Washington D.C., CRA Career Mentoring Workshop
Mar 17-18: Seattle, UW CSE Visit Day
Mar 20-21: Boston, Computation+Journalism
Apr 25-30: Honolulu, ACM CHI
May 28-29: Cambridge, MIT Graduation
June 18-19: Cambridge, UIST PC Meeting

Blog Posts

Themes from the Truth and Trust Online Conference 2019 - Nov 13, 2019
Exploring the Design of Credibility Tools - Dec 19, 2018
Having Trouble Keeping Up with Group Chat? - Nov 5, 2018
4 Things We Learned from Talking to People who Face Harassment: Research behind Squadbox - April 17, 2018
Considering End Users in the Design of News Credibility Annotations - June 12, 2017
Year in Reviews - Dec 30, 2016
Mailing Lists: Why Are They Still Here, What's Wrong With Them, and How Can We Fix Them? - May 11, 2015
Thoughts and Reflections from a 1st Time Reviewer - March 19, 2014
What is a Neighborhood? - March 11, 2012

Application Materials:

Research Statement, Teaching Statement, Diversity Statement

Tweets By @amyxzh


digital juries

Digital Juries: A Civics-Oriented Approach to Platform Governance

Drawing inspiration from constitutional jury trials in many legal systems, this paper proposes digital juries as a civics-oriented approach for adjudicating content moderation cases. Building on existing theoretical models of jury decision-making, we outline a 5-stage model characterizing the space of design considerations in a digital jury process. We implement two examples of jury designs involving blind-voting and deliberation. From users who participate in our jury implementations, we gather informed judgments of the democratic legitimacy of a jury process for content moderation. We find that digital juries are perceived as more procedurally just than existing common platform moderation practices, but also find disagreement over whether jury decisions should be enforced or used as recommendations.

Digital Juries: A Civics-Oriented Approach to Platform Governance. Jenny Fan, Amy X. Zhang. CHI '20.

data science graph

How do Data Science Workers Collaborate? Roles, Workflows, and Tools

In this work, we conducted an online survey with 183 participants who work in various aspects of data science. We focused on their reported interactions with each other (e.g., managers with engineers) and with different tools (e.g., Jupyter Notebook). We found that data science teams are extremely collaborative and work with a variety of stakeholders and tools during the six common steps of a data science workflow (e.g., clean data and train model). We also found that the collaborative practices workers employ, such as documentation, vary according to the kinds of tools they use. Based on these findings, we discuss design implications for supporting data science team collaborations and future research directions.

How do Data Science Workers Collaborate? Roles, Workflows, and Tools. Amy X. Zhang*, Michael Muller*, Dakuo Wang. CSCW '20.
*Equal Contribution.


thesis diagram

Ph.D. Thesis: Systems for Collective Human Curation of Online Discussion

The internet was supposed to democratize discussion, allowing people from all walks of life to communicate with each other at scale. However, this vision has not been fully realized—instead online discourse seems to be getting worse, as people are increasingly drowning in discussion, with much of it unwanted or unpleasant. In this thesis, I present new systems that empower discussion participants to work collectively to bring order to discussions through a range of curation tools that superimpose richer metadata structure on top of standard discussion formats. These systems enable the following new capabilities: 1) recursive summarization of threaded forums using Wikum, 2) teamsourced tagging and summarization of group chat using Tilda, 3) fine-grained customization of email delivery within mailing lists using Murmur, and 4) friendsourced moderation of messages against online harassment using Squadbox.

In a world of abundant discussion and mass capabilities for amplification, the curation of a social space becomes as equally essential as content creation in defining the nature of that space. By putting more powerful techniques for curation in the hands of everyday people, I envision a future where end users are empowered to actively co-create every aspect of their online discussion environments, bringing in their nuanced and contextual insights.

Systems for Collective Human Curation of Online Discussion. Amy X. Zhang.


YouPS tool

Opportunities for Automating Email Processing: A Need-Finding Study

Email management consumes significant effort from senders and recipients. Some of this work might be automatable. We performed a mixed-methods need-finding study to learn: (i) what sort of automatic email handling users want, and (ii) what kinds of information and computation are needed to support that automation. To further investigate our findings, we developed a platform for authoring small scripts over a user's inbox. Of the automations found in our studies, half are impossible in popular email clients, motivating new design directions.

Opportunities for Automating Email Processing: A Need-Finding Study. Soya Park, Amy X. Zhang, Luke Murray, David Karger. CHI '19.
pdf bibtex


Better Email Automation by MIT News

xrds image

Considering Social Factors in New Technologies

New technologies that alter how we interact with other people come and go, creating new opportunities but also upending social norms. How should builders of new technologies consider the social implications of their systems?

Considering Social Factors in New Technologies. Amy X. Zhang. XRDS: Crossroads, The ACM Magazine for Students.


Tilda chat

Making Sense of Group Chat through Collaborative Tagging and Summarization

Group chat systems generate long streams of unstructured back-and-forth discussion that are difficult to comprehend. We investigate ways to enrich the representation of chat conversations, using techniques such as tagging and summarization, to enable users to better make sense of chat. Through needfinding interviews with 15 active group chat users, we found the importance of structured representations, including signals such as discourse acts. We then developed Tilda, a prototype system that enables people to collaboratively enrich their chat conversation while conversing.
Try it out! code blogpost

Making Sense of Group Chat through Collaborative Tagging and Summarization. Amy X. Zhang, Justin Cranshaw. CSCW '18.
Best Paper Award.
pdf bibtex talk slides

squadbox logo

Squadbox: A Tool to Combat Email Harassment Using Friendsourced Moderation

Communication platforms have struggled to provide effective tools for people facing harassment online. We conducted interviews with 18 recipients of online harassment to understand their strategies for coping, finding that they often resorted to asking friends for help. Inspired by these findings, we explore the feasibility of friendsourced moderation as a technique for combating online harassment. We present Squadbox, a tool to help recipients of email harassment coordinate a "squad" of friend moderators to shield and support them during attacks.
Try it out! code blog

Squadbox: A Tool to Combat Email Harassment Using Friendsourced Moderation. Kaitlin Mahar, Amy X. Zhang, David Karger. CHI '18.
pdf bibtex talk slides

Squadbox: A Tool to Combat Online Harassment Using Friendsourced Moderation. Kaitlin Mahar, Amy X. Zhang, David Karger.
Demo Paper CHI '18.
demo pdf


[CHI talk slides and transcript] [OpenTranscripts of talk]

Inside MIT AI's Workshop (including segment on Squadbox) by BBC Click Television Program
New software by MIT, dubbed 'Squadbox,' hopes to combat cyberbullying by ABC News
MIT researchers created a new tool that lets your 'squad' combat online harassment by Business Insider
A New Tool For Fighting Online Abuse Lets You Recruit Friends For Help by Refinery29
MIT has a new tool to combat online harassment: your friends by The Verge
MIT Tool Lets Your Friends Help You Fight Email Harassment by PCMag
How you and your friends can fight back against online trolls by New Scientist
This MIT Tool Enlists Your Squad To Stop Toxic Internet Harassers by Fast Co.Design
With Squadbox, friends moderate harassing messages in your email by Engadget
Recruit Your Friends to Stop Online Harassment by LifeHacker
MIT creates tool to help curb cyber bullying by Channel 7 News Boston
MIT researchers have developed a tool to fight cyberbullying by The Daily Dot
MIT researchers aim to tackle cyberbullying with Squadbox - this is how it works by Qrius
Can 'friendsourcing' save us from online harassment? by The National Student
An Open Tool to Fight Harassment - Squadbox | An Open Project Spotlight by Mozilla Open Leaders

credco logo

A Structured Response to Misinformation: Defining and Annotating Credibility Indicators in News Articles

Given the variety of approaches to combat misinformation, collective agreement on the indicators that signify credible content could allow for greater collaboration and data-sharing across initiatives. We present an initial set of indicators for article credibility defined by a diverse coalition of experts. These indicators originate from both within an article's text as well as from external sources or article metadata. As a proof-of-concept, we present a dataset of 40 articles of varying credibility annotated with our indicators by 6 trained annotators using specialized platforms.
Credibility Coalition website data

A Structured Response to Misinformation: Defining and Annotating Credibility Indicators in News Articles. Amy X. Zhang, Aditya Ranganathan, Sarah Emlen Metz, Scott Appling, Connie Moon Sehat, Norman Gilmore, Nick B. Adams, Emmanuel Vincent, Jennifer 8. Lee, Martin Robbins, Ed Bice, Sandro Hawke, David Karger, and An Xiao Mina. WWW '18 Companion.
pdf bibtex talk slides


Creating a Circle of Trust by Gates Cambridge
These academics are on the frontlines of fake news research by Poynter
Elevating quality journalism on the open web by Google News Initiative
How The Credibility Coalition Determines Trust Indicators by news:rewired
The Credibility Coalition is working to establish the common elements of trustworthy articles by

tiis graph

Evaluation and Refinement of Clustered Search Results with the Crowd

We investigate using crowd-based human evaluation to inspect, evaluate, and improve clusters to create high-quality clustered search results at scale. We introduce a workflow that begins by using a collection of well-known clustering algorithms to produce a set of clustered search results for a given query. Then, we use crowd workers to holistically assess the quality of each clustered search result in order to find the best one. Finally, the workflow has the crowd spot and fix problems in the best result in order to produce a final output. We evaluate this workflow on 120 top search queries from the Google Play Store.

Evaluation and Refinement of Clustered Search Results with the Crowd.
Amy X. Zhang, Wei Chai, Jinjun Xu, Lichan Hong, Ed Chi. ACM Transactions on Interactive Intelligent Systems: Special Issue on Human-Centered Machine Learning. 8, 2, Article 14 (June 2018), 28 pages. Presented at ACM IUI 2019, Los Angeles.
pdf bibtex talk slides

tiis graph

Deliberation and Resolution on Wikipedia: A Case Study of Requests for Comments

We present an analysis of Requests for Comments (RfCs), one of the main vehicles on Wikipedia for formally resolving a policy or content dispute. We collected an exhaustive dataset of 7,316 RfCs on English Wikipedia over the course of 7 years and conducted a qualitative and quantitative analysis into what issues affect the RfC process. We found that a major issue affecting the RfC process is the prevalence of RfCs that could have benefited from formal closure but that linger indefinitely without one. From these findings, we developed a model that predicts whether an RfC will go stale with 75.3% accuracy, a level that is approached as early as one week after dispute initiation.
Wikipedia project page data

Deliberation and Resolution on Wikipedia: A Case Study of Requests for Comments
Jane Im, Amy X. Zhang, Christopher J. Schilling, David Karger CSCW '18.
pdf bibtex talk slides


A Third of Wikipedia Discussions Are Stuck in Forever Beefs. by Vice Motherboard
Predictive Model Identifies Wikipedia Arguments that Will Never Get Resolved. by Campus Technology

clerkbot image

Post-literate Programming: Linking Discussion and Code in Software Development Teams.

The literate programming paradigm presents a program interleaved with natural language text explaining the code's rationale and logic. While this is great for program readers, the labor of creating literate programs deters most program authors from providing this text at authoring time. Instead, as we determine through interviews, developers provide their design rationales after the fact, in discussions with collaborators. We propose to capture these discussions and incorporate them into the code. We have prototyped a tool to link online discussion of code directly to the code it discusses. Incorporating these discussions incrementally creates post-literate programs that convey information to future developers.

Post-literate Programming: Linking Discussion and Code in Software Development Teams.
Soya Park, Amy X. Zhang, David R. Karger UIST '18. Poster Paper.
pdf bibtex


coarse discourse

Characterizing Online Discussion Using Coarse Discourse Sequences

We present a novel method for classifying comments in online discussions into a set of coarse discourse acts towards the goal of better understanding discussions at scale. We collect and release a corpus of over 9,000 threads comprising over 100,000 comments manually annotated via paid crowdsourcing with discourse acts and randomly sampled from the site Reddit. Using our corpus, we demonstrate how the analysis of discourse acts can characterize different types of discussions, including discourse sequences such as Q&A pairs and chains of disagreement, as well as different communities. Finally, we conduct experiments to predict discourse acts using our corpus, finding that structured prediction models such as conditional random fields can achieve an F1 score of 75%.
code and data

Characterizing Online Discussion Using Coarse Discourse Sequences. Amy X. Zhang, Bryan Culbertson, Praveen Paritosh. ICWSM '17.
pdf bibtex talk slides

Coarse Discourse: A Dataset for Understanding Online Discussions by Google Research Blog


Wikum: Bridging Discussion Forums and Wikis using Recursive Summarization

Large-scale discussions between many participants abound on the internet today, on topics ranging from political arguments to group coordination. But as these discussions grow to tens of thousands of posts, they become ever more difficult for a reader to digest. In this article, we describe a workflow called recursive summarization, implemented in our Wikum prototype, that enables a large population of readers or editors to work in small doses to refine out the main points of the discussion. More than just a single summary, our workflow produces a summary tree that enables a reader to explore distinct subtopics at multiple levels of detail based on their interests.
Try it out! code

Wikum: Bridging Discussion Forums and Wikis using Recursive Summarization. Amy X. Zhang, Lea Verou, David Karger. CSCW '17.
pdf bibtex talk slides


Cutting down the clutter in online conversations by MIT News

emoji screenshot

Using Student Annotated Hashtags and Emojis to Collect Nuanced Affective States

Determining affective states such as confusion from students' participation in online discussion forums can be useful for instructors of a large classroom. In this work, we harness affordances prevalent in social media to allow students to self-annotate their discussion posts with a set of hashtags and emojis, a process that is fast and cheap. From a dataset of over 25,000 discussion posts from two courses containing self-annotated posts by students, we demonstrate how we can identify linguistic differences between posts expressing confusion versus curiosity, achieving 83% accuracy at distinguishing between the two affective states.

Using Student Annotated Hashtags and Emojis to Collect Nuanced Affective States. Amy X. Zhang, Michele Igo, Marc Facciotti, David Karger. Learning@Scale '17. Poster paper.
pdf bibtex



Mavo: Creating Interactive Data-Driven Web Applications by Authoring HTML

Many people can author static web pages with HTML and CSS but find it hard or impossible to program persistent, interactive web applications. We show that for a broad class of CRUD (Create, Read, Update, Delete) applications, this gap can be bridged. Mavo extends the declarative syntax of HTML to describe Web applications that manage, store and transform data. Using Mavo, authors with basic HTML knowledge define complex data schemas implicitly as they design their HTML layout. They need only add a few attributes and expressions to their HTML elements to transform their static design into a persistent, data-driven web application whose data can be edited by direct manipulation of the content in the browser.
Try it out! code

Mavo: Creating Interactive Data-Driven Web Applications by Authoring HTML. Lea Verou, Amy X. Zhang, David Karger. UIST '16.
pdf bibtex

Introducing Mavo: Create Web Apps Entirely By Writing HTML! by Lea Verou - Smashing Magazine


Exploring Social Browsing

While the web contains many social websites, people are generally left in the dark about the activities of other people traversing the web as a whole. We explore the potential benefits and privacy considerations around generating a real-time, publicly accessible stream of web activity where users can publish chosen parts of their web browsing data. We also develop a new social media system for collecting, sharing, and visualizing aspects of one's browsing history.
Try it out! code

Opportunities and Challenges Around a Tool for Social and Public Web Activity Tracking. Amy X. Zhang, Joshua Blum, David Karger. CSCW '16.
pdf bibtex talk slides

Eyebrowse: Selective and Public Web Activity Sharing. Amy X. Zhang, Joshua Blum, David Karger. Demo Paper. CSCW '16.
demo pdf

Reimagining Web Activity Tracking for Social Applications. Amy X. Zhang, Joshua Blum, David Karger. Workshop Paper. Everyday Surveillance Workshop @ CHI '16.
workshop pdf


Browsing in public by MIT News
MIT's Eyebrowse To Rank and Review Internet Sites, While Retaining Privacy by Slashdot
MIT proposes 'Eyebrowse' scheme to rank and review the entire internet by The Stack
Eyebrowse project lets users make web browsing history public and the accompanying radio segment by CBC News
Eyebrowse Aims to Socialize Your Web Surfing Experience by Inverse
MIT's Eyebrowse lets users make their browsing history public by ComputerWorld
System lets web users share aspects of their browsing history with friends, researchers by


Gender and Ideology in the Spread of Anti-Abortion Policy

In the past few years an unprecedented wave of anti-abortion policies were introduced and enacted in state governments in the U.S., affecting millions of constituents. We study this rapid spread of policy change as a function of the underlying ideology of constituents. We examine over 200,000 public messages posted on Twitter surrounding abortion in the year 2013, a year that saw 82 new anti-abortion policies enacted. From these posts, we characterize people's expressions of opinion on abortion and show how these expressions align with policy change on these issues.

Gender and Ideology in the Spread of Anti-Abortion Policy. Amy X. Zhang, Scott Counts CHI '16.
pdf bibtex talk slides


Conference Recommendation and Meetups

Confer is a tool for conference schedule organization and session/paper recommendation. Using the interface and the data, we are exploring ways to facilitate meetings, particularly between new and established members of research communities using this tool. We piloted a meetup session at CSCW '15 called "Confer Coffee", creating groups from people who liked similar papers on Confer, who then gathered in person at the conference, and we are interested in piloting more sessions in future conferences that use Confer.
Try it out! code

Confer: A Conference Recommendation and Meetup Tool. Amy X. Zhang, David Karger, Anant Bhardwaj. Demo Paper. CSCW '16.
demo pdf bibtex


mailing lists

Reimagining the Mailing List

Mailing lists have existed since the early days of email and are still widely used today, even as more sophisticated online forums and social media websites proliferate. We explore why members prefer mailing lists to other group communication tools. But we also identify several tensions around mailing list usage that appear to contribute to dissatisfaction with them.
Try it out! code

Mailing Lists: Why Are They Still Here, What's Wrong With Them, and How Can We Fix Them? Amy X. Zhang, Mark Ackerman, David Karger. CHI '15.
pdf bibtex talk slides

One way to reduce email stress: Re-invent the mailing list by MIT News
Hacker News


Modeling Ideology and Predicting Policy with Social Media

We study the many voices discussing an issue within a constituency and how they reflect ideology and may signal the outcome of important policy decisions. Focusing on the issue of same-sex marriage legalization, we examine almost 2 million public Twitter posts related to same-sex marriage in the U.S. states over the course of 4 years starting from 2011.

Modeling Ideology and Predicting Policy Change with Social Media: Case of Same-Sex Marriage. Amy X. Zhang, Scott Counts. CHI '15.
Best of CHI Honorable Mention.
pdf bibtex talk slides


Visualizing Text Corpora to Compare Media Frames

We develop a visualization technique and visual analytic system that enables the study of media frames across text corpora. In particular our system allows scholars or other analysts to compare media frames in a visualization called the Compare Cloud, which explicitly maps word prevalence and context information between two corpora.

Compare Clouds: Visualizing Text Corpora to Compare Media Frames. Nick Diakopolous, Dag Elgesem, Andrew Salway, Amy X. Zhang, Knuf Hofland. TextVis Workshop @ IUI '15.
pdf bibtex


Spreadsheet-backed Web Apps

We present a system for creating basic web applications using such spreadsheets in place of a server and using HTML to describe the client UI. Authors connect the two by placing spreadsheet references inside HTML attributes. Data computation is provided by spreadsheet formulas. The result is a reactive read-write-compute web page without a single line of Javascript code.

Spreadsheet Driven Web Applications. Edward Benson, Amy X. Zhang, David Karger. UIST '14.
pdf bibtex

Video demo:

Cloudstitch: Beautiful Apps without the Programming Hassle by Rough Draft Ventures
Now a startup called Cloudstitch at

Kinetic Scrolling on Mobile and Tablets

To support navigation of long documents on touchscreen devices, we introduce content-aware kinetic scrolling, a novel scrolling technique that dynamically applies pseudo-haptic feedback in the form of friction around points of high interest within the page. This allows users to quickly find interesting content while exploring without further cluttering the limited visual space.

Content-Aware Kinetic Scrolling for Supporting Web Page Navigation. Juho Kim, Amy X. Zhang, Jihee Kim, Robert Miller, Krzysztof Gajos. UIST '14.
pdf bibtex

Video demo:

Moral Framing in Climate Change Blog Discourse

In this work we develop a novel operationalization of moral evaluation frames and study their use within a corpus of blogs discussing climate change. We develop a text visualization tool called Lingoscope that allows the user to observe and filter the contextual terms that convey moral framing across large volumes of text, as well as to drill down to specific examples.

Identifying and Analyzing Moral Evaluation Frames in Climate Change Blog Discourse. Nick Diakopoulos, Amy X. Zhang, Dag Elgesem, Andrew Salway. ICWSM '14.
pdf bibtex

Controversy and Sentiment in Online News

In this work, we take a data-driven approach to understand how controversy interplays with emotional expression and biased language in the news. We begin by introducing a new dataset of controversial and noncontroversial terms collected using crowdsourcing. Then, focusing on 15 major U.S. news outlets, we compare millions of articles discussing controversial and non-controversial issues over a span of 7 months.

Controversy and Sentiment in Online News. Yelena Mejova, Amy X. Zhang, Nicholas Diakopoulos, Carlos Castillo. Computation + Journalism '14.
pdf bibtex

Why Media Bias Has Nowhere to Run and Hide from Data Science by CrowdFlower



Hoodsquare: Defining Neighborhoods

Information garnered from activity on location-based social networks can be harnessed to characterize urban spaces and organize them into neighborhoods. We adopt a data-driven approach to the identification and modeling of urban neighborhoods using location-based social networks.

Hoodsquare: Modeling and Recommending Neighborhoods in Location-based Social Networks. Amy X. Zhang, Anastasios Noulas, Salvatore Scellato, and Cecilia Mascolo. SocialCom '13.
pdf bibtex talk slides

Good Neighbours by Gates Cambridge


Visual Analytics of Media Frames

There is interest in trying to identify new frames around issues, and to compare how types of frames vary across different news outlets, or over time. We consider these analytic needs in the context of two use-cases relating to news producers and news consumers, and describe the initial design of a visual analytics tool, the LingoScope, in terms of how it supports these use-cases.

Visual Analytics of Media Frames in Online News and Blogs. Nick Diakopolous, Amy X. Zhang, and Andrew Salway. TextVis Workshop @ InfoVis '13.
pdf bibtex



Diurnal Urban Routines on Twitter

We study and characterize diurnal patterns in social media data for different urban areas, with the goal of providing context and framing for reasoning about such patterns at different scales. Using one of the largest datasets to date of Twitter content associated with different locations, we examine within-day variability and across-day variability of diurnal keyword patterns for different locations.

On the Study of Diurnal Urban Routines on Twitter. Mor Naaman, Amy X. Zhang, Samuel Brody, and Gilad Lotan. ICWSM '12.
pdf bibtex

This site built by © Amy X. Zhang, github code here.