Code

PolicyKit - an open-source web app for self-governing your community.
[Github repo] [website]

Wikum - an open-source web app for collectively summarizing large discussions.
[Github repo] [website]

Tilda - an open-source Slack plugin and web app for collaboratively tagging Slack conversations to produce structured summarizations.
[Slack plugin Github repo] [website Github repo] [website]

Eyebrowse - an open source web app and Chrome extension for sharing aspects of one's browsing history for social and personal purposes.
[web server Github repo] [Chrome extension Github repo] [website]

Murmur - an open source mailing list server and website that redesigns the mailing list to reduce noise and encourage sharing.
[mailing list Github repo] [website]

Squadbox - an open source email system and website for people facing harassment to call on their friends for moderation support. This is a fork of the Murmur code repo.
[Github repo] [website]

Pano - an open source web app and Chrome extension for bridging ideological divides by annotating moral framing. This is a fork of the Eyebrowse code repo.
[web server Github repo] [Chrome extension Github repo] [website]

Automatic Clustering of OPTICS - an implementation of an automatic hierarchical clustering algorithm given reachability plots output by the OPTICS algorithm. Written in Python.
[github]


Datasets

Credibility Coalition Crowdsourced Credibility - Explanation of dataset is in the paper:
[dataset]

Coarse Discourse - A large corpus of discourse annotations and relations on ~10K forum threads. Refer to our paper for an in-depth analysis and explanation of the data: Characterizing Online Discussion Using Coarse Discourse Sequences (ICWSM '17). Also make sure to check out the companion python script for retrieving the full texts of the threads.
[Github repo and data] [blog post]

English Wikipedia RfCs - dataset of 7,316 RfCs on English Wikipedia over the course of 7 years (from 2011 to 2018). Explanation of dataset is in in the paper: Deliberation and Resolution on Wikipedia: A Case Study of Requests for Comments (CSCW '18).
[dataset]

Credibility Coalition Credibility Indicators - dataset of 40 articles of varying credibility annotated with credibility indicators by 6 trained annotators. Explanation of dataset is in the paper: A Structured Response to Misinformation: Defining and Annotating Credibility Indicators in News Articles (WWW '18).
[dataset]


Side Projects

I've worked on many programming projects for fun and for classes. Here are some of them in case you ever find it useful:

BookSplice - a content-based recommender system for books based on user-defined tags taken from Goodreads. For the Codex Literary Hackathon in Feb 2017 at the MIT Media Lab. I was also on the organizing committee for this hackathon!
[github]

Down the Rabbit Hole - turning reading into an interactive exploratory environment! Read a book while moving a character around a space, where the words form platforms for the character to jump. For the Codex Literary Hackathon in Jan 2016 at the MIT Media Lab. I was also on the organizing committee for this hackathon!
[Demo of Alice in Wonderland] [Github]

Neural Public Library - a combination of a recursive neural net trained on book titles along with an API mashup that generates new book covers with hilarious results! For the Codex Literary Hackathon in summer 2015.
[Twitter] [Github]

WeCott - a prototype of a community website for boycotting and buycotting items. Initiated at the Hacking iCorruption hackathon hosted by MIT/Harvard and developed further in a Future of News class with participating journalists. We won 2nd place at the hackathon!
[github]

Lucy.js - a fully client side search engine that extends the capabilities of the IndexedDB API to allow full-text search. We implemented inverted indexes, prefix and suffix tries, and our own hybrid approach. For MIT 6.830 Databases final project. Github contains JavaScript code, poster, paper, client-side demo, and client and server-side demo.
[github]

RouteScout - a web application for bicyclists to exchange information such as where accidents occurred and particular tips for different routes. Created using iterative design and prototyping. For MIT 6.813 User Interface Design final project.
[github]

Baitless - RSS feed generator that allows anyone to change the titles of articles in existing RSS feeds to rid them of clickbait. People can upvote new titles and subscribe to the new RSS feed with most upvoted baitless titles.
[github]

Quick script to scrape highlights off of the Amazon Kindle Top highlights page.
[Code]

A simple, nice-looking, interactive quiz in HTML and Javascript.
[Code]

Twitter Analysis with Hadoop - a tool to analyze a large Twitter dataset using Hadoop Streaming
[Code]



This site built by © Amy X. Zhang, github code here.