Hi. I'm Shrainik Jain.

And this is my personal homepage. I am CS PhD student at the Paul G. Allen School for Computer Science and Engineering, University of Washington. My research focuses on database design, scientific data management, high-variety & federated databases, and more recently on query recommendation.

I am a part of the UW DB group

.. and am currently working with my advisors Bill Howe and Ed Lazowska on scientific data management problems.


I am one of the active developers for the SQLShare project. I fiddle with its user query logs and find interesting things.

Polystore Data Management

We are trying to build a query compiler and middleware which optimizes queries across multiple backends, figuring out the best system to run subplans on.

What query should I run on this dataset?

I am researching ways to extend SQLShare to be a system, which upon upload of a new dataset, recommends meaningful queries which can be run on it. Kind of like what Tableau does for visualizations, but with meaningful queries.

Definition for High Variety

Despite all the attention Big Data receives, no one knows what it means. I am trying to fix that by providing a definition for the 3rd V of Big Data, Variety.

Benchmark for High Variety

How do you measure Variety for your data sources? What tool/service/DBMS is best suited for your data management needs? I am trying to answer these questions by finding quantitative measures for Variety.

Build better Systems

We study user usage patterns and needs for High Variety Data Management. The analysis of Variety and User needs put together would help developers and researchers make better systems.


  1. The Myria Big Data Management and Analytics System and Cloud Service: Jingjing Wang, Tobin Baker, Magdalena Balazinska, Dan Halperin, Brandon Haynes, Bill Howe, Dylan Hutchison, Shrainik Jain, Ryan Maas, Parmita Mehta, Dominik Moritz, Brandon Myers, Jennifer Ortiz, Dan Suciu, Andrew Whitaker, and Shengliang Xu, In proceedings of the 8th biennial Conference on Innovative Data Systems Research (CIDR).
  2. Data Cleaning in the Wild: Reusable Curation Idioms from a Multi-Year SQL Workload: Shrainik Jain, Bill Howe, In proceedings of the 11th International Workshop on Quality in DataBases. (draft) (bibtex)
  3. SQLShare: Results from a Multi-Year SQL-as-a-Service Experiment: Shrainik Jain, Dominik Moritz, Daniel Halperin, Bill Howe, Ed Lazowska, In proceedings of the 2016 ACM SIGMOD International Conference on Management of Data. (pdf) (bibtex) (Reproducibility Award)
  4. High Variety Cloud Databases: Shrainik Jain, Dominik Moritz, Bill Howe, In proceedings of the 2016 IEEE Cloud Data Management Workshop. (pdf) (bibtex)
  5. Pattern Formation for Asynchronous Robots without Agreement in Chirality: Sruti Gan Chaudhuri, Swapnil Ghike, Shrainik Jain, Krishnendu Mukhopadhyaya. (pdf) (bibtex)

Here's some stuff I have done in the past.

In no particular order, importance, or chronology.

JavaScript Orleans

Introduced support for dynamic types and JavaScript development to project Orleans during my internship at Microsoft Research, Redmond.


Was a part of the team responsible for making MOHORO, a scalable, pay-by-usage Desktop-as-a-Service on Azure. Project served as a precursor to Microsoft Azure RemoteApp. This was during my stint as an SDET at Microsoft India Development Center.

Disaster Recovery as a Service

Also during my stint at Microsoft India Development Center, I was the part of the team responsible for building HyperV FR based Disaster Recovery as a Service or Azure Site Recovery as it is popularly known as.


As an undergraduate at the Birla Institute of Technology and Science, BITS, Pilani, I spent most of my time being clueless. Apart from being lost, I was a TA, President and Project forum in-charge (mentor) for the Computer Science Association at various times of my undergrad. Read more in my CV.


Find me on ...

This page was last updated on Oct 26, 2016