Conference papers
DataComp-LM: In search of the next generation of training sets for language models DataComp-LM: In search of the next generation of training sets for language models
Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muenninghoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner, Maciej Kilian, Hanlin Zhang, Rulin Shao, Sarah Pratt, Sunny Sanyal, Gabriel Ilharco, Giannis Daras, Kalyani Marathe, Aaron Gokaslan, Jieyu Zhang, Khyathi Chandu, Thao Nguyen, Igor Vasiljevic, Sham Kakade, Shuran Song, Sujay Sanghavi, Fartash Faghri, Sewoong Oh, Luke Zettlemoyer, Kyle Lo, Alaaeldin El-Nouby, Hadi Pouransari, Alexander Toshev, Stephanie Wang, Dirk Groeneveld, Luca Soldani, Pang Wei Koh, Jenia Jitsev, Thomas Kollar, Alexandros G Dimakis, Yair Carmon, Achal Dave, Ludwig Schmidt, Vaishaal Shankar
NeurIPS 2024, Datasets and Benchamrks Track
Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares
Gavin Brown, Jonathan Hayase, Samuel Hopkins, Weihao Kong, Xiyang Liu, Sewoong Oh, Juan C. Perdomo, Adam Smith
COLT, 2024
presented at FORC 2024, Symposium on the Fooundations of Responsible Computing
presented at TPDP 2024, Theory and Practice of Differential Privacy
DPZero: Private Fine-Tuning of Language Models without Backpropagation
Liang Zhang, Bingcong Li, Kiran Koshy Thekumparampil, Sewoong Oh, Niao He
ICML, 2024
presented at TPDP 2024, Theory and Practice of Differential Privacy
slides from my talk at CSL student conference at UIUC are available here
slides from my talk at IMS International Conference on Statistics and Data Science (ICSDS) 2023 are available here
a workshop version presented at NeurIPS 2023 workshop on Federated Learning in the Age of Foundation Models
Uncertainty Quantification with User-level Differential Privacy
Abhradeep Guha Thakurta, Dj Dvijotham, Georgie Evans, Peter Kairouz, Ryan McKenna, Sewoong Oh
working paper
Advancing Differential Privacy: Where We Are Now and Future Directions for Real-World Deployment
Rachel Cummings, Damien Desfontaines, David Evans, Roxana Geambasu, Matthew Jagielski, Yangsibo Huang, Peter Kairouz, Gautam Kamath, Sewoong Oh, Olga Ohrimenko, Nicolas Papernot, Ryan Rogers, Milan Shen, Shuang Song, Weijie Su, Andreas Terzis, Abhradeep Thakurta, Sergei Vassilvitskii, Yu-Xiang Wang, Li Xiong, Sergey Yekhanin, Da Yu, Huanyu Zhang, Wanrong Zhang
Harvard Data Science Review, 6(1), January 2024
In July 2022, we hosted a workshop titled “Differential Privacy (DP): Challenges towards the Next Frontier” with experts from industry, academia, and the public sector to discuss and find solutions to the challenges of differential privacy. This is a report from that workshop.
DataComp: In search of the next generation of multimodal datasets
Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt
NeurIPS 2023, Datasets and Benchmarks Track (Oral presentation)
MAML and ANIL Provably Learn Representations
Liam Collins, Aryan Mokhtari, Sewoong Oh, Sanjay Shakkottai
ICML, 2022
video of a talk on Dec 2022 at C3 Digital Transformation Institute is available here
slides from my talk at C3 Digital Transformation Institute is available here
5 minutes presentation by Liam Collins at ICML is available here
Differential privacy and robust statistics in high dimensions
Xiyang Liu, Weihao Kong, Sewoong Oh
COLT, 2022
Presented at the third AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-22), the recording of a 12 minute presentation is available here
video of a talk on Nov 2021 at SNAPP seminar series is available here
slides from my talk at SNAPP seminar is available here
slides from my talk at Google is available here
Robust and Differentially Private Mean Estimation
Xiyang Liu, Weihao Kong, Sham Kakade, Sewoong Oh
NeurIPS 2021,
presented at the ICML 2021 Workshop on Federated Learning for User Privacy and Data Confidentiality (ICML-FL 2021)
presented at the CCS 2021 workshop Privacy Preserving Machine Learning (PPML’21)
video of a talk on Oct 2021 at Simons Institute is available here
slides from my talk is available here
code is available here
DeepTurbo: Deep Turbo Decoder
Yihan Jiang, Hyeji Kim, Himanshu Asnani, Sreeram Kannan, Sewoong Oh, and Pramod Viswanath
2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), 2019
Detecting Sponsored Recommendations
Subhashini Krishnasamy, Rajat Sen, Sewoong Oh, and Sanjay Shakkottai
SIGMETRICS (short paper) 2015
What's your choice? Learning the mixed multi-nomial logit model
Ammar Ammar, Sewoong Oh, Devavrat Shah, and Luis-Filipe Voloch
SIGMETRICS (short paper) 2014
Journal papers
- Sequence-to-sequence translation from mass spectra to peptides with a transformer model
Melih Yilmaz, William E. Fondrie, Wout Bittremieux, Carlo F. Melendez, Rowan Nelson, Varun Ananth, Sewoong Oh, William Stafford Noble
Nature Comunications, 2024, 15(1), pp.6427
- Accounting for digestion enzyme bias in Casanovo
Carlo Melendez, Justin Sanders, Melih Yilmaz, Wout Bittremieux, William E Fondrie, Sewoong Oh, William Stafford Noble
Journal of Proteome Research, 2024,
- MAUVE scores for generative models: Theory and practice
Krishna Pillutla, Lang Liu, John Thickstun, Sean Welleck, Swabha Swayamdipta, Rowan Zellers, Sewoong Oh, Yejin Choi, Zaid Harchaoui
Journal of Machine Learning Research, 2023
- Towards a Defense Against Federated Backdoor Attacks Under Continuous Training
Shuaiqi Wang, Jonathan Hayase, Giulia Fanti, Sewoong Oh
Transactions on Machine Learning Research (TMLR), 2023
- Machine Learning-Aided Efficient Decoding of Reed-Muller Subcodes
Mohammad Vahid Jamali, Xiyang Liu, Ashok Vardhan Makkuva, Hessam Mahdavifar, Sewoong Oh, Pramod Viswanath
IEEE Transactions on Selected Areas in Information Theory (JSAIT) , 2023
- Gradient flows on graphons: existence, convergence, continuity equations
Sewoong Oh, Soumik Pal, Raghav Somani, Raghavendra Tripathi
Journal of Theoretical Probability, 2023,
presented at the NeurIPS 2021 workshop on Optimal Transport and Machine Learning
- Evaluating proteomics imputation methods with improved criteria
L Harris, WE Fondrie, S Oh, WS Noble
Journal of Proteome Research, 2023, 22 (11), 3427-3438
- Reducing peptide sequence bias in quantitative mass spectrometry data with machine learning
AB Dincer, Y Lu, DK Schweppe, S Oh, WS Noble
Journal of Proteome Research, 2022, 21 (7), 1771-1782
- Physical Layer Communication via Deep Learning
Hyeji Kim, Sewoong Oh, Pramod Viswanath
IEEE Transactions on Selected Areas in Information Theory (JSAIT), Vol.1, no.1, pp.5-18, 2020,
- LEARN Codes: Inventing Low-latency Codes via Recurrent Neural Networks
Yihan Jiang, Hyeji Kim, Himanshu Asnani, Sreeram Kannan, Sewoong Oh, and Pramod Viswanath
IEEE Transactions on Selected Areas in Information Theory (JSAIT), Vol.1, no.1, pp.207-216, 2020,
- PacGAN: The power of two samples in generative adversarial networks
Zinan Lin, Ashish Khetan, Giulia Fanti, Sewoong Oh
IEEE Transactions on Selected Areas in Information Theory (JSAIT), 2 Vol.1, no.1, pp.324-335, 2020,
[ code ], [ project page]
- Deepcode: Feedback Codes via Deep Learning
Hyeji Kim, Yihan Jiang, Sreeram Kannan, Sewoong Oh, Pramod Viswanath
IEEE Transactions on Selected Areas in Information Theory (JSAIT), Vol.1, no.1, pp.194-206, 2020,
[ code by Hyeji Kim ],
[ code by Yihan Jiang ]
- Spectrum Estimation from a Few Entries
Ashish Khetan, Sewoong Oh
Journal of Machine Learning Research, Vol.20, Issue:21, January 2019
- Learning from Comparisons and Choices
Sahand Negahban, Sewoong Oh, Kiran Thekumparampil, and Jiaming Xu,
Journal of Machine Learning Research, Vol.19, Issue:40, pp.1-95, September 2018
- Generalized
Rank-breaking: Computational and Statistical Tradeoffs
Ashish Khetan, Sewoong Oh
Journal of Machine Learning Research, Vol.19, Issue:28, pp.1-42, September 2018 [bibtex]
- Optimality of Belief Propagation for Crowdsourced Classification
Jungseul Ok, Sewoong Oh, Jinwoo Shin, Yung Yi
IEEE Transactions on Information Theory, Vol.64, Issue:9, pp.6127-6138, September 2018,
- Demystifying Fixed k-Nearest Neighbor Information Estimators
Weihao Gao, Sewoong Oh, Pramod Viswanath
IEEE Transactions on Information Theory, Vol.64, Issue:8, pp.5629-5661 February 2018, [bibtex]
- Breaking the Bandwidth Barrier: Geometrical Adaptive Entropy Estimation
Weihao Gao, Sewoong Oh, Pramod Viswanath
IEEE Transactions on Information Theory, Vol.64, Issue:5, pp.3313-3330, May 2018, [bibtex]
- Discovering Potential Correlations via Hypercontractivity
Hyeji Kim, Weihao Gao, Sreeram Kannan, Sewoong Oh, and Pramod Viswanath
Entropy, Vol.19, Issue:11, pp.586, October 2017, [ code ], [bibtex]
- Data-driven Rank Breaking for Efficient Rank Aggregation
Ashish Khetan, Sewoong Oh
Journal of Machine Learning Research, Vol.17, no.193, pp.1-54, October 2016 [bibtex]
- Hiding the Rumor Source
Giulia Fanti, Peter Kairouz, Sewoong Oh, Kannan Ramchandran, and Pramod Viswanath
IEEE Transactions on Information Theory, Vol.63, Issue:10, pp.6679-6713, October 2017 [bibtex]
- Metadata-conscious Anonymous Messaging
Giulia Fanti, Peter Kairouz, Sewoong Oh, Kannan Ramchandran, and Pramod Viswanath
IEEE Transactions on Signal and Information Processing over Networks, Volume: 2, Issue: 4, pp.582 - 594, December 2016
- Detecting Sponsored Recommendations
Subhashini Krishnasamy, Rajat Sen, Sewoong Oh, and Sanjay Shakkottai
ACM Transactions on Modeling and Performance Evaluation of Computing Systems, Volume 2, Issue 1, pp.6:1–6:29, November 2016
- The Composition Theorem for Differential Privacy
Peter Kairouz, Sewoong Oh and Pramod Viswanath
IEEE Transaction on Information Theory, Volume 63, Issue 6, pp.4037-4049, June 2017 [bibtex]
- Extremal Mechanisms for Local Differential Privacy
Peter Kairouz, Sewoong Oh, and Pramod Viswanath
Journal of Machine Learning Research, Volume 17, no.17, pp.1-51, April 2016 [bibtex]
- RankCentrality: Ranking from Pair-wise Comparisons
Sahand Negahban, Sewoong Oh, and Devavrat Shah
Operations Research, Vol.65, no.1, pp.266-287, October 2016
[bibtex]
- The Staircase Mechanisms in Differential Privacy
Q. Geng, P. Kairouz, S. Oh, and P. Viswanath
Selected Topics in Signal Processing, April 2015
- Budget-optimal Task Allocation for Reliable Crowdsourcing Systems
David R. Karger, Sewoong Oh and Devavrat Shah
Operations Research, Volume 62 Issue 1, pp.1-24, January-February 2014 [bibtex]
- Robust Localization from Incomplete Local Information
Amin Karbasi and Sewoong Oh
IEEE Transactions on Networking, Vol 21, pp.1131-1144, August 2013, [bibtex]
- Calibration using Matrix Completion with Application to Ultrasound Tomography
Reza Parhizkar, Amin Karbai, Sewoong Oh and Martin Vetterli
IEEE Transactions on Signal Processing, Vol 61, pp.4923-4933, October 2013, [bibtex]
- Counting with the Crowd
Adam Marcus, David Karger, Samuel Madden, Robert Miller, Sewoong Oh
Journal of the VLDB Endowment, Vol. 6, issue 2, pp.109-120, December 2012, [bibtex]
- Matrix Completion from Noisy Entries
Raghunandan Keshavan, Andrea Montanari and Sewoong Oh
Journal of Machine Learning Research, vol. 11, pp.2057-2078, July 2010, [ bibtex ,
code ]
- Matrix Completion from a Few Entries
Raghunandan Keshavan, Andrea Montanari and Sewoong Oh
IEEE Transactions on Information Theory,vol. 56,no. 6, pp.2980-2998, June 2010, [ bibtex ,
code ]
Dissertation
- Matrix Completion: Fundamental Limits and Efficient Algorithms
Ph.D. Dissertation, Stanford Univesiry, December 2010
|