I am a postdoctoral scholar at the Paul G. Allen Center for Computer Science and Engineering at University of Washington, advised by Yejin Choi and Yulia Tsvetkov. I received my PhD from UC San Diego where I was advised by Taylor Berg-Kirkpatrick.
My research interests are privacy, natural language processing and societal implications of ML. I explore the interplay between data, its influence on models, and the expectations of the people who regulate and use these models. My work has been recognized by the NCWIT Collegiate Award and the Rising Star in Adversarial Machine Learning Award.
During my PhD I was a part-time researcher/intern at Microsoft Research (Privacy in AI, Algorithms, and Semantic Machines teams) working on differential privacy, model compression and data synthesis.
I am currently on the academic job market for faculty positions! Please contact me if you would like to discuss potential openings or collaborations.
β¦ Explanation about my name: I used to publish under Fatemeh which is my legal name in paperwork. But I now go by Niloofar, which is the Lily flower in Farsi!
News Highlights
I will be attending NeurIPS conference! You can find me at:
- The Red Teaming GenAI workshop where I'm giving an invited talk A False Sense of Privacy: Semantic Leakage and Non-literal Copying in LLMs
- The WiML workshop where I'm serving as a mentor
- The PrivacyML: Meaningful Privacy-Preserving Machine Learning tutorial where I'm a panelist
- Our poster From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
I will be giving an in-person talk at the Stanford NLP Seminar on January 16th! Reach out if you want to meet there!
I will be visiting Johns Hopkins university to give a talk on the 9th! Reach out if you wanna meet up!
I appeared on a panel at the Future of Privacy Forum - Technologist Roundtable for Policymakers: Key Issues in Privacy and AI (write-up coming soon!)
I appeared on the Thesis Review podcast with Sean Welleck where I talked about my work on Auditing and Mitigating Safety Risks in Large Language Models.
I wrote a blogpost on "Should I do a postdoc?" based on my experience - check out the blog post and video with Sasha Rush!
I gave an invited keynote talk at the SRI International C3E workshop hosted by SRI and NSA. View talk slides.
I was interviewed by UW News about OpenAI's O1 update and advances in math and reasoning. Read the interview.
I was interviewed by the Washington Post on Google's AI image generator controversy and disclosure of personal information in conversations with ChatGPT.
Selected Publications
For the full list, please refer to my Google Scholar page.
-
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
NeurIPS 2024
L. Jiang, K. Rao, S. Han, A. Ettinger, F. Brahman, S. Kumar, N. Mireshghallah, X. Lu, M. Sap, Y. Choi, N. Dziri
-
EMNLP 2024
T. Chen, N. Mireshghallah*, A. Asai*, S. Min, J. Grimmelmann, Y. Choi, H. Hajishirzi, L. Zettlemoyer, P. W. Koh
-
Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild
COLM 2024
N. Mireshghallah*, M. Antoniak*, Y. More*, Y. Choi, G. Farnadi
-
Do membership inference attacks work on large language models?
COLM 2024
M. Duan, A. Suri, N. Mireshghallah, S. Min, W. Shi, L. Zettlemoyer, Y. Tsvetkov, Y. Choi, D. Evans, H. Hajishirzi
-
Machine Unlearning Doesn't Do What You Think
Extended Abstract at GenLaw 2024
K. Lee, A. F. Cooper, C. A. Choquette-Choo, K. Liu, M. Jagielski, N. Mireshghallah, L. Ahmed, J. Grimmelmann, D. Bau, C. De Sa, et al.
-
A Roadmap to Pluralistic Alignment
ICML 2024
T. Sorensen, J. Moore, J. Fisher, M. Gordon, N. Mireshghallah, C. M. Rytting, A. Ye, L. Jiang, X. Lu, N. Dziri, T. Althoff, Y. Choi
-
ICLR 2024
N. Mireshghallah*, H. Kim*, X. Zhou, Y. Tsvetkov, M. Sap, R. Shokri, Y. Choi
-
Privacy-preserving in-context learning with differentially private few-shot generation
ICLR 2024
X. Tang, R. Shin, H. A. Inan, A. Manoel, N. Mireshghallah,Z. Lin, S. Gopi, J. Kulkarni, R. Sim
-
Smaller Language Models are Better Black-box Machine-Generated Text Detectors
EACL 2024
N. Mireshghallah, J. Mattern, S. Gao, R. Shokri, T. Berg-Kirkpatrick
-
Privacy-Preserving Domain Adaptation of Semantic Parsers
ACL 2023
N. Mireshghallah, R. Shin, Y. Su, T. Hashimoto, J. Eisner
-
Non-Parametric Temporal Adaptation for Social Media Topic Classification
EMNLP 2023
N. Mireshghallah*, N. Vogler*, J. He, O. Florez, A. El-Kishky, T. Berg-Kirkpatrick
-
A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation
CoNLL 2023
J. Forristal, N. Mireshghallah, G. Durrett, T. Berg-Kirkpatrick
-
Membership Inference Attacks against Language Models via Neighbourhood Comparison
ACL 2023
J. Mattern, N. Mireshghallah, Z. Jin, B. Scholkop, M. Sachan, T. Berg-Kirkpatrick
-
Differentially Private Model Compression
NeurIPS 2022
N. Mireshghallah, A. Backurs, H. A. Inan, L. Wutschitz, J. Kulkarni
-
Memorization in NLP Fine-tuning Methods
EMNLP 2022
N. Mireshghallah, A. Uniyal, T. Wang, D. Evans, T. Berg-Kirkpatrick
-
Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks
EMNLP 2022
N. Mireshghallah, K. Goyal, A. Uniyal, T. Berg-Kirkpatrick, R. Shokri
-
NAACL 2022
N. Mireshghallah, V. Shrivastava, M. Shokouhi, T. Berg-Kirkpatrick, R. Sim, D. Dimitriadis
-
What Does it Mean for a Language Model to Preserve Privacy?
FAccT 2022
H. Brown, K. Lee, N. Mireshghallah, R. Shokri, F. Tram'er
-
Mix and Match: Learning-free Controllable Text Generation
ACL 2022
N. Mireshghallah, K. Goyal, T. Berg-Kirkpatrick
-
Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness
EMNLP 2021
N. Mireshghallah, T. Berg-Kirkpatrick
-
Privacy Regularization: Joint Privacy-Utility Optimization in Language Models
NAACL 2021
N. Mireshghallah, H. Inan, M. Hasegawa, V. RΓΌhle, T. Berg-Kirkpatrick, R. Sim
-
ICML 2020
A. Elthakeb, P. Pilligundla, N. Mireshghallah, A. Cloninger, H. Esmaeilzadeh
-
Not All Features Are Equal: Discovering Essential Features for Preserving Prediction Privacy
WWW 2021
N. Mireshghallah, M. Taram, A. Jalali, A. T. Elthakeb, D. Tullsen, H. Esmaeilzadeh
-
Shredder: Learning Noise Distributions to Protect Inference Privacy
ASPLOS 2020
N. Mireshghallah, M. Taram, A. Jalali, D. Tullsen, H. Esmaeilzadeh
Invited Talks
-
Stanford University
NLP Seminar, Jan. 2025
Privacy, Copyright and Data Integrity: The Cascading Implications of Generative AI
-
University of California, Los Angeles
Guest lecture for CS 269 - Computational Ethics, LLMs and the Future of NLP, Jan. 2025
A False Sense of Privacy: Semantic Leakage and Non-literal Copying in LLMs
-
NeurIPS Conference
Red Teaming GenAI workshop, Dec. 2024
A False Sense of Privacy: Semantic Leakage and Non-literal Copying in LLMs
-
NeurIPS Conference
Panelist, Dec. 2024
PrivacyML: Meaningful Privacy-Preserving Machine Learning tutorial
-
Johns Hopkins University
CS Department Seminar, Dec. 2024
Privacy, Copyright and Data Integrity: The Cascading Implications of Generative AIs
-
Future of Privacy Forum
Panelist, Nov. 2024
Technologist Roundtable for Policymakers: Key Issues in Privacy and AI
-
University of Utah
Guest lecture for the School of Computing CS 6340/5340 NLP course, Nov. 2024
Can LLMs Keep a Secret?
-
UMass Amherst
NLP Seminar, Oct. 2024
Membership Inference Attacks and Contextual Integrity for Language
-
Northeastern University
Khoury College of Computer Sciences Security Seminar, Oct. 2024
Membership Inference Attacks and Contextual Integrity for Language
-
Stanford Research Institute (SRI) International
Computational Cybersecurity in Compromised Environments (C3E) workshop, Sep. 2024
Can LLMs keep a secret? Testing privacy implications of Language Models via Contextual Integrity
-
LinkedIn Research
Privacy Tech Talk, Sep. 2024
Can LLMs keep a secret? Testing privacy implications of Language Models via Contextual Integrity
-
National Academies (NASEM)
Forum on Cyber Resilience, Aug. 2024
Oversharing with LLMs is underrated: the curious case of personal disclosures in human-LLM conversations
-
ML Collective
DLCT reading group, Aug. 2024
Privacy in LLMs: Understanding what data is imprinted in LMs and how it might surface!
-
Carnegie Mellon University
Invited Talk, Jun. 2024
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs
-
Generative AI and Law workshop, Washington DC
Invited Talk, Apr. 2024
What is differential privacy? And what is it not?
-
Meta AI Research
Invited Talk, Apr. 2024
Membership Inference Attacks and Contextual Integrity for Language
-
Georgia Institute of Technology
Guest lecture for the School of Interactive Computing, Apr. 2024
Safety in LLMs: Privacy and Memorization
-
University of Washington
Guest lecture for CSE 484 and 582 courses on Computer Security and Ethics in AI, Apr. 2024
Safety in LLMs: Privacy and Memorization
-
Carnegie Mellon University
Guest lecture for LTI 11-830 course on Computational Ethics in NLP, Mar. 2024
Safety in LLMs: Privacy and Memorization
-
Simons Collaboration
TOC4Fairness Seminar, Mar. 2024
Membership Inference Attacks and Contextual Integrity for Language
-
University of California, Santa Barbara
NLP Seminar Invited Talk, Mar. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
University of California, Los Angeles
NLP Seminar Invited Talk, Mar. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
University of Texas at Austin
Guest lecture for LIN 393 course on Social Applications and Impact of NLP, Feb. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
Google Brain
Google Tech Talk, Feb. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
University of Washington
Allen School Colloquium, Jan. 2024
Can LLMs Keep a Secret? Testing Privacy Implications of LLMs
-
University of Washington
eScience Institute Seminars, Nov. 2023
Privacy Auditing and Protection in Large Language Model
-
CISPA Helmholtz Center for Security
Invited Talk, Sep. 2023
What does privacy-preserving NLP entail?
-
Max Planck Institute for Software Systems
Next 10 in AI Series, Sep. 2023
Auditing and Mitigating Safety Risks in LLMs
-
Mila / McGill University
Invited Talk, May 2023
Privacy Auditing and Protection in Large Language Models
-
EACL 2023
Tutorial co-instruction, May 2023
Private NLP: Federated Learning and Privacy Regularization
-
LLM Interfaces Workshop and Hackathon
Invited Talk, Apr. 2023
Learning-free Controllable Text Generation
-
University of Washington
Invited Talk, Apr. 2023
Auditing and Mitigating Safety Risks in Large Language Models
-
NDSS Conference
Keynote talk for EthiCS workshop, Feb. 2023
How much can we trust large language models?
-
Google
Federated Learning Seminar, Feb. 2023
Privacy Auditing and Protection in Large Language Models
-
University of Texas Austin
Invited Talk, Oct. 2022
How much can we trust large language models?
-
Johns Hopkins University
Guest lecture for CS 601.670 course on Artificial Agents, Sep. 2022
Mix and Match: Learning-free Controllable Text Generation
-
KDD Conference
Adversarial ML workshop, Aug. 2022
How much can we trust large language models?
-
Microsoft Research Cambridge
Invited Talk, Mar. 2022
What Does it Mean for a Language Model to Preserve Privacy?
-
University of Maine
Guest lecture for COS435/535 course on Information Privacy Engineering, Dec. 2021
Improving Attribute Privacy and Fairness for Natural Language Processing
-
National University of Singapore
Invited Talk, Nov. 2021
Style Pooling: Automatic Text Style Obfuscation for Fairness
-
Big Science for Large Language Models
Invited Panelist, Oct. 2021
Privacy-Preserving Natural Language Processing
-
Research Society MIT Manipal
Cognizance Event Invited Talk, Jul. 2021
Privacy and Interpretability of DNN Inference
-
Alan Turing Institute
Privacy and Security in ML Seminars, Jun. 2021
Low-overhead Techniques for Privacy and Fairness of DNNs
-
Split Learning Workshop
Invited Talk, Mar. 2021
Shredder: Learning Noise Distributions to Protect Inference Privacy
-
University of Massachusetts Amherst
Machine Learning and Friends Lunch, Oct. 2020
Privacy and Fairness in DNN Inference
-
OpenMined Privacy Conference
Invited Talk, Sep. 2020
Privacy-Preserving Natural Language Processing
-
Microsoft Research AI
Breakthroughs Workshop, Sep. 2020
Private Text Generation through Regularization
Awards and Honors
Momental Foundation Mistletoe Research Fellowship (MRF) Finalist, 2023
Rising Star in Adversarial Machine Learning (AdvML) Award Winner, 2022. AdvML Workshop
Rising Stars in EECS, 2022. Event Page
UCSD CSE Excellence in Leadership and Service Award Winner, 2022
FAccT Doctoral Consortium, 2022. FAccT 2022
Qualcomm Innovation Fellowship Finalist, 2021. Fellowship Page
NCWIT (National Center for Women & IT) Collegiate Award Winner, 2020. NCWIT Awards
National University Entrance Exam in Math, 2014. Ranked 249th of 223,000
National University Entrance Exam in Foreign Languages, 2014. Ranked 57th of 119,000
National Organization for Exceptional Talents (NODET), 2008. Admitted, ~2% Acceptance Rate
Featured Press & Media
Thesis Review podcast episode about my work on Auditing and Mitigating Safety Risks in Large Language Models
Should I do a postdoc guest video on Sasha's channel - along with the blog post
Recent Co-organized Workshops
[for full list check my CV]Industry Research Experience
-
Microsoft Semantic Machines
Fall 2022-Fall 2023 (Part-time), Summer 2022 (Intern)
Mentors: Richard Shin, Yu Su, Tatsunori Hashimoto, Jason Eisner
-
Microsoft Research, Algorithms Group, Redmond Lab
Winter 2022 (Intern)
Mentors: Sergey Yekhanin, Arturs Backurs
-
Microsoft Research, Language, Learning and Privacy Group, Redmond Lab
Summer 2021 (Intern), Summer 2020 (Intern)
Mentors: Dimitrios Dimitriadis, Robert Sim
-
Western Digital Co. Research and Development
Summer 2019 (Intern)
Mentor: Anand Kulkarni
Diversity, Inclusion & Mentorship
Mentor for the mentorship program at WiML event in NeurIPS 2024
D&I chair at NAACL 2025
Widening NLP (WiNLP) co-chair
Socio-cultural D&I chair at NAACL 2022
Mentor for the Graduate Women in Computing (GradWIC) at UCSD
Mentor for the UC San Diego Women Organization for Research Mentoring (WORM) in STEM
Co-leader for the "Feminist Perspectives for Machine Learning & Computer Vision" Break-out session at the Women in Machine Learning (WiML) 2020 Un-workshop Held at ICML 2020
Mentor for the USENIX Security 2020 Undergraduate Mentorship Program
Volunteer at the Women in Machine Learning 2019 Workshop Held at NeurIPS 2019
Invited Speaker at the Women in Machine Learning and Data Science (WiMLDS) NeurIPS 2019 Meetup
Mentor for the UCSD CSE Early Research Scholars Program (CSE-ERSP) in 2018
Professional Services
[Outdated, for an updated version check my CV]Reviewer for ICLR 2022
Reviewer for NeurIPS 2021
Reviewer for ICML 2021
Shadow PC member for IEEE Security and Privacy Conference Winter 2021
Artifact Evaluation Program Committee Member for USENIX Security 2021
Reviewer for ICLR 2021 Conference
Program Committee member for the LatinX in AI Research Workshop at ICML 2020 (LXAI)
Reviewer for the 2020 Workshop on Human Interpretability in Machine Learning (WHI) at ICML 2020
Program Committee member for the MLArchSys workshop at ISCA 2020
Security & Privacy Committee Member and Session Chair for Grace Hopper Celebration (GHC) 2020
GHC (Grace Hopper Celebration) 2020 Privacy and Security Committee Member
Reviewer for ICML 2020 Conference
Artifact Evaluation Program Committee Member for ASPLOS 2020
Reviewer for IEEE TC Journal
Reviewer for ACM TACO Journal
Books I Like!
Small Is Beautiful: Economics As If People Mattered by E. F. Schumacher
Quarter-life by Satya Doyle Byock
The Body Keeps the Score by Bessel van der Kolk
36 Views of Mount Fuji by Cathy Davidson
Indistractable by Nir Eyal
Sapiens: A Brief History of Humankind by Yuval Noah Harari
The Martian by Andy Weir
The Solitaire Mystery by Jostein Gaarder
The Orange Girl by Jostein Gaarder
Life is Short: A Letter to St Augustine by Jostein Gaarder
The Alchemist by Paulo Coelho