Zachary Coalson | PhD Student at Oregon State University

About Me

I am an incoming first-year PhD student in Computer Science at Oregon State University. I am studying trustworthy and socially responsible machine learning under the supervision of Professor Sanghyun Hong, where I am a member of the Secure AI systems Lab (SAIL). The goal of my research is to audit and improve the robustness of machine learning systems agaisnt adversarial threats, misuse, and generally undesirable behaviors. Currently, I am particularly interested in improving the trustworthiness of large language models (for example, reducing toxicity and auditing jailbreaking). Please do not hesitate to contact me to discuss my work or potential collaborations!

Zachary (Zach) Coalson

coalsonz [at] oregonstate [dot] edu

News & Updates

June 30, 2025: I have been awarded the 2025 GEM Fellowship to support my PhD studies. Thank you GEM, PNNL and OSU, I am very grateful and excited for the opportunity!

June 25, 2025: Our paper on improving the efficiency of VLN is accepted to ICCV 2025!

June 16, 2025: I'm starting my summer internship at Pacific Northwest National Laboratory!

June 13, 2025: I am excited to share that I am a 2025 NSF GRFP recipient! I'm incredibly grateful for the opportunity. Thank you NSF!

June 2, 2025: Our paper on using influence functions to reduce language model toxicity is on arXiv.

May 26, 2025: I successfully defended my undergraduate honors thesis: On the Robustness of Neural Architecture Search to Data Poisoning Attacks. That about wraps up my undergraduate studies!

Feburary 26, 2025: I was awarded the ARCS Foundation Oregon Scholar Award from Oregon State University to support my PhD studies. I'm very grateful for the support.

December 10, 2024: Our paper on jailbreaking large language models with adversarial bit-flips is on arXiv.

May 1, 2024: Our paper on poisoning attacks agaisnt neural architecture search is on arXiv.

September 21, 2023: Our paper (my first publication!) on auditing multi-exit language models to adversarial slowdown is accepted to NeurIPS 2023.