Alan

Alan Sun

I’m a second-year MSCS student at Carnegie Mellon University. I’m grateful to be supported by a NSF Graduate Research Fellowship.

Before CMU, I was a visiting scholar at the Max-Planck Institute for Software Systems advised by Mariya Toneva. I earned my undergraduate degree in Computer Science and Mathematics at Dartmouth College with high honors. At Dartmouth, I did research with Soroush Vosoughi where I created a formal framework to characterize the robustness of language models.

Research

I'm broadly interested in improving the reliability of language models. My work has approached this through three axes: (a) evaluation: how can we develop practical, principled measures of performance? (b) attribution: how do models encode and use structural patterns downstream? (c) intervention: how do we distill actionable insight from attributions for model improvement and control?

Most recently, I've become interested in (1) understanding how/when language models acquire foundational, atomic skills during pre-training and how those acquisition schedules affect its ability to acquire capabilities downstream; (2) diffusion language models as a way to achieve human-like language generation and understanding.

Publications

(Ordered chronologically)