JING-JING LI

BUILDING SAFE INTELLIGENT SYSTEMS

PhD Student at UC Berkeley

AI Safety Researcher • Cognitive Scientist

prof_pic.jpg

JING-JING LI

RESEARCH

My research is dedicated to improving the safety of generative AI systems. My goal is to ensure that as AI models become more powerful, they remain robust, interpretable, and aligned with human values. I have applied this focus directly through research internships at AWS Agentic AI, where I investigated the adversarial robustness of AI agents, and at the Allen Institute for AI, where I worked on interpretable and transparent safety moderation.

My approach to AI safety is grounded in cognitive science. As a final-year PhD student at UC Berkeley advised by Professor Anne Collins, I investigate the computational principles behind how humans learn, reason, and make decisions. This background provides a unique lens for analyzing complex, human-like behaviors in AI and for constructing high-quality alignment data that reflects nuanced human judgment.

NEWS

Jul 13, 2025
Presenting my work on AI safety at ICML in Vancouver!
May 16, 2025
Started my internship at AWS Agentic AI in Seattle!
Apr 04, 2025
New work on exploration under uncertainty accepted as a spotlight at RLDM and a talk at CogSci!
Oct 04, 2024
Presenting my newly accepted Cognition paper at SfN24 on the TPDA award!
Aug 06, 2024
Two projects featured at the CCN conference in Boston.

SELECTED PUBLICATIONS

  1. safetyanalyst.jpg
    SafetyAnalyst: Interpretable, transparent, and steerable safety moderation for AI behavior
    Jing-Jing Li, Valentina Pyatkin, Max Kleiman-Weiner, Liwei Jiang, Nouha Dziri, Anne G. E. Collins, Jana Schaich Borg, Maarten Sap, Yejin Choi, and Sydney Levine
    ICML, 2025 Conference AI Safety LLM
  2. learning_hierarchy.jpg
    An algorithmic account for how humans efficiently learn, transfer, and compose hierarchically structured decision policies
    Jing-Jing Li, and Anne Collins
    Cognition, 2025 Journal Human Intelligence Reinforcement Learning Computational Modeling
  3. dynamic_noise_estimation.jpg
    Dynamic noise estimation: A generalized method for modeling noise fluctuations in decision-making
    Jing-Jing Li, Chengchun Shi, Lexin Li, and Anne GE Collins
    Journal of Mathematical Psychology, 2024 Journal Computational Modeling Method Development