publications
2025
- An algorithmic account for how humans efficiently learn, transfer, and compose hierarchically structured decision policiesJing-Jing Li, and Anne CollinsCognition, 2025
Learning structures that effectively abstract decision policies is key to the flexibility of human intelligence. Previous work has shown that humans use hierarchically structured policies to efficiently navigate complex and dynamic environments. However, the computational processes that support the learning and construction of such policies remain insufficiently understood. To address this question, we tested 1,026 human participants on a decision-making task where they could learn, transfer, and recompose multiple sets of hierarchical policies. We propose a novel algorithmic account for the learning processes underlying observed human behavior. We show that humans rely on compressed policies over states in early learning, which gradually unfold into hierarchical representations via meta-learning and Bayesian inference. Our modeling evidence suggests that these hierarchical policies are structured in a temporally backward, rather than forward, fashion. Taken together, these algorithmic architectures characterize how the interplay between reinforcement learning, policy compression, meta-learning, and working memory supports structured decision-making and compositionality in a resource-rational way.
- Genetic changes linked to two different syndromic forms of autism enhance reinforcement learning in adolescent male but not female miceJuliana Chase, Jing-Jing Li, Wan Chen Lin, Lung-Hao Tai, Anne GE Collins, and Linda WilbrechtbioRxiv, 2025
Autism Spectrum Disorder (ASD) is characterized by restricted and repetitive behaviors and social differences, both of which may manifest, in part, from underlying differences in corticostriatal circuits and reinforcement learning. Here, we investigated reinforcement learning in mice with mutations in either Tsc2 or Shank3, both high-confidence ASD risk genes associated with major syndromic forms of ASD. Using an odor-based two-alternative forced choice (2AFC) task, we tested adolescent mice of both sexes and found male Tsc2 and Shank3B heterozygote (Het) mice showed enhanced learning performance compared to their wild type (WT) siblings. No gain of function was observed in females. Using a novel reinforcement learning (RL) based computational model to infer learning rate as well as policy-level task engagement and disengagement, we found that the gain of function in males was driven by an enhanced positive learning rate in both Tsc2 and Shank3B Het mice. The gain of function in Het males was absent when mice were trained with a probabilistic reward schedule. These findings in two ASD mouse models reveal a convergent learning phenotype that shows similar sensitivity to sex and environmental uncertainty. These data can inform our understanding of both strengths and challenges associated with autism, while providing further evidence that sex and experience of uncertainty modulate autism-related phenotypes.
2024
- SafetyAnalyst: Interpretable, transparent, and steerable safety moderation for AI behaviorJing-Jing Li, Valentina Pyatkin, Max Kleiman-Weiner, Liwei Jiang, Nouha Dziri, Anne G. E. Collins, Jana Schaich Borg, Maarten Sap, Yejin Choi, and Sydney Levine2024
The ideal AI safety moderation system would be both structurally interpretable (so its decisions can be reliably explained) and steerable (to align to safety standards and reflect a community’s values), which current systems fall short on. To address this gap, we present SAFETYANALYST, a novel AI safety moderation framework. Given an AI behavior, SAFETYANALYST uses chainof-thought reasoning to analyze its potential consequences by creating a structured “harm-benefit tree,” which enumerates harmful and beneficial actions and effects the AI behavior may lead to, along with likelihood, severity, and immediacy labels that describe potential impact on any stakeholders. SAFETYANALYST then aggregates all harmful and beneficial effects into a harmfulness score using fully interpretable weight parameters, which can be aligned to particular safety preferences. We applied this conceptual framework to develop, test, and release an opensource LLM prompt safety classification system, distilled from 18.5 million harm-benefit features generated by frontier LLMs on 19k prompts. On a comprehensive set of prompt safety benchmarks, we show that SAFETYANALYST (average F1=0.81) outperforms existing LLM safety moderation systems (average F1<0.72) on prompt safety classification, while offering the additional advantages of interpretability, transparency, and steerability.
- Dynamic noise estimation: A generalized method for modeling noise fluctuations in decision-makingJing-Jing Li, Chengchun Shi, Lexin Li, and Anne GE CollinsJournal of Mathematical Psychology, 2024
Computational cognitive modeling is an important tool for understanding the processes supporting human and animal decision-making. Choice data in decision-making tasks are inherently noisy, and separating noise from signal can improve the quality of computational modeling. Common approaches to model decision noise often assume constant levels of noise or exploration throughout learning (e.g., the softmax policy). However, this assumption is not guaranteed to hold – for example, a subject might disengage and lapse into an inattentive phase for a series of trials in the middle of otherwise low-noise performance. Here, we introduce a new, computationally inexpensive method to dynamically estimate the levels of noise fluctuations in choice behavior, under a model assumption that the agent can transition between two discrete latent states (e.g., fully engaged and random). Using simulations, we show that modeling noise levels dynamically instead of statically can substantially improve model fit and parameter estimation, especially in the presence of long periods of noisy behavior, such as prolonged lapses of attention. We further demonstrate the empirical benefits of dynamic noise estimation at the individual and group levels by validating it on four published datasets featuring diverse populations, tasks, and models. Based on the theoretical and empirical evaluation of the method reported in the current work, we expect that dynamic noise estimation will improve modeling in many decision-making paradigms over the static noise estimation method currently used in the modeling literature, while keeping additional model complexity and assumptions minimal.
- Latent Variable Sequence Identification for Cognitive Models with Neural Bayes EstimationTi-Fen Pan, Jing-Jing Li, Bill Thompson, and Anne CollinsarXiv preprint arXiv:2406.14742, 2024
Extracting time-varying latent variables from computational cognitive models is a key step in model-based neural analysis, which aims to understand the neural correlates of cognitive processes. However, existing methods only allow researchers to infer latent variables that explain subjects’ behavior in a relatively small class of cognitive models. For example, a broad class of relevant cognitive models with analytically intractable likelihood is currently out of reach from standard techniques, based on Maximum a Posteriori parameter estimation. Here, we present an approach that extends neural Bayes estimation to learn a direct mapping between experimental data and the targeted latent variable space using recurrent neural networks and simulated datasets. We show that our approach achieves competitive performance in inferring latent variable sequences in both tractable and intractable models. Furthermore, the approach is generalizable across different computational models and is adaptable for both continuous and discrete latent spaces. We then demonstrate its applicability in real world datasets. Our work underscores that combining recurrent neural networks and simulation-based inference to identify latent variable sequences can enable researchers to access a wider class of cognitive models for model-based neural analyses, and thus test a broader set of theories.
- Neural mechanisms of awareness of actionDavid S Jin, Oumayma Agdali, Taruna Yadav, Sharif I Kronemer, Sydney Kunkler, Shweta Majumder, Maya Khurana, Marie C McCusker, Ivory Fu, Emily J Siff, Aya Khalaf, Kate L Christison-Lagay, Shanae L Aerts, Qilong Xin, Jing-Jing Li, Sarah H McGill, Michael J Crowley, and Hal BlumenfeldbioRxiv, 2024
The origins of awareness of action (AoA), the ability to report an action just performed, remain elusive. Differing theories ascribe AoA to pre-action, efferent motor/volitional mechanisms versus post-action, afferent sensory/perceptual neural mechanisms. To study these two types of mechanisms and others, we developed a paradigm where very similar aware and unaware actions occur repeatedly. Aware actions demonstrated larger neurophysiological signals both preceding and following movement. The differences included well-known volitional and perceptual event related potentials (PMP, N140, P300), as well as frontal midline theta, event-related alpha/beta desynchronization, and post-move blink rates. On longer time scales, we identified a novel event related potential preceding unaware moves, and found behavioral and pupillometric evidence for decreased attention and arousal over minutes concurrent with AoA loss. Our findings suggest that both dynamic, individual action-associated volitional and perceptual neural activity, as well as long-term attention and arousal states play a role in maintaining AoA.
2023
- A generalized method for dynamic noise inference in modeling sequential decision-makingJing-Jing Li, Chengchun Shi, Lexin Li, and Anne GE CollinsIn Proceedings of the Annual Meeting of the Cognitive Science Society , 2023
Computational cognitive modeling is an important tool for understanding the processes that support human and animal decision-making. Choice data in sequential decision-making tasks are inherently noisy, and separating noise from signal can improve the quality of computational modeling. Currently, most models assume that noise is constant, or static, typically by including a parameter (e.g., uniform ε) to estimate the noise level. However, this assumption is not guaranteed to hold – for example, an agent can lapse into an inattentive phase for a series of trials in the middle of otherwise low-noise performance. Assuming that noise is static could bias parameter and model identification. Here, we propose a new method to dynamically infer noise in choice behavior, under a model assumption that agents can transition between two discrete latent states (for example, attentive and noisy). Using four empirical datasets with diverse behavioral and modeling features, we demonstrate that our method improves model fit and that it can be easily incorporated into existing fitting procedures, including maximum likelihood estimation and hierarchical Bayesian modeling.
- Decreased but diverse activity of cortical and thalamic neurons in consciousness-impairing rodent absence seizuresCian McCafferty, Benjamin F Gruenbaum, Renee Tung, Jing-Jing Li, Xinyuan Zheng, Peter Salvino, Peter Vincent, Zachary Kratochvil, Jun Hwan Ryu, Aya Khalaf, Kohl Swift, Rashid Akbari, Wasif Islam, Prince Antwi, Emily A Johnson, Petr Vitkovskiy, James Sampognaro, Isaac G Freedman, Adam Kundishora, Antoine Depaulis, François David, Vincenzo Crunelli, Basavaraju G Sanganahalli, Peter Herman, Fahmeed Hyder, and Hal BlumenfeldNature Communications, 2023
Absence seizures are brief episodes of impaired consciousness, behavioral arrest, and unresponsiveness, with yet-unknown neuronal mechanisms. Here we report that an awake female rat model recapitulates the behavioral, electroencephalographic, and cortical functional magnetic resonance imaging characteristics of human absence seizures. Neuronally, seizures feature overall decreased but rhythmic firing of neurons in cortex and thalamus. Individual cortical and thalamic neurons express one of four distinct patterns of seizure-associated activity, one of which causes a transient initial peak in overall firing at seizure onset, and another which drives sustained decreases in overall firing. 40–60 s before seizure onset there begins a decline in low frequency electroencephalographic activity, neuronal firing, and behavior, but an increase in higher frequency electroencephalography and rhythmicity of neuronal firing. Our findings demonstrate that prolonged brain state changes precede consciousness-impairing seizures, and that during seizures distinct functional groups of cortical and thalamic neurons produce an overall transient firing increase followed by a sustained firing decrease, and increased rhythmicity.
2022
- Credit assignment in hierarchical option transferJing-Jing Li, Liyu Xia, Flora Dong, and Anne GE CollinsIn CogSci... Annual Conference of the Cognitive Science Society. Cognitive Science Society (US). Conference , 2022
Humans have the exceptional ability to efficiently structure past knowledge during learning to enable fast generalization. Xia and Collins (2021) evaluated this ability in a hierarchically structured, sequential decision-making task, where participants could build “options” (strategy “chunks”) at multiple levels of temporal and state abstraction. A quantitative model, the Option Model, captured the transfer effects observed in human participants, suggesting that humans create and compose hierarchical options and use them to explore novel contexts. However, it is not well understood how learning in a new context is attributed to new and old options (i.e., the credit assignment problem). In a new context with new contingencies, where participants can recompose some aspects of previously learned options, do they reliably create new options or overwrite existing ones? Does the credit assignment depend on how similar the new option is to an old one? In our experiment, two groups of participants (n=124 and n=104) learned hierarchically structured options, experienced different amounts of negative transfer in a new option context, and were subsequently tested on the previously learned options. Behavioral analysis showed that old options were successfully reused without interference, and new options were appropriately created and credited. This credit assignment did not depend on how similar the new option was to the old option, showing great flexibility and precision in human hierarchical learning. These behavioral results were captured by the Option Model, providing further evidence for option learning and transfer in humans.