Over the years, ERA Fellows have made significant contributions to the AI Safety & Governance research landscape. We highlight some of our fellows’ research below.

Research at ERA

  • Deep Ignorance Paper

    Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards Into Open-Weight LLMs

    Kyle O’Brien, ERA Fellow 2025

  • Mapping IAEA Verification Tools

    Mapping IAEA Verification Tools to International AI Governance: A Mechanism-by-Mechanism Analysis

    Christina Krawec, ERA Fellow 2025

  • Towards Reliable Evaluation of Behavior Steering Interventions in LLMs

    Itamar Pres, ERA Fellow 2024

  • The Case for Model Access Governance

    Edward Kembery, ERA Fellow 2024

  • Edward Kembery

    AI Safety Frameworks Should Include Procedure for Model Access Decisions

    Edward Kembery & Tom Reed, ERA Fellows 2024

  • Verification methods for international AI agreements

    Tom Reed & Jack William Miller

    ERA Fellows 2024

  • Towards Safe Multilingual Frontier AI

    Arturs Kanepajs & Vladimir Ivanov
    ERA Fellows 2024

  • What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks

    Nathalie Maria Kirch & Severin Field, ERA Fellows 2024

  • Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment

    Allison Huang, ERA Fellow 2024

  • Towards a UN Role in Governing Foundation Artificial Intelligence Models

    Claire Dennis, ERA Fellow 2023

  • Welfare Diplomacy: Benchmarking Language Model Cooperation

    Gabriel Mukobi, ERA Fellow 2023