Over the years, ERA Fellows have made significant contributions to the AI research landscape. We highlight some of our fellows’ research below.
Research at ERA
-
Towards a UN Role in Governing Foundation Artificial Intelligence Models
Claire Dennis, ERA Fellow 2023
-
Welfare Diplomacy: Benchmarking Language Model Cooperation
Gabriel Mukobi, ERA Fellow 2023
-
Towards Reliable Evaluation of Behavior Steering Interventions in LLMs
Itamar Pres, ERA Fellow 2024
-
The Case for Model Access Governance
Edward Kembery, ERA Fellow 2024
-
AI Safety Frameworks Should Include Procedure for Model Access Decisions
Edward Kembery & Tom Reed, ERA Fellows 2024
-
Verification methods for international AI agreements
Tom Reed & Jack William Miller
ERA Fellows 2024
-
Towards Safe Multilingual Frontier AI
Arturs Kanepajs & Vladimir Ivanov
ERA Fellows 2024 -
What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks
Nathalie Maria Kirch & Severin Field, ERA Fellows 2024
-
Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment
Allison Huang, ERA Fellow 2024