Understanding AI Risks
Recent rapid progress in artificial intelligence (AI) has sparked concerns among technical experts, policymakers, and leaders regarding its potential dangers. Like any powerful technology, transformative AI requires careful management to mitigate risks.
At ERA, fellows have the opportunity to spend 8 weeks conducting research to address catastrophic risks posed by advanced AI. Below, we explore the AI risks that we care about tackling, and why.
AI Progress is Accelerating
The capabilities of today’s AI systems are rapidly advancing – matching or surpassing human experts in coding (IOI Gold), mathematics (IMO Silver), and a broad range of scientific disciplines (GPQA Diamond) on well-defined tasks. A recent report outlined the growing abilities of AI models to complete tasks over long time-horizons, noting that:
“If the trend of the past 6 years continues to the end of this decade, frontier AI systems will be capable of autonomously carrying out month-long projects”.
~METR (Model Evaluation & Threat Research)
If progress continues at its current pace, AI could become the most transformative technology in history, reshaping the world in ways we can scarcely imagine—potentially within just years.
Indeed, many companies and governments see this possibility, and are pouring billions of dollars of investments into AI capabilities. Since the early 2010’s, the amount of computing power used to train frontier AI systems has been steadily growing at a rate of over 4x per year.
AI Progress Comes with Increased Risk
Industry leaders envision AI bringing unprecedented societal benefits and economic prosperity. Despite this, the same experts and leaders from across industry and academia are outspoken about the range of AI risks, including extreme catastrophic risks from sufficiently advanced AI. Preemptive action is needed to mitigate these harms, and more work can be done by both researchers and policy-makers to prepare for this transformative technology.
A recent report by RAND Corporation outlines “Five Hard National Security Problems” related to the emergence of highly advanced AI system; these five areas (listed in the adjacent image) closely align with some of ERA's core research priorities.
Image from RAND. Though there is an overarching “endemic” uncertainty about the development of advanced AI, the five categories above can be used as a framing to understand many AI risks.
ERA is Commited to Addressing These Risks
As part of the Cambridge ERA:AI Fellowship, fellows spend 8 weeks working on a research project related to addressing AI risk. Below, we have outlined some of the existing problems and potential research projects within our three areas of (1) Technical AI Safety, (2) AI Governance, and (3) Technical AI Governance. This list is far from being exhaustive — instead, we hope it serves as a guidance.
01
Technical AI Safety
Experts do not yet have fail-safe methods to assure the safety of their AI systems. AI systems lack critical robustness, there are difficulties specifying appropriate objectives clearly, and AI systems can learn the wrong goal from correct specification. Frontier models are even capable of scheming against their users, or faking their values when doing so is instrumentally useful.
ERA is excited to support work to address these issues: Researchers can develop robust control protocols for advanced AI systems, better understand how and when models are misaligned, reduce misbehavior and reward gaming in models’ reasoning processes, and understand model internals, among other possible research avenues.
02
AI Governance
Policymakers struggle to keep pace with AI advancements, even while leading AI companies themselves advocate for regulation and for increased company involvement with national security agencies. Current regulatory frameworks are sparse and inadequate for addressing the unique challenges posed by these increasingly capable AI systems, and international coordination remains difficult and fragmented.
ERA is interested in research addressing critical governance challenges, including (but not limited to):
Developing or expanding on frameworks to prevent harmful competitive dynamics through effective international coordination mechanisms
Creating or understanding liability and insurance systems that allocate responsibility for AI harms and incentivize safety
Advancing institutional designs that enable regulators and governments to adapt quickly to emerging AI capabilities
Creating incentives for companies to enhance cybersecurity, in order to prevent model weight exfiltration and/or theft of algorithmic secrets
03
Technical AI Governance
The technical infrastructure needed to safely govern advanced AI systems is currently insufficient. As concrete examples, we lack the technology needed to robustly monitor and enforce AI hardware use as well as standardized evaluation methods for dangerous capabilities. A wide range of key levers for shaping the trajectory of AI progress are inherently technical, and thus require deep engagement with the architectures, algorithms, and interfaces through which AI systems are designed, deployed, and controlled.
ERA is excited to support work across a wide variety of areas within technical AI governance. Some examples include studying compute governance, improving upon structured model access, devising new apporaches to enhance safety cases for frontier AI models, establishing robust international standards and evaluation methods, or further exploration of AI agent protocols, among other areas.