Response to OSTP on AI Action Plan
We believe it will be important to plan around the following three major factors:
AI systems are rapidly improving at carrying out tasks independently over long time horizons.
AI systems are improving on AI R&D tasks, in a feedback loop that could be very destabilizing.
AI systems will plausibly end up with unintended and/or malign goals, conceal this fact, and resist attempts to remove these goals.
In light of these factors, we recommend that the AI Action Plan include at least the following four priority actions:
Collaborate with the private sector to improve the information security and internal controls achievable by the leading U.S. AI developers.
Lead creation of narrowly-tailored formal standards for how and when AI developers should measure:
Capabilities that would directly threaten public safety or national security, specifically CBRN weapons development and cyber-offense.
Capabilities that would amplify risks from AI systems, specifically long-horizon autonomy, automated AI R&D, and control subversion.
Prepare and consider interventions on the AI development and deployment process, as future circumstances may warrant, to:
Require reporting of training runs and deployments above some threshold of scale.
Require external oversight of plans for public safety and national security above some threshold of scale.
Require heightened information security and other mitigations (including internal controls) above some threshold of capability.
Measure the degree of R&D automation and AI reliance occurring within the leading U.S. AI developers and other critical industries.