AI security commitments
The White House and "The Seven" AI incumbents sketch the outlines of future regulation.
More regulation is coming for AI.
And the players in this space know it. That’s why they are trying to get ahead of it and maybe even shape the playing field before freezing it in place with a formal regulatory regime.
The latest move in this respect comes from a White House announcement that seven of the top AI companies (“The Seven”) have made voluntary commitments regarding AI safety, security, and trust. These companies were:
Amazon
Anthropic
Google
Inflection
Meta
Microsoft
OpenAI
The fact sheet and explanatory document (which is itself oddly a “nonpaper,” i.e. bearing no markings or logos of any kind) accompanying the announcement mostly represent “motherhood and apple pie.” A reasonable person would have difficulty finding fault with much of the subject matter.
But that’s because it’s incredibly vague and does nothing to address any of the inevitable tradeoffs required when developing and deploying new technologies.
Don’t get me wrong, this all makes sense from the business perspective of The Seven and the political perspective of the Biden Administration. I don’t fault private companies looking out for their interests or politicians trying to garner good press.
As others have suggested, this also looks a bit like the leading players attempting to lock in “best practices” early and create a competitive moat against new market entrants. I warned of such coronation as regulation after CISA’s push earlier this year to mandate software development security standards and the White House’s subsequent release of the National Cybersecurity Strategy.
The statement itself says the “Biden-Harris Administration is currently developing an executive order and will pursue bipartisan legislation to help America lead the way in responsible innovation.” So something binding - probably along the lines of these voluntary commitments - is in the works.
With that said, the announcement does have a small amount of interesting content and it’s good to see some general alignment on certain underlying principles. In this post, I’ll go through them and identify the most interesting considerations, call out what needs to be expanded, and propose some alternative directions.
Safety
There are some good considerations here, but like most U.S. government documents regarding AI - or any technology for that matter - it provides zero guidance regarding appropriate tradeoffs to make.
It’s easy to say “no safety risk is acceptable when using AI,” but this is a nonsensical position which no one actually uses to guide decision-making. So the key question is how much safety risk you are willing to accept to achieve a given outcome.
Additionally, the safety and security sections imply - but don’t make explicit - what I consider to be an important distinction: that between mitigating AI-powered cyber threats and securely developing AI technologies. The safety sections focuses on the former while the security section focuses on the latter.
Commitment 1: Internal and external red-teaming of models or systems in areas including misuse, societal risks, and national security concerns, such as bio, cyber, and other safety areas.
Focusing on existential risk is the right move here. Preventing the development of novel neurotoxins or malware is a reasonable goal. Self-replication of models is also something to pay close attention to.
Comprehensive red-teaming can mitigate the risk. Motherhood and apple pie all around.
The final bullet point, however, is clearly a throwaway, where The Seven commit to preventing “bias,” without elaborating on the term. This is impossible by the technical definition of the term, as otherwise there would be no “model.”
This is an obligatory statement, of course, but is so vague that it adds no value or any sort of standard of accountability.
Finally, there is an interesting caveat here that this requirement applies to “major public releases” of future models. This suggests one or more of the The Seven are not prepared to retrofit or update existing systems to meet these standards (and reserve the right to skip these steps for other than “major” releases).
Commitment 2: Work toward information sharing among companies and governments regarding trust and safety risks, dangerous or emergent capabilities, and attempts to circumvent safeguards
This commitment is relatively toothless, as the term “work toward information sharing” could be covered by merely thinking about issuing the occasional press release. More interesting was the suggestion that The Seven:
establish or join a forum or mechanism through which they can develop, advance, and adopt shared standards and best practices for frontier AI safety, such as the NIST AI Risk Management Framework.
The NIST AI Risk Management Framework (RMF) is an early mover in terms of AI governance. Despite its flaws and vagueness, getting The Seven to help NIST revise and build out this standard seems like a good idea. Rather than framework sprawl, I would be very happy to see one standard to rule them all.
Security
This section starts off on a terrible note, purporting to establish “a duty to build systems that put security first.” If you have read Deploy Securely for a while, you know I consider statements like these to be obvious doublespeak and have negative information value because it creates more uncertainty than it resolves.
If security were priority #1, The Seven would just immediately delete all of their models and accompanying infrastructure. No technology means no cyber risk - security would thus be the first priority. Obviously they aren’t doing that, so we can pretty clear that security is not the first priority.
Appropriate tradeoffs are the name of the game, and I look forward to the day when a government document gives any sort of guidance in this respect. But I am not holding my breath.
Commitment 3: Invest in cybersecurity and insider threat safeguards to protect proprietary and unreleased model weights
More motherhood and apple pie here. Protecting intellectual property (IP) is a key cybersecurity task and measures like ensuring minimum viable privileges and controlling in which environments sensitive data can be handled makes sense.
I am, however, slightly amused that The Seven were able to slip in a “commitment” that perfectly aligns with their commercial interests.
What company - AI or other - would be opposed to protecting its IP?
Commitment 4: Incent third-party discovery and reporting of issues and vulnerabilities
More good cybersecurity practice here. Having a coordinated vulnerability disclosure (CVD) policy is a cornerstone of a good security program. OpenAI and other have gone further, paying bounties to ethical hackers who identify security holes in their models or software.
Because of the vast attack surface prompt injection offers, I view AI model vulnerability analysis becoming a sub-discipline of its own. This might be a good place to specialize in for aspiring bug bounty hunters just starting out.
Trust
Commitment 5: Develop and deploy mechanisms that enable users to understand if audio or visual content is AI-generated, including robust provenance, watermarking, or both, for AI-generated audio or visual content
This seems like a pretty tall order and some expert commentary on the topic suggests it is going to be very difficult to achieve. Especially interesting is that The Seven commit to “develop[ing] tools or APIs to determine if a particular piece of content was created with their system.” This seems to me to be a use case similar to that of a software bills of material (SBOM). Except this approach is even more advanced and complex because it will require a way to evaluate the provenance of millions or likely billions of different artifacts with no standardized format. As a result I think it’s even less likely than SBOMs to see comprehensive adoption.
Taking the exact opposite approach is also better from a societal perspective. People should default to treating any image, video, sound, or text as being inauthentic. The vast majority of publicly-available content in existence right now is not AI-generated (although that may change quickly), so knowing what is tells you little. Especially concerning would be if people start thinking “oh I didn’t see an AI-generated tag on this so it must be real/true.” They should be thinking the opposite!
From a technological and logistical perspective, even if this extremely complex system of authentication for AI-generated content develops, there is no realistic way it will become even close to comprehensive:
What about open source models?
What about malicious nation-states?
What about companies aside from The Seven?
Additionally, it’s hard to see any watermarking system being able to provide perfect certainty as to the provenance of a given piece of content. Unlike digital signatures, the best such a regime could hope to do would be to give a probabilistic rating as to the likelihood something was generated by a given AI tool. This is likely to set up many fierce - and unnecessary - future controversies over whether a given document or video was fake or real.
Thus, I think taking the opposite approach (but applying the same general concepts) is the best way to go. Digitally signing media - especially pictures and videos - at the hardware level can allow for affirmative verification of authentic content, as some have suggested. And then people can treat everything lacking such a signature with a heightened level of suspicion.
Commitment 6: Publicly report model or system capabilities, limitations, and domains of appropriate and inappropriate use, including discussion of societal risks, such as effects on fairness and bias
This is an interesting one, and creates an interesting game theory scenario. From a competitive perspective, documenting the limitations of one’s own model is going to hurt. So The Seven will have a strong incentive to document as little as possible while still being able to credibly claim they are complying. Conversely, they’ll have a strong incentive to surface limitations in the models of their competitors, whether directly or by guilting or pressuring them into doing so.
While I think this is a good idea in theory, any sort of regime for driving this will need to be extremely complex and well thought-out. I don’t see this getting too much attention from The Seven as the AI gold rush continues.
Commitment 7: Prioritize research on societal risks posed by AI systems, including on avoiding harmful bias and discrimination, and protecting privacy
I’m happy to see The Seven agree to “protect children” here, but this section is incredibly light on detail. Its only clarification from previous sections is that “harmful” bias (as opposed to all, which is impossible with AI) is to be avoided and doesn’t even define what “privacy” is.
Motherhood and apple pie is just that, but like with much of this document, the devil will be in the details.
Commitment 8: Develop and deploy frontier AI systems to help address society’s greatest challenges
Again, can’t argue here. All of these things are good applications for AI. I am surprised Meta did not object, though, considering its clearest use case would be for building out the Metaverse or existing social media platforms, which don’t clearly (to me, at least) represent “society’s greatest challenges.”
Moving from press release to action
Public commitments lay the groundwork, but there is always a lot of work to be done afterward.
That’s where StackAware comes in. We help organizations manage the:
Cybersecurity
Compliance
Privacy
risk related to using AI. So if you need help: