The Artificial Intelligence Risk Scoring System (AIRSS) - Part 2
Defining business and security requirements.
Check out the YouTube, Spotify, and Apple Podcast versions.
Cyber risk management with AI is already a confused mess.
Should we treat generative AI tools differently than other parts of of software supply chain processing sensitive data? (Answer: yes when it comes to unintended training and similar vulnerabilities, no otherwise).
How do you evaluate CVEs in AI infrastructure or compare them to undesired outputs of the model itself? (Answer: use the Deploy Securely Risk Assessment Model [DSRAM] for the first and the Artificial Intelligence Risk Scoring System [AIRSS] for the second).
This lack of clarity is why I decided to develop the AIRSS. And in this post, I’ll continue building out how to use it to manage AI model risk.
If you missed it, please check out part 1 in this series for an important primer.
For what are we measuring the risk, exactly?
As an already vast array of proprietary and open source options proliferate, ensuring everyone is talking about the same one unfortunately becomes challenging. Applying existing Common Platform Enumeration (CPE) clearly won’t do, but I think the package URL (pURL) approach has some merit.
The resulting model can be represented as the purl:
pkg:generic/gpt-3.5-turbo@0613?ft=stackaware::80Z1hDhg
This standardized way of describing models will become important when we evaluate vulnerabilities in them. And most importantly, when we communicate about them using the CycloneDX Software Bill of Material (SBOM) standard’s Vulnerability Exploitability eXchange (VEX) capability.
You can describe an entire AI application - and any vulnerabilities in it, both “traditional” and AI-specific - using the CycloneDX approach.
General adherence to security requirements is more important than individual instances of violations
Now we are clear about what we are evaluating, we can talk about how to evaluate it.
Identifying discrete vulnerabilities is useful only to the extent that it allows us to fix them. While in traditional information systems, patching a vulnerability for a security reasons can potentially impact data integrity or availability, this is a relatively rare (although occasionally severe) occurrence.
With AI systems, however, this appears to be quite different
There is substantial speculation that OpenAI’s introduction of successive “safety layers” has led to the clear performance degradation of GPT-4. This make sense, as attempting surgery to fix a specific issue or set thereof is quite likely to have unintended consequences for the broader model’s performance.
Combined with the generally non-deterministic outputs of AI models, identifying specific vulnerabilities in a given one is less important than ensuring that on the whole the model meets its security requirements.
Thus, the AIRSS does not attempt to describe individual vulnerabilities, but rather the overall cumulative impact of them on an artificial intelligence model. From a risk management perspective, this latter metric is far more important.
In the end, what we care about is the likelihood of a model behaving differently than intended, multiplied by the severity of this occurring. To talk about these situations with any precision, we need to define business and security requirements first.
Establish business requirements
Before a developer started putting keys to keyboard, hopefully a product manager drafted a set of business requirements explaining what a given AI model should do. If you haven’t documented what actions software must be capable of performing, it’s hard to tell what is a bug and what is a feature.
This becomes vitally important to evaluating data integrity considerations. For example, if an LLM is passing certain data to a function call, it will need to meet certain requirement, e.g. be an integer, not be longer than certain number of characters, etc. Only if you have defined these things can you know when the model is performing not as intended.
Similarly, these requirements will tell you whether - and how badly - your model’s data has been poisoned. Although I wouldn’t recommend using the output of a Large Language Model (LLM) to determine which bank account to wire money to, let’s assume you are doing so for argument’s sake.
In this case, if this is a refund processing LLM, then it would make sense that the model would kick off a workflow of sending money back to the customer’s account under certain conditions. If this is a purchase processing LLM, however, then it would not make sense. If the model started directing money to customer accounts after they buy from you, your business would have a serious problem.
Thus, you cannot say one outcome is wrong and another correct without defining the business requirements clearly.
Establish security requirements
Similarly, before determining what is and is not a vulnerability, we need to understand what a model should and should not do from a security perspective. In a sense, security requirements are just business requirements, and there is no firm need to distinguish them rigidly.
Business requirements usually focus on data integrity and availability while security requirements often focus on data confidentiality. Additionally, business requirements are often stated in the affirmative (e.g. “the model shall”) and security requirements in the negative (e.g. “the model shall not”). But these are not mutually exclusive and I am categorizing them differently merely to help with mental bucketing.
Getting into an example, if:
you are operating a model using a provider that has signed a Health Insurance Portability and Accountability Act (HIPAA) Business Associate Agreement (BAA) with your organization,
you control access to only medical professionals with a need to know, and
there is a valid medical purpose for all queries,
it is conceivable that returning Protected Health Information (PHI) would not represent a security policy violation.
Conversely, in most cases you probably wouldn’t want the model to do this.
Some additional potential security requirement include those that demand a model not:
return any information about anyone who is not the user.
associate any personal identifiers with a given medical diagnosis.
name or provide any information about a sensitive business process.
Check out this post for more examples if you need them.
Conclusion
We now have in place the scope of the AIRSS and a general framework for evaluating risk.
In the next installment, we’ll go through an example of how to calculate AI model risk quantitatively. And then finally, take a look at how to report this in a standardized format.
Ready to implement the AIRSS?
A special thanks to Steve Springett for his feedback prior to publication.
Related LinkedIn posts