The complexity of modern software supply chains is growing rapidly. And customers are becoming increasingly demanding when it comes to understanding how their data is used, especially when it comes to AI training.
Organizations have adapted in a variety of ways, such as:
Expanding privacy policies
Building AI-specific trust centers
(Going the opposite direction) obfuscating information processing.
I believe transparency is generally the best policy, and also that machine-readability is a key feature for any new tool and feature.
That’s why StackAware has officially launched its Software Bill of Material (SBOM) using the CycloneDX 1.5 (JSON) standard!
Drilling down into AI training
In addition to listing all of our:
first-level subprocessors
the data they handle
their locations
our SBOM also describes known AI training processes in our supply chain.
After working with the Co-Chair of CycloneDx, Steve Springett (also a StackAware advisor), I learned that the best way to depict this is using the formulation
field, for example:
[
{
"workflows": [
{
"bom-ref": "urn:uuid:25ce97c5-5822-4f4b-9392-e3060860e0f3#predictive-ai-training-metadata",
"uid": "Not applicable.",
"name": "Predictive AI training using metadata",
"description": "Many modern applications and services conduct predictive AI training on the metadata of customer activity and information for the purposes of fraud detection, spam filtering, and service improvement.",
"resourceReferences": [
{
"externalReference": {
"urI": "https://blog.stackaware.com/p/intellectual-property-artificial-intelligence",
"comment": "This article describes a framework for evaluating AI training practices from a security and intellectual property protection perspective.",
"type": "website"
}
}
],
"taskTypes": "other"
},
{
"bom-ref": "urn:uuid:25ce97c5-5822-4f4b-9392-e3060860e0f3#generative-ai-training-public-data",
"uid": "Not applicable.",
"name": "Generative AI training with Public Data.",
"description": "Many large technology companies train generative AI models using publicly availabile data such as on internet-facing websites or social media posts.",
"resourceReferences": [
{
"externalReference": {
"urI": "https://blog.stackaware.com/p/intellectual-property-artificial-intelligence",
"comment": "This article describes a framework for evaluating AI training practices from a security and intellectual property protection perspective.",
"type": "website"
}
}
],
"taskTypes": "other"
}
]
}
]
Over time, we will introduce additional detail into the SBOM, such as:
Known - but not exploitable - vulnerabilities
Data retention periods
n-level subprocessors
So stay tuned!
Staying on top of supply chain complexity
If building a CycloneDX SBOM doesn’t seem like your definition of fun and neither does mapping AI use among your vendors (and their vendors), you might consider getting some help.
StackAware enables organizations to manage their:
Cybersecurity
Compliance
Privacy
risk when using AI.
Ready to chat?