Amazon's $1,401,573 loss from ChatGPT data leakage
Quantifying unintended training risk with AI.
This headline caught people’s attention earlier this year:
As it should.
Essentially the reported story was about Amazon engineers using ChatGPT for various purposes and looking for guidance on how to use it securely. An attorney responded with a stern warning to avoid submitting confidential information to the tool, because its responses were beginning to reproduces internal information.
Bottom line: there is pretty good evidence Amazon was burned by the phenomenon of unintended training. This occurs whenever an AI model trains on information that the provider of such information - in retrospect - would not want the model to be trained on.
In this post I am going to quantify the damage the company suffered as a result. My approach is loosely based on the Factor Analysis of Information Risk (FAIR) methodology, and I’ll use the six categories of loss to structure this piece:
Productivity
Response
Replacement
Fines and Judgments
Competitive Advantage
Reputation Damage
Because we are looking retrospectively here, this isn’t a pure FAIR approach (which is always forward-looking). But the methodology provides some good tools and following this format will make future-looking forecasts easier. Finally, my calculations require a series of assumptions. So I state all of these explicitly and provide evidence justifying them along the way.
Setting the stage for Amazon’s data leakage
Assumption: ChatGPT was exposed to - and trained on - internal Amazon code.
I think this is fair to say based on what Amazon developers admit to doing, according to Business Insider:
Additionally, this article came out less than two months after the launch of ChatGPT. At this point there was relatively less understanding of how the model worked and what data it trained on. Thus, I doubt even a majority of these developers opted out of training.
With this established we can now look at the consequences:
Evaluating primary loss from ChatGPT’s unintended training
Productivity
Assumption: ChatGPT training on Amazon’s data made it an excellent interview prep tool, which resulted in Amazon hiring at least 1 under-qualified full-time equivalent (FTE) developer as a result.
This assumption is based on this excerpt:
Additionally, in the weeks after the launch of ChatGPT, dozens of blogs and YouTube videos appeared showing how to use the tool to prepare for coding interviews. The people following these recommendations almost certainly did not have malicious intent, but were just following sound (from their perspective) career advice.
And it seems highly likely folks preparing using the “fine-tuned” version of ChatGPT would have a higher chance of successfully landing the job than they would have otherwise.
That doesn’t mean they were qualified for the job, though.
Assumption: Software engineers provide double their value in productivity to Amazon than they consume in terms of “fully loaded” compensation.
I couldn’t find any clearly sourced estimates for this, but based on the cost of managing them, etc. on top of their fully loaded compensation, returning double one’s salary in productivity seems like a very low estimate for a software engineer. ChatGPT suggests it’s 3-5x, but couldn’t say why:
Assumption: “fully-loaded” employee compensation (including taxes, insurance, healthcare, other benefits, etc.) are 150% of a salary. This is on the low-ish end but takes into account Amazon’s well-developed human resources and related infrastructure.
Because median software engineer compensation at Amazon is $240,000, this suggests that value delivered would be $720,000 per year.
Assumption: the under-qualified FTE engineer generated 50% of the value a properly qualified one.
Again, this seems like a conservative estimate considering it’s quite conceivable many engineers are producing zero or even negative value. While productivity is heavily skewed to the top of the distribution.
Assumption: The sub-par FTE will stay employed for 1 year before being identified and removed. The formal performance-related termination process looks to take 2-3 months, so it seems reasonable that someone would get 3-4 times that ramp up after onboarding.
Thus, with these assumptions in place, Amazon has (or will have eventually) lost $360,000 due to forgone productivity.
Response
Dealing with the fallout of the unintended training internally cost money all by itself.
Based on attorney’s bills I have received in the past for similar lawyerly but not especially helpful wording.
Assumption: this response required 2 person-hours from Amazon’s corporate counsel(s).
This seems conservative, especially considering another Business Insider article from later in 2023 suggesting Amazon had since developed specific policies regarding the use of ChatGPT, which had been lacking earlier. It’s difficult to tie all of this work back to this particular situation, but I think at least 2 hours is reasonable.
Assumption: Amazon employees work an average of 2000 hours per year.
This is a standard assumption used in labor cost calculations, as it assumes 50 work weeks at 40 hours per week.
Amazon corporate counsel’s median salary is $290,289 per year, which is $218/hour fully-loaded.
So the total cost of attorney labor is $436.
Assumption: Amazon corporate communications spent at least 1 hour figuring out how to deal with press inquiries about this situation.
While Amazon didn’t respond to comment for this story, I am highly confident that at least one person-hour of time was spent determining whether to respond, based on my experience on the other side of these things.
A corporate communications expert’s average salary at Amazon is $182,700, which is $137/hour fully-loaded.
Total response costs were thus $436 + $137 = $573.
Replacement
Amazon also needed to recruit someone new to replace the unqualified FTE engineer. Replacement costs for technical employees can easily be 100% of their annual salary if you include severance and human resources expenses as well as the costs related to bringing onboard and training a new hire.
Thus, a conservative estimate for replacement costs are $240,000.
Fines and Judgments
It doesn’t look like anything regulated (personal data or protected health information) was trained on in this case. Thus Amazon the corporate entity appears to have suffered 100% of the damage, which means regulatory action, fines, or lawsuits are unlikely.
Estimate: $0.
Competitive advantage
This is a very challenging one to calculate, but it is almost certainly the greatest source of damage to Amazon here.
Assumption: rivals benefited from access to internal Amazon code.
One of the primary reasons loss of competitive advantage following a breach isn’t as bad as it could be is that many potentially competitive companies don’t want to take the reputation or legal risk of using stolen or doxxed intellectual property (IP).
With ChatGPT though, one could claim very plausible deniability while still doing so. That’s because it’s highly unlikely a competitor would even know whether and to what degree the model’s responses were leveraging Amazon’s code.
Considering 80% of the Fortune 500 were using ChatGPT as of this summer, Amazon’s proprietary code was likely spread far and wide, including to companies to compete directly with it.
I wouldn’t even be surprised if Microsoft and/or OpenAI had some secret project to extract information about competitors from ChatGPT via sensitive data generation. The only reasonable IP legal defense Amazon could use for its code is copyright because it likely lost trade secret protection when its developers knowingly entered their employer’s confidential data into a tool that affirmatively stated it was using it in ways that could make versions of it available to others (I am not a lawyer and this is not legal advice). But it looks like indemnification policies from the major generative AI players will make litigation on copyright grounds (related to training models on code) difficult.
Additionally, notwithstanding GPT-3.5/4 reported performance drift, I think it’s reasonable to assume some relative performance improvement in its capabilities as a result of training on Amazon’s code. I think it’s especially likely Amazon’s biggest competitors - mainly Microsoft which has its own GPT instances - gained as a result.
Assumption: Amazon’s software-related revenue is entirely captured by Amazon Web Services (AWS) annual revenue. This was $80.1B for approximately the trailing year before the article’s publication.
This is a simplification, but not a crazy one because AWS emerged as a essentially a way to sell the byproducts of Amazon’s own internal development and infrastructure engineering efforts.
Assumption: Amazon lost the equivalent of 0.001% of AWS revenue for one year due to the unintended training.
This is hard to exactly estimate, but if anything, it’s an underestimate based on the analysis above.
This means Amazon lost $801,000 in competitive advantage as a result.
Reputation
For Amazon, this is barely a dent in the basically continuous wave of (good and bad) press they get. This incident didn’t appear to involve any loss of customer information or downtime, which are the primary drivers of cybersecurity-related reputation damage. So I’ll call this a wash.
Total reputation damage: $0.
Considering secondary loss
I’m not aware of any other stakeholders beside Amazon itself taking action as a result of this series of events. See this article for an explanation if you want more detail on how to think about primary versus secondary loss using FAIR.
Total secondary risk: $0.
Summing it all up: the financial costs of unintended training
Now it’s just a matter of adding up the six types of loss:
Productivity: $360,000
Response: $573
Replacement: $240,000
Fines and Judgments: $0
Competitive Advantage: $801,000
Reputation Damage: $0
The total bill: $1,401,573.
For all the risk quantification skeptics, it’s important to make clear that this is just an estimate. Obviously Amazon is not going to get an invoice for this amount it has to pay.
But these are real losses that could have been avoided.
The ROI of AI governance
Now, imagine you could rewind to November 2022 when ChatGPT was released. And you took some steps to reduce the likelihood of this incident occurring by 10%.
That alone would be a savings of $140,157.
Worth getting a new security tool for tens of thousands per year? Potentially.
But you know what would have reduced the risk for even less money?
Understanding how ChatGPT uses your data!
A small tweak to the warning screen described below saying “please confirm you have opted-out of training” would have likely saved many Amazon engineers from the expensive mistake they ended up making.
And when combined with a technical solution, the mitigation effect would be even greater.
So for the security leaders out there concerned about their risk posture when it comes to AI, think hard about how you are managing it.
Want to save hundreds of thousands of dollars (or more) in potential loss?
There’s good news.
StackAware’s risk assessment and governance program is a lot cheaper than that.