Figma AI: security and privacy considerations
De-identification, subprocessors, and incentives.
TL;DR
Figma will start training proprietary AI models on private customer data on August 15th, 2024. It’s good they gave advance notice of this. They also sent a customer-wide email on the topic.
Previously it used third-party models, both out-of-the-box and fine-tuned on public customer data, to power its AI features. These will complement the proprietary models but not be trained on private customer data.
Customer admins can control whether Figma trains on their content or not.
What functionality is Figma rolling out and how can I control it?
1. Usage data AI training
This is pretty standard and many companies already train models on this type of data. Examples of information trained on include:
How many times content is accessed
Technical logs / metadata
Telemetry
I suppose there is some risk usage training data itself could reveal something sensitive (e.g. a burst of activity before a product launch). But I can’t see how this would be reproduced for another customer in a way that leaks confidential information.
Here are Figma’s usage data training policies:
Completely exempted:
Figma for Education
Figma for Government
Mandatory
Everyone else
2. Content data training
This is where things get more interesting. There are certainly ways content data could contain confidential information. Especially when examples of content data provided by Figma include:
Text and images
Comments and annotations
Layer names and properties
Figma says it does de-identification and de-anonymization on this content prior to training, but this is my biggest area of concern (see below for details).
Here are Figma’s content data training policies:
Completely exempted:
Figma for Education
Figma for Government
Opt-in (via admin controls)
Enterprise tier
Organization tier
Teams who turned off “Figma and FigJam AI” before June 26, 2024.
Opt-out (via admin controls)
Starter tier
Professional tier
3. Access to AI features
All users will have access to AI-powered features, whether or not they submit to training on their content.
Access can be turned off through:
Opt-out (via admin controls)
Everyone
Three questions for Figma about their AI training and features:
1. How does de-identification for content work?
Figma states:
We take additional steps to train our models so that they learn general design patterns and Figma-specific concepts and tools—not your content, concepts, and ideas.
For example, we de-identify content and redact sensitive information including from text and images.
Redacting sensitive data from images seems especially challenging to do because of its unstructured nature. This is a similar situation to DocuSign’s AI training policies and descriptions of de-identification, which are still concerning to me.
More details on the de-identification process would be appreciated. Better yet, customer approval of the outputs would remove any doubts.
2. What are your subprocessor retention periods?
Figma says they:
Do not permit 3rd party model providers to use data that customers upload or create on the Figma platform for training their own models
and
Limit how long vendors can store data. OpenAI and our other 3rd party model providers store data temporarily, or in some cases not at all, in order to process requests and enable AI features
Figma tells you to look a subprocessor list to determine the retention periods, but there are no retention timelines listed here.
From their subprocessor list, it looks like Figma is using:
AWS
Azure
OpenAI
Jasper
for it “AI-enabled” features.
Based on previous research, I know generative AI services for the first three keep inputs from between 0 and 30 days.
Jasper’s privacy policy, however, states that the:
Company will retain Your Personal Data only for as long as is necessary for the purposes set out in this Privacy Policy.
I suppose you could say this is a “limit,” but often this means indefinitely.
Exact details on which subprocessors are powering the AI features, and how long they retain data, would be helpful here.
3. Why would anyone submit to AI content training?
This is more of a business than security question, but it seems like Figma is taking the approach of:
Requiring individuals and smaller companies to opt-out because they have less leverage and expertise to scrutinize AI training policies.
Letting larger organizations opt-in because they are more likely to complain about AI training on their data.
People have a strong status quo bias and default options are incredibly powerful. So Figma may get all the training it needs from those who are automatically opted-in.
Ultimately, though, I would love to see an arrangement where people and companies are financially incentivized to opt-in. Those with fewer issues about having their work integrated into AI models can get paid. And those with concerns can choose a higher price and not contribute to product improvement.
Building AI products? Need help on your training policies
There is no AI without training data, so it needs to come from somewhere.
With that said, customers (especially enterprise ones) are incredibly sensitive about how their content is used.
Having a clear policy and communications strategy is key to avoiding fumbles like what Zoom and DocuSign experienced.
So if you need help: