Zoom's AI misstep
Cutting through the noise about recent revelations in their terms of service.
TL;DR
Earlier this year, Zoom updated its terms and conditions, regarding training its AI models.
A tech blogger recently reviewed and criticized these changes. Zoom's response was fast, but confusing.
Zoom is
almost certainlynot using chat, video, or audio for trainingwithout explicit opt-in(Updated August 17, 2023).Itis, however, using this content for training if you opt into the[Zoom has] generative AI features called IQ Meeting Summary and IQ Team Chat Compose (Updated August 17, 2023).Do not opt-in to these features if you are concerned about your content becoming part of Zoom's AI models(Updated August 17, 2023).Zoom is also (and has been for a while) using telemetry data, usage statistics, customer actions, etc. (which it calls “Service Generated Data”) for AI training. It considers this data to be its own property and you cannot opt out of such training.
[On August 17, 2023, I both [added] and removed content based on Zoom’s changed terms of service.]
Timeline
March 2023: Zoom modifies its terms and conditions with references to training AI models.
August 7, 2023: Alex Ivanovs at Stackdiary posts a blog criticizing the changes.
August 7, 2023: Zoom again modifies terms and conditions to clarify rules about AI training, makes tweets, and releases blog post. It mentions that it will train on customer data such as meeting invitations for webinars, but it won’t use audio, video, or in-meeting chat content.
August 11, 2023: Zoom yet again modifies its terms and conditions, streamlining them substantially and making clear they do not train their AI models on large swaths of Customer Content at all. (Updated August 17, 2023)
Questions
Question: is Zoom training AI models using pictures/images of my face, my voice, etc. without me knowing?
Answer: Probably not. While they could be lying, that seems unlikely due to the reputation risk. And there is no evidence that they are continuously recording all meetings irrespective of user decisions (which would be a requirement for such an AI training scheme and also probably illegal).
Question: Can Zoom use my audio, video, or chat content for marketing purposes (without AI training)?Answer:it looks like the answer is technically “yes,” based on section 10.4 of their terms and conditions. This is odd to me, and I certainly wouldn’t want my recorded meeting to appear in a Zoom ad. But I will note that Gmail had a similarprovisionfor a long time, and interpret this as meaning that Zoom reserves the right totargetads (theirs or third parties’) based on your content. This would seem to tough to do using audio, video, or chat content without AI, so this provision remains a little strange to me.(Updated August 17, 2023)
Question: Is Zoom using any of the below data for AI training (directly taken from Rob Black’s post)? Answers are provided below each point, but note that these are assessments based on the terms and conditions, Zoom’s other statements, and industry practices.
closed captioning content / the transcript of the call covered?
Answer: probably not,
unless using generative AI features (because that’s what they seem designed to do)[as this seems to be “communications-like Customer Content.”]
Is screen sharing covered?
Answer: No, according to a LinkedIn post from Zoom’s CEO.
Is your photo that is being used in Zoom?
Answer: maybe, as it doesn’t seem covered explicitly. But probably low value to Zoom from a business perspective.
How about your name?
Answer: maybe, but same as above.
Who you talk to?
Answer: Probably, as this seems like it is covered under “Service Generated Data.”
How long?
Answer: Almost certainly. Same as above.
Your attention to the meeting or if you are clicking around?
Answer: Almost certainly.
Title of the Zoom meeting?
Answer: Yes. Zoom explicitly mention that it trains its AI on meeting invitations for fraud detection purposes.
What can the Zoom app on your desktop collect about you and be used?
Answer: Unclear, but certainly limited to what your operating system (e.g. MacOS) gives it access to.
Does it train on attachments, pools, or whiteboards?
Answer: No, according to this UI prompt [and updated terms of service] (Updated August 17, 2023).
Question: What should I do about these revelations?
Answer:
Understand Zoom is using product usage patterns, telemetry, event invitations, and similar data to train its AI models.
Know that using IQ Meeting Summary and Team Chat Compose means Zoom will train on your chat, image, and voice data. Do not use these features if you are concerned about data leakage or retention (I am concerned and do not plan to enable these).(Updated August 17, 2023).Enable end-to-end-encryption (E2EE). This addresses not just AI training but many others security concerns (e.g. if Zoom is itself breached). Assuming you don’t think Zoom is lying (they admittedly don’t have a good track record, but there isn’t much else you can do), then this will store the decryption keys for any meetings locally and prevent the content from being used for training. The full details are here but below is the quick guide (h/t to Michael Breese for flagging this).
Lessons learned and discussion
Zoom probably thought they had everything wrapped up from a product, security, governance, and legal perspective when they changed their terms and conditions back in March of this year and then rolled out their new AI features.
Boy, were they wrong.
Although it took almost half a year for someone to notice, the internet fury was swift.
Admittedly, this first iteration of the terms and conditions was quite vague as to what Zoom would do with things like customer audio, video, and chat data (clearly the most sensitive). The subsequent addition was helpful, but probably should have been there in the first place (especially since Zoom took the trouble to create a separate opt-in process for its generative AI features: Meeting Summary and Team Chat Compose).
[The final update to Zoom’s terms of service, on August 11th, was the most re-assuring and clear, but still leaves a few details out with respect to the definition of “Service Generated Data”] (Updated August 17, 2023).
There also seemed to be substantial concern over the fact that Zoom asserted ownership of “Service Generated Data.” For someone who was worked in the software business for a while, this didn’t seem especially odd to me, as product usage patterns and similar telemetry is a bedrock of engineering and design decisions. Although it isn’t clear from the terms and conditions that this is necessarily anonymized data, I am not especially concerned with anyone knowing how many times I clicked a certain button or at what times I started a meeting. Nonetheless, Zoom should have been more proactive in its communication of its position (explaining how this is a common position in the industry and why it shouldn’t be especially worrying).
Probably the biggest concern for me is that if you do opt-in to the generative AI features, your likeness and voice might become part of Zoom’s model indefinitely! This is certainly something I can see drawing regulatory attention, so I would caution organizations against turning this feature on unless there is a clearly overwhelming business case for doing so. You might need to unwind - or participate in Zoom’s unwinding - of the training after the fact, so be wary (Updated August 17, 2023).
Finally, it was expected but nonetheless interesting to note Zoom made clear [and then removed that clarification on August 11th] they weren’t using customer data for training any third-party models (e.g. OpenAI’s GPT-4 or Anthropic’s Claude 2). As the AI arms race rages, it makes perfect sense that they wouldn’t want to leverage their customer’s data to help out anyone else (Updated August 17, 2023).
Want to avoid these types of mistakes?
StackAware helps organizations manage their:
Cybersecurity
Compliance
Privacy
risk when it comes to using AI. A key piece of this is communicating clearly to stakeholders, especially customers. So if you need help:
Disclosure: StackAware is a Zoom customer. Rob Black’s company, Fractional CISO, is a business partner of StackAware.
Related LinkedIn post