Hungry hungry (data) harvester
The security implications of hooking up Google Drive and Microsoft OneDrive to ChatGPT.
OpenAI is on a feature release streak with the launch of GPT-4o and several other new capabilities. While I didn’t notice anything major security- or privacy-related with the launch of the new “omni” model, I did with the new integrations.
It is now possible to directly connect Google Drive or Microsoft OneDrive with the AI tool. And I think there are some immediate concerns to address when it comes to data integrity, training, and retention.
What do the integrations look like?
You can make the connection from the user interface:
Depending on which cloud solution you use, you then get different prompts. Each one has different security implications.
Google Drive
The good news here is that this only allows ChatGPT access to specific files used with the AI tool.
The bad news is that it requests edit, create, and delete permissions. These aren’t necessary or appropriate, so are a little bit concerning.
A breach of OpenAI - or even a non-malicious incident - could result in the corruption or deletion of files you have exposed to ChatGPT. So I would recommend managing which files to which you authorize access carefully.
Microsoft OneDrive
OneDrive’s permissions requests basically invert the problems from those for Google Drive. It only requests read access (which is good) but doesn’t allow you fine-grained control with respect to which files it reads from.
This is an incredibly important question when it comes to addressing two security issues.
There are AI training and data retention concerns with both options
It is possible to use temporary chats when processing files (whether from cloud storage or directly uploaded), which I would recommend. Additionally, switching from normal to temporary chat mode removes any files that were attached to the prompt. This is kind of a pain but the reverse is also true, which can prevent you from accidentally processing sensitive data in normal mode that you didn’t want to.
ChatGPT Plus and below users are opted into training, which means OpenAI can more easily train models on large volumes of sensitive information
Because directly integrating your cloud storage with ChatGPT makes it easier to share data (a key feature) with the tool, the risk of unintended training also goes up. There are certainly mitigation steps that non-commercial users can take, such as opting out of training. But all other things being equal, this ease of use comes with drawbacks.
Additionally, while from a technical level ChatGPT would only have access to Google Drive files you specifically authorize, this is not necessarily true for OneDrive. I asked OpenAI whether there was any internal logic that limited files exposed to AI training and will update this article if I get a response.
Data retention is indefinite by default with ChatGPT Team and below
This is potentially an even bigger issue for businesses.
While OpenAI doesn’t train on the data from Team users, it does retain it indefinitely except in temporary chat mode. Unless your users are very conscientious about what they are doing, these new features will increase the amount of sensitive data forever residing on OpenAI’s servers.
Enterprise users can specify their default retention period, reducing the risk there. But ChatGPT Enterprise is expensive.
How do I disconnect my cloud storage from ChatGPT?
Out of an abundance of caution, I connected my personal Google Drive and my company OneDrive (which has very little data and is all publicly-available) to test out the functionality.
Being a security-conscious person, I decided to undo the connection after my experiment.
You can do this for both on the ChatGPT side:
And you can also do this on the Google Drive side as well:
For OneDrive, figuring out how to revoke permissions for ChatGPT took me 15 minutes. I followed the instructions on the original authorization screen and went to https://myapps.microsoft.com/.
But here is what I found:
This makes no sense because I am an admin.
After reviewing this help article and navigating to the login form it recommends, I got a message saying my account did not exist.
There appears to be a difference between Microsoft 365 accounts and a Microsoft Live accounts, causing this problem.
Finally, after doing some intense Googling, I was able to figure out that I needed to log in to Azure as an administrator and:
Go to “Enterprise Applications.”
Click on ChatGPT
Click “Properties”
Click “Delete”
Usability can certainly be a security feature, and Microsoft fails big time here.
AI apps are getting hungrier for data
There is certainly utility in being able to directly connect your file storage system to powerful tools like ChatGPT. At the same time, OpenAI benefits by having more data - all other things being equal - on which to train its models.
Even for that information on which it is not training, however, ChatGPT indefinitely retains it by default for non-Enterprise customers. This is a strange decision from my perspective - in security, privacy, and business terms - but it is nonetheless a reality.
AI-powered applications and companies appear to think “more is better” when it comes to data, gobbling up as much as they can. This approach can be double-edged. Customers who want to stay on top of these challenges have a lot of work to do.
Or they can just have StackAware do it for them.
Does this sound like you?