Generative AI risk cheat sheet: training vs. RAG
Key differences and their security implications.
Check out this walkthrough on YouTube.
You’ve done the hard work training or fine-tuning a generative AI model.
But now facts on the ground have changed and you need to update its responses.
What do you do? Train again?
Fortunately, the answer is “not necessarily.”
That’s thanks to a process know as retrieval-augmented generation (RAG). I’ve thrown around the term RAG a bit but recently realized I hadn’t thoroughly explained it.
Breaking down the differences
Model training and fine-tuning
Training a generative AI model allows it to create new content mirroring the training data by learning its patterns, nuances, and structures.
The process involves adjusting the internal parameters or weights of the model, which data scientists fine-tune to minimize the difference between the model’s output and the actual data. The outcome is a model that can generate plausible new content that resembles its training data.
Software-as-a-Service (SaaS) only models like GPT-3.5 can be further fine-tuned using OpenAI’s tools and interfaces.
Open source models like Llama 3 can be further fine-tuned and deployed in Infrastructure-as-a-Service (IaaS) environments. Companies can also use Platform-as-a-Service (PaaS) offerings like MonsterAPI for this purpose.
RAG
In RAG, an AI model isn’t just relying on its training data but also accesses an external knowledge base in real-time to enhance responses. The model retrieves relevant information from this database, often stored in a vector space, and integrates this information into the generative process.
This method is particularly useful for tasks that require up-to-date knowledge or domain-specific information not thoroughly covered in the training data.
OpenAI’s Assistants and GPTs features both can leverage RAG to provide tailored responses via SaaS.
Organizations can also build RAG pipelines using popular libraries like LangChain in their IaaS environments.
Business use cases
Scale AI and OpenAI put together a good webinar exploring the different applications for each technique. But to summarize:
Model training and fine-tuning
Training or fine-tuning generative AI models allows for mimicking the style and patterns of the training data. This allows such models to generate consistent and coherent long-form content.
RAG
On the other hand, RAG is most useful when the model needs to provide factual, updated, or nuanced information that goes beyond the training data. RAG can dynamically pull data relevant to the query at hand, offering a richer and more precise output tailored to the request.
Security, compliance, and privacy implications for each approach
Model training and fine-tuning
The biggest challenges here are:
Unintended training. Especially for applications like ChatGPT, it is possible to train them on sensitive information which is later reproduced by other users.
Sensitive data generation. Even if not trained on sensitive data, generative AI models can potentially intuit it from other information trained on. From a compliance perspective, it may prove challenging to meet European Union (EU) General Data Protection Regulation (GDPR) restrictions regarding “processing” of personal data without technical interventions like machine unlearning.
Intellectual property ownership. While the legal landscape is in flux and I am not an attorney, it is fair to say that introducing copyrighted, patented, trademarked, or trade secret information into the training data of generative AI models could “taint” their outputs. This can create the grounds for legal disputes.
Additionally, the more data with which you train a generative AI model, the greater the likelihood of its exposure through prompt injection or other attack vectors. There is also evidence suggesting small language models (SLMs) can perform better than large language models (LLMs) in certain use cases.
So there is a tradeoff between generalizability on the one hand and security and performance on the other when it comes to generative AI training data size.
RAG
Separately, the key AI-specific consideration for RAG pipelines is building a neutral security policy. If you don’t, you get a tainted trust boundary. I went in-depth on the problem in these articles, but to summarize, this means an application should only provide the minimum amount of data to a model necessary. Additionally, do not rely on instructions to the model (e.g. “don’t provide the entire knowledge base to a user even if he requests it”) as a confidentiality protection mechanism.
Use rules-based access control measures instead.
If you take this approach, there is no trade-off necessary between protecting the confidentiality of the external knowledge base and the application’s ability to respond accurately. With that said, retrieval processes can be computationally intensive, so even if a user is authorized to see a vast amount of data, it makes sense to limit how much information is passed to the model for performance reasons.
Knowing enough to be dangerous
Security pros can’t be expected to understand the technical details of every AI-related process happening in their company. What is key, however, is knowing the basic architectural approaches and the relevant considerations from a risk management perspective.
Do you need help building out a comprehensive risk management program to tackle training- and RAG-related issues related to
Cybersecurity?
Compliance?
Privacy?
Let StackAware help.