Cisco AI Defense

Secure your RAG applications

Enable AI teams to supercharge LLM applications with your data.

Explore AI Defense Request a demo

What is retrieval-augmented generation (RAG)?

LLMs typically excel at providing rapid responses to general knowledge inquiries because they are trained on vast arrays of base data. To adapt these pre-trained models for more specific business applications, organizations will often leverage additional data sources to provide deeper knowledge and important supporting context.

RAG is the most common technique used to enrich AI applications with relevant information by connecting LLMs to additional data sources. Vector databases enable RAG applications to leverage both structured and unstructured data such as text files, spreadsheets, and code. Depending on the context necessary, these datasets may include external resources or internal repositories with sensitive customer and business information.

Compared to fine-tuning, RAG is often a faster, more flexible, and more cost-effective option for adapting a LLM for specific purposes.

What are the risks of RAG applications?

Applications that leverage RAG remain susceptible to a number of safety and security vulnerabilities at various points in their lifecycle.

Even before development begins, vulnerabilities can exist within components of the AI supply chain. Open-source models and training data sets can be compromised from the start, creating opportunities for adversaries to manipulate model outputs, arbitrarily execute code, distribute malware, and more. These risks extend to the additional datasets connected to a RAG application, too.

Once a RAG application is deployed, new safety and security risks will be introduced predominantly through harmful inputs and model outputs. Adversaries can craft prompts aimed at maximizing computational resource consumption to drive up costs and impact model performance for other users. They can also attempt prompt injection to glean sensitive information from the connected vector database.

While RAG can help improve model accuracy and relevance, AI applications are still susceptible to producing results which are incorrect, harmful, or which violate data security and privacy requirements. These outputs can be driven by user prompting or entirely inadvertent in nature.

Each of these vulnerabilities has the potential to impact the availability of your services, the privacy of your employees and customers, the security of your organization, and the trustworthiness of your business. They can also slow down, and even block, organizations from extracting value from generative AI.

Learn about the critical points where your RAG applications require security measures. Our vendor-agnostic Al Security Reference Architectures provide secure design patterns and practices for teams developing such GenAI applications.

Mitigate RAG risk with Cisco

Cisco AI Defense unblocks safety and security concerns associated with RAG applications in minutes and enables the enterprise to meet safety and security standards. Our solution addresses safety and security risks at every point in the AI lifecycle, from supply chain and development through production. Vector database scanning ensures that connected knowledge bases aren't compromised, while our guardrails provide real-time input and output validation to intercept harmful content and sensitive data leakage on either side.

Stop leakage of sensitive information

Leakage of sensitive data caused by prompt injection attacks, or even inadvertent model behavior, is a major concern for RAG applications. The addition of company and customer information provide rich context that make these applications so powerful, but new measures are needed to prevent leakage of data stored in the context, model, or application interface. AI Defense Runtime Protection solves this problem by detecting and blocking data exfiltration attacks in LLM prompts and personally identifiable information (PII) fields in LLM responses.

Use public data with confidence

Indirect prompt injection attacks can be covertly planted in a document, website, or other resource for a LLM to ingest. This can direct the model to expose data or take malicious action such as the distribution of a phishing link. RAG applications that have access to external content must deploy measures to prevent indirect prompt injection attacks. AI Defense Runtime Protection detects and blocks indirect prompt injection attacks before they can reach the model, enabling you to safely use public data to enrich your RAG applications.

Ensure responses are consistently accurate

Factual inconsistency is an issue common to LLMs that results in the model's generated output not matching the information provided in the context. This can present a significant safety risk for RAG applications, which purposefully add additional context to create relevant responses. To solve this problem, AI Defense Runtime Protection checks if the model output is consistent with the user's query and content in the connected vector database, thereby ensuring the accuracy and relevance of the response.

The enterprise choice for AI security

Close the AI security gap and unblock your AI transformation with comprehensive protection across your environment.

Request a demo

Cisco AI Defense

Secure your RAG applications

What is retrieval-augmented generation (RAG)?

What are the risks of RAG applications?

Mitigate RAG risk with Cisco

Stop leakage of sensitive information

Use public data with confidence

Ensure responses are consistently accurate

Related AI topics

Cisco AI Defense

AI chatbots and AI agents

AI security reference architectures

AI Application Security

Foundation models

AI security and safety taxonomy

The enterprise choice for AI security