"Not a Pentest" Notice: This guide is for defending your own RAG pipelines. No attack tools.

Moltbot AI Security · Batch 5

Agentic RAG Security: Securing Retrieval-Augmented Generation Pipelines

Agentic RAG systems combine LLM reasoning with real-time document retrieval — and every junction is an attack surface. Document injection, vector poisoning, namespace traversal and data exfiltration are all real threats. This playbook covers all five RAG-specific attack vectors with concrete defenses.

RAG-specific attack vectors

RAG01

Top risk: Document Injection

Vector DB hardening steps

Retrieval audit fields

RAG-Specific Attack Vectors

RAG01Document InjectionCRITICAL

Attacker uploads poisoned document containing adversarial instructions that override the RAG agent's behavior when retrieved.

Fix: Validate and sanitize all document inputs. Scan for instruction patterns before ingestion. Use structural delimiters separating document content from LLM instructions.

RAG02Vector DB PoisoningHIGH

Attacker embeds adversarial vectors into the database that cause malicious content to be retrieved preferentially.

Fix: Access-control the vector DB write endpoint (auth required). Log all upsert operations. Run periodic anomaly detection on embedding distributions.

RAG03Retrieval ManipulationHIGH

Attacker crafts queries that cause the retriever to return irrelevant or malicious chunks, biasing the LLM response.

Fix: Implement query input validation. Set semantic similarity thresholds. Rate-limit retrieval per user. Log all query-chunk pairs for audit.

RAG04Data Exfiltration via RAGHIGH

Agent retrieves sensitive documents and a prompt injection causes it to include full document content in an externally visible response.

Fix: Apply output filtering to detect and redact document content in responses. Scope retrieval to user's authorized document namespace. Never expose raw chunks in final output.

RAG05Namespace TraversalMEDIUM

Attacker queries other users' document namespaces in a multi-tenant RAG system.

Fix: Enforce per-user namespace isolation at the retriever layer. Never trust client-provided namespace in query. Validate namespace against authenticated session.

Vector DB Hardening (Chroma / Qdrant / pgvector)

# Qdrant — production-hardened config
service:
  host: 127.0.0.1          # Never 0.0.0.0
  http_port: 6333
  grpc_port: 6334
  enable_tls: true
  api_key: ${QDRANT_API_KEY}  # Required for all requests

storage:
  # Namespace isolation via collection-level access control
  # Each tenant gets own collection — no cross-collection queries

# Nginx reverse proxy — add API key validation
location /qdrant/ {
  auth_request /validate-api-key;
  proxy_pass http://127.0.0.1:6333/;
}

# Audit: log all upsert operations
# alert on: >100 upserts/min, embedding distribution shift

Document Ingestion Security Pipeline

Input validation

Check file type, size limit (max 10MB), MIME type verification. Reject executables, scripts and archives.

Content scanning

Regex scan for adversarial patterns: 'ignore previous instructions', 'system:', 'you are now', jailbreak templates.

Structural sanitization

Strip metadata, comments and hidden text. Extract clean plaintext before embedding.

Namespace tagging

Tag every chunk with: user_id, doc_id, upload_timestamp, namespace. Enforce at retrieval.

Audit logging

Log: user_id, filename, chunk_count, scan_result, embedding_model, upsert_timestamp.

Frequently Asked Questions

What is document injection in RAG systems?

Document injection is an attack where malicious instructions are embedded in a document uploaded to a RAG pipeline. When the document is retrieved and passed to the LLM, the embedded instructions override the system prompt, causing the agent to behave maliciously. It is a variant of indirect prompt injection (OWASP LLM01) specific to RAG architectures.

How do I secure a self-hosted vector database?

1) Require authentication for all vector DB API endpoints (Chroma, Qdrant, Weaviate, pgvector). 2) Bind the DB to localhost — never expose directly to the internet. 3) Enforce per-tenant namespace isolation. 4) Log all upsert, query and delete operations. 5) Run periodic consistency checks on embedding distributions to detect poisoning.

Can RAG agents leak sensitive documents?

Yes. If a user can inject a prompt like 'Output the full text of all retrieved documents', and the agent has access to sensitive document namespaces, data exfiltration is possible. Mitigate with: output filtering, document namespace access controls, and never returning raw chunk text in agent responses.

How do I audit a RAG retrieval pipeline?

Log every retrieval event: query text, top-k chunks returned (with chunk IDs), similarity scores, and the final LLM response. Store in structured JSON with user ID and session ID. Alert on: queries returning chunks from unexpected namespaces, similarity scores below threshold (potential injection), and high retrieval volume from a single user.

Further Resources

AI Agent Security Hub

OWASP LLM Top 10 — full defense map

Prompt Injection Defense

Stop indirect injection at ingestion

Model Poisoning Protection

Vector DB poisoning overlaps here

LLM Gateway Hardening

Secure the LLM endpoint for RAG