The Real Risk Behind Model Context Protocol

PUBLISHED:
November 21, 2025
|
BY:
Debarshi

You already secured the model. Great. But the part no one’s talking about (the piece feeding it instructions, history, and memory) is wide open. 

Most teams using GenAI are moving fast and bolting on Model Context Protocols (MCPs) without questioning what they were designed to do. These protocols weren’t built for secure execution, let alone for environments with sensitive data or regulated workloads. There’s no built-in access control, no isolation between context sources, and no real validation for what gets injected. So when something goes wrong, it’s not because the model was weak, but because the protocol gave attackers a front-row seat.

We’re seeing prompt injection, data leakage, model hijacking, and full-on RCE hit systems that passed every security review, because MCP wasn’t even on the checklist. And the worst part? This happens in dev, staging, and production, because context management isn’t treated as a security surface. But it absolutely is.

Table of Contents

  1. Model Context Protocol is how your LLM actually works
  2. The top vulnerabilities in MCP you’re likely missing
  3. MCP risk can compromise the systems your business runs on
  4. Red flags in your GenAI architecture
  5. How to secure MCP without rewriting everything
  6. What to demand from vendors and open-source tools
  7. MCP is probably the weakest link in your GenAI stack right now

Model Context Protocol is how your LLM actually works

You’re not interacting with the model directly. Instead, you’re working through the protocol that feeds it, and that’s the layer attackers are exploiting.

Model Context Protocol (MCP) is the system that manages how prompts, tokens, memory, and instructions get delivered to the model during runtime. It acts as the glue between your LLM and the rest of your application stack. Whether you’re running a basic chat interface, a Retrieval-Augmented Generation (RAG) pipeline, or a multi-agent orchestration framework, the model can’t operate without a tightly managed stream of context. That stream is controlled by MCP.

Here’s what MCP typically handles under the hood:

  • Tokenized inputs, including system prompts, user instructions, prior messages, memory objects, and functions.
  • Context windows that define how much the model can “see” at once, including how information is truncated, summarized, or rolled forward between calls.
  • State management across turns in a conversation, so the model behaves like it remembers context, even across stateless API calls.

In RAG systems, MCP is the layer that injects the retrieved documents into the prompt. In agent frameworks, it handles the step-by-step outputs that get routed back into planning and execution. In fine-tuned enterprise interfaces, it governs how user roles, data access levels, and workflows are dynamically translated into model interactions.

This protocol layer sits between the raw model and the application logic. It parses signals, shapes memory, and decides what the model pays attention to. And it rarely has any security controls in place.

Attackers are going straight for this layer, because once they get into the context stream, they can:

  • Inject hostile prompts that get re-used across turns or agents.
  • Manipulate memory so the model behaves based on attacker-controlled state.
  • Slip sensitive data into shared buffers and extract it later through crafted inputs.
  • Exploit dynamic tool execution or plugin routing to trigger actions, run code, or access internal systems, especially when those outputs are passed without validation.

All of this happens outside the model. The base LLM might pass every safety test, and the app might have basic sanitization in place, but once you let external input flow into MCP-managed context, the threat surface shifts completely.

We’ve seen teams roll out RAG pipelines where document fetchers drop raw text directly into the prompt. We’ve seen agent loops reuse outputs from unauthenticated users across parallel sessions. We’ve seen devs wrap open-source orchestration layers around LLMs and assume that because the model is locked down, the system is safe.

It’s not.

MCP is the part making real-time decisions about what the model sees and how it behaves. And in most orgs, it’s treated like plumbing instead of a critical control surface that it is.

Securing this layer means applying the same rigor you’d use anywhere else in your stack:

  • Enforce strict validation and access control for every context input, including memory, history, and function results.
  • Separate user inputs from system prompts with hardened boundaries. Never blend these streams in the same buffer.
  • Trace how context is built, modified, and reused across turns or agents, and log every interaction that modifies it.
  • Lock down toolchains that can be triggered from model outputs, especially where model responses are mapped to real-world actions.

You don’t need to guess where the risk is coming from. You can see it in every place context flows without boundaries or audit.

The top vulnerabilities in MCP you’re likely missing

Your model gets compromised through the protocol feeding it instructions, memory, and user inputs in real time. And most security teams aren’t reviewing that layer at all.

These are the most critical MCP flaws we’re seeing in live deployments, across internal tools, customer-facing apps, and production RAG systems. They’re common, high-impact, and almost always missed in standard LLM security reviews.

Context injection through unsanitized input

Attackers don’t need to jailbreak the model when they can just poison the input stream. The risk starts when user inputs are appended directly to prompt templates without structured encoding, isolation, or boundaries. Once included, these inputs influence how the model interprets the entire context, especially in systems that reuse history, support multi-turn sessions, or preserve memory across user actions.

This gets worse in agent or orchestration frameworks where model outputs are looped back into the system as inputs for the next step. When previous context isn’t properly filtered, a crafted instruction can persist across multiple inference steps or alter downstream execution logic. Teams that treat prompts as strings instead of structured objects are especially vulnerable.

Memory leakage through overloaded or persistent context buffers

LLMs operate within strict context limits, and most implementations rely on sliding windows or summarization to manage token budgets. But systems often over-prioritize convenience by preserving full histories, cached memory, and function logs in a single context payload.

In practice, this results in sensitive or stale data leaking into prompts:

  • Session memory that includes internal configuration or admin-only instructions.
  • Prior tool outputs that were meant for back-end use only.
  • Summarized content that fails to redact or scope properly.

In multi-user or multi-tenant deployments, a lack of isolation between sessions increases the chance of accidental exposure. Without scoped memory boundaries, context can persist far longer than intended, across users, features, or environments.

Untrusted retrieval in RAG pipelines

Retrieval-Augmented Generation often pulls documents from vector stores, external APIs, or internal KBs. The issue is that these documents are rarely validated or sanitized before prompt assembly. It’s common to see RAG components:

  • Insert raw HTML or Markdown directly into prompts without escaping tags or scripts.
  • Use relevance scoring without any access control or classification enforcement.
  • Pull from user-submitted or publicly editable sources (e.g. Confluence, Slack, wikis).

This gives attackers the ability to inject arbitrary content into prompts simply by influencing what gets indexed or retrieved. And because retrieval is often detached from inference logs, those injections are hard to trace after the fact.

Missing authentication and scoping on context contributors

Most prompt assembly logic accepts inputs from multiple services: user interfaces, agents, tools, plugins, and workflow engines. In real deployments, there’s rarely any verification of who is allowed to inject context, or under what conditions.

We’ve seen systems where:

  • Backend services override system prompts without being authenticated.
  • Plugins manipulate instructions with elevated privileges.
  • Tool outputs are inserted into the prompt without session binding or trust-level tagging.

This creates a path for lateral movement, where one compromised input path allows broader influence over model behavior. It also makes it impossible to apply controls like role-based prompt construction or environment-based access restrictions.

No forensic visibility into full inference context

When something breaks (or worse, when an attacker manipulates behavior), teams need to know exactly what was passed to the model at the time of inference. But most systems only log the top-level prompt or user input, ignoring the full assembly. That includes:

  • System instructions that were active at the time.
  • Retrieved documents that contributed to the response.
  • Memory or state data injected from previous turns or agents.
  • Function results and tool outputs that were appended during orchestration.

Without structured and versioned logging of these components, it becomes impossible to reproduce outputs, investigate incidents, or demonstrate compliance. This is a gap in visibility that attackers can exploit to move through the system undetected.

These vulnerabilities persist because the protocol layer looks like an implementation detail. But in a GenAI system, that layer is the actual runtime environment, and it’s exposed by design. When you let models act on real data, integrate with business logic, and drive workflows, MCP becomes a full-blown attack surface. You either review it like you would any other execution layer, or you leave it open to abuse.

MCP risk can compromise the systems your business runs on

When you leave the context layer unprotected, you're not just exposing the model, but also the systems, tools, and environments that trust that model to behave safely. And when that trust gets broken, the consequences hit availability, integrity, and confidentiality in ways that most security reviews aren’t even looking for.

The risk actually begins with how you manage what the model sees, remembers, and acts on.

Context hijacking leads to unauthorized execution

In most LLM agent setups, like those built with LangChain or similar orchestration frameworks, the model doesn’t just generate text. It makes decisions. It picks tools, triggers actions, and moves workflows forward. Those steps rely on trusted context to operate correctly. When that context is compromised, so is every downstream decision.

Here’s how this plays out: an attacker injects prompts that alter the agent’s reasoning chain, redirecting the model to execute a tool it wasn’t supposed to touch, or sending manipulated parameters into an external system. If your tools are wired to deploy, update configurations, or trigger workflows automatically, those actions go through, because the context said they were valid. And because everything is piped through what looks like an expected model response, there’s no alert until something goes wrong.

Data spills happen through context reuse and memory bleed

MCP implementations that preserve conversation history or memory across turns often do so without tightly binding that context to a single user or session. That creates scenarios where sensitive data from one interaction shows up in another, especially in environments where users share infrastructure or where agents interact across multiple projects. Common causes include:

  • Memory buffers reused across threads or sessions without scoping.
  • Logging systems that capture full prompt history without redaction.
  • Prompt replays during retries, fine-tuning, or feedback loops that include stale inputs.

This creates exposure for customer data, internal configurations, and even security controls. And it happens quietly, because the model output still looks coherent, even when the context was corrupted.

Tool integration becomes a path to RCE

It’s increasingly common for GenAI systems to interface with internal tools, from CLI commands to API calls to code deployment systems. And many of those pipelines are designed to interpret model outputs as instructions. When you skip input validation or fail to constrain tool invocation, you open up execution risk. We’ve seen models:

  • Generate shell commands that are passed directly into CI environments.
  • Output structured data (like YAML or JSON) that gets parsed into configuration changes.
  • Call functions with elevated permissions in production environments.

An attacker who gains influence over context (even briefly) can use that position to generate payloads that execute downstream, especially in orchestrated agent setups where each output feeds the next step without enforced guardrails.

A CI assistant compromise shows how quickly this breaks containment

Picture a dev team that wires up a GenAI assistant to help manage CI workflows. The assistant reviews pull requests, manages build configurations, and recommends deployment actions. It uses context memory to track past builds, flags, and approvals, and it interfaces directly with the pipeline controller through authenticated APIs.

Now introduce a prompt crafted by a malicious contributor. The assistant parses the PR comment, pulls it into its reasoning loop, and due to insufficient filtering, updates its internal decision logic. The next time it runs, it pushes that PR forward, skips tests, and modifies deployment variables, all because the model was tricked into treating malicious context as a valid command path.

At this point, you’re not dealing with a failed code review, but with a live system that was reprogrammed through prompt manipulation, and because there’s no inference trace tied to the actual commands issued, you can’t prove where the compromise happened.

This is what happens when MCP vulnerabilities are ignored. The system continues to function, but the control plane underneath is no longer secure. Once attackers influence context, they gain access to decisions, data, and tools, often without triggering a single traditional security control. That’s the exposure. That’s why this isn’t just about LLM safety, but securing your real infrastructure.

Red flags in your GenAI architecture

A full audit is not always needed to spot the weak point in your GenAI deployment. Just look closely at how context is assembled, how tools are invoked, and how memory is scoped. These red flags show where protocol-layer exposure turns into real operational risk.

Using orchestration frameworks with default security behavior still active

LangChain, Semantic Kernel, and LlamaIndex make it easy to compose tools, memory, and prompts. But ease of use often means insecure defaults. In real deployments, we’ve seen:

  • Tool invocation left open, allowing any action to be called without verification.
  • Memory modules that persist across sessions without scoping or TTL.
  • Prompt templates shared between agents, users, and tasks without origin control.
  • Context builders that pull from multiple services but don’t isolate roles or sources.

These defaults aren’t built to withstand adversarial input. Without review and customization, they turn the framework into a wide-open surface for context injection, leakage, or command chaining.

Injecting raw user input into prompts without structured filtering

This is the most common and most damaging mistake. Raw input from UI forms, chat interfaces, or API requests gets passed directly into the prompt buffer. When that happens:

  • Special characters or formatting tokens alter prompt structure.
  • Malicious content persists across sessions when memory is enabled.
  • Prompt injection payloads overwrite instructions or reroute agent logic.

Filtering is all about enforcing structure, like escaping user-controlled text, validating schema compliance, and separating untrusted data from model-level directives.

No logging of full inference context

If your logs don’t capture everything the model saw (not just the user input),  you’re missing critical audit data. That includes:

  • System prompts and instructions in use at the time of inference.
  • Retrieved documents or RAG inputs included in the context.
  • Tool outputs or memory values appended to the prompt chain.
  • Session state or user metadata influencing context construction.

Without this, your incident response is broken. You can’t investigate behavior, reproduce outputs, or verify which part of the context triggered a fault or leak.

Secrets or static credentials embedded in prompts

This happens more often than teams realize. Developers embed internal tokens, API keys, or auth headers into system prompts or tool logic that gets included in the prompt window. We’ve seen:

  • Function-calling prompts with embedded auth tokens for downstream APIs.
  • YAML or JSON blobs injected into context with sensitive config fields.
  • Hardcoded credentials wrapped into tool instructions for LLM routing logic.

Once those secrets are in the prompt, they can leak through model outputs, logs, or prompt injection. And since context is dynamic, you may not even know when they were exposed.

No access controls on tool execution or prompt assembly

In many agent-based pipelines, the model decides which tool to call, and how. Without guardrails, this gives attackers control over execution. Red flags include:

  • Tool chains that don’t enforce authentication between steps.
  • No separation between read-only tools and write or deployment tools.
  • Prompt builders that pull from multiple microservices without tagging source or trust level.

When context is treated as a shared buffer, any component can insert instructions, override priorities, or trigger behavior, including ones the model was never meant to access.

Prompt logic treated like static config instead of executable code

Too many teams version prompts in random YAML files or string templates. There’s no review, no change tracking, and no test coverage. This leads to:

  • Prompt drift across environments with inconsistent behavior.
  • Silent failures when prompt updates break tool compatibility.
  • Ad hoc prompt edits that introduce risky behavior without formal code review.

Prompts define behavior. That makes them code, and they need to be treated with the same level of rigor as any other logic in your stack.

Memory used without expiration or isolation

Persistent memory can introduce serious cross-session and cross-user risks when it isn’t scoped. Issues include:

  • Memory that survives indefinitely, causing stale or poisoned context to persist.
  • Shared memory across users or agents, leading to data leakage.
  • No tagging or isolation across projects, environments, or session types.

Once something lands in memory, especially something malicious or sensitive, it’s difficult to detect and harder to clean up. Context doesn’t forget unless you force it to.

Each of these red flags points to an architectural decision that either wasn’t reviewed or wasn’t threat modeled. These aren’t niche risks or theoretical edge cases. They show up in live deployments, and they create paths for context manipulation, data exposure, and system compromise. You need to start asking who controls context, how it’s assembled, and where the boundaries are supposed to be. If those answers aren’t clear, then you’re already exposed.

How to secure MCP without rewriting everything

What you need is control. Most MCP risks come from how context is assembled, stored, and executed. And you can fix a lot of that with targeted changes. These steps that we will talk about are something that you can apply right now to reduce exposure across staging and production.

Step 1: Validate and scope all prompt inputs before assembly

Before anything touches the model, it needs to be sanitized and structured. That applies to user messages, retrieved documents, tool outputs, and memory inserts.

What to implement:

  1. Define schemas for every context element (user input, tool output, memory value) so inputs can be validated structurally and semantically.
  2. Escape formatting tokens, quotation marks, brackets, and delimiters in user-controlled text to prevent injection into prompt templates.
  3. Attach origin metadata to each input, like user ID, source type, and timestamp, and ensure that metadata is part of the prompt or logged for traceability.
  4. Block inputs that exceed defined structure, length, or character set boundaries.

All context builders should enforce this as part of pipeline logic.

Step 2: Set strict token budgets per context source and enforce them at runtime

Rather than relying on a single max-token limit, define granular budgets across all context contributors.

What to implement:

  1. Fixed caps for user input (e.g., max 500 tokens), retrieved content (e.g., 1000 tokens total), system instructions (e.g., 800 tokens), and memory (e.g., 300 tokens).
  2. Trimming policies that prioritize recent or higher-trust inputs when budgets are exceeded.
  3. Rejection paths for prompts that attempt to pack too many low-trust inputs into the window.
  4. Model output limits (e.g., max 512 tokens) with truncation or post-processing for safety.

This ensures that the model always receives a predictable and controlled context window, even under attack conditions.

Step 3: Log and version the full context at every inference call

Every model call should produce a reproducible snapshot of the exact context that was passed to the LLM.

What to implement:

  1. Centralized logging that captures the assembled prompt, system instructions, user input, memory state, tool outputs, and RAG inserts.
  2. Hashing or signing of the context snapshot so it can be verified later for integrity.
  3. Version tags for system prompt templates and orchestration logic to correlate behavior across environments.
  4. Audit trail fields that record inference ID, model version, and environment (e.g., staging, prod).

This gives you forensic visibility and operational clarity, and avoids surprises when things go sideways.

Step 4: Sign and lock your system prompts and instruction logic

System prompts define how the model behaves. They must be immutable at runtime unless explicitly updated through a secure pipeline.

What to implement:

  1. Sign prompt templates using cryptographic signatures (HMAC, RSA, etc.) to ensure they haven’t been altered before inference.
  2. Separate prompt templates into secured files or configuration objects, stored in read-only locations.
  3. Enforce a signing check before loading any prompt or instruction set into the model pipeline.
  4. Use prompt version IDs and change logs to track who made changes and when, with rollback capability.

This prevents attackers (or internal devs) from modifying critical behavior paths without detection.

Step 5: Deploy enforcement controls at the protocol layer, before and after inference

Most teams only validate user input. You need to validate the entire context chain and the model output before any action is taken.

What to implement:

  1. Pre-inference guards: reject prompts with unauthorized token patterns, malformed structure, or unexpected memory content.
  2. Post-inference guards: validate model responses before passing them to tools or downstream systems. Check for command structure, embedded secrets, or unsafe tokens.
  3. Execution policy enforcement: tool calls must match allowed schemas, have proper authentication context, and be scoped to the correct environment and role.
  4. Isolation enforcement: agents and toolchains should only be able to access the minimum context they need, scoped to the session and user they belong to.

This is where you stop inference from turning into compromise.

You don’t need to build this all in one sprint, but each of these steps closes a real attack path. MCP is your application’s control plane. And the longer it stays unreviewed, the more exposed your systems become. Secure the context. Secure the behavior. Secure the outcomes.

What to demand from vendors and open-source tools

It’s not enough to ask whether a platform uses GenAI securely. You need to know how it manages prompt context, memory, and tool execution, because that’s where the control plane lives. Vendors and open-source tools that rely on MCPs should be ready to answer detailed questions. If they can’t, that’s your signal the stack wasn’t built with security in mind.

These are the questions you should be asking, and the features they should be able to prove.

Ask your vendors and maintainers the right questions

Don't settle for general claims like we sanitize inputs or we use standard LLM guardrails. Push into the protocol layer and make them show their work.

Start with:

  • How is prompt history stored and isolated across sessions and users?
  • What happens when a user injects a function call or malformed input?
  • Can I view and audit every change made to the context, including tool output, memory, and system prompts?
  • How do you tag and track context origin at inference time?
  • What protections are in place to prevent tool misuse or execution abuse from inside a prompt?
  • Do your context assembly and prompt templates support signing and version control?

These are baselines for any platform that claims it handles sensitive GenAI workflows.

Don’t deploy anything that lacks the following security features

Any system that constructs or manages prompts on your behalf should be held to the same standard you’d apply to code execution or API integration. These are the features that matter most:

Full audit logging of prompt assembly

The platform must log the complete prompt window passed to the model, including system prompts, user input, tool output, memory, and retrieved content. That log must be versioned, immutable, and tied to a traceable inference ID.

Input and output filtering with structural enforcement

Inputs from external sources (users, tools, documents) must be validated against a schema. Output that feeds into execution paths must be parsed and inspected before triggering tools or updates.

Role-based access controls for context contributors

Every component that contributes to prompt assembly, such as user input, memory, tool output, should be scoped to a role or permission boundary. There should be no anonymous or global context inserts.

Prompt lineage and version tracking

You need visibility into which prompt template or routing logic was used at every step. This includes template version IDs, changes made, and who approved the update.

Control over memory scoping and retention

The platform should allow you to enforce memory boundaries, set TTLs, and isolate memory between users, sessions, and environments.

Context override protections and sandboxing

There should be safeguards that detect when a context element tries to override or impersonate a system-level instruction. The system must isolate user input from trusted logic and flag deviations before inference.

These are the capabilities that separate secure platforms from those still in early prototyping mode. If a vendor can’t show you how context is stored, filtered, and traced, they’re not ready to run GenAI in environments that handle sensitive data, execute actions, or operate at scale.

Context is infrastructure. Start vetting it like you would any privileged system, and don’t move forward until the protocol layer is just as hardened as the model itself.

MCP is probably the weakest link in your GenAI stack right now

Security leaders tend to focus on model safety, but that's not where most real-world compromises begin. The underlying protocol stack, such as how context is handled, how tools are called, how memory persists, is often treated like glue code when it should be treated like the control layer it is.

That’s the change that needs to happen. MCP is part of your infrastructure. The sooner your security reviews treat it that way, the less cleanup you'll be doing later.

What’s coming next is more interconnectivity. Agent stacks are becoming more autonomous. Toolchains are getting more powerful. And decisions that used to be supervised will soon be entirely model-driven. That makes securing the protocol path the only sustainable way to scale GenAI without inviting compromise.

Don’t wait until these systems are embedded into CI/CD, customer workflows, or production deployments. By then, the blast radius is already too big.

we45’s AI Security Services give your team the expertise and testing depth to evaluate how your GenAI systems handle context. From prompt injection exposure to agent-based attack paths, our model context protocol assessments help you secure the logic that actually runs your AI workflows. Learn more here: we45 Model Context Protocol Security Assessment.

FAQ

What is Model Context Protocol (MCP) in GenAI systems?

Model Context Protocol (MCP) is the system that manages how prompts, tokens, memory, and instructions are delivered to a Large Language Model (LLM) during runtime. It acts as the "glue" between the LLM and the rest of the application stack, controlling the stream of context that the model sees, remembers, and acts on.

Why is Model Context Protocol a security risk?

MCP is a critical attack surface because it rarely has built-in security controls like access control or isolation. Attackers can exploit this layer to inject hostile prompts, manipulate memory, leak sensitive data, and exploit dynamic tool execution, even if the base LLM itself is secure.

What are the top vulnerabilities found in Model Context Protocol implementations?

The most critical vulnerabilities include: Context injection through unsanitized input: Raw user inputs are appended directly to prompt templates without structured encoding or isolation, allowing attackers to poison the input stream. Memory leakage: Persistent context buffers or shared memory between users/sessions can cause sensitive or stale data to leak into new prompts. Untrusted retrieval in RAG pipelines: Documents pulled from external or user-submitted sources are often inserted into the prompt without validation or sanitization, allowing arbitrary content injection. Missing authentication and scoping on context contributors: A lack of verification on who is allowed to inject context (e.g., backend services, plugins) creates paths for lateral movement. No forensic visibility: Systems fail to log the full assembled prompt (including system instructions, memory, tool outputs), making incident investigation impossible.

How can an MCP compromise lead to unauthorized system execution?

In LLM agent setups, a compromised context can hijack the agent’s reasoning chain. This redirects the model to execute tools it shouldn't touch or sends manipulated parameters into external systems. If these tools control deployment, configuration, or workflows, unauthorized actions like modifying deployment variables or triggering malicious code can occur. This is a path to a system compromise like Remote Code Execution (RCE).

What are the key architectural "red flags" that indicate an MCP security problem?

Common red flags include: Using orchestration frameworks (like LangChain, Semantic Kernel) with insecure default security behavior still active. Injecting raw user input into prompts without structured filtering or escaping. No logging of the full inference context (system prompts, retrieved documents, memory, tool outputs). Embedding secrets or static credentials directly into system prompts or tool logic. Lack of access controls on tool execution or the prompt assembly process. Treating prompt logic as static configuration instead of version-controlled, executable code. Using persistent memory without expiration (TTL) or isolation between users and sessions.

What are the immediate steps to secure Model Context Protocol without a full system rewrite?

You can reduce exposure with targeted changes focused on control: Validate and scope all prompt inputs: Enforce schemas, escape formatting tokens in user input, and attach origin metadata before assembly. Set strict token budgets: Define granular token limits for each context source (user input, RAG, memory) and enforce trimming/rejection policies at runtime. Log and version the full context: Capture a reproducible, auditable snapshot of the entire assembled prompt (all components) at every inference call. Sign and lock system prompts: Use cryptographic signatures and version IDs to ensure system prompts are immutable at runtime unless securely updated. Deploy enforcement controls: Implement Pre-inference guards to reject malicious prompts and Post-inference guards to validate model outputs before they trigger tools or downstream actions.

What should I ask vendors about their GenAI security practices?

Do not accept general claims. Demand proof of features and ask specific questions about the protocol layer: How is prompt history stored and isolated across sessions and users? How can I view and audit every change made to the context (including memory and tool output)? What protections prevent tool misuse or execution abuse from inside a prompt? Do context assembly and prompt templates support cryptographic signing and version control? How do you enforce memory boundaries, TTLs, and isolation between environments?

Debarshi

I’m Debarshi vulnerability researcher, reverse engineer, and part-time digital detective. I hunt bugs, break binaries, and dig into systems until they spill their secrets. When I’m not decoding code, I’m exploring human psychology or plotting the perfect football pass. Fueled by caffeine and curiosity, I believe every system has a weakness you just have to be smart enough to find it.
View all blogs
X