Prompt injection attacks trick AI models into ignoring their instructions — users paste in jailbreaks like “DAN” prompts, role-play escapes, or instruction overrides designed to bypass your system prompt and extract restricted information or cause your AI to behave in unintended ways.
Arcjet prompt injection detection scores each incoming message for injection patterns inside your application before it reaches the AI provider. Detected attacks are blocked before the AI call is made, protecting both your application behavior and your AI budget.
Get started
Section titled “Get started”In this example we use the Vercel AI SDK to create a simple AI chat endpoint with Next.js, and Arcjet to block prompt injection attacks before they reach the AI model. The same principles can be applied to any AI application, including those built with other frameworks.
We assume you already have a Next.js app set up.
Install the dependencies:
# Export your Arcjet API key from https://app.arcjet.comexport ARCJET_KEY="ajkey_..."
npm install @arcjet/next ai @ai-sdk/openaiCreate an AI chat endpoint:
import { openai } from "@ai-sdk/openai";import arcjet, { experimental_detectPromptInjection, shield,} from "@arcjet/next";import type { UIMessage } from "ai";import { convertToModelMessages, isTextUIPart, streamText } from "ai";
const aj = arcjet({ key: process.env.ARCJET_KEY!, // Get your site key from https://app.arcjet.com rules: [ // Shield protects against common web attacks e.g. SQL injection shield({ mode: "LIVE" }), // Detect prompt injection attacks before they reach your AI model experimental_detectPromptInjection({ mode: "LIVE", // Blocks requests. Use "DRY_RUN" to log only // Confidence threshold, lower is more strict. Default = 0.5 // threshold: 0.5, }), ],});
export async function POST(req: Request) { const { messages }: { messages: UIMessage[] } = await req.json();
// Check the most recent user message for prompt injection. // Pass the full conversation if you want to scan all messages. const lastMessage: string = (messages.at(-1)?.parts ?? []) .filter(isTextUIPart) .map((p) => p.text) .join(" ");
const decision = await aj.protect(req, { detectPromptInjectionMessage: lastMessage, });
if (decision.isDenied()) { if (decision.reason.isPromptInjection()) { console.warn("Request blocked due to prompt injection"); return new Response( "Prompt injection detected — please rephrase your message", { status: 403 }, ); } return new Response("Forbidden", { status: 403 }); }
// Arcjet approved — call your AI provider const result = await streamText({ model: openai("gpt-4o"), messages: await convertToModelMessages(messages), });
return result.toUIMessageStreamResponse();}And hook it up to a chat UI:
"use client";
import { useChat } from "@ai-sdk/react";import { useState } from "react";
export default function Chat() { const [input, setInput] = useState(""); const [errorMessage, setErrorMessage] = useState<string | null>(null); const { messages, sendMessage } = useChat({ onError: async (e) => setErrorMessage(e.message), }); return ( <div className="flex flex-col w-full max-w-md py-24 mx-auto stretch"> {messages.map((message) => ( <div key={message.id} className="whitespace-pre-wrap"> {message.role === "user" ? "User: " : "AI: "} {message.parts.map((part, i) => { switch (part.type) { case "text": return <div key={`${message.id}-${i}`}>{part.text}</div>; } })} </div> ))}
{errorMessage && ( <div className="text-red-500 text-sm mb-4">{errorMessage}</div> )}
<form onSubmit={(e) => { e.preventDefault(); sendMessage({ text: input }); setInput(""); setErrorMessage(null); }} > <input className="fixed dark:bg-zinc-900 bottom-0 w-full max-w-md p-2 mb-8 border border-zinc-300 dark:border-zinc-800 rounded shadow-xl" value={input} placeholder="Say something..." onChange={(e) => setInput(e.currentTarget.value)} /> </form> </div> );}Then run the server:
npm run devYou will see requests being processed in your Arcjet dashboard in real time.
In this example we use LangChain to create a simple AI chat server with Flask, and Arcjet to block prompt injection attacks before they reach the AI model. The same principles can be applied to any AI application, including those built with other frameworks.
Set up the environment and install dependencies (uses uv, but you can also use pip to install the Arcjet Python SDK):
# Export your Arcjet API key from https://app.arcjet.comexport ARCJET_KEY="ajkey_..."export ARCJET_ENV=development
# Export your OpenAI API key (used by LangChain)export OPENAI_API_KEY="sk-..."
# Install dependenciesuv add arcjet flask langchain langchain-openaiCreate the chat server:
import loggingimport os
from arcjet import ( Mode, arcjet_sync, experimental_detect_prompt_injection, shield,)from flask import Flask, jsonify, requestfrom langchain_core.output_parsers import StrOutputParserfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_openai import ChatOpenAI
app = Flask(__name__)
logging.basicConfig(level=logging.INFO)logger = logging.getLogger(__name__)
arcjet_key = os.getenv("ARCJET_KEY")if not arcjet_key: raise RuntimeError("ARCJET_KEY is required. Get one at https://app.arcjet.com")
openai_api_key = os.getenv("OPENAI_API_KEY")if not openai_api_key: raise RuntimeError( "OPENAI_API_KEY is required. Get one at https://platform.openai.com" )
llm = ChatOpenAI(model="gpt-4o-mini", api_key=openai_api_key)
prompt = ChatPromptTemplate.from_messages( [ ("system", "You are a helpful assistant."), ("human", "{message}"), ])
chain = prompt | llm | StrOutputParser()
aj = arcjet_sync( key=arcjet_key, # Get your key from https://app.arcjet.com rules=[ # Shield protects your app from common attacks e.g. SQL injection shield(mode=Mode.LIVE), # Detect prompt injection attacks before they reach your AI model experimental_detect_prompt_injection( mode=Mode.LIVE, # Blocks requests. Use Mode.DRY_RUN to log only # threshold=0.5, # Confidence threshold, lower is more strict ), ],)
@app.post("/chat")def chat(): body = request.get_json() message = body.get("message", "") if body else ""
# Call protect() with the user message to score for prompt injection decision = aj.protect(request, detect_prompt_injection_message=message)
# Handle denied requests if decision.is_denied(): if decision.reason.is_prompt_injection(): logger.warning("Request blocked due to prompt injection") return jsonify( error="Prompt injection detected — please rephrase your message" ), 403 return jsonify(error="Denied"), 403
# All rules passed, proceed with handling the request reply = chain.invoke({"message": message})
return jsonify(reply=reply)
if __name__ == "__main__": app.run(debug=True)Then run the server:
uv run python app.pyAnd send a message to the API endpoint:
curl -X POST http://localhost:5000/chat \ -H "Content-Type: application/json" \ -d '{"message": "What is the capital of France?"}'You will see requests being processed in your Arcjet dashboard in real time.
In this example we use LangChain to create a simple AI chat server with FastAPI, and Arcjet to block prompt injection attacks before they reach the AI model. The same principles can be applied to any AI application, including those built with other frameworks.
Set up the environment and install dependencies (uses uv, but you can also use pip to install the Arcjet Python SDK):
# Export your Arcjet API key from https://app.arcjet.comexport ARCJET_KEY="ajkey_..."export ARCJET_ENV=development
# Export your OpenAI API key (used by LangChain)export OPENAI_API_KEY="sk-..."
# Install dependenciesuv add arcjet fastapi uvicorn langchain langchain-openaiCreate the chat server:
import loggingimport os
from arcjet import ( Mode, arcjet, experimental_detect_prompt_injection, shield,)from fastapi import FastAPI, Requestfrom fastapi.responses import JSONResponsefrom langchain_core.output_parsers import StrOutputParserfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_openai import ChatOpenAIfrom pydantic import BaseModel
app = FastAPI()
logging.basicConfig(level=logging.INFO)logger = logging.getLogger(__name__)
arcjet_key = os.getenv("ARCJET_KEY")if not arcjet_key: raise RuntimeError("ARCJET_KEY is required. Get one at https://app.arcjet.com")
openai_api_key = os.getenv("OPENAI_API_KEY")if not openai_api_key: raise RuntimeError( "OPENAI_API_KEY is required. Get one at https://platform.openai.com" )
llm = ChatOpenAI(model="gpt-4o-mini", api_key=openai_api_key)
prompt = ChatPromptTemplate.from_messages( [ ("system", "You are a helpful assistant."), ("human", "{message}"), ])
chain = prompt | llm | StrOutputParser()
class ChatRequest(BaseModel): message: str
aj = arcjet( key=arcjet_key, # Get your key from https://app.arcjet.com rules=[ # Shield protects your app from common attacks e.g. SQL injection shield(mode=Mode.LIVE), # Detect prompt injection attacks before they reach your AI model experimental_detect_prompt_injection( mode=Mode.LIVE, # Blocks requests. Use Mode.DRY_RUN to log only # threshold=0.5, # Confidence threshold, lower is more strict ), ],)
@app.post("/chat")async def chat(request: Request, body: ChatRequest): # Call protect() with the user message to score for prompt injection decision = await aj.protect( request, detect_prompt_injection_message=body.message )
# Handle denied requests if decision.is_denied(): if decision.reason.is_prompt_injection(): logger.warning("Request blocked due to prompt injection") return JSONResponse( {"error": "Prompt injection detected — please rephrase your message"}, status_code=403, ) return JSONResponse({"error": "Denied"}, status_code=403)
# All rules passed, proceed with handling the request reply = await chain.ainvoke({"message": body.message})
return {"reply": reply}Then run the server:
uv run uvicorn main:app --reloadAnd send a message to the API endpoint:
curl -X POST http://localhost:8000/chat \ -H "Content-Type: application/json" \ -d '{"message": "What is the capital of France?"}'You will see requests being processed in your Arcjet dashboard in real time.
Configuring prompt injection detection
Section titled “Configuring prompt injection detection”threshold: 0.5 — the minimum confidence score (between 0 and 1, exclusive)
required to block a request. Lower values are more aggressive and catch more
attacks but may produce false positives. The default of 0.5 is a balanced
starting point; raise it (e.g. 0.8) to reduce false positives, or lower it
(e.g. 0.3) for stricter enforcement.
detectPromptInjectionMessage - the text to score. Pass the user’s most recent
message, or the full conversation history if you want to scan all messages.
mode: "DRY_RUN" - logs detections without blocking. Use this to measure the
false-positive rate in production before switching to "LIVE".
Combine with abuse protection
Section titled “Combine with abuse protection”Prompt injection detection controls what your AI model receives. To also block automated clients and enforce per-user budgets, combine it with AI abuse protection and AI budget control.