Protect your OpenAI application with Arcjet
If you are building an AI application using OpenAI then you will want to protect it from abuse. Arcjet rate limiting and bot protection can help manage your OpenAI developer token budget.
What is Arcjet?
Arcjet helps developers protect their apps in just a few lines of code. Bot detection. Rate limiting. Email validation. Attack protection. Data redaction. A developer-first approach to security.Example use case
- You have a chat interface that uses OpenAI to generate responses.
- You want to prevent automated bots from accessing your application.
- You want to implement a rate limit for each user logged in to your application.
- The rate limit should be based on the OpenAI tokens, which is how you are billed for your usage of the OpenAI API.
How it works
- Arcjet rate limits allow custom characteristics to identify the client and apply the limit. We provide the user ID to identify the user. This can use any authentication system you have in place, such as Clerk.
- Define a rate limit of
2,000
tokens per hour with a maximum of5,000
tokens in the bucket. This allows for a reasonable conversation length without consuming too many tokens. - Also apply a bot rule to block clients we are sure are automated.
- Use the
openai-chat-tokens
package to count the number of tokens in each chat API request. - Pass the token estimate to the Arcjet
protect
call to deduct the tokens from the user’s rate limit.
The example below shows the API route for a Next.js application with a
gpt-3.5-turbo
AI chatbot. See the full example Next.js implementation on
GitHub.
// Adapted from https://sdk.vercel.ai/docs/getting-started/nextjs-app-routerimport { openai } from "@ai-sdk/openai";import arcjet, { shield, tokenBucket } from "@arcjet/next";import { streamText } from "ai";import { promptTokensEstimate } from "openai-chat-tokens";
const aj = arcjet({ // Get your site key from https://app.arcjet.com // and set it as an environment variable rather than hard coding. // See: https://nextjs.org/docs/app/building-your-application/configuring/environment-variables key: process.env.ARCJET_KEY!, characteristics: ["userId"], // track requests by user ID rules: [ shield({ mode: "LIVE", // will block requests. Use "DRY_RUN" to log only }), tokenBucket({ mode: "LIVE", // will block requests. Use "DRY_RUN" to log only refillRate: 2_000, // fill the bucket up by 2,000 tokens interval: "1h", // every hour capacity: 5_000, // up to 5,000 tokens }), ],});
// Allow streaming responses up to 30 secondsexport const maxDuration = 30;
// Edge runtime allows for streaming responsesexport const runtime = "edge";
export async function POST(req: Request) { // This userId is hard coded for the example, but this is where you would do a // session lookup and get the user ID. const userId = "totoro";
const { messages } = await req.json();
// Estimate the number of tokens required to process the request const estimate = promptTokensEstimate({ messages, });
console.log("Token estimate", estimate);
// Withdraw tokens from the token bucket const decision = await aj.protect(req, { requested: estimate, userId }); console.log("Arcjet decision", decision.conclusion);
for (const { reason } of decision.results) { if (reason.isRateLimit()) { console.log("Requests remaining", reason.remaining); } }
// If the request is denied, return a 429 if (decision.isDenied()) { if (decision.reason.isRateLimit()) { return new Response("Too Many Requests", { status: 429, }); } else { return new Response("Forbidden", { status: 403, }); } }
// If the request is allowed, continue to use OpenAI const result = await streamText({ model: openai("gpt-4-turbo"), messages, });
return result.toDataStreamResponse();}
The Next.js pages router does not support streaming responses so you should use
the app router for this example. You can still use the pages/
directory for
the rest of your application. See the Next.js AI docs for
details.