Skip to content

Arcjet / LangChain Integration

Arcjet and LangChain work together to redact sensitive information from prompts locally before it is sent to a third-party LLM or Chat Model.

What is Arcjet? Arcjet helps developers protect their apps in just a few lines of code. Bot detection. Rate limiting. Email validation. Attack protection. Data redaction. A developer-first approach to security.

The Arcjet LangChain integration wraps your LLM and Chat Model calls to perform redaction and un-redaction on sensitive information that is being sent to and from third party AI services.

This currently works with the LangChain JS SDK.

Installation

To use Arcjet with LangChain you need to install both the @arcjet/redact and @langchain/community packages.

Terminal window
npm i @arcjet/redact @langchain/community

Requirements

  • @langchain/community 0.2.33 or later

Configuration

The configuration definition is:

type RedactOptions = {
chatModel: ChatModel,
entities?: Array<SensitiveInfoType>;
contextWindowSize?: number;
detect?: (tokens: string[]) -> Array<SensitiveInfoType | undefined>;
replace?: (detectedEntity: SensitiveInfoType) -> string | undefined;
};
  • chatModel or llm: The chat model or llm that you are wrapping. (eg: OpenAIChat)
  • entities: The list of entities that you wish to redact. If undefined then all entities are redacted. Valid values are: email, phone-number, ip-address, credit-card, or any string returned from detect.
  • contextWindowSize - How many tokens to pass to the detect function at a time. Setting this to a higher value allows for more context to be used when determing if a token is sensitive or not.
  • detect - An optional function that allows you to detect custom entities. It will be passed a list of tokens as big as contextWindowSize and should return a list of detected entities of the same length.
  • replace - An optional function that allows you to define your own replacements for detected entities. It is passed a string with the type of entity detected and it should either return a replacement for that entity type or undefined.

Example

import { ArcjetRedact } from "@langchain/community/chat_models/arcjet";
import { ChatOpenAI } from "@langchain/openai";
// Create an instance of another chat model for Arcjet to wrap
const openai = new ChatOpenAI({
temperature: 0.8,
model: "gpt-3.5-turbo-0125",
});
const arcjetRedact = new ArcjetRedact({
// Specify a LLM that Arcjet Redact will call once it has redacted the input.
chatModel: openai,
// Specify the list of entities that should be redacted.
// If this isn't specified then all entities will be redacted.
entities: ["email", "phone-number", "ip-address", "custom-entity"],
// You can provide a custom detect function to detect entities that we don't support yet.
// It takes a list of tokens and you return a list of identified types or undefined.
// The undefined types that you return should be added to the entities list if used.
detect: (tokens: string[]) => {
return tokens.map((t) =>
t === "some-sensitive-info" ? "custom-entity" : undefined,
);
},
// The number of tokens to provide to the custom detect function. This defaults to 1.
// It can be used to provide additional context when detecting custom entity types.
contextWindowSize: 1,
// This allows you to provide custom replacements when redacting. Please ensure
// that the replacements are unique so that unredaction works as expected.
replace: (identifiedType: string) => {
return identifiedType === "email" ? "redacted@example.com" : undefined;
},
});
const response = await arcjetRedact.invoke(
"My email address is test@example.com, here is some-sensitive-info",
);