Skip to content

Node.js bot protection reference

Arcjet bot detection allows you to manage traffic by automated clients and bots.

Configuration

Bot detection is configured by allowing or denying a subset of bots. The allow and deny lists are mutually-exclusive, such that using allow will result in a DENY decision for any detected bot that is not specified in the allow list and using deny will result in an ALLOW decision for any detected bot that is not specified in the deny list.

You can use only one of the following configuration definitions:

type BotOptionsAllow = {
mode?: "LIVE" | "DRY_RUN";
allow: ArcjetWellKnownBot[];
};
type BotOptionsDeny = {
mode?: "LIVE" | "DRY_RUN";
deny: ArcjetWellKnownBot[];
};

The arcjet client is configured with one or more detectBot rules which take one or many BotOptions.

Only allowing specific bots

Most applications want to block almost all bots. However, it is common to allow some bots to access your system, such as bots for search indexing or API access from the command line.

This behavior is configured with an allow list from our full list of bots.

import arcjet, { detectBot } from "@arcjet/node";
const aj = arcjet({
key: process.env.ARCJET_KEY!,
rules: [
detectBot({
mode: "LIVE",
// configured with a list of bots to allow from
// https://arcjet.com/bot-list - all other detected bots will be blocked
allow: [
// Google has multiple crawlers, each with a different user-agent. Check
// the full list for more options
"GOOGLE_CRAWLER", // allows Google's main crawler
"GOOGLE_ADSBOT", // allows Google Adsbot
"GOOGLE_CRAWLER_NEWS", // allows Google News crawler
"CURL", // allows the default user-agent of the `curl` tool
"DISCORD_CRAWLER", // allows Discordbot
],
}),
],
});

Only denying specific bots

Some applications may only want to block a small subset of bots, while allowing the majority continued access. This may be due to many reasons, such as misconfigured or high-traffic bots.

This behavior is configured with an deny list from our full list of bots.

import arcjet, { detectBot } from "@arcjet/node";
const aj = arcjet({
key: process.env.ARCJET_KEY!,
rules: [
detectBot({
mode: "LIVE",
// configured with a list of bots to deny from
// https://arcjet.com/bot-list - all other detected bots will be allowed
deny: [
"PERPLEXITY_CRAWLER", // denies PerplexityBot
"CURL", // denies the default user-agent of the `curl` tool
"ANTHROPIC_CRAWLER", // denies Claudebot
],
}),
],
});

Decision

Arcjet provides the protect function which is used to execute your protection rules. This requires a request argument which is the request context as passed to the request handler.

This function returns a Promise that resolves to an ArcjetDecision object. This contains the following properties:

  • id (string) - The unique ID for the request. This can be used to look up the request in the Arcjet dashboard. It is prefixed with req_ for decisions involving the Arcjet cloud API. For decisions taken locally, the prefix is lreq_.
  • conclusion (ArcjetConclusion) - The final conclusion based on evaluating each of the configured rules. If you wish to accept Arcjet’s recommended action based on the configured rules then you can use this property.
  • reason (ArcjetReason) - An object containing more detailed information about the conclusion.
  • results (ArcjetRuleResult[]) - An array of ArcjetRuleResult objects containing the results of each rule that was executed.
  • ip (ArcjetIpDetails) - An object containing Arcjet’s analysis of the client IP address. See IP analysis in the SDK reference for more information.

See the SDK reference for more details about the rule results.

You check if a deny conclusion has been returned by a bot protection rule by using decision.isDenied() and decision.reason.isBot() respectively.

You can iterate through the results and check whether a bot protection rule was applied:

for (const result of decision.results) {
console.log("Rule Result", result);
}

This example will log the full result as well as the bot protection rule:

import arcjet, { fixedWindow, detectBot } from "@arcjet/node";
import http from "node:http";
const aj = arcjet({
key: process.env.ARCJET_KEY!, // Get your site key from https://app.arcjet.com
characteristics: ["ip.src"],
rules: [
fixedWindow({
mode: "LIVE",
window: "1h",
max: 60,
}),
detectBot({
mode: "LIVE",
allow: [], // "allow none" will block all detected bots
}),
],
});
const server = http.createServer(async function (
req: http.IncomingMessage,
res: http.ServerResponse,
) {
const decision = await aj.protect(req);
for (const result of decision.results) {
console.log("Rule Result", result);
if (result.reason.isRateLimit()) {
console.log("Rate limit rule", result);
}
if (result.reason.isBot()) {
console.log("Bot protection rule", result);
}
}
if (decision.isDenied()) {
if (decision.reason.isRateLimit()) {
res.writeHead(429, { "Content-Type": "application/json" });
res.end(
JSON.stringify({ error: "Too Many Requests", reason: decision.reason }),
);
res.end(JSON.stringify({ error: "Forbidden" }));
} else {
res.writeHead(403, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Forbidden" }));
}
} else {
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ message: "Hello world" }));
}
});
server.listen(8000);

Identified bots

The decision also contains all of the identified bots detected from the request. A request may be identified as zero, one, or more bots—all of which will be available on the decision.allowed and decision.denied properties.

import arcjet, { detectBot } from "@arcjet/node";
import http from "node:http";
const aj = arcjet({
key: process.env.ARCJET_KEY!, // Get your site key from https://app.arcjet.com
rules: [
detectBot({
mode: "LIVE", // will block requests. Use "DRY_RUN" to log only
allow: [], // "allow none" will block all detected bots
}),
],
});
const server = http.createServer(async function (
req: http.IncomingMessage,
res: http.ServerResponse,
) {
const decision = await aj.protect(req);
if (decision.reason.isBot()) {
console.log("detected + allowed bots", decision.reason.allowed);
console.log("detected + denied bots", decision.reason.denied);
}
if (decision.isDenied()) {
res.writeHead(403, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Forbidden" }));
} else {
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ message: "Hello world" }));
}
});
server.listen(8000);

Error handling

Arcjet is designed to fail open so that a service issue or misconfiguration does not block all requests. The SDK will also time out and fail open after 1000ms when NODE_ENV or ARCJET_ENV is development and 500ms otherwise. However, in most cases, the response time will be less than 20-30ms.

If there is an error condition, Arcjet will return an ERROR type and you can check the reason property for more information, like accessing decision.reason.message.

import arcjet, { detectBot } from "@arcjet/node";
import http from "node:http";
const aj = arcjet({
key: process.env.ARCJET_KEY!, // Get your site key from https://app.arcjet.com
rules: [
detectBot({
mode: "LIVE", // will block requests. Use "DRY_RUN" to log only
allow: [], // "allow none" will block all detected bots
}),
],
});
const server = http.createServer(async function (
req: http.IncomingMessage,
res: http.ServerResponse,
) {
const decision = await aj.protect(req);
// If the request is missing a User-Agent header, the decision will be
// marked as an error! You should check for this and make a decision about
// the request since requests without a User-Agent could indicate a crafted
// request from an automated client.
if (decision.isErrored()) {
// Fail open by logging the error and continuing
console.warn("Arcjet error", decision.reason.message);
// You could also fail closed here if the request is missing a User-Agent
//res.writeHead(503, { "Content-Type": "application/json" });
//res.end(JSON.stringify({ error: "Service unavailable" }));
}
if (decision.isDenied()) {
res.writeHead(403, { "Content-Type": "application/json" });
res.end(JSON.stringify({ error: "Forbidden" }));
} else {
res.writeHead(200, { "Content-Type": "application/json" });
res.end(JSON.stringify({ message: "Hello world" }));
}
});
server.listen(8000);

Testing

Arcjet runs the same in any environment, including locally and in CI. You can use the mode set to DRY_RUN to log the results of rule execution without blocking any requests.

We have an example test framework you can use to automatically test your rules. Arcjet can also be triggered based using a sample of your traffic.

See the Testing section of the docs for details.

Discussion