Skip to content

Arcjet bot protection

Arcjet bot detection allows you to manage traffic by automated clients and bots.

What is Arcjet? Arcjet helps developers protect their apps in just a few lines of code. Bot detection. Rate limiting. Email validation. Attack protection. Data redaction. A developer-first approach to security.

Why configure bot protection?

Bots can be good (such as search engine crawlers or monitoring agents) or bad (such as scrapers or automated scripts). You might also want to allow automated clients access to your API (even though they might seem like bots), but deny access to a signup form. In an ideal world, all bots would respect the robots.txt file, but unfortunately this is not always the case. Arcjet allows you to configure bot protection so you can decide which bots are allowed.

Bot detection realities

It’s impossible to create a system that can block all bots. Good bots identify themselves in a consistent manner so it is straightforward to identify them, but those are often not the bots you wish to block! Bad bots pretend to be real clients and use various mechanisms to evade bot detection.

The Arcjet Pro or Enterprise plans provides additional bot verification using IP analysis, which can help if you are under attack from bots pretending to be good bots e.g. clients pretending to be Google (who you usually want to allow).

Arcjet’s bot protect works best when combined with rate limiting. This allows you to block clients that are causing problems whilst minimizing the impact on real users. See an example of this in the Node.js SDK reference.

Arcjet will help you reduce bot traffic and give you more control over which requests reach your application, but it’s important to understand that it’s not possible to achieve 100% accuracy.

User-Agent header

Requests without User-Agent headers can not be identified as any particular bot and will be marked as an errored decision. Check decision.isErrored() and decide if you want to allow or deny the request.

Our recommendation is to block requests without User-Agent headers before calling Arcjet because most legitimate clients always send this header:

if (!request.headers.get("User-Agent")) {
log.warn("request missing required user-agent header");
// Return a 400 Bad request error here
// Next.js example:
// return NextResponse.json({ error: "Bad request" }, { status: 400 });
// Node.js example:
// res.writeHead(400, { "Content-Type": "application/json" });
// res.end(JSON.stringify({ error: "Bad request" }));
}

An alternative approach is to check the error message after the decision is made:

if (decision.isErrored()) {
if (decision.reason.message.includes("requires user-agent header")) {
log.warn(
{ error: decision.reason.message },
"request missing required user-agent header",
);
// You could return a 400 Bad request error here
// Next.js example:
// return NextResponse.json({ error: "Bad request" }, { status: 400 });
// Node.js example:
// res.writeHead(400, { "Content-Type": "application/json" });
// res.end(JSON.stringify({ error: "Bad request" }));
} else {
// Just log the error and continue
log.error({ error: decision.reason.message }, "Arcjet error");
}
}

Blocking based on fingerprint

If Arcjet detects bot traffic, it will block further traffic based on a fingerprint of the client, which includes the IP address. This could temporarily block traffic from legitimate clients using the same IP address. Although this trade-off might affect a small number users on shared networks, we are constantly improving our systems to minimize false-positives while ensuring overall security.

Discussion