Skip to content

Bot protection reference

Arcjet bot detection allows you to manage traffic by automated clients and bots.

Configuration

Bot detection is configured by allowing or denying a subset of bots. The allow and deny lists are mutually-exclusive, such that using allow will result in a DENY decision for any detected bot that is not specified in the allow list and using deny will result in an ALLOW decision for any detected bot that is not specified in the deny list.

You can use only one of the following configuration definitions:

type BotOptionsAllow = {
mode?: "LIVE" | "DRY_RUN";
allow: Array<ArcjetWellKnownBot | ArcjetBotCategory>;
};
type BotOptionsDeny = {
mode?: "LIVE" | "DRY_RUN";
deny: Array<ArcjetWellKnownBot | ArcjetBotCategory>;
};

The arcjet client is configured with one or more detectBot rules which take one or many BotOptions.

Allowing specific bots

Most applications want to block almost all bots. However, it is common to allow some bots to access your system, such as bots for search indexing or API access from the command line.

When allowing specific bots we recommend that you also check the verification status after an allow decision is returned to ensure that the bots are who they say they are.

This behavior is configured with an allow list from our full list of bots and/or bot categories.

Denying specific bots

Some applications may only want to block a small subset of bots, while allowing the majority continued access. This may be due to many reasons, such as misconfigured or high-traffic bots.

This behavior is configured with a deny list from our full list of bots and/or bot categories.

Decision

The quick start example will deny requests that match the bot detection rules, immediately returning a response to the client.

Arcjet also provides a single protect function that is used to execute your protection rules. This requires a request argument which is the request context as passed to the request handler.

This function returns a Promise that resolves to an ArcjetDecision object. This contains the following properties:

  • id (string) - The unique ID for the request. This can be used to look up the request in the Arcjet dashboard. It is prefixed with req_ for decisions involving the Arcjet cloud API. For decisions taken locally, the prefix is lreq_.
  • conclusion (ArcjetConclusion) - The final conclusion based on evaluating each of the configured rules. If you wish to accept Arcjet’s recommended action based on the configured rules then you can use this property.
  • reason (ArcjetReason) - An object containing more detailed information about the conclusion.
  • results (ArcjetRuleResult[]) - An array of ArcjetRuleResult objects containing the results of each rule that was executed.
  • ip (ArcjetIpDetails) - An object containing Arcjet’s analysis of the client IP address. See the SDK reference for more information.

You check if a deny conclusion has been returned by a bot protection rule by using decision.isDenied() and decision.reason.isBot() respectively.

You can iterate through the results and check whether a bot protection rule was applied:

for (const result of decision.results) {
console.log("Rule Result", result);
}

Identified bots

The decision also contains all of the identified bots and matched categories detected from the request. A request may be identified as zero, one, or more bots/categories—all of which will be available on the decision.allowed and decision.denied properties.

Error handling

Arcjet is designed to fail open so that a service issue or misconfiguration does not block all requests. The SDK will also time out and fail open after 1000ms when NODE_ENV or ARCJET_ENV is development and 500ms otherwise. However, in most cases, the response time will be less than 20-30ms.

If there is an error condition, Arcjet will return an ERROR type and you can check the reason property for more information, like accessing decision.reason.message.

Filtering categories

All categories are also provided as enumerations, which allows for programmatic access. For example, you may want to allow most of CATEGORY:GOOGLE except their “advertising quality” bot.

Bot verification

Requests analyzed by Arcjet for users on the Pro plan or above automatically get additional bot verification. Behind the scenes, Arcjet will verify the authenticity of requests from known bots using IP data and reverse DNS queries.

This helps protect against spoofed bots where clients pretend to be someone else. For example, we can detect if a client is really Googlebot by checking if the request IP is within Google’s published IP ranges.

If we detect a spoofed bot (or successfully verify a bot), additional metadata will be added to the response decision. This does not currently mark a request as denied, so we recommend checking this in your code.

Check for spoofed bots

This will check if the bot is spoofed. You would usually return a 403 or similar response to block the request.

if (decision.reason.isBot() && decision.reason.isSpoofed()) {
console.log("Detected spoofed bot", decision.reason.spoofed);
// Return a 403 or similar response
}

Check bot verification

This will check if the bot is verified.

if (decision.reason.isBot() && decision.reason.isVerified()) {
console.log("Verified bot", decision.reason.verified);
// Allow the request
}

Testing

Arcjet runs the same in any environment, including locally and in CI. You can use the mode set to DRY_RUN to log the results of rule execution without blocking any requests.

We have an example test framework you can use to automatically test your rules. Arcjet can also be triggered based using a sample of your traffic.

See the Testing section of the docs for details.

Discussion