summaryrefslogtreecommitdiff
path: root/docs/configuration/domain-filter.md
blob: e8adc30348780e575f05145a55d7a499d897f343 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# Domain Filter

The `allowedDomains` and `blockedDomains` config options allow you to specify a list of domains which are accessible or blocked during scraping.

```javascript
export const options = {
    url: "http://example.com/",
    allowedDomains: ["subdomain.example.com"],
    // ...
};
```

### `allowedDomains`

This config option controls which additional domains are allowed to be visted during scraping. The domain of the initial URL is always allowed.

You can also allow all domains to be accessible by setting `allowedDomains` to `["*"]`. To then further restrict access, you can specify `blockedDomains`.

Example:

```javascript
export const options = {
    url: "http://example.com/",
    allowedDomains: ["*"],
    // ...
};
```

### `blockedDomains`

This config option controls which additional domains are blocked from being accessed. By default all domains other than the domain of the initial URL or those specified in `allowedDomains` are blocked.

You can best use `blockedDomains` in conjunction with `allowedDomains: ["*"]`, allowing the scraping process to access all domains except what's specified in `blockedDomains`.

Example:

```javascript
export const options = {
    url: "http://example.com/",
    allowedDomains: ["*"],
    blockedDomains: ["google.com", "bing.com"],
    // ...
};
```