summaryrefslogtreecommitdiff
path: root/content/docs/configuration/url-filter.md
diff options
context:
space:
mode:
authorPhilipp Tanlak <philipp.tanlak@gmail.com>2025-11-24 20:54:57 +0100
committerPhilipp Tanlak <philipp.tanlak@gmail.com>2025-11-24 20:57:48 +0100
commitb1e2c8fd5cb5dfa46bc440a12eafaf56cd844b1c (patch)
tree49d360fd6cbc6a2754efe93524ac47ff0fbe0f7d /content/docs/configuration/url-filter.md
Docs
Diffstat (limited to 'content/docs/configuration/url-filter.md')
-rw-r--r--content/docs/configuration/url-filter.md42
1 files changed, 42 insertions, 0 deletions
diff --git a/content/docs/configuration/url-filter.md b/content/docs/configuration/url-filter.md
new file mode 100644
index 0000000..80d3544
--- /dev/null
+++ b/content/docs/configuration/url-filter.md
@@ -0,0 +1,42 @@
+---
+title: 'URL Filter'
+weight: 4
+prev: /docs/getting-started
+---
+
+The `allowedURLs` and `blockedURLs` config options allow you to specify a list of URL patterns (in form of regular expressions) which are accessible or blocked during scraping.
+
+```javascript {filename="Configuration"}
+export const options = {
+ url: "http://example.com/",
+ allowedURLs: ["/articles/.*", "/authors/.*"],
+ blockedURLs: ["/authors/admin"],
+ // ...
+};
+```
+
+## Allowed URLs
+
+This config option controls which URLs are allowed to be visted during scraping. When no value is provided all URLs are allowed to be visited if not otherwise blocked.
+
+When a list of URL patterns is provided, only URLs matching one or more of these patterns are allowed to be visted.
+
+```javascript {filename="Configuration"}
+export const options = {
+ url: "http://example.com/",
+ allowedURLs: ["/products/"],
+};
+```
+
+## Blocked URLs
+
+This config option controls which URLs are blocked from being visted during scraping.
+
+When a list of URL patterns is provided, URLs matching one or more of these patterns are blocked from to be visted.
+
+```javascript {filename="Configuration"}
+export const options = {
+ url: "http://example.com/",
+ blockedURLs: ["/restricted"],
+};
+```