summaryrefslogtreecommitdiff
path: root/content/docs/configuration/link-following.md
diff options
context:
space:
mode:
authorPhilipp Tanlak <philipp.tanlak@gmail.com>2025-11-24 20:54:57 +0100
committerPhilipp Tanlak <philipp.tanlak@gmail.com>2025-11-24 20:57:48 +0100
commitb1e2c8fd5cb5dfa46bc440a12eafaf56cd844b1c (patch)
tree49d360fd6cbc6a2754efe93524ac47ff0fbe0f7d /content/docs/configuration/link-following.md
Docs
Diffstat (limited to 'content/docs/configuration/link-following.md')
-rw-r--r--content/docs/configuration/link-following.md33
1 files changed, 33 insertions, 0 deletions
diff --git a/content/docs/configuration/link-following.md b/content/docs/configuration/link-following.md
new file mode 100644
index 0000000..b9755f7
--- /dev/null
+++ b/content/docs/configuration/link-following.md
@@ -0,0 +1,33 @@
+---
+title: 'Link Following'
+weight: 5
+---
+
+The `follow` config option allows you to specify a list of CSS selectors that determine which links the scraper should follow.
+
+When no value is provided the scraper will follow all links found with the `a[href]` selector.
+
+```javascript {filename="Configuration"}
+export const config = {
+ url: "http://example.com/",
+ follow: [
+ ".pagination > a[href]",
+ ".nav a[href]",
+ ],
+ // ...
+};
+```
+
+## Following non `href` attributes
+
+For special cases where the link is not to be found in the `href`, you specify a selector with a different ending attribute.
+
+```javascript {filename="Configuration"}
+export const config = {
+ url: "http://example.com/",
+ follow: [
+ ".articles > div[data-url]",
+ ],
+ // ...
+};
+```