Configuration

Feedstock uses two configuration objects: BrowserConfig for browser-level settings, and CrawlerRunConfig for per-crawl behavior.

BrowserConfig

Controls the browser instance. Set once when creating the WebCrawler.

import { createBrowserConfig } from "feedstock";

const config = createBrowserConfig({
  browserType: "chromium",   // "chromium" | "firefox" | "webkit"
  headless: true,
  viewport: { width: 1920, height: 1080 },
  userAgent: "my-bot/1.0",
  proxy: { server: "http://proxy:8080" },
  backend: { kind: "playwright" },
});

Option	Type	Default	Description
`browserType`	`"chromium" \| "firefox" \| "webkit"`	`"chromium"`	Browser engine
`headless`	`boolean`	`true`	Run without UI
`viewport`	`{ width, height }`	`1920x1080`	Page viewport size
`userAgent`	`string \| null`	`null`	Custom user agent
`proxy`	`ProxyConfig \| null`	`null`	Proxy server
`ignoreHttpsErrors`	`boolean`	`true`	Ignore SSL errors
`javaEnabled`	`boolean`	`true`	Enable JavaScript
`extraArgs`	`string[]`	`[]`	Extra browser launch args
`textMode`	`boolean`	`false`	Text-only mode
`backend`	`BrowserBackend`	`{ kind: "playwright" }`	Browser backend
`verbose`	`boolean`	`false`	Enable verbose logging

CrawlerRunConfig

Controls per-crawl behavior. Pass to crawl() or crawlMany().

import { createCrawlerRunConfig, CacheMode } from "feedstock";

const config = createCrawlerRunConfig({
  cacheMode: CacheMode.Bypass,
  waitFor: { kind: "selector", value: "#loaded" },
  screenshot: true,
  excludeTags: ["nav", "footer", "aside"],
  cssSelector: "article",
});

Content Options

Option	Type	Default	Description
`wordCountThreshold`	`number`	`10`	Min words for content
`excludeTags`	`string[]`	`[]`	HTML tags to strip
`includeTags`	`string[]`	`[]`	Only keep these tags
`removeOverlayElements`	`boolean`	`false`	Remove modals/popups
`cssSelector`	`string \| null`	`null`	Extract only matching elements
`generateMarkdown`	`boolean`	`true`	Generate markdown output

Browser Behavior

Option	Type	Default	Description
`jsCode`	`string \| string[] \| null`	`null`	JavaScript to execute
`waitFor`	`WaitForType \| null`	`null`	Wait condition
`waitAfterLoad`	`number`	`0`	Additional wait (ms)
`pageTimeout`	`number`	`60000`	Navigation timeout (ms)

Wait Conditions

// Wait for a CSS selector
{ kind: "selector", value: "#content", timeout: 5000 }

// Wait for network idle
{ kind: "networkIdle" }

// Wait a fixed delay
{ kind: "delay", ms: 2000 }

// Wait for a JS function to return truthy
{ kind: "function", fn: "() => document.readyState === 'complete'" }

Capture Options

Option	Type	Default	Description
`screenshot`	`boolean`	`false`	Capture full-page screenshot
`pdf`	`boolean`	`false`	Capture page as PDF
`captureNetworkRequests`	`boolean`	`false`	Log network requests
`captureConsoleMessages`	`boolean`	`false`	Log console output

Layered Configuration

Feedstock supports a three-layer configuration system. Each layer overrides the one below it:

Programmatic overrides — values passed to createBrowserConfig() / createCrawlerRunConfig()
Environment variables — FEEDSTOCK_* vars
Project config file — feedstock.json in your project root
Built-in defaults

Project Config File

Create a feedstock.json in your project root (or any ancestor directory). Feedstock walks up from cwd to find it.

{
  "browser": {
    "browserType": "chromium",
    "headless": true,
    "stealth": true,
    "verbose": false
  },
  "crawl": {
    "pageTimeout": 30000,
    "screenshot": false,
    "generateMarkdown": true
  }
}

The file accepts two top-level keys: browser (partial BrowserConfig) and crawl (partial CrawlerRunConfig).

Environment Variables

Set FEEDSTOCK_* environment variables to override project file settings. Useful for CI/CD and Docker deployments.

Variable	Type	Maps to
`FEEDSTOCK_BROWSER_TYPE`	`string`	`browser.browserType`
`FEEDSTOCK_HEADLESS`	`"true" \| "false"`	`browser.headless`
`FEEDSTOCK_USER_AGENT`	`string`	`browser.userAgent`
`FEEDSTOCK_STEALTH`	`"true" \| "false"`	`browser.stealth`
`FEEDSTOCK_VERBOSE`	`"true" \| "false"`	`browser.verbose`
`FEEDSTOCK_TEXT_MODE`	`"true" \| "false"`	`browser.textMode`
`FEEDSTOCK_CDP_URL`	`string`	`browser.backend` (sets `{ kind: "cdp", wsUrl }`)
`FEEDSTOCK_PROXY`	`string`	`browser.proxy.server`
`FEEDSTOCK_PROXY_USERNAME`	`string`	`browser.proxy.username`
`FEEDSTOCK_PROXY_PASSWORD`	`string`	`browser.proxy.password`
`FEEDSTOCK_PAGE_TIMEOUT`	`number`	`crawl.pageTimeout`
`FEEDSTOCK_SCREENSHOT`	`"true" \| "false"`	`crawl.screenshot`
`FEEDSTOCK_BLOCK_RESOURCES`	`"true" \| "false" \| profile name`	`crawl.blockResources`
`FEEDSTOCK_GENERATE_MARKDOWN`	`"true" \| "false"`	`crawl.generateMarkdown`

Set FEEDSTOCK_CDP_URL in your environment and your code doesn't need to change between local development and CI — the layered config picks it up automatically.

Using `loadConfig()`

The loadConfig() function merges the project file and environment variable layers. Spread the result into your config creators to apply all layers:

import {
  loadConfig,
  createBrowserConfig,
  createCrawlerRunConfig,
  WebCrawler,
} from "feedstock";

const layered = loadConfig();

const browserConfig = createBrowserConfig({
  ...layered.browser,
  // Programmatic overrides (highest precedence)
  headless: false,
});

const crawlConfig = createCrawlerRunConfig({
  ...layered.crawl,
  screenshot: true,
});

const crawler = new WebCrawler(browserConfig);
const result = await crawler.crawl("https://example.com", crawlConfig);

loadConfig() accepts an optional { startDir } to control where the project file search begins (defaults to process.cwd()).

On this page