Getting Started

Installation

Install the package

bun add feedstock

Install Playwright browsers

bunx playwright install chromium

Your First Crawl

import { WebCrawler } from "feedstock";

const crawler = new WebCrawler();

const result = await crawler.crawl("https://example.com");

if (result.success) {
  console.log("Title:", result.metadata?.title);
  console.log("Markdown:", result.markdown?.rawMarkdown);
  console.log("Links found:", result.links.internal.length);
}

await crawler.close();

Crawl Multiple URLs

const results = await crawler.crawlMany(
  [
    "https://example.com",
    "https://example.com/about",
    "https://example.com/products",
  ],
  { cacheMode: CacheMode.Bypass },
  { concurrency: 3 },
);

for (const result of results) {
  console.log(`${result.url}: ${result.success}`);
}

Process Raw HTML

You can process HTML without launching a browser:

const html = "<html><body><h1>Hello</h1><p>World</p></body></html>";
const result = await crawler.processHtml(html);

console.log(result.markdown?.rawMarkdown);
// # Hello\n\nWorld

Project Structure

index.ts

crawler.ts

config.ts

models.ts

Installation

Install the package

Install Playwright browsers

Your First Crawl

Crawl Multiple URLs

Process Raw HTML

Project Structure

On this page