Token-efficient web browser
for LLM agents
A typical web page is 50,000+ tokens. The useful content? 2,000–5,000. BotBrowser strips the bloat and returns clean markdown — saving 90–95% of tokens.
No API key. No server. No config. Just install and extract.
Three lines to clean markdown
import { extract } from 'botbrowser';
const result = await extract('https://example.com');
console.log(result.content);
// # Article Title
// Clean markdown content...
console.log(result.metadata.tokenSavingsPercent);
// 94 from botbrowser import extract
result = extract("https://example.com")
print(result.content)
# # Article Title
# Clean markdown content...
print(result.metadata.token_savings_percent)
# 94 How it works
URL → Fetch → Extract → Clean → Markdown
Fetch
Smart HTTP with user-agent rotation, redirect handling, timeouts
Extract
Identifies main content using Readability (JS) / Trafilatura (Python)
Clean
Strips scripts, styles, ads, nav, footers, cookie banners, tracking
Convert
Clean Markdown preserving headings, lists, links, tables, code blocks
Why BotBrowser?
Purpose-built for LLM agents that need to read the web.
Token-first
Built specifically to minimize LLM token usage. Every design decision optimizes for fewer tokens while preserving meaning.
Dual native SDKs
Real implementations in both JS and Python, not thin wrappers. Use whichever fits your stack.
Zero setup
npm install or pip install. No API key, no account, no server to run. Works offline.
Battle-tested extraction
Mozilla Readability and Trafilatura — the same engines powering Firefox Reader View and academic web research.
Open source
MIT licensed. Self-host, fork, embed, do what you want. No vendor lock-in.
MCP ready
Hosted MCP server for AI agents at scale. JS rendering, batch processing, search + extract.
Pricing
Start free with open source. Scale with the hosted service.
Open Source
Self-hosted extraction engine
- ✓ npm install / pip install
- ✓ 100% local, no server
- ✓ Markdown + text output
- ✓ Link extraction
- ✓ Works offline
- ✓ MIT licensed
Pro
Hosted MCP server for agents at scale
- ✓ 10,000 requests/day
- ✓ JS rendering (Playwright)
- ✓ Token budget summarization
- ✓ CSS selector targeting
- ✓ Batch extraction (10 URLs)
- ✓ Search + extract
- ✓ 24-hour caching
- ✓ Email support
Enterprise
Custom infrastructure + SLA
- ✓ Custom request limits
- ✓ All Pro features
- ✓ Priority queue
- ✓ Custom caching policy
- ✓ Webhook callbacks
- ✓ Dedicated support
- ✓ 99.9% SLA
- ✓ Audit logs