TinyFish, the Palo Alto agent infrastructure company, shipped Bigset as an open-source release on June 2. The system accepts a natural-language request and delivers a structured, export-ready dataset by dispatching agents against the live web.
The architecture splits the work across two roles. An orchestrator agent handles breadth-first discovery and delegates row-filling to sub-agents, each capped at six tool calls. Sub-agents are instructed to leave fields blank rather than guess, and duplicate primary keys are rejected automatically. Datasets build in two to five minutes and can be set to refresh on a cadence as short as thirty minutes.
Bigset runs self-hosted via Docker. Schema inference defaults to Claude Sonnet 4.6; agent roles default to Qwen3.7-max, both routed through OpenRouter and swappable per role. The free tier covers 2,500 row operations per month. TinyFish notes the project is experimental and performs best on topics with broad public web coverage.
For analysts and operators who have been stitching together scrapers and parsers to answer structured research questions, Bigset is the kind of tool worth a quick evaluation run before committing to a paid data vendor.
Testing Catalog (testingcatalog.com) reporting on TinyFish, 2026-06-02.