A zero-dependency CLI for picking the right local model

Ollama Model Tester runs any prompt against every model on your machine and saves responses, latency, and token counts for direct comparison.

Alessandro Benigni

PUBLISHED JUN 6, 2026

1 MIN READ

Follow on Google

-1100 MIN AGO

A zero-dependency CLI for picking the right local model — featured image for AI Insiders

The hard part of running local models is no longer getting them running. Developers who use Ollama now routinely have a dozen weights pulled and no fast way to know which one handles a specific task best.

Ollama Model Tester, an MIT-licensed CLI released by Ulysses Tenn on GitHub, solves that directly. Give it a prompt and it fires that prompt against whichever local models you select, repeats each run N times at a chosen temperature, and writes responses plus Ollama metadata (token counts, timing) to a structured folder keyed on the prompt. Same prompt, different model, same folder: comparison is the default output shape, not an afterthought.

The tool requires only Python 3.7 and a running Ollama instance. No pip install. Fully scriptable via flags once you drop the interactive setup.

This is the unglamorous infrastructure the local-models movement actually needs. Model releases are outpacing the tooling to evaluate them for real workloads. If your team has a half-dozen fine-tunes or quantized weights in rotation, a structured empirical test beats reading benchmark leaderboards that were not run on your hardware or your prompts.

Ulysses Tenn on GitHub (github.com/ulyssestenn/omt), 2026-06-04.

A zero-dependency CLI for picking the right local model

The morning brief for people inside the AI industry.

More in Wire

Alibaba publishes the distillation recipe, not just the model

ServiceNow ships EVA-Bench 2.0 with 121 tools and 213 scenarios

Ideogram releases open-weight image model built on JSON prompts