Poor Paul's Benchmark MCP Server

Ask an LLM to recommend the right quantization for your hardware. The ppb-mcp server connects any MCP-aware client to 30,000+ real GPU benchmark rows — throughput, TTFT, ITL, VRAM, and power data across hundreds of model × GPU × quantization combinations.

Connect in 60 Seconds

Hosted (recommended)

Add to your MCP client config. Claude Desktop example (~/Library/Application Support/Claude/claude_desktop_config.json):

claude_desktop_config.jsonjson
{
  "mcpServers": {
    "ppb": {
      "transport": { "type": "http", "url": "https://mcp.poorpaul.dev/mcp" }
    }
  }
}

No API key. No sign-up. Restart Claude Desktop and the tools are available.

Run locally

~/bash
pip install ppb-mcp
MCP_TRANSPORT=stdio ppb-mcp

Claude Desktop config for local stdio:

claude_desktop_config.jsonjson
{
  "mcpServers": {
    "ppb": {
      "command": "ppb-mcp",
      "env": { "MCP_TRANSPORT": "stdio" }
    }
  }
}

What You Can Ask

prompts
> Which quantization should I run on my RTX 4090 with 24 GB VRAM for 2 concurrent users?
> Show me every Q4_K_M result on the RTX 5090.
> Will Llama-13B at Q5_K_M fit on a 16 GB GPU at 4 concurrent users?
> What's the most power-efficient quantization for Qwen3.5-7B?
> Compare throughput for Q4_K_M vs Q8_0 on 24 GB GPUs.

The Four Tools

ToolWhat it does
recommend_quantizationEmpirical-first recommendation engine — returns best quant for your VRAM and user count with confidence level (high / medium / low)
query_ppb_resultsFilter raw benchmark rows by GPU, model, quantization, VRAM range, or concurrency
get_gpu_headroomSanity-checks a (GPU, model, quant, users) config — tells you VRAM required vs available
list_tested_configsLists every tested GPU, model, and quantization — call this first to orient yourself

Call list_tested_configs first when exploring — it gives you the exact GPU names, model names, and quantization strings used in the dataset, so your follow-up queries match.

Server Details

Endpointhttps://mcp.poorpaul.dev/mcp
TransportStreamable HTTP (MCP 2025-03-26 spec)
AuthNone — public, read-only
Healthhttps://mcp.poorpaul.dev/health
Datasetpaulplee/ppb-results
Sourcegithub.com/paulplee/ppb-mcp
Installpip install ppb-mcp

The server is open source and self-hostable. If you want to run your own instance against a custom dataset, see the ppb-mcp README for configuration. To contribute benchmark results that make the recommendations better, see poor-pauls-benchmark.

Back to the leaderboard or insights.