Poor Paul's Benchmark MCP Server

Ask an LLM to recommend the right quantization for your hardware. The ppb-mcp server connects any MCP-aware client to 30,000+ real GPU benchmark rows — throughput, TTFT, ITL, VRAM, and power data across hundreds of model × GPU × quantization combinations.

Connect in 60 Seconds

Hosted (recommended)

Add to your MCP client config. Claude Desktop example (~/Library/Application Support/Claude/claude_desktop_config.json):

claude_desktop_config.jsonjson

{
  "mcpServers": {
    "ppb": {
      "transport": { "type": "http", "url": "https://mcp.poorpaul.dev/mcp" }
    }
  }
}

No API key. No sign-up. Restart Claude Desktop and the tools are available.

Run locally

~/bash

pip install ppb-mcp
MCP_TRANSPORT=stdio ppb-mcp

Claude Desktop config for local stdio:

claude_desktop_config.jsonjson

{
  "mcpServers": {
    "ppb": {
      "command": "ppb-mcp",
      "env": { "MCP_TRANSPORT": "stdio" }
    }
  }
}

What You Can Ask

prompts

> Which quantization should I run on my RTX 4090 with 24 GB VRAM for 2 concurrent users?

> Show me every Q4_K_M result on the RTX 5090.

> Will Llama-13B at Q5_K_M fit on a 16 GB GPU at 4 concurrent users?

> What's the most power-efficient quantization for Qwen3.5-7B?

> Compare throughput for Q4_K_M vs Q8_0 on 24 GB GPUs.

The Four Tools

Tool	What it does
`recommend_quantization`	Empirical-first recommendation engine — returns best quant for your VRAM and user count with confidence level (high / medium / low)
`query_ppb_results`	Filter raw benchmark rows by GPU, model, quantization, VRAM range, or concurrency
`get_gpu_headroom`	Sanity-checks a (GPU, model, quant, users) config — tells you VRAM required vs available
`list_tested_configs`	Lists every tested GPU, model, and quantization — call this first to orient yourself

Call list_tested_configs first when exploring — it gives you the exact GPU names, model names, and quantization strings used in the dataset, so your follow-up queries match.

Server Details

Endpoint	`https://mcp.poorpaul.dev/mcp`
Transport	Streamable HTTP (MCP 2025-03-26 spec)
Auth	None — public, read-only
Health	`https://mcp.poorpaul.dev/health`
Dataset	paulplee/ppb-results
Source	github.com/paulplee/ppb-mcp
Install	`pip install ppb-mcp`

The server is open source and self-hostable. If you want to run your own instance against a custom dataset, see the ppb-mcp README for configuration. To contribute benchmark results that make the recommendations better, see poor-pauls-benchmark.

Back to the leaderboard or insights.