Repogen

The private inference layer for AI agents

Point your agent at one endpoint and reach every open models. It pays per call in USDC and no logs. Repogen routes each call to the right provider and caps what the agent can spend.

Arena

Read the quickstart

Repogen

Get started

Docs

BUILT FOR AGENTS

Built for agents you can't watch every second.

Built for agents you can’t watch every second.

Private by default.

Your agent makes hundreds of calls carrying real data, repogen keeps nothing but the token count it needs to settle the bill. Never a prompt, never a reply. Route to a sealed enclave when even we should not see it

Spend control built in.

Set a hard limit per agent, per task, and per day. When an agent hits its cap, repogen stops the call instead of letting it run. No runaway loops, no surprise bills.

Pay per call.

Your agent pays for each request on Base via USDC and authorized up to a ceiling. Your wallet balance and transaction history are always on-chain

DROP-IN

Point your agent at repogen.

Keep your framework. Change the base URL and key, or add one MCP server, and every call your agent makes runs through repogen.

Open AI

MCP

from openai import OpenAI

client = OpenAI(
base_url=”https://api.repogen.xyz/v1”,
api_key=”rg_live_…”,
)

response = client.chat.completions.create(
model=”meta-llama/llama-3.1-70b-instruct”,
messages=[{role: user, content: Hello}],
)

Open Ai

MCP

from openai import OpenAI

client = OpenAI(

base_url=“https://api.repogen.xyz/v1”, # the only change

api_key=“rg_live_…,”,

)

response = client.chat.completions.create(

model=“meta-llama/llama-3.1-70b-instruct”,

messages=[{“role”: “user”, “content”: “Hello”}],

}

Same format your agent already speaks. Tools, streaming, and model calls keep working.