Peak AI API
An OpenAI-compatible chat completions API. Point any OpenAI SDK at https://ai.pearlfibers.com/v1 and bring a Peak API key.
https://ai.pearlfibers.com/v1
Authorization: Bearer sk-peak-…
Models & pricing
Rates are in USD per million tokens. Costs are computed per request and frozen at write time — rate changes do not affect historical bills.
| Model ID | Display name | Context | Capabilities | Input | Output |
|---|---|---|---|---|---|
peak-v4.3 |
Peak AI V4.3 | 65,536 tokens | text | $1.25/MTok | $10/MTok |
peak-v4.3-light |
Peak AI V4.3 Light | 65,536 tokens | text | $0.3/MTok | $2.5/MTok |
peak-v4.3-vision |
Peak AI V4.3 Vision | 32,768 tokens | vision | $1.25/MTok | $10/MTok |
Quickstart
curl
curl https://ai.pearlfibers.com/v1/chat/completions \
-H "Authorization: Bearer $PEAK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "peak-v4.3-light",
"messages": [
{"role": "system", "content": "You are concise."},
{"role": "user", "content": "What is the capital of France?"}
]
}'
Python (openai SDK)
from openai import OpenAI
client = OpenAI(
api_key="sk-peak-...",
base_url="https://ai.pearlfibers.com/v1",
)
resp = client.chat.completions.create(
model="peak-v4.3",
messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)
Node.js (openai SDK)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.PEAK_API_KEY,
baseURL: "https://ai.pearlfibers.com/v1",
});
const resp = await client.chat.completions.create({
model: "peak-v4.3",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(resp.choices[0].message.content);
LangChain
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="peak-v4.3",
api_key="sk-peak-...",
base_url="https://ai.pearlfibers.com/v1",
)
print(llm.invoke("Hello!").content)
Streaming
Set stream: true to receive Server-Sent Events. Each event is a chat.completion.chunk, ending with data: [DONE]. Token usage arrives as a final chunk before [DONE].
resp = client.chat.completions.create(
model="peak-v4.3-light",
messages=[{"role": "user", "content": "Stream me a haiku."}],
stream=True,
)
for chunk in resp:
delta = chunk.choices[0].delta.content if chunk.choices else None
if delta:
print(delta, end="", flush=True)
Rate limits
Limits are enforced per API key on a tumbling 60-second window. The defaults are 60 requests/min and 200,000 tokens/min — your workspace admin can raise or lower these per key.
Every response carries these headers:
X-RateLimit-Limit-Requests/X-RateLimit-Remaining-RequestsX-RateLimit-Limit-Tokens/X-RateLimit-Remaining-TokensX-RateLimit-Reset— seconds until the bucket resets
When you hit a limit you'll get 429 with Retry-After. A monthly usage cap (if set by your admin) returns 402 usage_cap_exceeded until the cap is raised or the month rolls over.
Errors
All errors use OpenAI's shape so existing SDK error handlers work unchanged:
{
"error": {
"message": "Human-readable explanation.",
"type": "invalid_request_error",
"code": "missing_model"
}
}
| HTTP | Code | Meaning |
|---|---|---|
| 400 | missing_model | The request didn't include a model field. |
| 400 | missing_messages | The request didn't include a messages array. |
| 400 | invalid_messages / invalid_role | A message was malformed or used an unsupported role. |
| 401 | missing_authorization | No Authorization: Bearer header. |
| 401 | invalid_key_format | The key doesn't match sk-peak-<64 hex>. |
| 401 | invalid_key / revoked_key | Unknown or revoked key. |
| 402 | usage_cap_exceeded | This account hit its monthly cap. |
| 403 | access_revoked | API access has been revoked for the workspace. |
| 404 | model_not_found | That model isn't active on Peak AI. |
| 429 | requests_per_minute_exceeded | RPM bucket full — see Retry-After. |
| 429 | tokens_per_minute_exceeded | TPM bucket full — see Retry-After. |
| 502 | upstream_error | The upstream Peak provider returned an error. |