Learn how to optimize LLM costs, build RAG systems, create multi-agent chats, and more.
Step-by-step guide to cutting your LLM costs with verified savings
Find where you're spending the most tokens
// Analyze your token usage
const usage = await client.estimateTokens(prompt, { mode: "medium" });
console.log(`Current cost: ${usage.original_tokens} tokens`);Use balanced mode for optimal savings
const result = await client.optimize(prompt, {
mode: "balanced", // Best balance of savings and quality
format: "auto"
});Reduce chat history token usage
const compressed = await client.compressHistory(
messages,
currentInput,
{ mode: "balanced" }
);Track your cost reduction
console.log(`Saved ${result.tokensSaved} tokens`);
console.log(`Cost reduction: ${result.compression}%`);