Resources · Insights
Blog
AI engineering, product updates, best practices, and deep dives — straight from the team building Ofinis.
How we built real-time web grounding into every model call
A deep dive into the retrieval pipeline that fetches, ranks, and summarises live web results before generation — and why it changes everything for accuracy.
Read article →Migrating from OpenAI to Ofinis in under 10 minutes
Our API is schema-compatible with OpenAI Chat Completions. Here's a step-by-step walkthrough with before/after code.
Introducing ofinis-2-reason: 128 K context, deep chain-of-thought
Our most capable model yet — built for entire codebases, regulatory documents, and multi-step analytical tasks.
Semantic memory vs. RAG: when to use each
Ofinis offers both persistent semantic memory and context injection. We break down the tradeoffs and when each pattern wins.
Streaming SSE responses at scale: lessons from production
Building robust Server-Sent Events delivery across thousands of concurrent connections — what broke and how we fixed it.
Building a document Q&A bot with the Ofinis Files API
Upload PDFs, ask questions, and get cited answers — all with fewer than 50 lines of Python.
More articles coming soon. Subscribe to the newsletter to get notified.