Meet Ofinis
Models
Four models tuned for different latency, context, and reasoning tradeoffs. All support real-time web search. Pick the right one for your use case.
ofinis-1-fastFreeThe fastest model in the Ofinis family. Optimised for minimal latency on straightforward queries, classification, short summaries, and high-frequency completions.
Context
2 000 tokens
Latency
~0.4 s (P50)
Throughput
~120 tok/s
Strengths
- ·Ultra-low latency
- ·High throughput
- ·Cost-efficient at scale
- ·Q&A and classification
Best for
- ·Live chat autocomplete
- ·Content moderation
- ·Real-time classification
- ·Simple chatbots
ofinis-1Starter+Balanced performance with full web search grounding. Great for research assistants, summarisation, multi-turn conversations, and content generation tasks that benefit from current information.
Context
8 000 tokens
Latency
~1.2 s (P50)
Throughput
~80 tok/s
Strengths
- ·Web-grounded answers
- ·8 K context for documents
- ·Strong general reasoning
- ·Cited responses
Best for
- ·Research assistants
- ·News summarisation
- ·Customer support bots
- ·Email drafting
ofinis-2Pro+High-capability reasoning model with a 32 K context window. Excels at code generation, complex analysis, technical writing, and tasks requiring multi-step logical inference.
Context
32 000 tokens
Latency
~2.5 s (P50)
Throughput
~50 tok/s
Strengths
- ·32 K context — whole codebases
- ·Complex multi-step reasoning
- ·Code generation & review
- ·Web-grounded + document RAG
Best for
- ·Code assistants
- ·Legal & financial analysis
- ·Long-form content
- ·Technical documentation
ofinis-2-reasonEnterpriseDeep multi-step reasoning model with a 128 K context window. Designed for the most demanding tasks — entire codebases, regulatory documents, academic research, and chain-of-thought analysis that requires sustained coherence across vast input.
Context
128 000 tokens
Latency
~6 s (P50)
Throughput
~25 tok/s
Strengths
- ·128 K context — entire documents
- ·Deep chain-of-thought reasoning
- ·Enterprise accuracy benchmarks
- ·Custom fine-tuning available
Best for
- ·Due diligence automation
- ·Codebase-wide refactors
- ·Scientific literature review
- ·Regulatory compliance analysis
Side-by-side comparison
| Feature | ofinis-1-fast | ofinis-1 | ofinis-2 | ofinis-2-reason |
|---|---|---|---|---|
| Context window | 2 K | 8 K | 32 K | 128 K |
| Latency (P50) | ~0.4 s | ~1.2 s | ~2.5 s | ~6 s |
| Web search | ||||
| Streaming | ||||
| File / doc RAG | ||||
| Semantic memory | ||||
| Priority queue | ||||
| Min plan | Free | Starter | Pro | Enterprise |
Try any model for free
50 free API calls on signup. No credit card required.