Meet Ofinis

Models

Four models tuned for different latency, context, and reasoning tradeoffs. All support real-time web search. Pick the right one for your use case.

ofinis-1-fastFree

The fastest model in the Ofinis family. Optimised for minimal latency on straightforward queries, classification, short summaries, and high-frequency completions.

Context

2 000 tokens

Latency

~0.4 s (P50)

Throughput

~120 tok/s

Strengths

·Ultra-low latency
·High throughput
·Cost-efficient at scale
·Q&A and classification

Best for

·Live chat autocomplete
·Content moderation
·Real-time classification
·Simple chatbots

ofinis-1Starter+

Balanced performance with full web search grounding. Great for research assistants, summarisation, multi-turn conversations, and content generation tasks that benefit from current information.

Context

8 000 tokens

Latency

~1.2 s (P50)

Throughput

~80 tok/s

Strengths

·Web-grounded answers
·8 K context for documents
·Strong general reasoning
·Cited responses

Best for

·Research assistants
·News summarisation
·Customer support bots
·Email drafting

Most capable

ofinis-2Pro+

High-capability reasoning model with a 32 K context window. Excels at code generation, complex analysis, technical writing, and tasks requiring multi-step logical inference.

Context

32 000 tokens

Latency

~2.5 s (P50)

Throughput

~50 tok/s

Strengths

·32 K context — whole codebases
·Complex multi-step reasoning
·Code generation & review
·Web-grounded + document RAG

Best for

·Code assistants
·Legal & financial analysis
·Long-form content
·Technical documentation

ofinis-2-reasonEnterprise

Deep multi-step reasoning model with a 128 K context window. Designed for the most demanding tasks — entire codebases, regulatory documents, academic research, and chain-of-thought analysis that requires sustained coherence across vast input.

Context

128 000 tokens

Latency

~6 s (P50)

Throughput

~25 tok/s

Strengths

·128 K context — entire documents
·Deep chain-of-thought reasoning
·Enterprise accuracy benchmarks
·Custom fine-tuning available

Best for

·Due diligence automation
·Codebase-wide refactors
·Scientific literature review
·Regulatory compliance analysis

Side-by-side comparison

Feature	ofinis-1-fast	ofinis-1	ofinis-2	ofinis-2-reason
Context window	2 K	8 K	32 K	128 K
Latency (P50)	~0.4 s	~1.2 s	~2.5 s	~6 s
Web search
Streaming
File / doc RAG
Semantic memory
Priority queue
Min plan	Free	Starter	Pro	Enterprise

Try any model for free

50 free API calls on signup. No credit card required.

Get your free API key View pricing