Ofinis

Meet Ofinis

Models

Four models tuned for different latency, context, and reasoning tradeoffs. All support real-time web search. Pick the right one for your use case.

ofinis-1-fastFree

The fastest model in the Ofinis family. Optimised for minimal latency on straightforward queries, classification, short summaries, and high-frequency completions.

Context

2 000 tokens

Latency

~0.4 s (P50)

Throughput

~120 tok/s

Strengths

  • ·Ultra-low latency
  • ·High throughput
  • ·Cost-efficient at scale
  • ·Q&A and classification

Best for

  • ·Live chat autocomplete
  • ·Content moderation
  • ·Real-time classification
  • ·Simple chatbots
ofinis-1Starter+

Balanced performance with full web search grounding. Great for research assistants, summarisation, multi-turn conversations, and content generation tasks that benefit from current information.

Context

8 000 tokens

Latency

~1.2 s (P50)

Throughput

~80 tok/s

Strengths

  • ·Web-grounded answers
  • ·8 K context for documents
  • ·Strong general reasoning
  • ·Cited responses

Best for

  • ·Research assistants
  • ·News summarisation
  • ·Customer support bots
  • ·Email drafting
Most capable
ofinis-2Pro+

High-capability reasoning model with a 32 K context window. Excels at code generation, complex analysis, technical writing, and tasks requiring multi-step logical inference.

Context

32 000 tokens

Latency

~2.5 s (P50)

Throughput

~50 tok/s

Strengths

  • ·32 K context — whole codebases
  • ·Complex multi-step reasoning
  • ·Code generation & review
  • ·Web-grounded + document RAG

Best for

  • ·Code assistants
  • ·Legal & financial analysis
  • ·Long-form content
  • ·Technical documentation
ofinis-2-reasonEnterprise

Deep multi-step reasoning model with a 128 K context window. Designed for the most demanding tasks — entire codebases, regulatory documents, academic research, and chain-of-thought analysis that requires sustained coherence across vast input.

Context

128 000 tokens

Latency

~6 s (P50)

Throughput

~25 tok/s

Strengths

  • ·128 K context — entire documents
  • ·Deep chain-of-thought reasoning
  • ·Enterprise accuracy benchmarks
  • ·Custom fine-tuning available

Best for

  • ·Due diligence automation
  • ·Codebase-wide refactors
  • ·Scientific literature review
  • ·Regulatory compliance analysis

Side-by-side comparison

Featureofinis-1-fastofinis-1ofinis-2ofinis-2-reason
Context window2 K8 K32 K128 K
Latency (P50)~0.4 s~1.2 s~2.5 s~6 s
Web search
Streaming
File / doc RAG
Semantic memory
Priority queue
Min planFreeStarterProEnterprise

Try any model for free

50 free API calls on signup. No credit card required.