Guaranteed 30% off your current AI inference bill for teams spending $500 or more per month.

Book a call →

ModelRegistry

Choose an open-source model and deploy it in seconds.

Models

42

Mistral-7B

Fast, efficient general-purpose language model for chat, summarization, and reasoning.

Llama-3.1-8B

Strong open-source model with excellent instruction following and reasoning quality.

Qwen 2.5-7B

Multilingual language model optimized for general reasoning and conversational tasks.

Stable Diffusion v1.5

Widely used text-to-image model for fast and flexible image generation.

Whisper-Medium

Reliable speech recognition model for transcription and audio analysis.

DeepSeek Coder - 33B

Specialized model for code generation, refactoring, and programming assistance.

Llama-3.3-70B

Large-scale model designed for complex reasoning and production-grade workloads.

YOLOv9

Real-time object detection model for images and video streams.

Whisper Large

High-accuracy transcription model suitable for production use cases.

Whisper Large v3

State-of-the-art multilingual speech-to-text model with improved accuracy, robustness, and performance for production transcription.

Qwen-3 32B

Large multilingual model built for strong reasoning and generation tasks.

GPT-OSS-20B

Open-source GPT-style model supporting both natural language and coding tasks.

GPT-OSS-120B

Premium open-source GPT-style large language model with top-tier reasoning, coding, and natural language capabilities for high-performance AI applications

DeepSeek R1 70B

Flagship reasoning model offering superior accuracy and deep problem-solving capabilities for demanding AI workloads.

DeepSeek R1 8B

Efficient reasoning-focused language model optimized for fast, cost-effective problem solving and structured AI tasks.

Kimi-K2-Thinking

Premium reasoning-optimized language model designed for deep thinking, complex problem solving, and advanced multi-step AI tasks.

BGE-Large

High-quality embedding model optimized for semantic search and RAG pipelines.

E5-Large

Embedding model optimized for similarity search and information retrieval.

Llama-3.2-3B

Lightweight model optimized for low-latency and cost-efficient workloads.

YOLOv11

Latest YOLO model offering improved object detection accuracy and performance.

Flux.1 Schnell

High-speed diffusion model optimized for rapid image generation.

Oxlo Image Pro

Premium image generation model delivering exceptional visual quality, precise prompt adherence, and reliable production-grade performance

Kokoro-82M

Lightweight speech synthesis model for generating natural-sounding audio.

DeepSeek V3.2

Powerful general-purpose language model delivering strong reasoning, coding, and natural language performance.

Llama 4 Maverick 17B

Efficient large language model designed for strong reasoning, conversational AI, and scalable production deployments.

DeepSeek V3 0324

Advanced language model optimized for complex reasoning, structured responses, and high-performance AI assistants.

Kimi K2.5

High-capacity reasoning model built for long context understanding, research workflows, and multi-step problem solving.

DeepSeek R1 0528

Reasoning focused language model specialized for analytical tasks, coding workflows, and complex multi-step reasoning.

Ministral 3 14B Instruct

Efficient instruction tuned language model designed for conversational AI, structured responses, and scalable production applications.

Qwen-3 Coder 30B

Code-focused language model for software development and technical reasoning.

Gemma-3-27B

Large language model focused on high-quality text generation and reasoning.

Gemma-3-4B

Compact model designed for efficient inference with solid generation quality.

Minimax M2.5

Mixture-of-Experts model optimized for coding, agentic tool use, complex workflows, and office productivity tasks.

GLM 5

744B parameter MoE model built for complex systems engineering, long-horizon agentic tasks, and advanced reasoning.

Kimi K2.6

Latest high-capacity reasoning model built for long context understanding, research workflows, and complex problem solving.

DeepSeek V4 Flash

Efficient MoE model with 1M context and near state-of-the-art open-source reasoning performance.

Stable Diffusion 3.5 Large

High-resolution image generation model focused on output realism and detail.

Oxlo Image Ultra

Flagship image generation model optimized for ultra-realistic visuals and delivering exceptional photorealism.

Oxlo Coder Fast

High-speed code generation model designed for rapid completions, responsive coding workflows, and efficient developer productivity

Falcon 7B

Efficient language model optimized for fast, reliable text generation and general-purpose AI tasks.

Qwen 2.5 Coder 7B

Specialized code generation model designed for accurate coding assistance, debugging, and developer-focused workflows.

Falcon 11B

Advanced language model delivering enhanced reasoning, stronger text generation, and improved performance across diverse AI tasks

Ox Assistant
Online
OxBot
OxBot

Hi there! Try our cost calculator to see what you'd save with Oxlo.ai.