Your compute sanctuary

Your compute sanctuary

Rest easy knowing you can get the compute you need for any AI workload

.

Customers and partners

Customers and partners

Trusted by AI-native startups, research labs, public companies, and academic institutions.

Developers of EVO, a genomic foundation model

“Mithril’s omnicloud platform has accelerated science at Arc.

Our machine learning work brings demanding performance infrastructure needs, and Mithril delivers. With Mithril, we can guarantee that our researchers have exactly the compute they need, when they need it, without procurement friction.”

Patrick Hsu

Co-founder and CEO, Arc Institute

Omnicloud AI compute

Mithril aggregates and orchestrates multi-cloud GPUs, CPUs, and storage so you don't have to. Access the infrastructure you need through a single platform with transparent pricing.

1

Burst the compute you need

Prevent overprovisioning by spinning up GPUs via extendable short-term reservations or spot bids.

2

Pay the market rate

Get the best compute prices without talking to 10 sales teams. Mithril sets compute prices across all capacity algorithmically, based on supply and demand.

3

Run batch workloads

Use the batch SDK and batch inference API to run workloads asynchronously on spot capacity at the best price.

Compute instances

Compute

instances

Batch inference

Batch

inference

Cloud

Cloud

Cloud

Datacenter

Datacenter

Datacenter

Datacenter

Datacenter

Datacenter

Compute instances

Compute instances

Compute the way you need it

Compute the way you need it

Spin up GPUs through a simple, transparent platform

Training

Fine-tuning

Inference

Batch jobs

Training

Fine-tuning

Inference

Batch jobs

Spin up GPUs for hours, days, or months

Spin up GPUs for hours, days, or months

Reserve what you need in minutes

Instantly access GPU VMs with extendable short-term reservations.

Don't overpay

See pricing for all compute in real-time. Get the best prices without negotiating.

Scale to thousands of GPUs

Optimize your economics for large-scale clusters without shopping around every cloud.

Reserve what you need in minutes

Instantly access GPU VMs with extendable short-term reservations.

Don't overpay

See pricing for all compute in real-time. Get the best prices without negotiating.

Scale to thousands of GPUs

Optimize your economics for large-scale clusters without shopping around every cloud.

Burst spot instances for async workloads

Burst spot instances for async workloads

Burst when the price is right

Limit spend by creating spot bids that provision GPUs when they're available at your desired price.

Run batch jobs for less

Queue up async jobs to run on spot compute programmatically.

Pause and resume

Pause your bids without losing your environment whenever you don't have workloads to run.

Burst when the price is right

Limit spend by creating spot bids that provision GPUs when they're available at your desired price.

Run batch jobs for less

Queue up async jobs to run on spot compute programmatically.

Pause and resume

Pause your bids without losing your environment whenever you don't have workloads to run.

Purpose-built infrastructure for machine learning workloads

Purpose-built infrastructure for machine learning workloads

NVIDIA GPUs with InfiniBand

Run distributed workloads on NVIDIA A100s, H100s, and H200s with pre-configured interconnect.

High-performance storage

Maximize workload performance with co-located WEKA block and fileshare storage.

API and batch job SDK

Provision compute via API and launch ML batch jobs on spot instances with the Flow SDK.

Streamlined access and security controls

Project-level permissions, global SSH keys for admin monitoring, and SSO for secure access—all backed by SOC 2 compliant infrastructure.

Batch inference

Batch inference

Trillions of tokens, effortlessly

Trillions of tokens, effortlessly

Asynchronously process millions of inference requests with a simple API

Content generation

Data classification

Embeddings

Prompt testing

Inference, orchestrated for efficiency

Inference, orchestrated for efficiency

More tokens for less

Get the best economics for processing massive multimodal datasets.

Start in minutes

Use the OpenAI SDK to easily transfer existing workloads or create new batch jobs.

Any use case

Experiment with text models in minutes. Deploy any open or custom model you need.

Predictable delivery

Simple, token-based pricing for 24-hour job completion. If your job takes longer, it's free.

Models

Models

Text

gpt-oss-120b

MoE · 117B parameters

Text

gpt-oss-20b

MoE · 21B parameters

Text

Llama 4 Maverick 17Bx128E Instruct FP8

MoE · 400B parameters

Text

Llama 4 Scout 17Bx16E Instruct

MoE · 109B parameters

Text

Llama 3.1 8B Instruct

Dense model · 8B parameters

Text

Qwen 2.5 72B Instruct

Dense model · 72B parameters

Custom

[Your model]

Hosted open and custom models