MiniMax-M3

MiniMax M3 is a frontier Mixture-of-Experts model featuring a 1M token context window and native multimodal support. Built for high-fidelity reasoning, MiniMax M3 excels in coding, bilingual tasks, and long-document analysis.

$ 0.48

$ 0.6

$ 0.96

$ 1.2

text

$ 0.48

$ 0.6

text

$ 0.96

$ 1.2

text

Related Models

claude-opus-4-8-thinking

$ 20

$ 25

MiniMax M3 API

Call the MiniMax M3 API on GPTProto — an open-weight coding and agent model with a 1M-token context — at $0.96 per 1M output tokens, one key across 200+ models, no regional sign-up.

MSA Sparse Attention

MiniMax M3 runs on MiniMax Sparse Attention (MSA), cutting per-token compute at 1M context to roughly 1/20 of the M2 generation — over 9x faster prefill and 15x faster decode.

Long-Horizon Agent Runs

MiniMax M3 is built for sustained agent and coding work — autonomous task decomposition, tool calls, and multi-step reasoning held in one 1M-token session, tuned on multi-turn developer workflows.

Coding & Agentic Performance

In MiniMax's own tests, M3 scores 59.0% on SWE-Bench Pro and 83.5 on BrowseComp — ahead of GPT-5.5 and Gemini 3.1 Pro on coding, and above Opus 4.7 on web browsing.

1M Token Long Context

MiniMax M3 handles up to 1,048,576 tokens with a 512K guaranteed minimum. MSA keeps retrieval coherent across the full window, so whole-repo and long-document runs fit in one prompt.

What Is MiniMax M3?

MiniMax M3 is an open-weight large language model from MiniMax (MiniMaxAI), released June 1, 2026. It targets long-horizon coding and agent workloads: autonomous task decomposition, tool use, and multi-step reasoning across a 1M-token context. Its defining change is MiniMax Sparse Attention (MSA), which selects the key–value blocks that matter instead of attending to every token — the reason a 1-million-token window is practical to run rather than just a spec-sheet number. On GPTProto you call the MiniMax M3 API through one account balance shared with 200+ other models, no separate MiniMax sign-up required.

Spec table：

Field	MiniMax M3
Developer	MiniMax (MiniMaxAI), Shanghai
Released	June 1, 2026
Type	Open-weight LLM
Architecture	Mixture-of-Experts · 229.9B total / 9.8B active · 256 experts
Attention	MiniMax Sparse Attention (MSA)
Context window	1,048,576 tokens (512K guaranteed minimum)
Max output	up to ~512K tokens
Input modality	text (on this page) · image / file via the image-to-text subpage
Output modality	text
Thinking mode	toggleable per request
Tool use / function calling	yes
Endpoint	`https://gptproto.com/v1/chat/completions` (OpenAI-compatible)
GPTProto price	$0.48 / 1M input · $0.96 / 1M output
GPTProto model string	`MiniMax-M3`

MiniMax M3 vs MiniMax M2.5

Both models run on GPTProto under the same key and balance. M2.5 is the earlier, full-attention text model; M3 moves to sparse attention (MSA) and a practical 1M-token window, and adds image input through its image-to-text subpage.

	MiniMax M3	MiniMax M2.5
Attention	MSA (sparse)	Full attention
Input (this page)	text	text
Image input	via image-to-text subpage	—
Context window	1,048,576 tokens	204,800 tokens
GPTProto price (in / out per 1M)	$0.48 / $0.96	$0.24 / $0.96
Best for	Long-horizon coding & agent runs, 1M context	Lower-cost text reasoning at shorter context

Switching from the official MiniMax API

If you already call MiniMax directly, moving to GPTProto is a drop-in change: point your client at the GPTProto endpoint, pass your GPTProto key, and set the model to MiniMax-M3. The request and response shape follow the OpenAI chat format, so existing code paths stay the same. You keep one balance across 200+ models, skip a separate MiniMax platform sign-up, and avoid the regional payment friction Western developers hit on the official Shanghai platform.

One migration gotcha: GPTProto expects the API key directly in the Authorization header — no Bearer prefix. If your OpenAI SDK auto-adds Bearer, set the header manually.

bash

curl --location 'https://gptproto.com/v1/chat/completions' \
  --header 'Authorization: GPTPROTO_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "MiniMax-M3",
    "messages": [{ "role": "user", "content": "Who are you?" }],
    "stream": false
  }'

Is MiniMax M3 open source?

Yes. MiniMax released M3 as an open-weight model, with weights and a technical report published to Hugging Face and GitHub. On GPTProto you can call the hosted MiniMax M3 API without self-hosting — useful when you want the model's long-context and agent behaviour but not the GPU footprint of running 229.9B parameters yourself.

How to Get a MiniMax-M3 API Key

Getting a MiniMax-M3 API key takes four steps and a few minutes. Create a free GPTProto account, add credits, generate your key, and make your first call — at $0.48 / $0.96 it's a cheaper MiniMax-M3 API key than going direct, and one key works across every model on the platform. Full MiniMax-M3 Documentation is in the docs.

Create your free GPT Proto account to begin. You can set up an organization for your team at any time.

Top up

Your balance can be used across all models on the platform, including MiniMax-M3, giving you the flexibility to experiment and scale as needed.

Generate your API key

In your dashboard, create an API key — you'll need it to authenticate when making requests to MiniMax-M3.

Make your first API call

Use your API key with our sample code to send a request to MiniMax-M3 via GPT Proto and see instant AI-powered results.

Get API Key

MiniMax M3 Frequently Asked Questions

What is the context limit of MiniMax M3?

MiniMax M3 supports up to 1,048,576 tokens, with a guaranteed minimum of 512K. Maximum output is around 512K tokens per request.

Does MiniMax M3 support image input?

The MiniMax M3 model is natively multimodal. On GPTProto, this text-to-text page covers text input and output; image and file input run through the dedicated MiniMax M3 image-to-text API, which uses the same key and balance.

Is MiniMax M3 cheaper than calling MiniMax directly?

On output, GPTProto's $0.96 per 1M is below MiniMax's $2.40 standard list rate. Input is $0.48 per 1M. You also get one balance across 200+ models and no separate regional sign-up.

Is my data used to train the MiniMax M3 model?

No. Data sent to MiniMax M3 via the GPTProto API aggregation platform is processed under a Zero-Retention policy. Your inputs and outputs are not used for model training or refinement by MiniMax. This enterprise-grade security ensures that sensitive codebase information or private financial data remains confidential while you leverage the advanced reasoning capabilities of the MiniMax M3 architecture.

What is the pricing for MiniMax M3 tokens?

On GPTProto, MiniMax M3 is $0.48 per 1M input tokens and $0.96 per 1M output tokens. The output rate sits below MiniMax's $2.40 standard list price, and everything bills from one shared balance.

How do I migrate to MiniMax M3 from GPT-4o?

Point your client at https://gptproto.com/v1/chat/completions, pass your GPTProto key in the Authorization header (no Bearer prefix), and set "model": "MiniMax-M3". It follows the OpenAI chat format, so the rest of your code is unchanged. One balance covers GPT-class and MiniMax models, so you can A/B both without a second account.

More GPTProto AI Tools

AI French Kissing Generator

Upload a photo of two people and turn it into a short, romantic kiss video. Our AI French kissing generator animates a natural lean-in and a gentle kiss in seconds — no editing skills needed.

Motion Control AI

Preserve character consistency and human-like movement across every frame with advanced motion control technology.

Face Rating

Upload a photo to our advanced face rating analyzer and get an objective face score out of 10 based on facial symmetry, features, and proportions.

High Quality Photo

Turn any blurry or pixelated image into a stunning, high quality photo using our advanced AI enhancement technology.

More Blogs

Seedance 2.0 Mini vs Seedance 2.0: Price, Quality, and Which One to Actually Use

Seedance 2.0 Mini isn't "half price"—on the API it's ~20% cheaper. The big saving comes from 720p drafting. Real pricing, runnable code, which tier to ship.

How to Use Kling 3.0 Motion Control: A Developer's Guide (Web + API)

A developer's guide to Kling 3.0 Motion Control — pro vs std, input limits, prompt tips, and runnable API code (Python + cURL) via GPTProto.

Nano Banana Pro vs Nano Banana 2: Which Gemini Image Model Should You Use in 2026?

Nano Banana 2 costs half of Nano Banana Pro and scores higher on the Image Arena. See when each Gemini image model wins—plus one-API code to run both.

What Is GLM 5.2? Open-Weight Coding at 1/6 the Price

GLM 5.2 is Z.ai's open-weight, MIT-licensed coding model with a 1M-token context. See its features, benchmarks vs Claude Opus 4.8 and GPT-5.5, pricing, and how to run it.

MiniMax M3 API

MSA Sparse Attention

Long-Horizon Agent Runs

Coding & Agentic Performance

1M Token Long Context

What Is MiniMax M3?

Spec table：

MiniMax M3 vs MiniMax M2.5

Switching from the official MiniMax API

Is MiniMax M3 open source?

How to Get a MiniMax-M3 API Key

Create your free GPT Proto account to begin. You can set up an organization for your team at any time.

Your balance can be used across all models on the platform, including MiniMax-M3, giving you the flexibility to experiment and scale as needed.

In your dashboard, create an API key — you'll need it to authenticate when making requests to MiniMax-M3.

Use your API key with our sample code to send a request to MiniMax-M3 via GPT Proto and see instant AI-powered results.

MiniMax M3 Frequently Asked Questions

What is the context limit of MiniMax M3?

Does MiniMax M3 support image input?

Is MiniMax M3 cheaper than calling MiniMax directly?

Is my data used to train the MiniMax M3 model?

What is the pricing for MiniMax M3 tokens?

How do I migrate to MiniMax M3 from GPT-4o?

More GPTProto AI Tools

AI French Kissing Generator

Motion Control AI

Face Rating

High Quality Photo

Related Articles

Seedance 2.0 Mini vs Seedance 2.0: Price, Quality, and Which One to Actually Use

How to Use Kling 3.0 Motion Control: A Developer's Guide (Web + API)

Nano Banana Pro vs Nano Banana 2: Which Gemini Image Model Should You Use in 2026?

What Is GLM 5.2? Open-Weight Coding at 1/6 the Price