The global AI race just hit another gear. In a single week, China unleashed not one but two trillion-parameter AI models, shaking up the leaderboard and putting pressure on American labs to respond.
Alibaba’s Qwen-3 Max: A Trillion-Parameter Preview
The biggest headline comes from Alibaba’s Qwen team, which unveiled Qwen-3 Max Preview — a model weighing in at over 1 trillion parameters.
For context, many have speculated that OpenAI’s GPT-4o and its successors sit in a similar range, but most labs lately have leaned toward smaller, more efficient models. Qwen going bigger bucks that trend.
Benchmarks show why: on tests like SuperGQA, LiveCodeBench V6, Arena Hard V2, and LiveBench 2024, Qwen-3 Max outperformed rivals including Claude Opus 4, Kimi K2, and DeepSeek v3.1.
That’s no small feat — these are some of the toughest models to beat right now.
Availability and Pricing
Qwen-3 Max is already live:
-
Available via Qwen Chat (Alibaba’s ChatGPT competitor)
-
Accessible through Alibaba Cloud’s API
-
Integrated into OpenRouter and Anyscale Coder (Hugging Face’s coding tool), where it’s now the default model
But unlike some of Qwen’s earlier releases, this one isn’t open source. Access comes via Alibaba Cloud or its partners, with tiered pricing depending on context length:
-
Up to 32k tokens: $0.86 per million input tokens, $3.44 per million output
-
32k–128k tokens: $1.43 input, $5.73 output
-
Up to 252k tokens: $2.15 input, $8.60 output
Short prompts? Affordable. Heavy, high-context workloads? Pricey.
Context Window and Features
-
Max context: 262,144 tokens
-
Input up to 258,048 tokens
-
Output up to 32,768 tokens (trade-off between input vs. output length)
-
-
Context caching: keeps long conversations alive without reprocessing
-
Use cases: complex reasoning, coding, JSON/data handling, and creative work
Early testers (including VentureBeat) report that it’s blazing fast — even quicker than ChatGPT in side-by-side trials — while avoiding common “big model” pitfalls like miscounting letters or botching arithmetic.
Moonshot AI: The Kimi Upgrade
While Qwen stole headlines, Moonshot AI, a Beijing startup valued at $3.3 billion, also made waves with an update to its Kimi series.
-
The new release (internally dubbed Kimi K2-0905) doubles the context window from 128k to 256k tokens
-
Focuses on improved coding skills and reduced hallucination
-
Keeps its creative writing strengths that made the first Kimi popular
Moonshot’s first trillion-parameter model, Kimi K2, was open source and climbed the LM Arena leaderboard (tied for 8th overall, 4th in coding). The company remains committed to open-sourcing future models, unlike Alibaba’s more closed approach.
Founder Yang Jullin has been outspoken:
-
Believes millions of tokens are needed for AI to truly solve hard problems
-
Argues that scaling laws are alive and well, with efficiency gains driving faster progress than ever
-
Revealed that K2 is already being used to train K3, their next-generation base model
What It Means for the AI Race
With Alibaba and Moonshot both flexing trillion-parameter models in the same week, it’s clear that China is serious about AI supremacy.
-
Enterprises now have access to longer context windows and more powerful reasoning engines — but they’ll need to weigh costs and risks.
-
Developers are already running into Qwen-3 Max inside tools like Anyscale Coder, often without realizing it.
-
The open-source vs. closed-source divide between Qwen and Moonshot could shape the global AI ecosystem just as much as raw performance.
The bigger question: does this mark the start of China overtaking the US in AI?
For now, what’s certain is that the competition just got fiercer — and trillion-parameter models are no longer the exception, but the new benchmark.
No comments:
Post a Comment