Moonshot AI has launched Kimi K2.7 Code, a new open source coding model that promises stronger software development performance while using significantly fewer reasoning tokens.
Quick Summary – TLDR:
- Kimi K2.7 Code is Moonshot AI’s latest coding focused AI model designed for long and complex software engineering tasks.
- The company claims the model delivers notable gains in coding and agentic benchmarks compared to K2.6.
- K2.7 Code reportedly reduces reasoning token usage by around 30%, helping lower costs and improve efficiency.
- The model is open source, supports a 256K context window, and is available through Kimi Code, Kimi API, and Hugging Face.
What Happened?
Moonshot AI announced Kimi K2.7 Code on June 12, 2026, introducing a new open source AI model built specifically for coding and agent based software development tasks. The release is available through Kimi Code, the Kimi API, and Hugging Face under a Modified MIT license.
The company says the model improves both coding ability and autonomous task execution while reducing unnecessary reasoning overhead, making it better suited for real world software engineering workflows.
🌘 Meet Kimi K2.7 Code HighSpeed!
— Kimi.ai (@Kimi_Moonshot) June 15, 2026
A high-speed mode of our latest open-source multimodal coding model, Kimi K2.7 Code.
⚡️ Up to 6× faster: Around 180 tok/s on coding tasks with median-length inputs, and up to 260 tok/s on shorter-context tasks.
🔷 Rolling out to Kimi Code Beta… pic.twitter.com/syOOgIdtI4
Built for Long Horizon Software Development
Modern software projects often require much more than generating a few lines of code. Developers frequently need AI systems that can work across multiple files, understand large codebases, perform debugging, and complete tasks over extended sessions.
According to Moonshot AI, Kimi K2.7 Code was optimized specifically for these long horizon scenarios. The company says the model follows instructions more reliably in lengthy contexts and achieves higher end to end task completion rates than its predecessor, K2.6.
This focus on sustained task execution is also reflected in the model’s agentic benchmark performance, where it showed improvements across several internal evaluations designed to measure autonomous software engineering capabilities.
Benchmark Scores Show Noticeable Gains
Moonshot AI reported significant improvements across coding focused benchmarks when compared with K2.6.
Some of the reported gains include:
- Kimi Code Bench v2: 62.0 versus 50.9
- Program Bench: 53.6 versus 48.3
- MLS Bench Lite: 35.1 versus 26.7
The company also reported stronger results in agent based evaluations:
- Kimi Claw 24/7 Bench: 46.9 versus 42.9
- MCP Atlas: 76.0 versus 69.4
- MCP Mark Verified: 81.1 versus 72.8
Moonshot compared the new model against GPT 5.5 and Claude Opus 4.8 on several benchmarks. While K2.7 Code remains competitive, some categories still show higher scores from competing models.
An important detail for developers is that the published benchmark results currently come from Moonshot AI itself. Independent evaluations on widely followed public benchmarks have not yet been released, meaning organizations will likely want to validate performance against their own workloads before making deployment decisions.
Efficiency Is the Main Selling Point
One of the biggest claims surrounding Kimi K2.7 Code is its improved reasoning efficiency.
Moonshot says the model reduces thinking token consumption by approximately 30% compared with K2.6 while still delivering better benchmark results. For development teams running coding agents at scale, lower token usage can translate into reduced infrastructure costs and faster response times.
The model operates with thinking mode permanently enabled. Unlike some AI systems that allow reasoning to be switched off, K2.7 Code always performs internal reasoning before generating a response.
For developers building AI-powered workflows, this creates a more predictable behavior pattern, although it also means token usage must be managed differently than with models offering optional reasoning modes.
Massive Architecture Designed for Scale
Under the hood, Kimi K2.7 Code uses a Mixture of Experts architecture featuring 1 trillion total parameters with 32 billion activated parameters per token.
The model supports a 256,000 token context window, allowing it to process large projects and extended conversations. It also includes MoonViT, a 400 million parameter vision encoder that enables multimodal capabilities.
Moonshot says the model can work with image and video inputs, which could prove useful for UI debugging, design reviews, visual inspection tasks, and full stack development workflows.
Availability and Pricing
Developers can access Kimi K2.7 Code through Kimi Code and the Kimi API.
Kimi Code membership plans start at $19 per month, with higher tiers offering larger usage limits and increased concurrency.
For API users, pricing is set at:
- $0.19 per million input tokens for cache hits.
- $0.95 per million input tokens for cache misses.
- $4.00 per million output tokens.
Moonshot also says organizations already running K2.6 deployments can migrate relatively easily using existing infrastructure built on frameworks such as vLLM, SGLang, and KTransformers.
SQ Magazine Takeaway
I think the most interesting part of this launch is not the benchmark numbers. Every AI company publishes impressive benchmark charts. What stands out here is the focus on efficiency and long running coding workflows. A 30% reduction in reasoning tokens could have a real impact for teams operating coding agents every day.
At the same time, developers should be careful about relying solely on vendor reported results. The true test for Kimi K2.7 Code will come when independent benchmarks and real production deployments start revealing how it performs against GPT 5.5, Claude, and other leading coding models. Still, Moonshot AI is clearly moving aggressively in the developer tools market, and K2.7 Code looks like one of its most ambitious releases yet.