Arm is launching Lumex CSS, a platform that enables faster, smarter and more personal AI delivered on consumer devices.
Arm‘s new Lumex CSS platform drives double-digit performance gains for AI, with SME2-enabled Arm CPUs delivering up to five times faster AI performance, said Chris Bergey, SVP and GM of the client line of business at Arm, in a blog post.
The Arm Lumex CSS platform — a new compute platform for mobile devices — unlocks real-time on-device AI use cases like assistants, voice translation and personalization, with new SME2-enabled Arm CPUs delivering up to five times faster AI performance.
“Now [AI] is not a feature, it’s an expectation. In just over a year, real-time intelligence has become essential, and we’re only seeing the tip of the iceberg,” said Bergey in a press briefing. “Since last year, AI hasn’t just become more powerful, it’s become more personal. It now understands, adapts and reacts in real time, all without ever needing to leave your device, whether it’s streamlining your workflow, helping you communicate across languages, or anticipating what you need before you ask. AI is shifting from a tool to a companion, and expectations are growing defining consumer choices.”
Developers can access SME2 performance with KleidiAI, now integrated into all major mobile OSes and AI frameworks, including PyTorch ExecuTorch, Google LiteRT, Alibaba MNN and Microsoft ONNX Runtime.
For flagship devices, Arm Lumex CSS platform achieves an unprecedented six years of double digit IPC performance gains. And the new Mali G1-Ultra redefines mobile entertainment and is built for gamers, with two times ray tracing uplift.
Arm’s AI stance

AI is no longer a feature, it’s the foundation of next-generation mobile and consumer technology, Bergey wrote. Users now expect real-time assistance, seamless communication, or personalized content that is instant, private, and available on device, without compromise. Meeting these expectations requires more than incremental upgrades; it demands a step change that brings performance, privacy and efficiency together in a scalable way.
“That’s why we’re introducing Arm Lumex, our most advanced compute subsystem (CSS) platform, purpose-built to accelerate AI experiences on flagship smartphones and next-gen PCs,” Bergey said. “Lumex unites our highest performing CPUs with Scalable Matrix Extension version 2 (SME2), GPUs and system IP with an optimized software stack, enabling the ecosystem to bring AI devices to market faster and deliver experiences from desktop-class mobile gaming to real-time translation, smarter assistants, and personalized applications.”
He said Arm is enabling SME2 across every CPU platform and by 2030, SME and SME2 will add over 10 billion TOPS of compute across more than 3 billion devices, delivering an exponential leap in on-device AI capability.
“This shift is being driven by major advances in large language models and agentic AI. These aren’t static models anymore,” Bergey said. “They’re dynamic systems that reason, plan and take action on your behalf the result of interactions that feel less like commands and more like collaboration. We have moved from AI being a parlor trick to influencing how things get done. People of all ages are using these experiences every day, embedded seamlessly into apps, devices and systems they rely on, but we have only started to see how AI will shape our future expectations….What feels magical right now will be the bare minimum of tomorrow. People will soon expect every device to understand the natural
voice, anticipate their needs and respond with context and intelligence. And if it doesn’t, they’ll be frustrated instantly.”
He said the AI has to move to the device because relying on the cloud isn’t sustainable. It’s too expensive for developers and too slow for users and too concerning for privacy, he said.
For Arm, this means the company has to evolve its computing to keep pace with the rapid growth of AI.
“The heart of modern AI is powerful with non-stop learning algorithms and models to support this. We need robust AI compute platforms, and our job is to deliver high-performing CPUs and GPUs. With that AI AI-enabled consumer devices like smartphones and PCs are able to provide these experiences seamlessly and instantly,” Bergey said.
Where Arm’s advantage comes in is power efficiency, as that has helped Arm succeed in mobile devices for decades. That’s why the firm has more than 22 million software devs.
“Now it’s time to take the next step to prepare for the AI era. We’ve continually challenged ourselves to go further, because the demands of AI are fundamentally reshaping how the compute gets built, deployed and scaled. We’ve evolved our offering from soft IP to delivering full subsystems, and now we are proud to say we offer a complete AI-first platform, one that unifies computing performance, power efficiency and scalable system design,” he said.
Partners can choose exactly how they build Lumex into their SoC – they can take the platform as delivered and leverage cutting-edge physical implementations tailored to their needs, reaping time to market and time to performance benefits. Alternatively, partners can configure the platform RTL for their targeted tiers and harden the cores themselves.
Lumex and our simplified naming conventions across the Arm portfolio were announced earlier this year. The platform combines: next-generation SME2-enabled Armv9.3 CPU cluster including C1-Ultra and C1-Pro, powering flagship devices.
It also has the new C1-Premium, purpose built for the sub-flagship market, providing best in class area efficiency, as well as the new Mali G1-Ultra GPU with next-generation ray tracing enabling advanced graphics and gaming, plus a boost to AI performance.
And it has the most flexible and power-aware DynamIQ Shared Unit (DSU) Arm has delivered to date: C1 DSU; it is optimized physical implementations for 3nm nodes in manufacturing; and it has deep integration across the software stack delivering seamless AI acceleration for developers using KleidiAI libraries.
Accelerated AI everywhere with SME2-enabled CPUs

The SME2-enabled Arm C1 CPU cluster provides dramatic AI performance gains for real-world, AI-driven tasks. It provides up to five times uplift in AI performance, 4.7 times lower latency for speech-based workloads, and 2.8 times faster audio generation.
This leap in CPU AI compute enables real-time, on-device AI inference capabilities, providing users with smoother, faster experiences across interactions like audio generation, computer vision, and contextual assistants.
So what does this mean in real world use cases? SME2 can deliver a whole new level of responsiveness and efficiency. For example, the Smart Yoga Tutor demo app saw a 2.4 times boost in text-to-speech, meaning users get instant feedback on their poses, all without draining battery life.
Together with Alipay and Vivo, Arm achieved 40% reduction in the time it takes for LLM response for interaction with the user, proving SME2 is delivering faster real-time generative AI on-device.
SME2 isn’t just about speed; it’s also unlocking AI-powered capabilities that traditional CPUs can’t match.
For example, neural camera denoising now runs at over 120fps in 1080p or 30fps in 4K, all on a single core. That enables smartphone users to capture sharper, crystal-clear images even in the darkest scenes, allowing for smoother interactions and richer experiences on everyday devices.
Unlike cloud-first AI, which is constrained by latency, cost, and privacy concerns, Lumex brings intelligence directly to the device where it’s faster, safer, and always available. SME2 is being embraced by leading ecosystem players including Alibaba, Alipay, Honor, Samsung LSI and Tencent.
Architectural freedom for every product tier Lumex offers partners the freedom to balance peak performance, sustained efficiency, and silicon area in products ranging from high-end smartphones and PCs to emerging AI-first form factors.
Enabling desktop-class gaming and faster AI inference on Mali GPU

With over 12 billion Arm GPUs shipped to date, Arm is at the center of mobile gaming experiences. The new Arm Mali G1-Ultra GPU continues to push the boundaries of mobile gaming, delivering high-fidelity, console-class graphics. This is made possible by a brand-new Ray Tracing Unit v2 (RTUv2), powering advanced lighting, shadows and reflections, leading to a 2x uplift in ray tracing performance compared to its predecessor. For AI workloads, the G1-Ultra enables up to 20% faster inference performance, enhancing responsiveness across real-time applications.
The Mali G1-Ultra delivers 20% better performance across graphics benchmarks compared to the previous generation, with across-the-board improvements for leading titles, including Arena Breakout, Fortnite, Genshin Impact, and Honkai Starail. The G1-Premium and G1-Pro GPUs deliver superior performance and power efficiency for constrained devices.
Finally, developer-friendly AI for mobile For developers, AI experiences just work on the Lumex platform. Through the KleidiAI integration across major frameworks including PyTorch ExecuTorch, Google LiteRT, Alibaba MNN and Microsoft ONNX Runtime, apps automatically benefit from SME2 acceleration with no code changed required.
For developers building cross-platform apps, Lumex brings new portability: Google apps like Gmail, YouTube and Google Photos are already SME2-ready, ensuring seamless
integration as Lumex-based devices hit the market.
Cross platform portability means optimizations built for Android can seamlessly extend to
Windows on Arm and other platforms. Partners like Alipay are already showcasing on device LLMs running efficiently with SME2 Technology leaders – including Apple, Samsung, and MediaTek – are integrating AI acceleration capabilities for faster, more efficient on-device AI. Apple is powering Apple Intelligence; Samsung and MediaTek are improving responsiveness and efficiency of real-time AI applications such as translation, summarization, and personal assistants using Google Gemini.
Arm Lumex: Platform-level intelligence for the AI era
Arm Lumex is more than our most advanced CSS platform for the consumer computing market, it’s the foundation for the next era of intelligent AI-enabled experiences. Whether you’re an OEM or developer, Lumex gives you the tools to deliver personal, private and high-performance AI at the edge, where it matters most. Built for the AI era, Lumex is where the future of mobile innovation begins.
“Through deep integration with SME2, MNN enables low-latency, quantized inference for billion-parameter models like Qwen on smartphones — showcasing Arm and Alibaba’s joint innovation in scalable, next-gen mobile AI,” said Xiaotang Jiang, Head of MNN, Taobao and Tmall Group at Alibaba, in a statement.
“With Arm’s SME2 instruction set, Alipay is accelerating the deployment of on-device large language models to enable smarter, more responsive mobile experiences. SME2’s matrix acceleration and efficient low-precision computing allow us to deliver real-time AI with low latency, enhanced privacy, and seamless integration into everyday scenarios,” said Xindan Weng, Head of Client Engineering at Alipay, in a statement.
“SME2-enhanced hardware enables more advanced AI models, like Gemma 3, to run directly on a wide range of devices. As SME2 continues to scale, it will enable mobile developers to seamlessly deploy the next generation of AI features across ecosystems. This will ultimately benefit end-users with low-latency experiences that are widely available on their smartphones,” said Iliyan Malchev, distinguished software
engineer, Android at Google, in a statement.
Tencent has demoed a smart coach that runs alongside the game, offering comments on your play style. That’s a feature that will keep people in the game, and that translates into real gains in micro transactions and monetization. Krafton’s InZoi also has a variety of users for LLMs in the game.