- Mu is Microsoft's new small language model, optimized to run locally on Windows 11 devices with NPUs.
- Its initial integration is done in the Windows 11 Configuration agent, allowing adjustments using natural language.
- Mu stands out for its efficiency and speed, reaching more than 100 tokens per second thanks to its 330 million parameters.
- It includes innovations such as Dual LayerNorm, RoPE, and GQA, and has been trained using advanced processes and high-quality educational data.

The arrival of Mu, the latest small language model presented by Microsoft, marks a significant step in the current trend of placing artificial intelligence directly on users' devices. With the intention of reduce cloud dependence and harness the potential of Neural Processing Units (NPU), Mu is integrated into the Copilot+ PCs running Windows 11, initially focusing on the Settings app to facilitate access and modification of system parameters using simply natural language.
This advance means that, instead of sending queries to external servers, processing and responses are generated on the device itself, ensuring greater privacy, agility and efficiency. For the moment, the The rollout is targeted at Windows Insider Program participants with Copilot+ computers., although the expectation is that this technology will be extended to more users and functions in future updates.
What is Mu really and what makes it stand out?

Mu is a small language model (SLM, for its acronym in English), trained with 330 million parametersIts compact size does not mean a sacrifice in performance, as according to Microsoft it achieves figures very close to much larger models such as Phip-3.5-miniThis balance has been achieved thanks to a rigorous training process that has included techniques such as Dual LayerNorm, Rotary Positional Embeddings (RoPE) y Grouped-Query Attention (GQA) that provide efficiency and precision, especially in devices with limited resources.
The model takes advantage of a encoder-decoder architecture of the transformer type, capable of processing user input and transforming it into actions within the system. Thanks to this structure, Mu separates input and output processingWhich reduces latency and memory consumption, key points to ensure a smooth, wait-free user experience.
In official tests and data, Mu has proven capable of respond to more than 100 tokens per second and provide responses in less than 500 millisecondsThese numbers allow for virtually instantaneous interactions, even when it comes to modifying settings or interpreting long and varied queries in everyday language. If you'd like to dig deeper into how these models work, you can check out Comparisons between language models on PC.
Integration into the Configuration Agent and practical functions
Mu's initial landing is centered on the Windows 11 Configuration Agent, a feature that allows users adjust system parameters by simply typing or saying what they need. For example, just ask "How do I activate dark mode?" o "I want to increase the brightness" so that Mu can translate that instruction into the corresponding technical action within the system.
Microsoft has stressed that AI adapts to tens of thousands of different contexts and queriesIn fact, more than 100,000 have been used. 3,6 million training samples to cover everything from the most common requests – such as changing the language or managing Wi-Fi networks – to more complex tasks. For questions that are too short or ambiguous, the system uses the traditional search functions, but when the instruction is clear and detailed, Mu acts automatically or guides the user step by step.
Technology and optimization adapted to new generations of hardware

La Mu optimization has been one of the most carefully considered points during its development. Microsoft has worked in collaboration with silicon partners such as AMD, Intel and Qualcomm to adapt it to the specificities of the new NPUs present in the Copilot+ PCsThis joint work has made it possible to introduce post-training quantification techniques, which convert the model weights and activations to 8- and 16-bit integers, thus reducing memory consumption and avoiding the need to retrain the entire model.
Mu's training process was carried out in high-performance environments, using NVIDIA A100 GPUs within Azure Machine LearningThe data set included hundreds of billions of educational tokens and techniques such as distillation from Phi models and Low-Range Adaptation (LoRA) to transfer knowledge and fine-tune the model for specific tasks. The end result is a small, agile model that's uniquely suited to the resources and limitations of modern wearable hardware. You can also explore how turn your PC into a local AI hub to expand the capabilities of your system.
Current challenges, availability and future prospects
One of the biggest challenges facing Mu is the interpretation of ambiguous or very brief queries, a common problem in natural language-based systems. To do this, Microsoft has implemented a hybrid logicWhile short queries trigger traditional search results, more detailed instructions trigger AI intervention, either to guide the user or perform automated actions.
For now, Mu is only available in English and on Copilot+ devices through the Insider channel., although it is expected to be expanded to more languages and other devices in the coming months, including those with AMD and Intel processors. privacy and security They also play a fundamental role, given the local nature of the processing.
The deployment of Mu is just the beginning of a broader strategy by Microsoft to incorporate Local AI and efficient language models in even more applications and aspects of the operating system, improving the experience and accessibility without sacrificing performance or privacy.
I am a technology enthusiast who has turned his "geek" interests into a profession. I have spent more than 10 years of my life using cutting-edge technology and tinkering with all kinds of programs out of pure curiosity. Now I have specialized in computer technology and video games. This is because for more than 5 years I have been writing for various websites on technology and video games, creating articles that seek to give you the information you need in a language that is understandable to everyone.
If you have any questions, my knowledge ranges from everything related to the Windows operating system as well as Android for mobile phones. And my commitment is to you, I am always willing to spend a few minutes and help you resolve any questions you may have in this internet world.

