Meta brings its Llama model to smartphones and it’s awesome! AI remains powerful without draining RAM and monopolizing the processor. How did Meta achieve this feat, which clearly leaves Apple and Google in the lurch? Explanations.
Operate directly on smartphones with low power consumption AIAIit is the dream of companies specializing in their development. And Meta announces that it has made this dream a reality by running its Llama 3.2 1B and 3B models up to four times faster on a phone. Even better, the models use less than half the memory of previous versions.
How did Meta achieve this result? Using a compression technique based on quantization. It is based on the simplification of the mathematical calculations that drive AI models. This is, in a way, the principle explained to Futura by Mehdi Bennis, a researcher working on this type of revolutionary artificial intelligence at the University of Oulu in Finland. To maintain response accuracy despite a significantly smaller model, Meta blended two methods by combining QLoRA adapters and another called SpinQuant.
The first is based on quantification-aware training with LoRA adapters. The idea is to define a fixed weight for the pretrained model and add matrices that can be trained. The number of trainable parameters is reduced and the AI ​​adaptation process is more efficient. The other is dedicated to portability and is called SpinQuant. Ultimately, with this combination, colossal computing power is no longer needed to achieve a result. When testing on phones AndroidAndroid The OnePlus 12 models were 56% smaller and used 41% less memory while processing text more than twice as fast. The only limitation is the generation of texts, which must not exceed 8,000 characters.
Run AI directly on the smartphone
But that’s not all. While GoogleGoogle And AppleApple taking an approach to mobile AI embedded in their operating systems, Meta is open-sourcing its compressed models and taking the opportunity to partner with chipmakers Qualcomm and MediaTek. With open source, there is no need for developers to wait for updates from Android oriOSiOS to be able to create appsapps of AI. And by leveraging partnerships with Qualcomm and MediaTek, Meta optimizes its models for these widespread processors. This allows it to ensure that its AI will work effectively on phones in different price ranges, not just high-end devices.
We can consider that what Meta is doing is similar to what happened inuniverseuniverse of computers. Before the advent of PCs, processing power came fromcomputerscomputers central. It ended up on PCs.
On the same principle, AI today follows this process to switch between serversservers against direct exploitation on mobiles. This will still require phones powerful enough to work, but the benefit will be a boost confidentialityconfidentiality data on the mobile, instead of passing through via the cloud to be processed. A method that goes against Apple and Google’s vision for artificial intelligence on smartphones.