Apple AI Makes Technological Breakthrough: Large Language Model Now Runnable on iPhone

2023-12-22

Apple AI researchers have invented a technology that utilizes flash memory to store AI models, enabling large language models (LLMs) to run on memory-constrained devices like iPhones. This brings faster AI capabilities to iPhones and lays the foundation for Apple's generative AI model "Ajax". LLMs and Memory Constraints LLMs are data and memory-intensive AI applications, such as ChatGPT and Claude chatbots. They typically require a significant amount of memory to run, which poses a challenge for devices with limited memory capacity like iPhones. Apple's researchers have developed a novel technique that utilizes flash memory (the same memory used for storing applications and photos) to store AI model data. Storing AI on Flash Memory In a recent research paper, the authors point out that flash memory is more abundant in mobile devices compared to traditional random-access memory (RAM) used for running LLMs. Their approach employs two techniques that reduce data transfers and improve flash memory throughput: 1. Windowing: This is a recycling method where AI models reuse previously processed data instead of loading new data every time. This reduces the need for memory retrieval, resulting in a faster and smoother process. 2. Row-Column Bundling: This is a grouping method that allows for more efficient data grouping, enabling faster data retrieval from flash memory and accelerating AI language understanding and generation. According to the paper, these methods enable AI models to run at twice the available memory on iPhones. This translates to a 4-5x speed improvement on standard processors (CPUs) and a 20-25x speed improvement on graphics processors (GPUs). The authors state, "This is crucial for deploying advanced LLMs in resource-constrained environments, expanding their applicability and accessibility." Faster AI on iPhones The improved efficiency of AI opens up new possibilities for the future of iPhones, such as advanced Siri features, real-time language translation, and complex AI-driven functionalities in photography and augmented reality. This technology also lays the groundwork for running sophisticated AI assistants and chatbots on devices, which Apple is reportedly researching. Apple's work in generative AI may be integrated into the Siri voice assistant. In February 2023, Apple held an AI summit where it introduced its work on large language models. According to Bloomberg, Apple aims to make Siri smarter and more deeply integrated with AI. Apple plans to update the interaction between Siri and messaging apps, allowing users to handle complex queries more efficiently and automate sentence completion. Apple also plans to incorporate AI into more of its applications. Apple GPT Reports suggest that Apple is developing its own generative AI model called "Ajax" to compete with OpenAI's GPT-3 and GPT-4. Ajax operates on 200 billion parameters, indicating high complexity and capability in language understanding and generation. Dubbed "Apple GPT," Ajax aims to unify Apple's machine learning development, hinting at a broader strategy to integrate AI more deeply into Apple's ecosystem. Ajax is more capable than the earlier ChatGPT 3.5. However, some suggest that OpenAI's new models may have surpassed Ajax's capabilities as of September 2023. Analyst Jeff Pu predicts that Apple will introduce some form of generative AI functionality on iPhones and iPads around late 2024, coinciding with the release of iOS 18. Pu stated in October that Apple built hundreds of AI servers in 2023 and will continue to increase them by 2024. Apple will also offer a combination of cloud-based AI and on-device AI.