Apple AI Research: Advancements in 3D Virtual Image Creation and Mobile Device Language Model Optimization

2023-12-21

Apple, a company that has almost become synonymous with technological innovation, has once again positioned itself at the forefront of the artificial intelligence revolution.

Apple recently announced significant advancements in artificial intelligence research through two new papers, which introduce new technologies for 3D avatars and efficient language model inference. These advancements could lead to more immersive visual experiences and enable complex AI systems to run on consumer devices such as iPhones and iPads.

In the first research paper, Apple scientists propose the Human Gaussian Splatter (HUGS) technique for generating animated 3D avatars from short monocular videos (videos captured from a single camera). "Our method can automatically learn to separate static scenes from fully animatable human avatars using just a monocular video with 50-100 frames in 30 minutes," said lead author Muhammed Kocabas.

HUGS uses 3D Gaussian splatter to represent humans and background scenes, which is an efficient rendering technique. The human model is initialized from a statistical body shape model called SMPL. However, HUGS allows for Gaussian deviations to capture details such as clothing and hair.

A novel neural deformation module animates the Gaussians in a realistic manner using linear blend skinning, which avoids distortions when repositioning avatars. Kocabas stated that HUGS "enables novel pose synthesis for humans and novel viewpoint synthesis for humans and scenes."

Compared to previous avatar generation methods, HUGS is up to 100 times faster in training and rendering. Researchers demonstrated realistic results by optimizing it on a typical gaming GPU system for 30 minutes. HUGS also surpasses state-of-the-art techniques like Vid2Avatar and NeuMan in 3D reconstruction quality.

Apple's 3D modeling capability is truly impressive. Real-time performance and the ability to create avatars from real-world videos could soon unlock new possibilities for virtual try-ons, remote attendance, and synthetic media. Imagine what could be possible if you could create such new 3D scenes on an iPhone camera!

Bridging the Memory Gap in AI Inference

In the second paper, Apple researchers address a key challenge in deploying large language models (LLMs) on memory-constrained devices. Modern natural language models like GPT-4 contain trillions of parameters, making inference on consumer-grade hardware prohibitively expensive.

The proposed system minimizes data transfers from flash memory to scarce DRAM during inference. "Our approach involves building an inference cost model that is consistent with flash memory behavior, guiding optimizations in two critical areas: reducing the amount of data transferred from flash and reading data in larger, more contiguous blocks," explained lead author Keivan Alizadeh.

Two main techniques are introduced. "Windowing" reuses recently inferred activations, while "row-column packing" reads larger data blocks by storing rows and columns together. These methods improve inference latency by 4-5 times compared to the original loading on Apple's M1 Max CPU. On GPUs, the speedup reaches 20-25 times.

"This breakthrough is particularly important for deploying advanced LLMs in resource-constrained environments, expanding their applicability and accessibility," said co-author Mehrdad Farajtabar. These optimizations could soon enable smooth running of complex AI assistants and chatbots on iPhones, iPads, and other mobile devices.

Apple's Strategic Vision

Both papers demonstrate Apple's growing leadership in AI research and applications. While the prospects are promising, experts caution that Apple needs to be extremely cautious and responsible when integrating these technologies into consumer products. Considerations must be given to privacy protection and mitigating risks of misuse, among other social impacts.

Apple may integrate these innovations into its product lineup, indicating that the company is not only enhancing its devices but also anticipating future demands for AI integration services. By enabling the execution of more complex AI models on memory-constrained devices, Apple may be paving the way for a new class of applications and services that leverage the capabilities of LLMs in previously unachievable ways.

If applied with caution, Apple's latest innovations could take artificial intelligence to a new level. Realistic digital avatars and powerful AI assistants on portable devices once seemed out of reach—but thanks to Apple's scientists, the future is rapidly becoming a reality.