Google Launches "Project Ellman": Using AI to Tell Personal Life Stories

2023-12-11

A team at Google has proposed using artificial intelligence technology to create a "bird's-eye view" of users' lives through mobile data such as photos and searches.

The project, named "Project Ellmann" after biographer and literary critic Richard David Ellmann, aims to use large language models (LLMs) like Gemini to ingest search results, discover patterns in user photos, create chatbots, and "answer previously impossible questions." The goal of the Ellmann project is to become "your life's storyteller."

It is currently unclear whether the company plans to implement these features in Google Photos or any other product. According to a blog post by Google, Google Photos has over 1 billion users and 4 trillion photos and videos.

Project Ellman is just one of many ways Google is proposing to use artificial intelligence technology to create or improve its products. On Wednesday, Google launched Gemini, its most advanced artificial intelligence model to date, which in some cases surpasses OpenAI's GPT-4. The company plans to license Gemini to a wide range of customers through Google Cloud for use in their own applications. One notable feature of Gemini is its multimodal capability, meaning it can process and understand information beyond text, including images, videos, and audio.

According to documents, the product manager of Photos proposed Project Ellman to the Gemini team at a recent internal summit. They wrote that the team identified large language models as the ideal technology to achieve this bird's-eye view of life stories over the past few months.

Ellmann can provide context by using biographies, previous moments, and subsequent photos to describe users' photos more deeply than "just pixels with tags and metadata." The presentation states that it can identify a range of moments, such as college years, time in the Bay Area, and time as parents.

Next to a photo, it describes, "Without a bird's-eye view, we can neither answer difficult questions nor tell good stories."

"We will search in your photos, identify a meaningful moment by looking at their tags and locations," one slide reads. "When we step back and understand your life as a whole, your overall story becomes clear."

The presentation states that the large language model can infer moments like the birth of a user's child. "This LLM can infer that this is Jack's birth and that he is James and Gemma's first and only child," it says.

"One reason why LLM is so powerful for this bird's-eye approach is that it can acquire unstructured context and use it to enhance understanding of other areas," reads a slide with an illustration showing different "moments" and "chapters" of a user's life.

The presenters also gave other examples, such as determining when a user recently attended a class reunion. "It's been exactly 10 years since his graduation, and there are many people he hasn't seen in 10 years, so it's likely a reunion," the team inferred in their presentation.

The team also demonstrated "Ellmann Chat," described as "imagine opening ChatGPT, but it already knows everything about your life. What would you ask it?"

It showed an example conversation where the user asked, "Do I have a pet?" The chatbot replied, "Yes, you have a dog wearing a red raincoat," and then provided the name of the dog and the names of the two family members it most often follows.

Another chat example was a user asking when their siblings last visited. Ellmann provided an answer to both.

Ellmann also showcased a summary of the user's eating habits, with other slides displaying, "You seem to enjoy Italian food. There are several photos of pasta and a photo of pizza." It also mentioned that the user seems to enjoy trying new foods as there was a dish on the menu in their photos that it didn't recognize.

The technology also identified products the user is considering purchasing, their interests, work, and travel plans based on their screenshots, the presentation stated. It also suggested that it could know the user's favorite websites and apps, such as Google Docs, Reddit, and Instagram.

A Google spokesperson told CNBC, "Google Photos has been using artificial intelligence to help people search their photos and videos, and we're excited about the potential of LLMs to unlock even more useful experiences. This is an early-stage internal exploration, and as always, if we decide to launch new features, we'll take the time to ensure they're helpful to people and prioritize user privacy and safety in their design."

Big tech companies compete for AI-driven 'memories'

The proposed Project Ellmann could help Google gain an advantage in the race among tech giants to create more personalized life memories.

Google Photos and Apple Photos have been offering "memories" and albums generated based on photo trends for years.

In November last year, Google announced that with the help of artificial intelligence, Google Photos can now group similar photos together and organize screenshots into easily searchable albums.

Apple announced in June that its latest software update would include the ability for its Photos app to recognize people, dogs, and cats in users' photos. It already has the ability to categorize faces and allows users to search for them by name.

Apple also announced an upcoming journaling app that will use on-device artificial intelligence to create personalized suggestions for users to write paragraphs describing their memories and experiences based on recent photos, locations, music, and workouts.

However, Apple, Google, and other tech giants are still grappling with the complexities of displaying and identifying images appropriately.

For example, Apple and Google avoided labeling gorillas after a 2015 report found that the companies mistakenly labeled black people as gorillas. A New York Times investigation this year found that Apple and Google's Android software, which is the foundation of most smartphones worldwide, turned off visual search capabilities for primates to avoid labeling humans as animals.

Companies including Google and Apple have added controls over time to minimize unwanted memories, but users report that they still occasionally appear, requiring multiple settings switches to minimize them.