Sesame, the Startup Behind Popular Virtual Assistant Maya, Releases Its Foundation AI Model

2025-03-14

Sesame, an artificial intelligence company, has launched a foundational model that powers Maya, a strikingly realistic voice assistant.

The model boasts 1 billion parameters ("parameters" refer to various components of the model) and is released under the Apache 2.0 license, allowing for commercial use with minimal restrictions. According to Sesame's description on the AI development platform Hugging Face, this model, named CSM-1B, can generate "RVQ audio codes" from both text and audio inputs.

RVQ stands for "residual vector quantization," a technique that encodes audio into discrete tokens called codes. RVQ is utilized in many recent AI audio technologies, including Google's SoundStream and Meta's Encodec.

CSM-1B uses Meta’s Llama series models as its backbone and includes an audio "decoder" component. Sesame states that a fine-tuned variant of CSM powers Maya.

"The model open-sourced here is a base generative model," Sesame writes in the Hugging Face and GitHub repositories for CSM-1B. "It is capable of generating multiple voices but hasn't been fine-tuned for any specific voice[…] Due to data contamination in the training dataset, the model has some capability to handle non-English languages, though potentially with reduced effectiveness."

It remains unclear what data Sesame used to train CSM-1B. The company has not provided specifics.

Notably, the model lacks robust safety measures. Sesame relies on an honor system, merely urging developers and users not to use the model to mimic others' voices without consent, create misleading content like fake news, or engage in "harmful" or "malicious" activities.

Sesame, co-founded by Oculus co-founder Brendan Iribe, gained attention at the end of February for its assistant technology, which nearly transcends the uncanny valley. Maya and another assistant from Sesame, Miles, exhibit breathing patterns and include disfluencies in their speech. They can also be interrupted while speaking, much like OpenAI's voice models.

Sesame has raised an undisclosed amount of funding from Andreessen Horowitz, Spark Capital, and Matrix Partners. In addition to developing voice assistant technology, the company says it is prototyping "all-day wearable" AI glasses equipped with its custom models.