option
Home
News
Sesame Unveils Base AI Model Behind Viral Virtual Assistant Maya

Sesame Unveils Base AI Model Behind Viral Virtual Assistant Maya

April 23, 2025
103

Sesame Unveils Base AI Model Behind Viral Virtual Assistant Maya

Sesame, the innovative AI company behind the strikingly lifelike voice assistant Maya, has just made waves by releasing the base model that drives her capabilities. Dubbed CSM-1B, this model boasts a size of 1 billion parameters, a term that refers to the individual components making up the model. Released under an Apache 2.0 license, it's open for commercial use with minimal restrictions, as announced on the AI development platform Hugging Face.

CSM-1B functions by converting text and audio inputs into "RVQ audio codes." RVQ stands for "residual vector quantization," a method that transforms audio into discrete tokens, or codes. This technique is also utilized in other cutting-edge AI audio technologies, such as Google's SoundStream and Meta's Encodec. At its core, CSM-1B leverages a model from Meta's Llama family, combined with an audio "decoder" component. A specialized version of CSM-1B, after fine-tuning, powers the voice of Maya, according to Sesame.

Describing the model as a "base generation model" on its Hugging Face and GitHub repositories, Sesame notes that it's designed to produce a variety of voices but hasn't been refined for any specific voice. While it has some ability to handle non-English languages thanks to "data contamination" in its training set, its performance in this area is likely subpar. Interestingly, Sesame has kept the details of the training data under wraps, leaving us curious about what went into building this model.

One aspect that raises eyebrows is the lack of robust safeguards. Sesame operates on an honor system, simply encouraging users and developers to avoid using the model to replicate someone's voice without permission, produce misleading content like fake news, or partake in any "harmful" or "malicious" activities. I personally tested the demo on Hugging Face, and within a minute, I had cloned my voice. It was a breeze to generate speech on any topic, even sensitive ones like the election and Russian propaganda.

Consumer Reports recently highlighted the concerning lack of "meaningful" safeguards in many AI-powered voice cloning tools, which could lead to potential fraud or abuse. Sesame, co-founded by Oculus co-creator Brendan Iribe, caught the public's eye in late February with its assistant tech that nearly escapes the uncanny valley. Both Maya and Sesame's other assistant, Miles, exhibit realistic human-like traits such as taking breaths, speaking with disfluencies, and being interruptible mid-speech, similar to OpenAI's Voice Mode.

Financially, Sesame has secured undisclosed funding from heavyweights like Andreessen Horowitz, Spark Capital, and Matrix Partners. Beyond voice assistants, the company is also venturing into prototyping AI glasses intended for all-day wear, equipped with their custom models. This move shows Sesame's ambition to push the boundaries of AI technology further into our daily lives.

Related article
Alibaba Unveils Wan2.1-VACE: Open-Source AI Video Solution Alibaba Unveils Wan2.1-VACE: Open-Source AI Video Solution Alibaba has introduced Wan2.1-VACE, an open-source AI model poised to transform video creation and editing processes.VACE is a key component of Alibaba’s Wan2.1 video AI model family, with the company
IBM Power11 Boosts Enterprise AI with Uninterrupted Performance IBM Power11 Boosts Enterprise AI with Uninterrupted Performance IBM’s Power11 enterprise servers tackle a key issue in enterprise computing: deploying AI workloads while maintaining the robust reliability required for mission-critical applications. Launched on Jul
AI-Powered Retail Experiment Fails Spectacularly at Anthropic AI-Powered Retail Experiment Fails Spectacularly at Anthropic Imagine handing over a small shop to an artificial intelligence, entrusting it with everything from pricing to customer interactions. What could go wrong?A recent Anthropic study, released on Friday,
Comments (7)
0/200
AnthonyMartinez
AnthonyMartinez July 30, 2025 at 9:41:20 PM EDT

Wow, Sesame's CSM-1B sounds like a game-changer! A billion parameters for Maya’s lifelike voice? That’s some serious tech flex. Curious how it stacks up against other models in real-world use. 😎

RoySmith
RoySmith July 27, 2025 at 9:18:39 PM EDT

Whoa, a 1B parameter model powering Maya? That's some serious brainpower! Curious how Sesame's CSM-1B stacks up against other AI giants. Excited to see where this tech takes us! 🚀

EricPerez
EricPerez April 24, 2025 at 6:42:49 PM EDT

Sesame's base AI model for Maya is mind-blowing! 1 billion parameters? That's insane! Maya's voice is so lifelike, it's like talking to a real person. But sometimes she gets a bit too chatty, which can be annoying. Still, a fantastic piece of tech! 🤯

GeorgeMiller
GeorgeMiller April 24, 2025 at 9:04:42 AM EDT

¡El modelo base de IA de Sesame para Maya es alucinante! ¿1 billón de parámetros? ¡Eso es una locura! La voz de Maya es tan realista, parece que estoy hablando con una persona real. Pero a veces se pone un poco parlanchina, lo que puede ser molesto. Aún así, una tecnología fantástica! 🤯

JonathanMiller
JonathanMiller April 24, 2025 at 8:11:38 AM EDT

Das Basis-AI-Modell von Sesame für Maya ist umwerfend! 1 Milliarde Parameter? Das ist verrückt! Mayas Stimme ist so lebensecht, es fühlt sich an, als würde man mit einer echten Person sprechen. Aber manchmal wird sie ein bisschen zu gesprächig, was nervig sein kann. Trotzdem, eine fantastische Technologie! 🤯

TimothyMitchell
TimothyMitchell April 24, 2025 at 3:25:54 AM EDT

SesameのMayaの基礎AIモデルは驚異的!10億のパラメータ?それは狂ってる!Mayaの声は本当にリアルで、まるで本物の人と話しているみたい。ただ、時々彼女がちょっとおしゃべりすぎてうざい時がある。それでも、素晴らしい技術だよ!🤯

Back to Top
OR