Multiverse Computing propels compressed AI models into mainstream
With private company defaults reaching as high as 9.2% — the highest in years — venture capital firm Lux Capital recently advised AI-reliant companies to secure written commitments for their compute capacity. As financial instability spreads through the AI supply chain, Lux warned that a handshake deal is no longer sufficient.
But there is a completely different option: abandoning reliance on external compute infrastructure altogether. Smaller AI models that run directly on a user's device — with no data center, no cloud provider, and no counterparty risk — are becoming capable enough to merit serious consideration. And Multiverse Computing is stepping forward.
The Spanish startup has maintained a relatively low profile compared to some competitors, but that is shifting as demand for AI efficiency surges. After compressing models from major AI labs such as OpenAI, Meta, DeepSeek, and Mistral AI, it has released two products: an app that demonstrates the capabilities of its compressed models, and an API portal — a gateway for developers to access and build upon those models — making them more widely available.
The CompactifAI app, which takes its name from Multiverse's quantum-inspired compression technology, is an AI chat tool similar to ChatGPT or Mistral's Le Chat. You ask a question, and the model responds. The difference is that Multiverse has embedded Gilda, a model so compact that it can run locally and offline, according to the company.

For end users, this offers a taste of edge AI, where data never leaves their devices and no internet connection is required. However, there is a catch: their mobile devices must have sufficient RAM and storage. If they don't — and many older iPhones won't — the app falls back to cloud-based models via API. The routing between local and cloud processing is handled automatically by a system Multiverse calls Ash Nazg, a name that Tolkien fans will recognize as a reference to the One Ring inscription in 'The Lord of the Rings.' But when the app routes to the cloud, it loses its primary privacy advantage.
These limitations suggest CompactifAI is not yet ready for widespread consumer adoption, though that may never have been the intention. According to Sensor Tower, the app has seen fewer than 5,000 downloads in the past month.
The real focus is businesses. Today, Multiverse is launching a self-service API portal that provides developers and enterprises with direct access to its compressed models — no need for AWS Marketplace.
“The CompactifAI API portal [now] gives developers direct access to compressed models with the transparency and control needed to run them in production,” CEO Enrique Lizaso said in a statement.
Real-time usage monitoring is a key feature of the API, and that's no coincidence. Along with the potential benefits of edge deployment, lower compute costs are a major reason why enterprises are exploring smaller models as an alternative to large language models (LLMs).
It also helps that small models are far less constrained than they once were. Earlier this week, Mistral updated its small model lineup with the release of Mistral Small 4, which it says is optimized for general chat, coding, agentic tasks, and reasoning. The French company also launched Forge, a system that allows enterprises to build custom models, including small models where they can choose the trade-offs their use cases can best accommodate.
Multiverse's recent results also indicate that the gap with LLMs is closing. Its latest compressed model, HyperNova 60B 2602, is built on gpt-oss-120b — an OpenAI model with publicly available underlying code. The company claims it delivers faster responses at lower cost than the original it was derived from, an advantage that is especially important for agentic coding workflows, where AI autonomously handles complex, multi-step programming tasks.
Making models small enough to run on mobile devices while remaining useful is a significant challenge. Apple Intelligence sidestepped this by combining an on-device model with a cloud model. Multiverse's CompactifAI app can also route requests to gpt-oss-120b via API, but its primary goal is to demonstrate that local models like Gilda and its future successors offer advantages beyond cost savings.
For workers in critical fields, a model that runs locally without cloud connectivity offers greater privacy and resilience. But the larger value lies in the business use cases this enables — for example, embedding AI in drones, satellites, and other environments where reliable connectivity cannot be assumed.
The company already serves over 100 global customers, including the Bank of Canada, Bosch, and Iberdrola, but expanding its customer base could help it secure additional funding. After raising a $215 million Series B last year, it is now reportedly raising a new €500 million funding round at a valuation exceeding €1.5 billion.
Related article
Cohere Unveils Open-Source Multilingual AI Model Family
Enterprise AI firm Cohere has unveiled a new family of multilingual models, named Tiny Aya, during the ongoing India AI Summit. These open-weight models—meaning their core code is publicly accessible for use and modification—support over 70 languages
Multiverse Computing Launches Free Compressed Generative AI Model
Large language models face a significant challenge: their immense size. Spanish startup Multiverse Computing is tackling this problem by creating compressed models designed to bridge the gap between the capabilities of cutting-edge AI and what busine
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Related Special Topic Recommendations
Comments (0)
0/500
With private company defaults reaching as high as 9.2% — the highest in years — venture capital firm Lux Capital recently advised AI-reliant companies to secure written commitments for their compute capacity. As financial instability spreads through the AI supply chain, Lux warned that a handshake deal is no longer sufficient.
But there is a completely different option: abandoning reliance on external compute infrastructure altogether. Smaller AI models that run directly on a user's device — with no data center, no cloud provider, and no counterparty risk — are becoming capable enough to merit serious consideration. And Multiverse Computing is stepping forward.
The Spanish startup has maintained a relatively low profile compared to some competitors, but that is shifting as demand for AI efficiency surges. After compressing models from major AI labs such as OpenAI, Meta, DeepSeek, and Mistral AI, it has released two products: an app that demonstrates the capabilities of its compressed models, and an API portal — a gateway for developers to access and build upon those models — making them more widely available.
The CompactifAI app, which takes its name from Multiverse's quantum-inspired compression technology, is an AI chat tool similar to ChatGPT or Mistral's Le Chat. You ask a question, and the model responds. The difference is that Multiverse has embedded Gilda, a model so compact that it can run locally and offline, according to the company.

For end users, this offers a taste of edge AI, where data never leaves their devices and no internet connection is required. However, there is a catch: their mobile devices must have sufficient RAM and storage. If they don't — and many older iPhones won't — the app falls back to cloud-based models via API. The routing between local and cloud processing is handled automatically by a system Multiverse calls Ash Nazg, a name that Tolkien fans will recognize as a reference to the One Ring inscription in 'The Lord of the Rings.' But when the app routes to the cloud, it loses its primary privacy advantage.
These limitations suggest CompactifAI is not yet ready for widespread consumer adoption, though that may never have been the intention. According to Sensor Tower, the app has seen fewer than 5,000 downloads in the past month.
The real focus is businesses. Today, Multiverse is launching a self-service API portal that provides developers and enterprises with direct access to its compressed models — no need for AWS Marketplace.
“The CompactifAI API portal [now] gives developers direct access to compressed models with the transparency and control needed to run them in production,” CEO Enrique Lizaso said in a statement.
Real-time usage monitoring is a key feature of the API, and that's no coincidence. Along with the potential benefits of edge deployment, lower compute costs are a major reason why enterprises are exploring smaller models as an alternative to large language models (LLMs).
It also helps that small models are far less constrained than they once were. Earlier this week, Mistral updated its small model lineup with the release of Mistral Small 4, which it says is optimized for general chat, coding, agentic tasks, and reasoning. The French company also launched Forge, a system that allows enterprises to build custom models, including small models where they can choose the trade-offs their use cases can best accommodate.
Multiverse's recent results also indicate that the gap with LLMs is closing. Its latest compressed model, HyperNova 60B 2602, is built on gpt-oss-120b — an OpenAI model with publicly available underlying code. The company claims it delivers faster responses at lower cost than the original it was derived from, an advantage that is especially important for agentic coding workflows, where AI autonomously handles complex, multi-step programming tasks.
Making models small enough to run on mobile devices while remaining useful is a significant challenge. Apple Intelligence sidestepped this by combining an on-device model with a cloud model. Multiverse's CompactifAI app can also route requests to gpt-oss-120b via API, but its primary goal is to demonstrate that local models like Gilda and its future successors offer advantages beyond cost savings.
For workers in critical fields, a model that runs locally without cloud connectivity offers greater privacy and resilience. But the larger value lies in the business use cases this enables — for example, embedding AI in drones, satellites, and other environments where reliable connectivity cannot be assumed.
The company already serves over 100 global customers, including the Bank of Canada, Bosch, and Iberdrola, but expanding its customer base could help it secure additional funding. After raising a $215 million Series B last year, it is now reportedly raising a new €500 million funding round at a valuation exceeding €1.5 billion.
Cohere Unveils Open-Source Multilingual AI Model Family
Enterprise AI firm Cohere has unveiled a new family of multilingual models, named Tiny Aya, during the ongoing India AI Summit. These open-weight models—meaning their core code is publicly accessible for use and modification—support over 70 languages
Multiverse Computing Launches Free Compressed Generative AI Model
Large language models face a significant challenge: their immense size. Spanish startup Multiverse Computing is tackling this problem by creating compressed models designed to bridge the gap between the capabilities of cutting-edge AI and what busine
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha





Home






