Osaurus Integrates Local and Cloud AI Models on Mac
As AI models become increasingly commoditized, startups are competing to develop the software layer that operates on top of them. A notable player in this field is Osaurus, an open-source, Mac-exclusive LLM server. It enables users to switch between different local or cloud-based AI models while keeping their files and tools securely on their own hardware.
Osaurus originated from the concept of a desktop AI companion called Dinoki, which co-founder Terence Pae likened to an "AI-powered Clippy." Potential customers questioned why they should purchase the app if they still had to pay for tokens—the units AI companies charge for processing prompts and generating responses.
This prompted Pae to explore the possibilities of running AI locally in greater depth.
"That's how Osaurus began," Pae, a former software engineer at Tesla and Netflix, explained to TechCrunch. The initial idea was to attempt running an AI assistant directly on a user's device. "You can accomplish nearly everything locally on your Mac, such as browsing files, accessing your web browser, and checking system configurations. I realized this positioned Osaurus perfectly as a personal AI for individual users."
Pae started developing the tool publicly as an open-source project, continuously adding features and resolving bugs throughout the process.

Image Credits: Osaurus, Inc.
Today, Osaurus offers flexible connections to locally hosted AI models or cloud providers like OpenAI and Anthropic. Users can freely select their preferred AI model while maintaining other aspects of the AI experience—such as the model's memory, personal files, and tools—on their own hardware.
Since different AI models possess unique strengths, this system's advantage lies in allowing users to switch to the model best suited for their specific task.
This architecture classifies Osaurus as a "harness"—a control layer that integrates various AI models, tools, and workflows through a single interface, similar to platforms like OpenClaw or Hermes. However, such tools are often designed for developers comfortable with terminal commands. In some cases, like with OpenClaw, they may also introduce security vulnerabilities.
In contrast, Osaurus provides a user-friendly interface for general consumers and addresses security by operating within a hardware-isolated, virtual sandbox. This confines the AI's access to a defined scope, protecting your computer and data.

Image Credits: Osaurus, Inc.
Running AI models locally is still an emerging practice, as it demands significant resources and depends heavily on hardware. To operate local models, your system requires at least 64 GB of RAM. For larger models like DeepSeek v4, Pae recommends systems with approximately 128 GB of RAM.
Nevertheless, Pae is confident that the hardware demands for local AI will decrease over time.
"I recognize its potential because the intelligence per watt—a key metric for local AI—has been rising dramatically. It's on its own innovation trajectory. Last year, local AI struggled to complete sentences, but today it can execute tools, write code, access your browser, and even place orders on Amazon. It's continuously improving," he noted.

Image Credits: Osaurus, Inc.
Currently, Osaurus supports models including MiniMax M2.5, Gemma 4, Qwen3.6, GPT-OSS, Llama, and DeepSeek V4. It also integrates Apple's on-device foundation models, Liquid AI's LFM family of on-device models, and cloud connections to OpenAI, Anthropic, Gemini, xAI/Grok, Venice AI, OpenRouter, Ollama, and LM Studio.
As a full Model Context Protocol (MCP) server, it grants any MCP-compatible client access to your tools. Additionally, it comes pre-equipped with over 20 native plugins for Mail, Calendar, Vision, macOS utilities, XLSX, PPTX, Browser, Music, Git, Filesystem, Search, Fetch, and more.
A recent update added voice capability to Osaurus.
Since its launch nearly a year ago, the project has been downloaded over 112,000 times, according to its website.
The founders, including co-founder Sam Yoo, are currently participating in the New York-based startup accelerator Alliance. They are also planning future steps, which may involve offering Osaurus to businesses in sectors like legal or healthcare, where local LLMs could help address data privacy concerns.
As local AI models grow more powerful, the team believes they could reduce reliance on large-scale AI data centers.
"We're witnessing explosive growth in AI, where cloud providers must scale using massive data centers and infrastructure. Yet, we feel the value of local AI remains underappreciated," Pae stated. "Instead of depending on the cloud, organizations could deploy a Mac Studio on-premises, consuming significantly less power. You retain cloud-like capabilities without being dependent on a remote data center to run your AI," he added.
Related article
Kakao Mobility outlines Level 4 autonomous driving roadmap for physical AI
Kakao Mobility is planning to develop Level 4 autonomous driving technologies internally as part of its physical AI strategy.
At the 2026 World IT Show conference in Seoul's COEX, Kim Jin-kyu — vice president and head of Kakao Mobility's Physical AI
Barry Diller: Trust in Sam Altman irrelevant as AGI nears
Barry Diller, the billionaire media titan, does not believe OpenAI CEO Sam Altman is untrustworthy, despite recent reports suggesting otherwise. Speaking at the Wall Street Journal's "Future of Everything" conference this week, Diller defended Altman
YouTube expands AI deepfake detection to politicians, government officials, and journalists
On Tuesday, YouTube announced it is expanding its deepfake detection technology to a select group of government officials, political candidates, and journalists. The tool identifies AI-generated likenesses and lets pilot participants request the remo
Related Special Topic Recommendations
Comments (0)
0/500
As AI models become increasingly commoditized, startups are competing to develop the software layer that operates on top of them. A notable player in this field is Osaurus, an open-source, Mac-exclusive LLM server. It enables users to switch between different local or cloud-based AI models while keeping their files and tools securely on their own hardware.
Osaurus originated from the concept of a desktop AI companion called Dinoki, which co-founder Terence Pae likened to an "AI-powered Clippy." Potential customers questioned why they should purchase the app if they still had to pay for tokens—the units AI companies charge for processing prompts and generating responses.
This prompted Pae to explore the possibilities of running AI locally in greater depth.
"That's how Osaurus began," Pae, a former software engineer at Tesla and Netflix, explained to TechCrunch. The initial idea was to attempt running an AI assistant directly on a user's device. "You can accomplish nearly everything locally on your Mac, such as browsing files, accessing your web browser, and checking system configurations. I realized this positioned Osaurus perfectly as a personal AI for individual users."
Pae started developing the tool publicly as an open-source project, continuously adding features and resolving bugs throughout the process.

Image Credits: Osaurus, Inc.
Today, Osaurus offers flexible connections to locally hosted AI models or cloud providers like OpenAI and Anthropic. Users can freely select their preferred AI model while maintaining other aspects of the AI experience—such as the model's memory, personal files, and tools—on their own hardware.
Since different AI models possess unique strengths, this system's advantage lies in allowing users to switch to the model best suited for their specific task.
This architecture classifies Osaurus as a "harness"—a control layer that integrates various AI models, tools, and workflows through a single interface, similar to platforms like OpenClaw or Hermes. However, such tools are often designed for developers comfortable with terminal commands. In some cases, like with OpenClaw, they may also introduce security vulnerabilities.
In contrast, Osaurus provides a user-friendly interface for general consumers and addresses security by operating within a hardware-isolated, virtual sandbox. This confines the AI's access to a defined scope, protecting your computer and data.

Image Credits: Osaurus, Inc.
Running AI models locally is still an emerging practice, as it demands significant resources and depends heavily on hardware. To operate local models, your system requires at least 64 GB of RAM. For larger models like DeepSeek v4, Pae recommends systems with approximately 128 GB of RAM.
Nevertheless, Pae is confident that the hardware demands for local AI will decrease over time.
"I recognize its potential because the intelligence per watt—a key metric for local AI—has been rising dramatically. It's on its own innovation trajectory. Last year, local AI struggled to complete sentences, but today it can execute tools, write code, access your browser, and even place orders on Amazon. It's continuously improving," he noted.

Image Credits: Osaurus, Inc.
Currently, Osaurus supports models including MiniMax M2.5, Gemma 4, Qwen3.6, GPT-OSS, Llama, and DeepSeek V4. It also integrates Apple's on-device foundation models, Liquid AI's LFM family of on-device models, and cloud connections to OpenAI, Anthropic, Gemini, xAI/Grok, Venice AI, OpenRouter, Ollama, and LM Studio.
As a full Model Context Protocol (MCP) server, it grants any MCP-compatible client access to your tools. Additionally, it comes pre-equipped with over 20 native plugins for Mail, Calendar, Vision, macOS utilities, XLSX, PPTX, Browser, Music, Git, Filesystem, Search, Fetch, and more.
A recent update added voice capability to Osaurus.
Since its launch nearly a year ago, the project has been downloaded over 112,000 times, according to its website.
The founders, including co-founder Sam Yoo, are currently participating in the New York-based startup accelerator Alliance. They are also planning future steps, which may involve offering Osaurus to businesses in sectors like legal or healthcare, where local LLMs could help address data privacy concerns.
As local AI models grow more powerful, the team believes they could reduce reliance on large-scale AI data centers.
"We're witnessing explosive growth in AI, where cloud providers must scale using massive data centers and infrastructure. Yet, we feel the value of local AI remains underappreciated," Pae stated. "Instead of depending on the cloud, organizations could deploy a Mac Studio on-premises, consuming significantly less power. You retain cloud-like capabilities without being dependent on a remote data center to run your AI," he added.
Barry Diller: Trust in Sam Altman irrelevant as AGI nears
Barry Diller, the billionaire media titan, does not believe OpenAI CEO Sam Altman is untrustworthy, despite recent reports suggesting otherwise. Speaking at the Wall Street Journal's "Future of Everything" conference this week, Diller defended Altman
YouTube expands AI deepfake detection to politicians, government officials, and journalists
On Tuesday, YouTube announced it is expanding its deepfake detection technology to a select group of government officials, political candidates, and journalists. The tool identifies AI-generated likenesses and lets pilot participants request the remo





Home






