AI Health Tech War Escalates with New Launches from OpenAI, Google and Anthropic
Within days of each other this month, OpenAI, Google, and Anthropic all revealed specialized medical AI capabilities. This clustering of announcements points to competitive pressure rather than mere coincidence. However, none of these releases are cleared as medical devices, approved for clinical use, or available for direct patient diagnosis—despite marketing language that emphasizes transforming healthcare.
On January 7, OpenAI introduced ChatGPT Health, allowing US users to connect medical records via partnerships with b.well, Apple Health, Function, and MyFitnessPal. Google released MedGemma 1.5 on January 13, expanding its open medical AI model to interpret 3D CT and MRI scans as well as whole-slide histopathology images.
Anthropic followed on January 11 with Claude for Healthcare, offering HIPAA-compliant connectors to CMS coverage databases, ICD-10 coding systems, and the National Provider Identifier Registry.
All three companies are targeting the same administrative pain points—prior authorization reviews, claims processing, and clinical documentation—with similar technical approaches but different go-to-market strategies.
Developer platforms, not diagnostic products
The architectural similarities are striking. Each system uses multimodal large language models fine-tuned on medical literature and clinical datasets. Each emphasizes privacy protections and regulatory disclaimers. Each positions itself as supporting, not replacing, clinical judgment.

The differences lie in deployment and access models. OpenAI’s ChatGPT Health operates as a consumer-facing service with a waitlist for ChatGPT Free, Plus, and Pro subscribers outside the EEA, Switzerland, and the UK. Google’s MedGemma 1.5 is released as an open model through its Health AI Developer Foundations program, available for download via Hugging Face or deployment through Google Cloud’s Vertex AI.
Anthropic’s Claude for Healthcare integrates into existing enterprise workflows via Claude for Enterprise, targeting institutional buyers rather than individual consumers. All three share a consistent regulatory stance.
OpenAI explicitly states that Health “is not intended for diagnosis or treatment.” Google positions MedGemma as “starting points for developers to evaluate and adapt to their medical use cases.” Anthropic emphasizes that outputs “are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications.”

Benchmark performance vs clinical validation
Medical AI benchmark scores saw substantial improvements across all three releases, though the gap between test performance and real-world clinical deployment remains wide. Google reports MedGemma 1.5 achieved 92.3% accuracy on Stanford's MedAgentBench, a medical agent completion benchmark, compared to 69.6% for the previous Sonnet 3.5 baseline.
In internal testing, the model improved by 14 percentage points on MRI disease classification and 3 percentage points on CT findings. Anthropic’s Claude Opus 4.5 scored 61.3% on MedCalc medical calculation accuracy tests with Python code execution enabled, and 92.3% on MedAgentBench.
The company also claims improvements in “honesty evaluations” regarding factual hallucinations, though specific metrics were not shared.
OpenAI has not published benchmark comparisons specifically for ChatGPT Health, instead noting that “over 230 million people globally ask health and wellness-related questions on ChatGPT every week,” based on de-identified analysis of existing usage.
These benchmarks measure performance on curated test datasets, not clinical outcomes. Since medical errors can have life-threatening consequences, translating benchmark accuracy into real clinical utility is far more complex than in other AI domains.
Regulatory pathway remains unclear
The regulatory landscape for these medical AI tools is still ambiguous. In the US, FDA oversight hinges on intended use. Software that “supports or provides recommendations to a healthcare professional about prevention, diagnosis, or treatment” may require premarket review as a medical device. None of the announced tools currently have FDA clearance.
Liability questions are similarly unresolved. When Banner Health’s CTO Mike Reagin states the system was “drawn to Anthropic’s focus on AI safety,” this speaks to technology selection, not legal liability frameworks.
If a clinician relies on Claude’s prior authorization analysis and a patient suffers harm from delayed care, existing case law offers little guidance on assigning responsibility.
Regulatory approaches vary significantly by region. While the FDA and Europe’s Medical Device Regulation offer established frameworks for software as a medical device, many APAC regulators have yet to issue specific guidance on generative AI diagnostic tools.
This ambiguity affects adoption timelines in markets where healthcare infrastructure gaps might otherwise accelerate implementation, creating tension between clinical need and regulatory caution.
Administrative workflows, not clinical decisions
Actual deployments remain narrowly scoped. Novo Nordisk’s Director of Content Digitalisation, Louise Lind Skov, described using Claude for “document and content automation in pharma development,” focusing on regulatory submissions rather than patient diagnosis.
Taiwan’s National Health Insurance Administration applied MedGemma to extract data from 30,000 pathology reports for policy analysis, not treatment decisions.
This pattern shows institutional adoption is concentrating on administrative workflows where errors are less immediately dangerous—such as billing, documentation, and protocol drafting—rather than direct clinical decision support where AI could most dramatically impact patient outcomes.
Medical AI capabilities are advancing faster than institutions can navigate the regulatory, liability, and workflow integration complexities. The technology is here. Sophisticated medical reasoning tools are accessible for a monthly fee.
Whether this translates to transformed healthcare delivery depends on critical questions these coordinated announcements have yet to address.
See also: AstraZeneca bets on in-house AI to speed up oncology research
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events. Click here for more information.
AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.
Related article
Hangzhou Shangcheng District Launches Zhejiang's First AIGC Audio-Visual 'Golden Ten Measures', 5 Billion Industry Fund
On the 16th, the AIGC Audio-Visual Industry Innovation Ecosystem Conference took place in Hangzhou's Shangcheng District. During the event, the province unveiled its first dedicated policy for the AIGC audio-visual industry—"The Golden Ten." This pol
MIIT Seeks Public Feedback on 121 Industry Standards, Including AI Model Context Protocol
China's Ministry of Industry and Information Technology has officially released a notice seeking public feedback on 121 industry standardization projects, including the "Application Security Requirements for the Artificial Intelligence Security Gover
OpenAI Partners with U.S. Department of Defense, ChatGPT Uninstallations Surge 295%
Public Outrage: OpenAI's Military Partnership Sparks a 'Uninstall Surge'Recently, AI leader OpenAI announced a deep partnership with the U.S. Department of Defense (DoD), integrating its AI models into top-secret military networks. The news sparked w
Related Special Topic Recommendations
Comments (1)
0/500
Wow, this AI health race is getting intense! Just saw the news about OpenAI, Google, and Anthropic all dropping medical AI tools almost at the same time. It's clearly a strategic move, not a coincidence. Makes you wonder who's really leading the pack and what it means for our future healthcare. Exciting but also a bit scary, right? 🤔
Within days of each other this month, OpenAI, Google, and Anthropic all revealed specialized medical AI capabilities. This clustering of announcements points to competitive pressure rather than mere coincidence. However, none of these releases are cleared as medical devices, approved for clinical use, or available for direct patient diagnosis—despite marketing language that emphasizes transforming healthcare.
On January 7, OpenAI introduced ChatGPT Health, allowing US users to connect medical records via partnerships with b.well, Apple Health, Function, and MyFitnessPal. Google released MedGemma 1.5 on January 13, expanding its open medical AI model to interpret 3D CT and MRI scans as well as whole-slide histopathology images.
Anthropic followed on January 11 with Claude for Healthcare, offering HIPAA-compliant connectors to CMS coverage databases, ICD-10 coding systems, and the National Provider Identifier Registry.
All three companies are targeting the same administrative pain points—prior authorization reviews, claims processing, and clinical documentation—with similar technical approaches but different go-to-market strategies.
Developer platforms, not diagnostic products
The architectural similarities are striking. Each system uses multimodal large language models fine-tuned on medical literature and clinical datasets. Each emphasizes privacy protections and regulatory disclaimers. Each positions itself as supporting, not replacing, clinical judgment.

The differences lie in deployment and access models. OpenAI’s ChatGPT Health operates as a consumer-facing service with a waitlist for ChatGPT Free, Plus, and Pro subscribers outside the EEA, Switzerland, and the UK. Google’s MedGemma 1.5 is released as an open model through its Health AI Developer Foundations program, available for download via Hugging Face or deployment through Google Cloud’s Vertex AI.
Anthropic’s Claude for Healthcare integrates into existing enterprise workflows via Claude for Enterprise, targeting institutional buyers rather than individual consumers. All three share a consistent regulatory stance.
OpenAI explicitly states that Health “is not intended for diagnosis or treatment.” Google positions MedGemma as “starting points for developers to evaluate and adapt to their medical use cases.” Anthropic emphasizes that outputs “are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications.”

Benchmark performance vs clinical validation
Medical AI benchmark scores saw substantial improvements across all three releases, though the gap between test performance and real-world clinical deployment remains wide. Google reports MedGemma 1.5 achieved 92.3% accuracy on Stanford's MedAgentBench, a medical agent completion benchmark, compared to 69.6% for the previous Sonnet 3.5 baseline.
In internal testing, the model improved by 14 percentage points on MRI disease classification and 3 percentage points on CT findings. Anthropic’s Claude Opus 4.5 scored 61.3% on MedCalc medical calculation accuracy tests with Python code execution enabled, and 92.3% on MedAgentBench.
The company also claims improvements in “honesty evaluations” regarding factual hallucinations, though specific metrics were not shared.
OpenAI has not published benchmark comparisons specifically for ChatGPT Health, instead noting that “over 230 million people globally ask health and wellness-related questions on ChatGPT every week,” based on de-identified analysis of existing usage.
These benchmarks measure performance on curated test datasets, not clinical outcomes. Since medical errors can have life-threatening consequences, translating benchmark accuracy into real clinical utility is far more complex than in other AI domains.
Regulatory pathway remains unclear
The regulatory landscape for these medical AI tools is still ambiguous. In the US, FDA oversight hinges on intended use. Software that “supports or provides recommendations to a healthcare professional about prevention, diagnosis, or treatment” may require premarket review as a medical device. None of the announced tools currently have FDA clearance.
Liability questions are similarly unresolved. When Banner Health’s CTO Mike Reagin states the system was “drawn to Anthropic’s focus on AI safety,” this speaks to technology selection, not legal liability frameworks.
If a clinician relies on Claude’s prior authorization analysis and a patient suffers harm from delayed care, existing case law offers little guidance on assigning responsibility.
Regulatory approaches vary significantly by region. While the FDA and Europe’s Medical Device Regulation offer established frameworks for software as a medical device, many APAC regulators have yet to issue specific guidance on generative AI diagnostic tools.
This ambiguity affects adoption timelines in markets where healthcare infrastructure gaps might otherwise accelerate implementation, creating tension between clinical need and regulatory caution.
Administrative workflows, not clinical decisions
Actual deployments remain narrowly scoped. Novo Nordisk’s Director of Content Digitalisation, Louise Lind Skov, described using Claude for “document and content automation in pharma development,” focusing on regulatory submissions rather than patient diagnosis.
Taiwan’s National Health Insurance Administration applied MedGemma to extract data from 30,000 pathology reports for policy analysis, not treatment decisions.
This pattern shows institutional adoption is concentrating on administrative workflows where errors are less immediately dangerous—such as billing, documentation, and protocol drafting—rather than direct clinical decision support where AI could most dramatically impact patient outcomes.
Medical AI capabilities are advancing faster than institutions can navigate the regulatory, liability, and workflow integration complexities. The technology is here. Sophisticated medical reasoning tools are accessible for a monthly fee.
Whether this translates to transformed healthcare delivery depends on critical questions these coordinated announcements have yet to address.
See also: AstraZeneca bets on in-house AI to speed up oncology research
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events. Click here for more information.
AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.
Hangzhou Shangcheng District Launches Zhejiang's First AIGC Audio-Visual 'Golden Ten Measures', 5 Billion Industry Fund
On the 16th, the AIGC Audio-Visual Industry Innovation Ecosystem Conference took place in Hangzhou's Shangcheng District. During the event, the province unveiled its first dedicated policy for the AIGC audio-visual industry—"The Golden Ten." This pol
MIIT Seeks Public Feedback on 121 Industry Standards, Including AI Model Context Protocol
China's Ministry of Industry and Information Technology has officially released a notice seeking public feedback on 121 industry standardization projects, including the "Application Security Requirements for the Artificial Intelligence Security Gover
OpenAI Partners with U.S. Department of Defense, ChatGPT Uninstallations Surge 295%
Public Outrage: OpenAI's Military Partnership Sparks a 'Uninstall Surge'Recently, AI leader OpenAI announced a deep partnership with the U.S. Department of Defense (DoD), integrating its AI models into top-secret military networks. The news sparked w
Wow, this AI health race is getting intense! Just saw the news about OpenAI, Google, and Anthropic all dropping medical AI tools almost at the same time. It's clearly a strategic move, not a coincidence. Makes you wonder who's really leading the pack and what it means for our future healthcare. Exciting but also a bit scary, right? 🤔





Home






