How to Ensure Your Data is Trustworthy for AI Integration

Trust in artificial intelligence is a delicate matter, hinging entirely on the quality of the data it's built upon. The issue of data integrity, a longstanding challenge for even the most sophisticated organizations, has resurfaced with a vengeance. Industry experts are raising red flags, warning that users of generative AI could be at the mercy of incomplete, repetitive, or outright incorrect data due to the fragmented or weak data foundations of these systems.
According to a recent analysis by Ashish Verma, the chief data and analytics officer at Deloitte US, along with his co-authors, "AI and gen AI are setting new standards for data quality." They emphasize that without a robust data architecture that spans various types and modalities, and accounts for data diversity and bias, generative AI strategies are bound to falter. They also stress the need for data transformation suitable for probabilistic systems.
The Unique Demands of AI-Ready Data Architectures
AI systems, which rely on probabilistic models, introduce unique challenges. The output can vary based on the probabilities and the underlying data at the moment of a query, which complicates data system design. Verma and his team highlight that traditional data systems might not be up to the task, potentially inflating the costs of training and retraining AI models. They advocate for data transformations that include ontologies, governance, trust-building initiatives, and the development of queries that mirror real-world scenarios.
Adding to these complexities are issues like AI hallucinations and model drift, underscoring the need for human oversight and efforts to align and ensure data consistency.
The Crucial Role of Trust in AI
Ian Clayton, the chief product officer at Redpoint Global, told ZDNET that trust might be the most valuable asset in the AI landscape. He stressed the importance of a data environment fortified with strong data governance, clear data lineage, and transparent privacy policies. Such a foundation not only fosters ethical AI use but also prevents AI from veering off course, which could result in inconsistent customer experiences.
Industry Concerns Over Data Readiness for AI
Gordon Robinson, senior director of data management at SAS, echoed the sentiment that data quality has been a persistent challenge for businesses. Before embarking on an AI journey, he advises companies to ask two critical questions: "Do you understand what data you have, its quality, and its trustworthiness?" and "Do you have the necessary skills and tools to prepare your data for AI?"
Clayton also highlighted the pressing need for enhanced data consolidation and quality measures to tackle AI challenges, advocating for the integration of data from silos and rigorous quality checks like deduplication and consistency assurance.
New Dimensions of Data Security with AI
The introduction of AI also brings new security considerations to the forefront. Omar Khawaja, field chief information security officer at Databricks, warned against bypassing security measures in the rush to deploy AI solutions, as this could lead to inadequate oversight.
Essential Elements for Trustworthy AI Data
- Agile Data Pipelines: Clayton noted that the fast-paced evolution of AI necessitates agile and scalable data pipelines. These are crucial for adapting to new AI applications, particularly during the training phase.
- Visualization: Clayton also pointed out that if data scientists struggle to access and visualize their data, it significantly hampers their efficiency in developing AI.
- Robust Governance Programs: Robinson emphasized the importance of strong data governance to prevent data quality issues that could lead to flawed insights and poor decision-making. Such governance also helps in understanding the organization's data landscape and ensuring compliance with regulations.
- Thorough and Ongoing Measurements: Khawaja stressed that the performance of AI models depends directly on the quality of their training data. He recommended regular metrics, like monthly adoption rates, to monitor how quickly AI capabilities are being adopted, indicating whether these tools and processes meet user needs.
Clayton advocated for an AI-ready data architecture that allows IT and data teams to measure outcomes such as data quality, accuracy, completeness, consistency, and AI model performance. He urged organizations to ensure that their AI initiatives deliver tangible benefits, rather than deploying AI just for the sake of it.
Interested in more AI stories? Subscribe to our weekly newsletter, Innovation.
Related article
Design Eye-Catching Coloring Book Covers Using Leonardo AI
Looking to design eye-catching coloring book covers that grab attention in Amazon's competitive KDP marketplace? Leonardo AI can help you create professional-grade, visually appealing covers that drive sales. Follow our expert techniques to craft stu
YouTube Integrates Veo 3 AI Video Tool Directly Into Shorts Platform
YouTube Shorts to Feature Veo 3 AI Video Model This SummerYouTube CEO Neal Mohan revealed during his Cannes Lions keynote that the platform's cutting-edge Veo 3 AI video generation technology will debut on YouTube Shorts later this summer. This follo
Top AI Labs Warn Humanity Is Losing Grasp on Understanding AI Systems
In an unprecedented show of unity, researchers from OpenAI, Google DeepMind, Anthropic and Meta have set aside competitive differences to issue a collective warning about responsible AI development. Over 40 leading scientists from these typically riv
Comments (37)
0/200
DouglasScott
August 23, 2025 at 3:01:24 PM EDT
This article really hits the nail on the head! Data quality is everything for AI. I’ve seen companies rush into AI without cleaning their data, and it’s a mess—garbage in, garbage out. Curious how small startups handle this compared to big players. 🤔
0
DouglasAllen
August 21, 2025 at 5:01:34 PM EDT
This article really opened my eyes to how crucial data quality is for AI. It's wild to think that even big companies struggle with this! I wonder how smaller startups manage to keep their data trustworthy. 🤔
0
RaymondAdams
August 20, 2025 at 11:01:15 PM EDT
This article really opened my eyes to how crucial data quality is for AI. It’s wild to think that even big companies struggle with this! Makes me wonder if we’re rushing AI integration too fast. 🤔
0
JuanEvans
August 17, 2025 at 1:00:59 AM EDT
This article really opened my eyes to how crucial data quality is for AI. It’s wild to think that even big companies struggle with this. Makes me wonder if we’ll ever fully trust AI decisions 🤔.
0
WalterAnderson
August 14, 2025 at 7:01:00 PM EDT
Super insightful read! Trustworthy data is the backbone of AI, but it’s wild how many orgs still struggle with integrity. Feels like we’re building castles on sand sometimes. 🏰
0
StephenMiller
August 6, 2025 at 1:00:59 AM EDT
This article really opened my eyes to how crucial data quality is for AI. It’s wild to think even big companies struggle with this! 😮 Makes me wonder if we’ll ever fully trust AI decisions.
0
Trust in artificial intelligence is a delicate matter, hinging entirely on the quality of the data it's built upon. The issue of data integrity, a longstanding challenge for even the most sophisticated organizations, has resurfaced with a vengeance. Industry experts are raising red flags, warning that users of generative AI could be at the mercy of incomplete, repetitive, or outright incorrect data due to the fragmented or weak data foundations of these systems.
According to a recent analysis by Ashish Verma, the chief data and analytics officer at Deloitte US, along with his co-authors, "AI and gen AI are setting new standards for data quality." They emphasize that without a robust data architecture that spans various types and modalities, and accounts for data diversity and bias, generative AI strategies are bound to falter. They also stress the need for data transformation suitable for probabilistic systems.
The Unique Demands of AI-Ready Data Architectures
AI systems, which rely on probabilistic models, introduce unique challenges. The output can vary based on the probabilities and the underlying data at the moment of a query, which complicates data system design. Verma and his team highlight that traditional data systems might not be up to the task, potentially inflating the costs of training and retraining AI models. They advocate for data transformations that include ontologies, governance, trust-building initiatives, and the development of queries that mirror real-world scenarios.
Adding to these complexities are issues like AI hallucinations and model drift, underscoring the need for human oversight and efforts to align and ensure data consistency.
The Crucial Role of Trust in AI
Ian Clayton, the chief product officer at Redpoint Global, told ZDNET that trust might be the most valuable asset in the AI landscape. He stressed the importance of a data environment fortified with strong data governance, clear data lineage, and transparent privacy policies. Such a foundation not only fosters ethical AI use but also prevents AI from veering off course, which could result in inconsistent customer experiences.
Industry Concerns Over Data Readiness for AI
Gordon Robinson, senior director of data management at SAS, echoed the sentiment that data quality has been a persistent challenge for businesses. Before embarking on an AI journey, he advises companies to ask two critical questions: "Do you understand what data you have, its quality, and its trustworthiness?" and "Do you have the necessary skills and tools to prepare your data for AI?"
Clayton also highlighted the pressing need for enhanced data consolidation and quality measures to tackle AI challenges, advocating for the integration of data from silos and rigorous quality checks like deduplication and consistency assurance.
New Dimensions of Data Security with AI
The introduction of AI also brings new security considerations to the forefront. Omar Khawaja, field chief information security officer at Databricks, warned against bypassing security measures in the rush to deploy AI solutions, as this could lead to inadequate oversight.
Essential Elements for Trustworthy AI Data
- Agile Data Pipelines: Clayton noted that the fast-paced evolution of AI necessitates agile and scalable data pipelines. These are crucial for adapting to new AI applications, particularly during the training phase.
- Visualization: Clayton also pointed out that if data scientists struggle to access and visualize their data, it significantly hampers their efficiency in developing AI.
- Robust Governance Programs: Robinson emphasized the importance of strong data governance to prevent data quality issues that could lead to flawed insights and poor decision-making. Such governance also helps in understanding the organization's data landscape and ensuring compliance with regulations.
- Thorough and Ongoing Measurements: Khawaja stressed that the performance of AI models depends directly on the quality of their training data. He recommended regular metrics, like monthly adoption rates, to monitor how quickly AI capabilities are being adopted, indicating whether these tools and processes meet user needs.
Clayton advocated for an AI-ready data architecture that allows IT and data teams to measure outcomes such as data quality, accuracy, completeness, consistency, and AI model performance. He urged organizations to ensure that their AI initiatives deliver tangible benefits, rather than deploying AI just for the sake of it.
Interested in more AI stories? Subscribe to our weekly newsletter, Innovation.




This article really hits the nail on the head! Data quality is everything for AI. I’ve seen companies rush into AI without cleaning their data, and it’s a mess—garbage in, garbage out. Curious how small startups handle this compared to big players. 🤔




This article really opened my eyes to how crucial data quality is for AI. It's wild to think that even big companies struggle with this! I wonder how smaller startups manage to keep their data trustworthy. 🤔




This article really opened my eyes to how crucial data quality is for AI. It’s wild to think that even big companies struggle with this! Makes me wonder if we’re rushing AI integration too fast. 🤔




This article really opened my eyes to how crucial data quality is for AI. It’s wild to think that even big companies struggle with this. Makes me wonder if we’ll ever fully trust AI decisions 🤔.




Super insightful read! Trustworthy data is the backbone of AI, but it’s wild how many orgs still struggle with integrity. Feels like we’re building castles on sand sometimes. 🏰




This article really opened my eyes to how crucial data quality is for AI. It’s wild to think even big companies struggle with this! 😮 Makes me wonder if we’ll ever fully trust AI decisions.












