Why Data Governance Sits at the Heart of ISO 42001
If you are building an AI management system under ISO 42001, data governance is not a side topic you can address with a quick policy document. It is woven into the core of what the standard requires. The reason is straightforward: AI systems are only as trustworthy as the data that trains and drives them. Poor data quality, uncontrolled data sources, and undocumented data lineage are among the leading causes of AI failures, biased outputs, and regulatory breaches.
On this page
ISO 42001 is the international standard for AI management systems, published by the International Organisation for Standardisation. It provides a structured framework for organisations that develop, deploy, or use AI systems to manage risks, demonstrate accountability, and operate responsibly. Data governance sits within that framework as a foundational requirement rather than an optional add-on.
This article walks you through what ISO 42001 actually requires in relation to data governance, what it looks like in practice, and how to build controls that will hold up under audit. If you want a broader introduction to the standard first, our understanding of ISO IEC 42001 for AI management systems is a good starting point.
What ISO 42001 Says About Data
ISO 42001 does not have a single clause titled “data governance.” Instead, data-related requirements are distributed across several clauses and the standard’s normative Annex A controls. Understanding where these requirements live helps you build a complete picture rather than missing critical elements.
Clause 6: Planning and Data Risk
Clause 6 requires you to identify AI-related risks and opportunities. Data risks must be included in this assessment. That means you need to consider risks such as training data that is unrepresentative of the real population, data that was collected without appropriate consent, data that has become stale or outdated, and data pipelines that introduce errors or biases.
These are not theoretical concerns. An AI system trained on historical hiring data that reflects past discrimination will perpetuate that discrimination unless the data risk is identified and treated. Clause 6 requires you to document these risks and plan how you will address them, which feeds directly into the controls you establish around data.
Clause 8: Operational Controls and Data Management
Clause 8 is where the practical data governance work happens. It requires organisations to establish, implement, and control the processes needed to meet AI system requirements. For data, this translates into documented processes for how data is sourced, validated, prepared, stored, and used within AI systems.
Specifically, you need to be able to demonstrate that your data handling processes are planned and controlled, that criteria for data acceptance and rejection are defined, and that records are kept to show the processes were followed. This is where many organisations fall short in early implementation. They have informal practices but no documented process, which means an auditor cannot verify that the right controls are actually in place.
Annex A Controls Related to Data
Annex A of ISO 42001 contains a set of AI-specific controls that organisations should consider implementing. Several of these controls relate directly to data governance. The key ones include:
- A.6 AI system data: This is the most directly relevant control area. It covers data governance policies, data quality requirements, data provenance, and the handling of sensitive data within AI systems.
- A.7 AI system documentation: Requires that the data used in AI systems is documented sufficiently to allow for review and accountability.
- A.8 AI system operation: Includes requirements around monitoring data inputs and outputs during operation, not just during development.
- A.9 Technical and organisational measures: Covers security and access controls applied to data used in AI systems.
Annex A is normative in that organisations are expected to consider these controls, but the standard uses a Statement of Applicability approach similar to ISO 27001. You document which controls apply to your context, implement the applicable ones, and justify any exclusions. For most organisations deploying AI systems, the data-related controls in A.6 will almost always be applicable.
The Five Core Data Governance Requirements Under ISO 42001
When you map the standard’s requirements to practical obligations, five core data governance areas emerge. Each one needs a documented approach, implemented controls, and evidence that the controls are working.
1. Data Governance Policy
You need a documented policy that establishes your organisation’s approach to managing data in AI systems. This is not the same as a general data management policy or a privacy policy, though it can reference and build on those documents. The AI data governance policy needs to address the specific risks and responsibilities associated with data used to train, validate, test, and operate AI systems.
The policy should define roles and responsibilities, state the organisation’s commitment to data quality and ethical data use, and set out the principles that govern data selection and handling. It does not need to be lengthy. A clear, two-page policy that your team actually understands and follows is far more valuable than a thirty-page document that sits in a shared drive untouched.
2. Data Quality Requirements
ISO 42001 requires that you define and apply data quality criteria for the data used in your AI systems. Quality in this context covers accuracy, completeness, consistency, timeliness, and relevance to the intended use case. You need to document what “good enough” data looks like for each AI system and have a process for checking incoming data against those criteria before it is used.
In practice, this means building data validation steps into your AI development and deployment pipeline. If you are using a third-party dataset, you need to assess it against your quality criteria before incorporating it. If you are collecting data from operational systems, you need controls to detect and handle anomalies, missing values, and inconsistencies.
This requirement has direct implications for organisations that purchase or license datasets from external providers. You cannot simply assume that a commercially available dataset meets your quality requirements. You need to verify it, document your assessment, and keep records of that verification.
3. Data Provenance and Lineage
Provenance refers to where your data came from. Lineage refers to how it was transformed, processed, and used on its way to your AI system. ISO 42001 requires that you can account for both. This is one of the areas where organisations with informal data practices find themselves most exposed during an audit.
Being able to demonstrate data provenance means you can answer questions like: Where did this training dataset originate? Was it collected with appropriate consent? Does it comply with the relevant data protection laws in the jurisdictions where your AI system operates? Was it obtained from a reliable source, and how do you know?
Data lineage documentation means you can trace what happened to the data after it was collected. Was it anonymised? Was it augmented with synthetic data? Were any records removed or modified, and if so, why and by whom? These questions matter because they determine whether your AI system’s outputs can be trusted and whether you can defend your data practices to a regulator or a client.
4. Sensitive Data Handling
ISO 42001 pays specific attention to sensitive data, which includes personal information, health data, financial data, data relating to vulnerable populations, and any data that could cause harm if mishandled. If your AI system uses or produces sensitive data, you need additional controls.
These controls typically include access restrictions, anonymisation or pseudonymisation where feasible, retention limits, and documented approval processes for using sensitive data in AI training or testing. You also need to consider the intersection with privacy law. In Australia, the Privacy Act 1988 and the Australian Privacy Principles impose specific obligations on how personal information is handled. Your ISO 42001 data governance controls need to be consistent with those legal requirements.
For organisations that are also certified to ISO 27701 for privacy information management, there is significant overlap here. Aligning your ISO 42001 sensitive data controls with your existing ISO 27701 framework is an efficient approach that avoids duplication and makes both systems easier to maintain.
5. Data Monitoring During Operation
Data governance under ISO 42001 does not stop when your AI system goes live. The standard requires that you monitor data inputs and outputs during operation to detect issues such as data drift, where the statistical properties of incoming data change over time and cause the AI system’s performance to degrade.
This is a requirement that many organisations overlook during implementation because it requires ongoing operational effort rather than a one-time setup task. You need to define what you are monitoring, how often, what thresholds trigger a review, and what actions you take when issues are detected. These monitoring activities need to be documented and the results recorded.
Data drift is a real problem in production AI systems. A model trained on pre-pandemic consumer behaviour data will perform poorly if deployed without adjustment in a post-pandemic market. Monitoring for drift and having a documented response process is what separates organisations that manage their AI systems responsibly from those that simply deploy and hope for the best.
Documenting Your Data Governance Framework
ISO 42001 follows the same High Level Structure as other ISO management system standards, which means documentation requirements are consistent with what you would find in ISO 9001 or ISO 27001. You need documented information that demonstrates your data governance controls are defined, implemented, and maintained.
At a minimum, your documentation should include a data governance policy for AI systems, a register of the data assets used in each AI system within scope, data quality criteria for each AI system, data provenance records for training and validation datasets, a sensitive data register with associated controls, and monitoring records showing that operational data is being reviewed.
If you are already familiar with how to manage controlled documents under other ISO standards, the same principles apply here. Version control, review cycles, and access controls for your data governance documents are all part of demonstrating a functioning management system.
How Data Governance Connects to AI Risk Management
Data governance and AI risk management are not separate workstreams under ISO 42001. They are deeply connected. The risks you identify in Clause 6 around data directly inform the controls you establish, and the effectiveness of those controls determines whether your risk treatment is actually working.
Consider a practical example. A financial services company deploys an AI system to assess loan applications. The data governance risk assessment identifies that the training data contains historical loan decisions made by human officers who had implicit biases against certain demographic groups. The risk is that the AI system will replicate and potentially amplify those biases.
The data governance controls in response might include reprocessing the training data to remove or rebalance the biased historical decisions, establishing ongoing monitoring of approval rates across demographic groups, and setting a threshold that triggers a human review when the AI system’s decisions deviate from expected patterns. Each of these controls needs to be documented, implemented, and evidenced.
This kind of connected thinking between risk and control is what auditors are looking for. It is also what distinguishes ISO 42001 from the NIST AI Risk Management Framework in terms of certifiability. ISO 42001 requires you to demonstrate that your controls are working, not just that you have thought about the risks.
Common Gaps Found During ISO 42001 Audits
Having worked through AI management system implementations, certain data governance gaps come up repeatedly during Stage 1 and Stage 2 audits. Being aware of these helps you address them before the auditor finds them.
No Documented Data Quality Criteria
Organisations often have informal standards for what constitutes acceptable data, but nothing written down. If an auditor asks how you determined that a particular dataset was suitable for training your AI system, “we reviewed it and it looked fine” is not an acceptable answer. You need documented criteria and evidence that the data was assessed against them.
Missing Provenance Records for External Datasets
If you downloaded a public dataset or licensed data from a third party, you need records showing where it came from, what terms govern its use, and whether it is permitted to be used for AI training under those terms. Many organisations have not thought carefully about whether their data licensing agreements actually permit AI training use, which creates both a compliance gap and a legal risk.
No Operational Monitoring Plan
The absence of any operational data monitoring is a common finding. Organisations implement controls during development but have no plan for monitoring data quality or drift once the system is in production. This is a straightforward gap to address with a monitoring schedule and defined responsibilities, but it needs to be done before certification.
Sensitive Data in Test Environments
Using real personal data in AI testing and development environments without the same controls that apply in production is a frequent issue. ISO 42001 requires that sensitive data handling controls apply wherever the data is used, not just in the production system.
Practical Steps to Build Your ISO 42001 Data Governance Framework
If you are starting from scratch or assessing your current state against the standard’s requirements, here is a practical sequence to follow.
- Inventory your AI systems and their data inputs: You cannot govern what you have not mapped. Start by documenting every AI system within your certification scope and identifying all data sources that feed into each one.
- Assess data risks for each system: For each data source, assess the risks around quality, provenance, sensitivity, and bias. Document your assessment and link it to your Clause 6 risk register.
- Define data quality criteria: For each AI system, document what acceptable data looks like. Be specific. Vague criteria like “data must be accurate” are not auditable. Specific criteria like “training data must have less than 2% missing values in key fields” are.
- Document provenance for existing datasets: Go back to your current datasets and document where they came from, how they were obtained, and what terms apply. Fill any gaps by contacting data providers or sourcing replacement datasets where provenance cannot be established.
- Implement sensitive data controls: Apply appropriate controls to any sensitive data used in your AI systems and document those controls in a register.
- Build an operational monitoring plan: Define what you will monitor, how often, who is responsible, and what actions follow when issues are detected. Implement the plan and keep records of monitoring activities.
- Write your data governance policy: Once you have worked through the above steps, you have the substance to write a meaningful policy rather than a generic one.
If you are preparing for your first ISO 42001 audit, our guide on how to prepare for an ISO 42001 Stage 1 audit covers what auditors will look for across all clauses, including data governance.
Getting Help With ISO 42001 Data Governance
ISO 42001 is still a relatively new standard, and finding consultants with genuine hands-on experience implementing its data governance requirements is not straightforward. The standard requires a combination of AI system knowledge, data management expertise, and ISO management system experience that not every consultant brings to the table.
If you are looking for qualified help, CertBetter connects businesses with verified ISO 42001 consultants and accredited certification bodies. You submit one form and receive up to three competing quotes from vetted providers, at no cost to your business. It is a practical way to find consultants who have actually implemented ISO 42001 data governance frameworks rather than those who are learning on your time and budget.




