This document provides practical guidance for World Bank Group staff, country teams, and other organizations involved in Digital-in-Health activities on engaging with clients on data governance in health. It discusses key terms, considerations, and provides broad guidance on the topic. The document aims to help stakeholders understand the importance of data governance, key definitions, and steps for implementing effective data governance frameworks. This brief offers valuable insights for policymakers, program managers, and practitioners working to improve health data governance in various contexts.
Key Insights
What is Meant by Data?
Data may be quantitative or qualitative, stored on analog or digital media, collected through observation, digital transactions, or as by-products of digital lives. It is processed, structured, and analyzed to be converted into information (data are not synonymous with information) and is about people, things, and systems. Key characteristics of digital data (OECD 2022b) include:
- Intangible: digital data do not have a physical or financial embodiment
- Nonrival: digital data are theoretically infinitely usable, simultaneously by different actors, and without depleting them
- Potentially non-excludable: digital data that are made available or circulated on the Internet are not easily controlled and their use is not easily restricted
- Externalities: digital data sharing and use can generate wider benefits and costs for those who may not be directly involved in the process of sharing and using data
- Can exhibit increasing returns to scale: when used as a factor of production, the output from digital data use may increase by a larger proportion than the increase in data volume
- Very heterogeneous: digital data are treated differently in policy frameworks depending in what setting(s) and context(s) they were collected and are used
- Often co-produced: digital data are often the product of interactions between many actors, which may complicate conventional notions of ownership and lead to externalities
Data Concerning Health
One of the characteristics of data is that they are highly heterogeneous. There are many sources and types of data that concern health (Vayena et al. 2018):
- Disease surveillance
- Immunization records
- Public health reporting
- Vital statistics registries
- Clinical records (EHR/EMR)
- Prescribing
- Diagnostics
- Laboratory
- Insurance
- Omics, genomes
- Registries
- Clinical trials
- Biobanks
- Climate (Meteorological, Transport, Pollution, Energy, Geospatial)
- Lifestyle and socioeconomic
- Behavioral and social
- Loyalty card
- Store transactions
- Location tracking
- Financial
- Education
- Mobile apps
- Wellness
- Fitness
- Internet (World Wide Web, Social media, Self monitoring)
- Wearables and sensors
- Mobile apps
What is Data Governance?
As per the 2021 WDR, data governance entails creating an environment of implementing norms, infrastructure policies and technical mechanisms, laws, and regulations for data, related economic policies, and institutions that can effectively enable the safe, trustworthy use of public intent and private intent data to achieve development outcomes (public intent data are data collected for public purposes while private intent data are data collected for private purposes). The Pan American Health Organization (PAHO) defines data governance as a set of practices for making decisions about data and for managing data throughout its life cycle to optimize an organization’s capability to use data to generate insights that inform policy, strategy, and operational management (PAHО 2021а).
Data management and data governance along the data life cycle
(Source: Based on World Bank 2021.)
- Stage of life cycle: Create/receive
- Establish lawful use (such as obtaining consent for data collection and sharing)
- Determine and collect identifications that allow data to be merged with other datasets
- Stage of life cycle: Process
- Establish/adopt standards for units and categories (such as industry classifications)
- Determine/implement data formats that are widely compatible and accessible
- Set processes for validating the quality (accuracy), relevance, and integrity of data
- Stage of life cycle: Store
- Establish standard rules and procedures to encrypt data; use secure servers; back up and archive data
- Stage of life cycle: Transfer/share
- Establish verification processes to determine whether consent allows for data to be shared
- Determine when appropriate to deidentify data
- Sign confidentiality agreements for use of identified data
- Set rules for publishing data via bulk downloads or application programming interfaces (APIs)
- Stage of life cycle: Analyze and use
- Establish ways to promote reproducibility; publish code or algorithms
- Set constraints on publishing identifiable data
- Visualize and communicate insights from data
- Stage of life cycle: Archive and preserve
- Set mechanisms to classify and catalog data systematically so they can be found easily
- Include data dictionaries and notes on how data were created
- Establish rules to maintain access to data and their security and integrity over time
- Stage of life cycle: Destroy or reuse
- Establish when and how to keep records of destruction processes
- Determine how to verify that consent for use is still valid
The 2021 WDR Data governance framework
The 2021 WDR data governance framework (illustrated in figure 3) is made up of four distinct layers that build on and support one another.
- The foundational layer is the policy framework for data infrastructure to collect, exchange, store, process, and distribute data.
- The next layer consists of the legal and regulatory environment for data, which creates rules to enable the reuse and sharing of data while safeguarding against their potential abuse and misuse.
- The legal and regulatory environment for data interacts with wider economic policy issues represented in the third layer, which affect a country’s ability to harness the economic value of data through competition, trade, and taxation.
- The fourth and final layer is the institutional ecosystem that ensures that data can deliver on their potential and that laws, regulations, and policies are effectively enforced.
The 2021 WDR functions of health data governance frameworks
The 2021 WDR groups the functions of data governance into four thematic clusters: strategic planning; rule making and implementation; compliance; and learning and evidence (as illustrated in figure 4).
- Strategic Planning: Develop strategies; Establish institutional arrangements.
- Rule Making and Implementation: Legislate/Regulate; Set standards; Provide clarification and guidance.
- Compliance: Enforce; Audit; Arbitrate; Remedy.
- Learning and Evidence: Monitor and evaluate; Anticipate and manage risks.
The Health Data Governance Principles
The Health Data Governance Principles are clustered around three interconnected objectives: protecting people-as individuals, as groups, and as communities; promoting health value-through data sharing and innovative uses of data; and prioritizing equity-by ensuring equitable distribution of benefits that arise from the use of data in health systems.
Key Statistics & Data
- Data are a double-edged sword (World Bank 2021): they have massive potential for both creating social and economic value as well as for concentrating economic and political power to the detriment of citizens.
- Data are key to improving health system performance, including (OECD 2022c): the quality, safety, and patient-centeredness of health care services; the discovery and evaluation of new treatments through scientific innovation; and the redesign and evaluation of new models of service delivery.
Methodology
This document synthesizes information from various sources, including World Bank reports, OECD guidelines, and other relevant literature, to provide a practical overview of data governance in the health sector. It employs a framework-based approach, drawing on established data governance frameworks and principles to guide the discussion. The document also incorporates case examples and best practices to illustrate key concepts and implementation strategies.
Implications and Conclusions
This document highlights the critical role of data governance in achieving universal health coverage and improving health system performance. It emphasizes the need for a holistic and rights-based approach to data governance, with a focus on equity, privacy, and security. The document also underscores the importance of international cooperation and dialogue in addressing cross-border data governance challenges. By providing practical guidance and highlighting key considerations, this brief aims to support World Bank staff, country teams, and other organizations in implementing effective data governance frameworks in the health sector.
Key Points
- Data governance frameworks foster trust, create robust processes for data sharing, and promote ethical and responsible data management.
- Digital data is intangible, nonrival, potentially non-excludable, can exhibit increasing returns to scale, and is often co-produced.
- A data governance framework is the tangible expression of a country's social contract around data, allowing data governance to go from theory to practice.
- The 2021 WDR data governance framework is made up of four distinct layers: policy framework for data infrastructure, legal and regulatory environment, economic policy issues, and the institutional ecosystem.
- The 2016 OECD Council Recommendation on Health Data Governance provides a roadmap towards more harmonized approaches to health data governance across countries.
- Key steps in implementing health data governance frameworks include an initial assessment, a planning and design stage, and execution through budgeting, implementation, monitoring, evaluation, and continuous learning.
- Health data governance must protect individuals, groups, and communities against harm and violations at every stage of the data lifecycle.