This paper explores how novel data sources, such as satellite imagery and mobile phone data, combined with Artificial Intelligence (AI) and Machine Learning (ML), were used to scale up social assistance during the COVID-19 pandemic in countries lacking traditional data infrastructure. It examines experiences primarily from the Democratic Republic of Congo (DRC), Togo, and Nigeria, distilling lessons on the opportunities and limitations of these approaches for delivering support, particularly cash transfers, in data-scarce environments, both during emergencies and for regular programs. The practical value lies in providing guidance for policymakers considering these innovative tools to improve social protection system responsiveness and reach.
Core Arguments & Findings
The adoption of novel data and AI for social assistance targeting was driven by the urgent need to respond to the COVID-19 crisis in contexts lacking adequate traditional data (e.g., up-to-date social registries, household surveys) and infrastructure (p. 7, 12). These methods offered potential substitutes or complements to conventional approaches, particularly valuable in emergencies and fragile settings (p. 13).
Requirements for Implementation
Successful adoption requires several key conditions (pp. 14-15):
- Political leadership: Commitment is crucial to overcome inertia and potential political economy challenges.
- Mobile penetration: Sufficiently high mobile phone ownership is necessary for phone-based outreach, registration, and payments.
- MNO partnerships: Collaboration with Mobile Network Operators is essential for accessing mobile data (like Call Detail Records - CDR) and leveraging their infrastructure.
- Legal & regulatory safeguards: Robust data protection and privacy frameworks must be in place, along with agreements governing data use between government, MNOs, and other partners.
Geographic Targeting using Novel Data
Countries used satellite imagery and ML to create poverty maps where traditional small area estimation wasn’t feasible due to outdated data (pp. 16-17).
- DRC (STEP-KIN): Developed a hotspot and vulnerability map for Kinshasa using satellite imagery to assess crowding, precarious construction, flood risk, and access to jobs (p. 17). Selected the 20% most vulnerable areas within hotspots.
- Togo (Novissi Model 2) & Nigeria (NASSP SU): Used ML models trained on existing (though potentially limited) household survey data combined with various geospatial features (satellite imagery, nightlights, connectivity data, population density) to predict poverty/consumption levels at a granular level (pp. 18-19). Togo ranked cantons; Nigeria identified poorer wards.
Challenges: These maps are often based on pre-crisis data and may not reflect pandemic impacts. The static nature of satellite imagery limits its ability to capture dynamic changes. Statistical robustness can be questioned due to limited ground truth data points per geographic unit. High-resolution imagery can be costly, and geographic targeting can face political opposition (pp. 23-24).
Individual-Level Targeting using Novel Data
Mobile phones enabled contactless, paperless, and cashless delivery systems, often targeting individuals rather than households (p. 20).
- DRC (STEP-KIN): Used a simple affluence test based on anonymized MNO data, excluding subscribers with smartphones, high monthly spending (>$5), prepaid data plans, or international calls. Operated on a first-come, first-served basis among eligible individuals within target zones (pp. 20-21).
- Togo (Novissi Model 2): Implemented a proxy-means test (PMT) using ML algorithms trained on phone survey data linked with anonymized CDR data to predict individual consumption levels (p. 21).
- Nigeria (NASSP SU - RRR): Used mobile phones primarily for outreach and registration (via USSD) in urban areas, followed by in-person visits for verification and data collection. Mobile data was not used for welfare scoring beyond initial contact (p. 20).
Challenges: Targeting accuracy compared to traditional PMTs is uncertain and likely lower based on simulations (p. 53). Identifying the true ‘targeting unit’ is difficult due to SIM card sharing within households and individuals using multiple SIMs (p. 24, 53). Partnering with multiple MNOs is complex (p. 24). Most significantly, these approaches inherently exclude those without mobile phones or sufficient digital literacy, often the poorest segments of the population (pp. 20, 24, 49).
Innovations Across the Delivery Chain
Novel data and digital tools spurred innovation beyond targeting (pp. 25-33):
- Outreach: Targeted SMS/IVR messages based on location or pre-screened CDR lists; satellite data guiding social workers to underserved areas.
- Intake & Registration: User-friendly USSD platforms for self-registration; automated profile completion using linked data (e.g., voter ID in Togo).
- Payments: Automatic creation of mobile money accounts; simplified Know Your Customer (KYC) requirements for basic accounts in emergencies (DRC).
- Monitoring & Management: Chatbots for beneficiary support; explainable AI for PMT results; algorithms for fraud detection; dynamic eligibility checks using linked data (e.g., bank data in South Africa); linking with climate early warning systems.
Key Statistics & Data
- Implementation Speed: Programs in DRC and Togo launched within 8-11 months of political decision, significantly faster than typical traditional PMT-based programs (p. 29). Nigeria’s RRR payments started ~1.5 years after the crisis began (p. 35).
- Administrative Costs: Estimated at 5.7% of total budget for DRC STEP-KIN and ~10% for Togo Novissi Model 2, comparable to traditional programs (p. 36). Digital processes, simplified procedures, and fee waivers contributed (p. 36). Malawi’s pilot showed higher initial fixed costs for tele-targeting due to ground truth data collection, but potentially lower variable costs at scale (p. 44).
- Targeting Accuracy (Simulations):
- Togo: CDR-based ML targeting reached ~54% of the true poor, compared to ~63% for a simulated traditional PMT (p. 37, Fig 4).
- Nigeria: ML-based poverty maps (satellite) showed higher resolution and coverage than survey-based maps and correlated well with ground truth wealth index (p. 38, Fig 5). NSR and RRR registries showed similar effectiveness in identifying the poor (p. 40, Fig 6).
- Malawi: Tele-targeting (CDR+ML) had a 17% error rate vs. 13% for PMT when targeting the bottom 80% (p. 45, Table 5). Targeting performance diminished rapidly as the target group size increased for both methods in this high-poverty context (p. 46).
- Digital Exclusion: Mobile phone ownership is significantly lower in LICs (64%) and FCV countries (56%) than in MICs (83-90%) (p. 49). In the Malawi pilot, only ~36% of households owned a phone, and phone-owning beneficiaries were ~20% wealthier on average than non-owners (pp. 44, 45, Fig 7).
Methodology
This paper employs a comparative case study approach. It reviews and analyzes the design, implementation, and preliminary outcomes of social assistance programs that utilized novel data sources (geospatial, mobile phone CDR) and computational methods (AI/ML) in response to the COVID-19 pandemic in DRC, Togo, and Nigeria (p. 9, 10). It further examines two post-pandemic pilot projects in Togo and Malawi that integrated these innovations into more traditional delivery frameworks (Section 4, p. 41).
The analysis draws on program documentation, World Bank project information, academic research linked to the programs (e.g., Aiken et al., Smythe & Blumenstock), and practitioner surveys. Key methodologies assessed within the case studies include:
- Geographic Targeting: Using satellite imagery, ML, and composite indices.
- Individual Targeting: Using mobile phone CDR, ML, phone surveys, and affluence tests.
- Hybrid Approaches: Combining novel digital methods with traditional elements like in-person registration or community validation.
Data sources utilized by the programs studied include satellite imagery, mobile phone CDR, household surveys (DHS, LSMS, bespoke), census data, voter registries, and crowd-sourced data (pp. 17-19, 21, 41-43). The paper acknowledges limitations, particularly the lack of rigorous ex-post evaluations for the emergency programs and the reliance on simulations for accuracy assessments (p. 36, 53).
Key Conclusions & Recommendations
Novel data sources and AI offer promising tools but are not a panacea for social assistance delivery in data-scarce contexts. Their effectiveness is context-dependent and hinges on addressing significant limitations, particularly digital exclusion and the need for robust validation (p. 48, 54).
Key Conclusions
- Potential: Enables rapid deployment of welfare-targeted aid in emergencies without pre-existing systems; digital delivery lowers costs and increases convenience for some; potential for more dynamic and adaptive systems (pp. 48, 50).
- Limitations: Digital exclusion is a primary concern, potentially worsening inequality (p. 50). Targeting accuracy is uncertain and appears lower than traditional PMTs, especially for phone-based methods (p. 53). Methods often rely on pre-crisis data, limiting responsiveness to shocks (p. 53). The mismatch between SIM card data and household units complicates targeting (p. 53). Data access (cost, MNO willingness) and data quality are constraints (p. 53). Robust ex-post evaluations are lacking (p. 53).
- Complementarity: Novel approaches are best viewed as complements to, rather than replacements for, traditional methods and infrastructure. Hybrid models combining digital tools with physical outreach and verification are likely necessary (p. 54).
Recommendations
- Invest in Adaptive Social Protection: Strengthen foundational systems (digital ID, social registries, payment systems, interoperability) and invest in state capacity (personnel, legal/regulatory frameworks, institutional partnerships with private sector/civil society) to leverage technology effectively and ethically (p. 55).
- Mitigate Digital Exclusion: Employ combined approaches using multiple, diversified channels (digital and traditional) to reach excluded populations. Avoid purely digital systems where exclusion is high (p. 55).
- Improve Data Ecosystems: Invest in more frequent, geo-tagged traditional household surveys to serve as better ground truth for calibrating novel models. Explore ethical ways to link survey data with novel data sources (p. 56).
- Establish Strong Data Protection Safeguards: Embed principles of fairness, transparency, accuracy, consent, and security into the design of social protection systems using novel data (p. 56).
- Enhance Evidence-Based Decision Making: Invest heavily in rigorous Monitoring and Evaluation (M&E), including ex-post evaluations and RCTs, to understand the comparative accuracy, cost-effectiveness, and impacts of different targeting methods (traditional vs. novel vs. hybrid) and address key knowledge gaps (p. 57).
Key Questions Addressed or Raised
Addressed:
- How can novel data (satellite, mobile) and AI/ML be used for social assistance targeting in data-scarce environments?
- What were the specific approaches used in DRC, Togo, and Nigeria during the COVID-19 crisis?
- What are the potential benefits (speed, cost, reach) and limitations (accuracy, exclusion, data issues) of these approaches?
- How are these innovations being integrated into post-pandemic programs (e.g., Togo, Malawi)?
Raised:
- What is the actual targeting accuracy of novel methods compared to traditional ones in real-world, scaled-up implementations? (p. 53)
- How large is the digital exclusion effect, and who is most affected? (p. 50)
- How can hybrid models effectively combine digital reach with inclusive traditional support? (p. 55)
- How do targeting errors accumulate across multi-stage (e.g., geographic + individual) processes? (p. 52)
- Can remote data collection for PMTs maintain accuracy despite lack of verification? (p. 52)
- How can welfare changes during crises be reliably measured and predicted using novel data? (p. 53, 57)
- What are the best practices for ethical data sharing, robust data protection, and sustainable partnerships with data holders (e.g., MNOs)? (pp. 54, 56)
Key Points
- Novel data (satellite, mobile) and AI/ML can enable welfare-targeted social assistance delivery in data-scarce, emergency contexts.
- These approaches allowed rapid program setup (less than a year) compared to traditional methods in DRC and Togo during the COVID crisis.
- Administrative costs were comparable to traditional programs (5.7% DRC, ~10% Togo), facilitated by digital processes.
- Geographic targeting using satellite imagery combined with ML can identify poor areas, but accuracy depends on ground truth data quality and may not capture crisis impacts.
- Individual targeting using mobile phone data (CDR) combined with ML shows promise but appears less accurate than traditional PMTs and faces challenges (SIM sharing, data privacy).
- Digital exclusion is a major limitation; phone-based approaches exclude those without phones or digital literacy, often the poorest.
- Combining novel digital methods with traditional approaches (e.g., in-person registration) is crucial for inclusivity and accuracy.
- Robust ex-post evaluations are needed to understand the true targeting accuracy and cost-effectiveness of these novel methods.
- Strong political will, partnerships with MNOs, and robust data protection frameworks are essential prerequisites.