WorldPop Book of Methods
Welcome
Preface
Acknowledgements
Contributing Authors
Introduction
1
Gridded Population Estimates
1.1
Top-down
1.2
Bottom-up
1.3
Peanut Butter
1.4
Comparison of Data
Contribution
2
Data Access
2.1
Websites
2.2
Web Applications
2.3
FTP Server
2.4
GIS Plugins
2.5
R Packages
2.6
Python Packages
2.7
REST API
Contribution
3
Research Applications
3.1
Introduction
3.2
Population modelling to support census
3.3
Population modelling to support vaccination programmes
Contribution
4
Settlement Data and Geospatial Covariates
4.1
Settlement Data
4.2
Key Settlement Data Sources
4.3
Use of Settlement Data in Population Modelling
4.4
Geospatial Covariates
Contribution
Bottom-Up Population Mapping Method
5
Bottom-up Models
5.1
Input Data
5.1.1
Population Data
5.1.2
Settlement Map
5.1.3
Geospatial Covariates
5.1.4
Administrative Boundaries
5.2
Statistical Models
5.2.1
Software
5.2.2
Simple Model to Start
5.2.3
Bayesian Priors
5.2.4
Hierarchical Core Model
5.2.5
Age-sex Structure
5.2.6
Random Intercept
5.2.7
Hierarchical Variance
5.2.8
Weighted-likelihood
5.2.9
Geostatistical Models
5.3
Conclusion
Contribution
6
Hierarchical Bottom-Up Modelling Tutorial
6.1
Introduction
6.2
Part I: How to think about population as a Bayesian
6.2.1
Set-up
6.2.2
From a frequentist to a Bayesian mindset
6.2.3
How to choose priors
6.2.4
Simulating data
6.2.5
Modelling the data
6.2.6
Implementing the model in
stan
6.2.7
Preparing the data for stan
6.2.8
Running the model
6.2.9
Checking the MCMC simulations
6.2.10
Evaluating the estimated parameters
6.2.11
Let’s try with real data
6.2.12
The data
6.2.13
Response variable: the population count
6.2.14
Modelling Population count: a Poisson distribution
6.2.15
Implementing the model
6.2.16
Estimating the model
6.2.17
Evaluating the model goodness-of-fit
6.2.18
Estimated parameters
6.2.19
Predicted population count
6.2.20
Modelling Population count: a Poisson Lognormal model
6.2.21
Writing the model
6.2.22
Defining the priors
6.2.23
Implementing the model
6.3
Part II: How to model large-scale spatial variation
6.3.1
Goals
6.3.2
Supporting readings
6.3.3
Extra packages
6.3.4
Hierarchical structure in the data
6.3.5
Full picture of the data grouping
6.3.6
Hierarchical structure in the model
6.3.7
A hierarchical intercept by settlement type
6.3.8
No-pooling
6.3.9
Partial pooling
6.4
Part III: How to model small-scale spatial variation
6.4.1
Goals
6.4.2
Supporting readings
6.4.3
Formal modelling
6.4.4
Review of the covariates used in WorldPop
6.4.5
Covariates engineering
6.4.6
Including covariates in the model
6.4.7
Overview of the covariates in Nigeria, v1.2
6.4.8
Preparing the data
6.4.9
Implementing the model
6.4.10
A note on initialisation
6.4.11
Comparing prediction with previous model
6.4.12
Grouping and covariates effect: a random slope model
6.5
Part IV: Model diagnostics and predictions
6.5.1
Goals
6.5.2
Supporting readings
6.5.3
Extra packages
6.5.4
Advanced model diagnostics
6.5.5
Assessing convergence
6.5.6
Checking predicted posterior
6.5.7
Cross-validation of model predictions
6.5.8
Gridded population prediction
6.5.9
Preparing data for prediction
6.5.10
Extracting estimated parameters
6.5.11
Predicting population count for every grid cell
6.5.12
Predicting distributions
6.5.13
Gridded population
6.5.14
Gridded uncertainty
6.5.15
Aggregating prediction
Contribution
Suggested citation
7
Hierarchical Bottom-Up Modelling Extention: the Weighted Likelihood
Introduction
Methods
Simulated Populations
Simulated Survey Data
Statistical Models
Population Totals
Results
Discussion
7.1
Contributing
7.1.1
Suggested Citation
7.1.2
License
References
Tables
Figures
Appendix A: Supplementary Files
Hierarchical Bottom-Up Modelling Use Cases
8
Nigeria (v1)
8.1
Introduction
8.2
Materials and Methods
8.3
Results
8.4
Discussion
8.5
Conclusions
Suggested Citation
9
Democratic Republic of Congo - DRC (v2)
9.1
Introduction
9.2
Methods
9.3
Results
9.4
Discussion
Suggested Citation
10
Ghana (v2)
10.1
Introduction
10.2
Methods
10.2.1
Data
10.2.2
Statistical Model
10.2.3
Model Implementation and Diagnostics
10.3
Results
10.3.1
People per household
10.3.2
Age-sex structure
10.3.3
Census projections
10.4
Discussion
Contribution
11
Burkina Faso (v1)
11.1
Introduction
11.2
Estimating Missed
Communes
11.2.1
Model Method
11.2.2
Model Implementation
11.3
Model results
11.3.1
Implementing the model
11.3.2
Assessing the model goodness-of-fit
11.4
Estimating Gridded Population for the Entire Country
11.4.1
Model Method
11.4.2
Model Implementation
11.5
Model Results
11.5.1
Implementing the model
11.5.2
Assessing the model goodness-of-fit
11.6
Discussion
Suggested Citation
12
Zambia (v1)
12.1
Introduction
12.1.1
Background
12.1.2
Zambia Context
12.2
Methods
12.2.1
Data
12.2.2
Statistical Modelling
12.2.3
Predictions and Final Dataset
12.3
Results
12.3.1
Assessment of Building Footprints Data
12.3.2
Model Fit
12.3.3
Summary of Population Estimates
12.4
Discussion
Contribution
13
Bottom-Up Geostatistical Population Modelling
14
Geostatistical Bottom-Up Modelling
14.1
Introduction
14.2
Model Structure
14.3
Implementation within R-INLA
14.4
Posterior Simulation and grid cell prediction
14.5
Model Fit Checks and Cross-Validation
14.6
Contribution
15
Geostatistical Use Cases
15.1
Introduction
15.2
Materials and Methods
15.3
Simulation Study
15.3.1
Model parameter values and sample sizes
15.3.2
Geospatial Covariates and Random Effects
15.3.3
Estimating the mean of the population density
15.3.4
Data Simulation
15.3.5
Model fit checks and cross-validation
15.3.6
Simulation Study Results
15.4
Application to Cameroon Household Listing Datasets
15.4.1
Input Datasets
15.4.2
Bayesian Hierarchical Modelling
15.4.3
Results
15.4.4
Discussion
15.5
Contributions
Top-Down Population Mapping Method
16
Top-Down Population Modelling
16.1
Introduction
16.2
Approach to Top-Down Disaggregation
16.2.1
Random Forest
16.3
Global 1
16.4
Top-Down Unconstrained
16.5
Top-Down Constrained
16.6
Contributions
17
Top-Down Disaggregation
17.1
Introduction
17.1.1
Pre-requisites
17.2
Background
17.3
R Environment
17.3.1
Source Data
17.4
Random Forest
17.4.1
Response Variable
17.4.2
Predictor Variables
17.4.3
Model Fitting
17.4.4
Weighting Layer
17.4.5
Redistribution to EA-level
17.4.6
Diagnostics
17.5
Limitations
17.6
Tips and Tricks
17.6.1
Map Results
17.6.2
Zonal Statistics
17.6.3
Gridded Population Estimates
17.6.4
Parallel Processing
Contributions
Suggested Citation
Bespoke Top-Down Population Models
18
South Sudan (v2)
18.1
Introduction
18.2
Methods
18.2.1
Boundaries, settled area and unadjusted (baseline) county population totals
18.2.2
Mapping of internally displaced persons
18.2.3
Mapping of unadjusted population (population distribution in the absence of displacement)
18.2.4
Mapping locations where people have been displaced from
18.2.5
Final population distribution accounting for displacement
18.3
Results
18.3.1
Mapping of internally displaced persons
18.3.2
Mapping of unadjusted population (population distribution in the absence of displacement)
18.3.3
Mapping of where people have been displaced from
18.3.4
Final population distribution accounting for displacement
18.4
Discussion
Contribution
19
High Resolution Gridded Population Datasets for Latin America and the Caribbean using Official Statistics
19.1
Background & Summary
19.2
Methods
19.2.1
Random forest-based dasymetric population mapping approach
19.2.2
Data Collection
19.2.3
Data Processing
19.2.4
Random Forest Modelling Scenarios
19.3
Data Records
19.4
Technical Validation
19.5
WSF3D Quantitative Assessment
19.6
Usage Notes
19.7
Code Availability
19.8
Contributions
WorldPop Book of Methods
Contributing Authors
Click the names of the authors to find more about them
Ortis Yankey
Andrew Tatem
Chris Jochem
Chibuzor Christopher Nnanatu
Douglas Leasure
Claire Dooley
Gianluca Boo
Edith Darin
Alessandro Sorichetta
Attila N. Lazar
Heather R. Chamberlain
Tom Mckeen
Maksym Bondarenko
Assane Gadiaga