Welcome
Preface
Acknowledgements
Contributing Authors
Introduction
1
Gridded Population Estimates
1.1
Top-down
1.2
Bottom-up
1.3
Peanut Butter
1.4
Comparison of Data
Contribution
2
Data Access
2.1
Websites
2.2
Web Applications
2.3
FTP Server
2.4
GIS Plugins
2.5
R Packages
2.6
Python Packages
2.7
REST API
Contribution
3
Research Applications
3.1
Introduction
3.2
Population modelling to support census
3.3
Population modelling to support vaccination programmes
Contribution
4
Settlement Data and Geospatial Covariates
4.1
Settlement Data
4.2
Key Settlement Data Sources
4.3
Use of Settlement Data in Population Modelling
4.4
Geospatial Covariates
Contribution
Bottom-Up Population Mapping Method
5
Bottom-up Models
5.1
Input Data
5.1.1
Population Data
5.1.2
Settlement Map
5.1.3
Geospatial Covariates
5.1.4
Administrative Boundaries
5.2
Statistical Models
5.2.1
Software
5.2.2
Simple Model to Start
5.2.3
Bayesian Priors
5.2.4
Hierarchical Core Model
5.2.5
Age-sex Structure
5.2.6
Random Intercept
5.2.7
Hierarchical Variance
5.2.8
Weighted-likelihood
5.2.9
Geostatistical Models
5.3
Conclusion
Contribution
6
Hierarchical Bottom-Up Modelling Tutorial
6.1
Introduction
6.2
Part I: How to think about population as a Bayesian
6.2.1
Set-up
6.2.2
From a frequentist to a Bayesian mindset
6.2.3
How to choose priors
6.2.4
Simulating data
6.2.5
Modelling the data
6.2.6
Implementing the model in
stan
6.2.7
Preparing the data for stan
6.2.8
Running the model
6.2.9
Checking the MCMC simulations
6.2.10
Evaluating the estimated parameters
6.2.11
Let’s try with real data
6.2.12
The data
6.2.13
Response variable: the population count
6.2.14
Modelling Population count: a Poisson distribution
6.2.15
Implementing the model
6.2.16
Estimating the model
6.2.17
Evaluating the model goodness-of-fit
6.2.18
Estimated parameters
6.2.19
Predicted population count
6.2.20
Modelling Population count: a Poisson Lognormal model
6.2.21
Writing the model
6.2.22
Defining the priors
6.2.23
Implementing the model
7
1 Set-up —-
7.1
Part II: How to model large-scale spatial variation
7.1.1
Goals
7.1.2
Supporting readings
7.1.3
Extra packages
7.1.4
Hierarchical structure in the data
7.1.5
Full picture of the data grouping
7.1.6
Hierarchical structure in the model
7.1.7
A hierarchical intercept by settlement type
7.1.8
No-pooling
7.1.9
Partial pooling
7.2
Part III: How to model small-scale spatial variation
7.2.1
Goals
7.2.2
Supporting readings
7.2.3
Formal modelling
7.2.4
Review of the covariates used in WorldPop
7.2.5
Covariates engineering
7.2.5.1
A note on covariate selection
7.2.6
Including covariates in the model
7.2.7
Overview of the covariates in Nigeria, v1.2
7.2.8
Preparing the data
7.2.9
Implementing the model
7.2.10
A note on initialisation
7.2.11
Comparing prediction with previous model
7.2.12
Grouping and covariates effect: a random slope model
7.3
Part IV: Model diagnostics and predictions
7.3.1
Goals
7.3.2
Supporting readings
7.3.3
Extra packages
7.3.4
Advanced model diagnostics
7.3.5
Assessing convergence
7.3.6
Checking predicted posterior
7.3.7
Cross-validation of model predictions
7.3.8
Gridded population prediction
7.3.9
Preparing data for prediction
7.3.10
Extracting estimated parameters
7.3.11
Predicting population count for every grid cell
7.3.12
Predicting distributions
7.3.13
Gridded population
7.3.14
Gridded uncertainty
7.3.15
Aggregating prediction
Contribution
Suggested citation
8
Hierarchical Bottom-Up Modelling Extention: the Weighted Likelihood
Introduction
Methods
Simulated Populations
Simulated Survey Data
Random Sample
Population-weighted Sample
Combined Sample
Statistical Models
Unweighted Log-normal
Weighted-likelihood
Weighted-precision (Stan)
Weighted-precision (JAGS)
Population Totals
Results
Discussion
8.1
Contributing
8.1.1
Suggested Citation
8.1.2
License
References
Tables
Figures
Appendix A: Supplementary Files
Hierarchical Bottom-Up Modelling Use Cases
9
Nigeria (v1)
9.1
Introduction
9.2
Materials and Methods
9.3
Results
9.4
Discussion
9.5
Conclusions
Suggested Citation
10
Democratic Republic of Congo - DRC (v2)
10.1
Introduction
10.2
Methods
10.3
Results
10.4
Discussion
Suggested Citation
11
Ghana (v2)
11.1
Introduction
11.2
Methods
11.2.1
Data
11.2.2
Statistical Model
Indexing
Population Estimates
People per Household (pph)
Demographic Groups
Households per Building (hpb)
Measurement Error in Census Projections
11.2.3
Model Implementation and Diagnostics
11.3
Results
11.3.1
People per household
11.3.2
Age-sex structure
11.3.3
Census projections
11.4
Discussion
Contribution
12
Burkina Faso (v1)
12.1
Introduction
12.2
Estimating Missed
Communes
12.2.1
Model Method
12.2.2
Model Implementation
12.2.2.1
Input data
12.3
Model results
12.3.1
Implementing the model
12.3.2
Assessing the model goodness-of-fit
12.4
Estimating Gridded Population for the Entire Country
12.4.1
Model Method
12.4.2
Model Implementation
12.4.2.1
Input data
12.5
Model Results
12.5.1
Implementing the model
12.5.2
Assessing the model goodness-of-fit
12.6
Discussion
Suggested Citation
13
Zambia (v1)
13.1
Introduction
13.1.1
Background
13.1.2
Zambia Context
13.2
Methods
13.2.1
Data
13.2.1.1
Geospatial Covariates
13.2.2
Statistical Modelling
13.2.3
Predictions and Final Dataset
13.3
Results
13.3.1
Assessment of Building Footprints Data
13.3.2
Model Fit
13.3.3
Summary of Population Estimates
13.4
Discussion
Contribution
14
Bottom-Up Geostatistical Population Modelling
15
Geostatistical Bottom-Up Modelling
15.1
Introduction
15.2
Model Structure
15.3
Implementation within R-INLA
15.4
Posterior Simulation and grid cell prediction
15.5
Model Fit Checks and Cross-Validation
15.6
Contribution
16
Geostatistical Use Cases
16.1
Introduction
16.2
Materials and Methods
16.3
Simulation Study
16.3.1
Model parameter values and sample sizes
16.3.2
Geospatial Covariates and Random Effects
16.3.3
Estimating the mean of the population density
16.3.4
Data Simulation
16.3.4.1
Data Simulation Steps
16.3.5
Model fit checks and cross-validation
16.3.6
Simulation Study Results
16.4
Application to Cameroon Household Listing Datasets
16.4.1
Input Datasets
16.4.2
Bayesian Hierarchical Modelling
16.4.2.1
Covariate Selection
16.4.2.2
Model Implementation Within INLA-SPDE Framework
16.4.3
Results
16.4.3.1
Model fit metrics
16.4.3.2
Proportion of variance explained
16.4.3.3
Cross- Validation
16.4.3.4
Fixed Effects Coefficients
16.4.3.5
Comparison Between INLA-SPDE Estimates and NIS Projections at Admin 1
16.4.4
Discussion
16.5
Contributions
Top-Down Population Mapping Method
17
Top-Down Population Modelling
17.1
Introduction
17.2
Approach to Top-Down Disaggregation
17.2.1
Random Forest
17.3
Global 1
17.4
Top-Down Unconstrained
17.5
Top-Down Constrained
17.6
Contributions
18
Top-Down Disaggregation
18.1
Introduction
18.1.1
Pre-requisites
18.2
Background
18.3
R Environment
18.3.1
Source Data
18.4
Random Forest
18.4.1
Response Variable
18.4.2
Predictor Variables
18.4.3
Model Fitting
18.4.3.1
Settings
18.4.3.2
Run the Model
18.4.4
Weighting Layer
18.4.5
Redistribution to EA-level
18.4.6
Diagnostics
18.4.6.1
Summing Enumeration Areas
18.4.6.2
Goodness-of-Fit
18.4.6.3
EA-level Assessment
18.4.6.4
Covariate Importance
18.5
Limitations
18.6
Tips and Tricks
18.6.1
Map Results
18.6.2
Zonal Statistics
18.6.3
Gridded Population Estimates
18.6.4
Parallel Processing
Contributions
Suggested Citation
Bespoke Top-Down Population Models
19
South Sudan (v2)
19.1
Introduction
19.2
Methods
19.2.1
Boundaries, settled area and unadjusted (baseline) county population totals
19.2.2
Mapping of internally displaced persons
19.2.3
Mapping of unadjusted population (population distribution in the absence of displacement)
19.2.4
Mapping locations where people have been displaced from
19.2.5
Final population distribution accounting for displacement
19.3
Results
19.3.1
Mapping of internally displaced persons
19.3.2
Mapping of unadjusted population (population distribution in the absence of displacement)
19.3.3
Mapping of where people have been displaced from
19.3.4
Final population distribution accounting for displacement
19.4
Discussion
Contribution
20
High Resolution Gridded Population Datasets for Latin America and the Caribbean using Official Statistics
20.1
Background & Summary
20.2
Methods
20.2.1
Random forest-based dasymetric population mapping approach
20.2.2
Data Collection
20.2.3
Data Processing
20.2.4
Random Forest Modelling Scenarios
20.3
Data Records
20.4
Technical Validation
20.5
WSF3D Quantitative Assessment
20.6
Usage Notes
20.7
Code Availability
20.8
Contributions
WorldPop Book of Methods
Contributing Authors
Click the names of the authors to find more about them
Ortis Yankey
Andrew Tatem
Chris Jochem
Chibuzor Christopher Nnanatu
Douglas Leasure
Claire Dooley
Gianluca Boo
Edith Darin
Alessandro Sorichetta
Attila N. Lazar
Heather R. Chamberlain
Tom Mckeen
Maksym Bondarenko
Assane Gadiaga