• WorldPop Book of Methods
  • Welcome
  • Preface
    • Acknowledgements
  • Contributing Authors
  • Introduction
  • 1 Gridded Population Estimates
    • 1.1 Top-down
    • 1.2 Bottom-up
    • 1.3 Peanut Butter
    • 1.4 Comparison of Data
    • Contribution
  • 2 Data Access
    • 2.1 Websites
    • 2.2 Web Applications
    • 2.3 FTP Server
    • 2.4 GIS Plugins
    • 2.5 R Packages
    • 2.6 Python Packages
    • 2.7 REST API
    • Contribution
  • 3 Research Applications
    • 3.1 Introduction
    • 3.2 Population modelling to support census
    • 3.3 Population modelling to support vaccination programmes
    • Contribution
  • 4 Settlement Data and Geospatial Covariates
    • 4.1 Settlement Data
    • 4.2 Key Settlement Data Sources
    • 4.3 Use of Settlement Data in Population Modelling
    • 4.4 Geospatial Covariates
    • Contribution
  • Bottom-Up Population Mapping Method
  • 5 Bottom-up Models
    • 5.1 Input Data
      • 5.1.1 Population Data
      • 5.1.2 Settlement Map
      • 5.1.3 Geospatial Covariates
      • 5.1.4 Administrative Boundaries
    • 5.2 Statistical Models
      • 5.2.1 Software
      • 5.2.2 Simple Model to Start
      • 5.2.3 Bayesian Priors
      • 5.2.4 Hierarchical Core Model
      • 5.2.5 Age-sex Structure
      • 5.2.6 Random Intercept
      • 5.2.7 Hierarchical Variance
      • 5.2.8 Weighted-likelihood
      • 5.2.9 Geostatistical Models
    • 5.3 Conclusion
    • Contribution
  • 6 Hierarchical Bottom-Up Modelling Tutorial
    • 6.1 Introduction
    • 6.2 Part I: How to think about population as a Bayesian
      • 6.2.1 Set-up
      • 6.2.2 From a frequentist to a Bayesian mindset
      • 6.2.3 How to choose priors
      • 6.2.4 Simulating data
      • 6.2.5 Modelling the data
      • 6.2.6 Implementing the model in stan
      • 6.2.7 Preparing the data for stan
      • 6.2.8 Running the model
      • 6.2.9 Checking the MCMC simulations
      • 6.2.10 Evaluating the estimated parameters
      • 6.2.11 Let’s try with real data
      • 6.2.12 The data
      • 6.2.13 Response variable: the population count
      • 6.2.14 Modelling Population count: a Poisson distribution
      • 6.2.15 Implementing the model
      • 6.2.16 Estimating the model
      • 6.2.17 Evaluating the model goodness-of-fit
      • 6.2.18 Estimated parameters
      • 6.2.19 Predicted population count
      • 6.2.20 Modelling Population count: a Poisson Lognormal model
      • 6.2.21 Writing the model
      • 6.2.22 Defining the priors
      • 6.2.23 Implementing the model
    • 6.3 Part II: How to model large-scale spatial variation
      • 6.3.1 Goals
      • 6.3.2 Supporting readings
      • 6.3.3 Extra packages
      • 6.3.4 Hierarchical structure in the data
      • 6.3.5 Full picture of the data grouping
      • 6.3.6 Hierarchical structure in the model
      • 6.3.7 A hierarchical intercept by settlement type
      • 6.3.8 No-pooling
      • 6.3.9 Partial pooling
    • 6.4 Part III: How to model small-scale spatial variation
      • 6.4.1 Goals
      • 6.4.2 Supporting readings
      • 6.4.3 Formal modelling
      • 6.4.4 Review of the covariates used in WorldPop
      • 6.4.5 Covariates engineering
      • 6.4.6 Including covariates in the model
      • 6.4.7 Overview of the covariates in Nigeria, v1.2
      • 6.4.8 Preparing the data
      • 6.4.9 Implementing the model
      • 6.4.10 A note on initialisation
      • 6.4.11 Comparing prediction with previous model
      • 6.4.12 Grouping and covariates effect: a random slope model
    • 6.5 Part IV: Model diagnostics and predictions
      • 6.5.1 Goals
      • 6.5.2 Supporting readings
      • 6.5.3 Extra packages
      • 6.5.4 Advanced model diagnostics
      • 6.5.5 Assessing convergence
      • 6.5.6 Checking predicted posterior
      • 6.5.7 Cross-validation of model predictions
      • 6.5.8 Gridded population prediction
      • 6.5.9 Preparing data for prediction
      • 6.5.10 Extracting estimated parameters
      • 6.5.11 Predicting population count for every grid cell
      • 6.5.12 Predicting distributions
      • 6.5.13 Gridded population
      • 6.5.14 Gridded uncertainty
      • 6.5.15 Aggregating prediction
    • Contribution
    • Suggested citation
  • 7 Hierarchical Bottom-Up Modelling Extention: the Weighted Likelihood
    • Introduction
    • Methods
      • Simulated Populations
      • Simulated Survey Data
      • Statistical Models
      • Population Totals
    • Results
    • Discussion
    • 7.1 Contributing
      • 7.1.1 Suggested Citation
      • 7.1.2 License
    • References
    • Tables
    • Figures
    • Appendix A: Supplementary Files
  • Hierarchical Bottom-Up Modelling Use Cases
  • 8 Nigeria (v1)
    • 8.1 Introduction
    • 8.2 Materials and Methods
    • 8.3 Results
    • 8.4 Discussion
    • 8.5 Conclusions
      • Suggested Citation
  • 9 Democratic Republic of Congo - DRC (v2)
    • 9.1 Introduction
    • 9.2 Methods
    • 9.3 Results
    • 9.4 Discussion
      • Suggested Citation
  • 10 Ghana (v2)
    • 10.1 Introduction
    • 10.2 Methods
      • 10.2.1 Data
      • 10.2.2 Statistical Model
      • 10.2.3 Model Implementation and Diagnostics
    • 10.3 Results
      • 10.3.1 People per household
      • 10.3.2 Age-sex structure
      • 10.3.3 Census projections
    • 10.4 Discussion
    • Contribution
  • 11 Burkina Faso (v1)
    • 11.1 Introduction
    • 11.2 Estimating Missed Communes
      • 11.2.1 Model Method
      • 11.2.2 Model Implementation
    • 11.3 Model results
      • 11.3.1 Implementing the model
      • 11.3.2 Assessing the model goodness-of-fit
    • 11.4 Estimating Gridded Population for the Entire Country
      • 11.4.1 Model Method
      • 11.4.2 Model Implementation
    • 11.5 Model Results
      • 11.5.1 Implementing the model
      • 11.5.2 Assessing the model goodness-of-fit
    • 11.6 Discussion
      • Suggested Citation
  • 12 Zambia (v1)
    • 12.1 Introduction
      • 12.1.1 Background
      • 12.1.2 Zambia Context
    • 12.2 Methods
      • 12.2.1 Data
      • 12.2.2 Statistical Modelling
      • 12.2.3 Predictions and Final Dataset
    • 12.3 Results
      • 12.3.1 Assessment of Building Footprints Data
      • 12.3.2 Model Fit
      • 12.3.3 Summary of Population Estimates
    • 12.4 Discussion
    • Contribution
  • 13 Bottom-Up Geostatistical Population Modelling
  • 14 Geostatistical Bottom-Up Modelling
    • 14.1 Introduction
    • 14.2 Model Structure
    • 14.3 Implementation within R-INLA
    • 14.4 Posterior Simulation and grid cell prediction
    • 14.5 Model Fit Checks and Cross-Validation
    • 14.6 Contribution
  • 15 Geostatistical Use Cases
    • 15.1 Introduction
    • 15.2 Materials and Methods
    • 15.3 Simulation Study
      • 15.3.1 Model parameter values and sample sizes
      • 15.3.2 Geospatial Covariates and Random Effects
      • 15.3.3 Estimating the mean of the population density
      • 15.3.4 Data Simulation
      • 15.3.5 Model fit checks and cross-validation
      • 15.3.6 Simulation Study Results
    • 15.4 Application to Cameroon Household Listing Datasets
      • 15.4.1 Input Datasets
      • 15.4.2 Bayesian Hierarchical Modelling
      • 15.4.3 Results
      • 15.4.4 Discussion
    • 15.5 Contributions
  • Top-Down Population Mapping Method
  • 16 Top-Down Population Modelling
    • 16.1 Introduction
    • 16.2 Approach to Top-Down Disaggregation
      • 16.2.1 Random Forest
    • 16.3 Global 1
    • 16.4 Top-Down Unconstrained
    • 16.5 Top-Down Constrained
    • 16.6 Contributions
  • 17 Top-Down Disaggregation
    • 17.1 Introduction
      • 17.1.1 Pre-requisites
    • 17.2 Background
    • 17.3 R Environment
      • 17.3.1 Source Data
    • 17.4 Random Forest
      • 17.4.1 Response Variable
      • 17.4.2 Predictor Variables
      • 17.4.3 Model Fitting
      • 17.4.4 Weighting Layer
      • 17.4.5 Redistribution to EA-level
      • 17.4.6 Diagnostics
    • 17.5 Limitations
    • 17.6 Tips and Tricks
      • 17.6.1 Map Results
      • 17.6.2 Zonal Statistics
      • 17.6.3 Gridded Population Estimates
      • 17.6.4 Parallel Processing
    • Contributions
    • Suggested Citation
  • Bespoke Top-Down Population Models
  • 18 South Sudan (v2)
    • 18.1 Introduction
    • 18.2 Methods
      • 18.2.1 Boundaries, settled area and unadjusted (baseline) county population totals
      • 18.2.2 Mapping of internally displaced persons
      • 18.2.3 Mapping of unadjusted population (population distribution in the absence of displacement)
      • 18.2.4 Mapping locations where people have been displaced from
      • 18.2.5 Final population distribution accounting for displacement
    • 18.3 Results
      • 18.3.1 Mapping of internally displaced persons
      • 18.3.2 Mapping of unadjusted population (population distribution in the absence of displacement)
      • 18.3.3 Mapping of where people have been displaced from
      • 18.3.4 Final population distribution accounting for displacement
    • 18.4 Discussion
    • Contribution
  • 19 High Resolution Gridded Population Datasets for Latin America and the Caribbean using Official Statistics
    • 19.1 Background & Summary
    • 19.2 Methods
      • 19.2.1 Random forest-based dasymetric population mapping approach
      • 19.2.2 Data Collection
      • 19.2.3 Data Processing
      • 19.2.4 Random Forest Modelling Scenarios
    • 19.3 Data Records
    • 19.4 Technical Validation
    • 19.5 WSF3D Quantitative Assessment
    • 19.6 Usage Notes
    • 19.7 Code Availability
    • 19.8 Contributions
  • WorldPop

WorldPop Book of Methods

Contributing Authors

Click the names of the authors to find more about them

Ortis Yankey

Andrew Tatem

Chris Jochem

Chibuzor Christopher Nnanatu

Douglas Leasure

Claire Dooley

Gianluca Boo

Edith Darin

Alessandro Sorichetta

Attila N. Lazar

Heather R. Chamberlain

Tom Mckeen

Maksym Bondarenko

Assane Gadiaga