Predicting Multidimensional Poverty with Machine Learning Algorithms: An Open Data Source Approach Using Spatial Data

Muñetón-Santa, Guberney; Manrique-Ruiz, Luis Carlos

Predecir la pobreza multidimensional con algoritmos de aprendizaje automático: un enfoque de fuente de datos abierta que utiliza datos espaciales

Item Links

URI: http://hdl.handle.net/10818/62548
Visitar enlace: https://www.scopus.com/inward/ ...

ISSN: 20760760

DOI: 10.3390/socsci12050296

Statistics

View Usage Statistics

Metrics

Bibliographic cataloging

Show full item record

Author

Muñetón-Santa, Guberney; Manrique-Ruiz, Luis Carlos

Date

2023

Abstract

This paper presents a methodology to estimate the multidimensional poverty index using spatial data at the street block level. The data used in this study were obtained from Open Street Maps and ESA’s land use cover, which are freely available sources of spatial information. The study employs five machine-learning algorithms, including Catboost, Lightboost, and Random Forest, to estimate the multidimensional poverty index with spatial granularity. The results indicate that these models achieve promising performance in predicting poverty levels in Medellín, Colombia. The results showed that the Random Forest algorithm achieved the highest performance, with an MAE of 0.07504. Furthermore, the spatial distribution of the multidimensional poverty estimate was highly correlated with the true values of the distribution. This work contributes to predicting multidimensional poverty by demonstrating the potential of machine learning algorithms to utilize accessible spatial data. By providing evidence of the feasibility of estimating poverty levels at a granular spatial level, this methodology offers a powerful tool for policymakers to make poverty social interventions with low-cost evidence. Furthermore, this study has important implications for poverty eradication efforts in developing countries, where access to reliable data remains challenging. © 2023 by the authors.

Este artículo presenta una metodología para estimar el índice de pobreza multidimensional utilizando datos espaciales a nivel de manzana. Los datos utilizados en este estudio se obtuvieron de Open Street Maps y de la cobertura del uso del suelo de la ESA, que son fuentes de información espacial de libre acceso. El estudio emplea cinco algoritmos de aprendizaje automático, incluidos Catboost, Lightboost y Random Forest, para estimar el índice de pobreza multidimensional con granularidad espacial. Los resultados indican que estos modelos logran un desempeño prometedor en la predicción de los niveles de pobreza en Medellín, Colombia. Los resultados mostraron que el algoritmo Random Forest logró el mayor rendimiento, con un MAE de 0,07504. Además, la distribución espacial de la estimación de la pobreza multidimensional estaba altamente correlacionada con los valores reales de la distribución. Este trabajo contribuye a predecir la pobreza multidimensional al demostrar el potencial de los algoritmos de aprendizaje automático para utilizar datos espaciales accesibles. Al proporcionar evidencia de la viabilidad de estimar los niveles de pobreza a un nivel espacial granular, esta metodología ofrece una poderosa herramienta para que los formuladores de políticas realicen intervenciones sociales contra la pobreza con evidencia de bajo costo. Además, este estudio tiene implicaciones importantes para los esfuerzos de erradicación de la pobreza en los países en desarrollo, donde el acceso a datos confiables sigue siendo un desafío. © 2023 por los autores.

Keywords

Artificial intelligence

Census

Colombia

Geospatial data

Machine learning

Ubication

Social sciences (Basel), Vol.12 (5), p.296, Article 296

Collections to which it belong

Facultad de Filosofía y Ciencias Humanas [108]

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 Internacional

Browse

What you need to know

Autofile of works

My Account

Context

Statistics

Sitios de Interés