Sparse Cells, Inflated Odds: Bias Reduction Methods for Complex Survey Data with an Application to Hygiene Availability
DOI:
https://doi.org/10.29020/nybg.ejpam.v18i4.7150Keywords:
Complex survey design, Sparse data, Firth penalized logistic regression (FPLR), Survey-weighted logistic regression (SLR), Multiple indicator cluster survey (MICS)Abstract
Complex survey designs, such as UNICEF’s Multiple Indicator Cluster Surveys (MICS), often generate sparse cells across strata, leading to inflated odds ratios (ORs), wide confidence intervals (CIs), and convergence failures in survey-weighted logistic regression (SLR). Using Sudan MICS 2014 data on 16,679 households, this study aims to demonstrate a practical workflow to mitigate sparse-data bias: (1) SLR for population-representative estimates, (2) Firth penalized logistic regression (FPLR) to reduce bias from sparse cells, and (3) geographic collapsing of states into broader regions to improve stability. Results highlight implausibly large state-level ORs under SLR (e.g., OR=153.19 for Central Darfur) were substantially reduced after FPLR and regional collapsing. Substantively, strong gradients in handwashing facility access emerged by education, wealth, and geography, while sex of household head showed no effect. This study illustrates that combining design-based estimation, penalized likelihood, and collapsing strategies yields more stable inference in complex surveys with sparse data, offering practical guidance for applied researchers.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Zakariya M. S. Mohammed, Sanaa A. Mohammed, Mohyaldein Salih, Myada A Ibrahim, Omer M. A. Hamed, Ola A. I. Osman, Ali Satty

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Upon acceptance of an article by the European Journal of Pure and Applied Mathematics, the author(s) retain the copyright to the article. However, by submitting your work, you agree that the article will be published under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). This license allows others to copy, distribute, and adapt your work, provided proper attribution is given to the original author(s) and source. However, the work cannot be used for commercial purposes.
By agreeing to this statement, you acknowledge that:
- You retain full copyright over your work.
- The European Journal of Pure and Applied Mathematics will publish your work under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
- This license allows others to use and share your work for non-commercial purposes, provided they give appropriate credit to the original author(s) and source.