Gambela's Malaria Mapping: Spatial Pattern by Woreda

Using Spatial Models to Map Disease Risk and Identify Clusters in Ethiopia's Most Suitable Region

By Natnael G. Mengistu et al. in Spatial Analysis Public Health

December 31, 2025

Full Report (Google Doc)

Introduction and Background

Malaria remains a significant public health challenge in Ethiopia, with its transmission heavily influenced by environmental and geopolitical factors. For this project, we focused on the Gambela region, a unique area characterized by its low altitude, high temperature suitability for malaria, and complex geopolitical dynamics.

Why Gambela?

Altitude: Most of Gambela lies below 2,000 meters above sea level (masl), with floodplains below 500 masl. Epidemiologically, areas below 1,750m are of primary interest for malaria transmission.
Temperature Suitability: Using data from the Malaria Atlas Project, we found that Gambela ranks as the second most suitable region in Ethiopia for Plasmodium falciparum transmission, following only the Somali region.

# sorting region by mean temprature suitability
region |>
  select(c(ADM1_EN, temp_mean, temp_median)) |>
  arrange(desc(temp_mean))

Region Suitability Rankings Ranking Ethiopia’s regions according to temperature suitability for Plasmodium falciparum.

Geopolitical Context: As of November 2025, UNHCR puts 447,095 refugees and asylum seekers in Gambela, which is 39.7% of the total in Ethiopia. Proximity to conflicts in neighboring Sudan makes malaria transmission in this region a critical priority.

Refugee Status in Ethiopia Total refugees and asylum seekers in Ethiopia (Source: UNHCR Operational Data Portal).

Data and Methodology

We leveraged the malariaAtlas R library to extract spatial data. Our primary health indicator is the Malaria Incidence Rate, as it directly measures transmission risk, unlike mortality rates which often reflect healthcare access or behavior.

Data Extraction Process

We extracted case counts, mortality rates, and population estimates using woreda-level geometries.

# fetching malaria data through the malariaAtlas library
library(sf)
library(malariaAtlas)

# incidence rate per 1000 population from 2000 - 2024
incidence_id <- "Malaria__202508_Global_Pf_Incidence_Rate"

incidence_raster <- getRaster(
  dataset_id = incidence_id,
  shp = ethiopia,
  year = 2024
)

# only taking the mean estimates for the rates
incidence_raster <- incidence_raster[[1]]

# merging count and rate data with woreda shapes
gambela_malaria <- st_transform(gambela_woreda, crs(incidence_raster))
incidence_rate <- extract(incidence_raster, vect(gambela_malaria),
                          fun = mean, na.rm = TRUE)[, 2]
gambela_malaria$incidence <- incidence_rate

Crude Rate Analysis

Our initial step was to map the crude incidence rates. The choropleth map below shows a clear trend: the risk of malaria increases significantly as we move from the North-East toward the Southern woredas.

# choropleth (crude rates)
library(ggplot2)

ggplot(gambela_malaria) +
  geom_sf(aes(fill=incidence)) +
  scale_fill_viridis_c(option = "viridis", name = "Crude Incidence Rate") +
  labs(
    title = "Crude Malria Incidence by Woreda (Gambela Region)",
    subtitle = "Data Source: Malaria Atlas Project",
  )+
  theme_minimal()

Crude Incidence Map Map showing crude malaria incidence rates across Gambela woredas.

Analysis of Small Population Effects

In disease mapping, small population denominators can lead to unstable rate estimates. To check this, we plotted the crude incidence rate against population size with variability bands (funnel plot).

# calculating variability bands at 95% confidence
overall_rate <- sum(gambela_malaria$cases) / sum(gambela_malaria$pop)
gambela_malaria$incidence_se <- sqrt(overall_rate / gambela_malaria$pop)
gambela_malaria$incidence_lower <- overall_rate - 1.96 * gambela_malaria$incidence_se
gambela_malaria$incidence_upper <- overall_rate + 1.96 * gambela_malaria$incidence_se

# plotting population vs crude incidence rate
plot(
  x = gambela_malaria$pop, y = gambela_malaria$incidence,
  xlab = "Population ", ylab = "Crude Incidence Rate",
  main = "Crude Incidence Rate vs Population"
)
lines(gambela_malaria$pop, gambela_malaria$incidence_upper, col = "lightblue")
lines(gambela_malaria$pop, gambela_malaria$incidence_lower, col = "lightblue")
abline(h = overall_rate, col = "red")

Funnel Plot of Incidence vs Population Funnel plot checking for small population effects on crude incidence rates.

Interpretation: Most of our data points lie outside the funnel, even for small populations. Quantitatively, the ratio of expected variance to observed variance was only 0.0105. This means only 1% of the variability is explained by small denominators—the rest is likely due to spatial dependence or other overdispersion factors.

Spatial Smoothing with BYM2

To distinguish between structured spatial patterns and random noise, we implemented the Besag-Yor-Molie 2 (BYM2) model using INLA.

# implementing INLA and BYM2
library(INLA)
library(spdep)

gambela_malaria$ID <- 1:nrow(gambela_malaria)
nb <- poly2nb(gambela_malaria)
nb2INLA("gambela.graph", nb)

# Fitting BYM2 Model
bym2_model <- inla(round(cases) ~ 1 + f(ID, model = "bym2", graph = "gambela.graph"),
              family = "poisson", data = as.data.frame(gambela_malaria),
              E = expected, control.predictor = list(compute = TRUE, link = 1))

gambela_malaria$smoothed_incidence <- bym2_model$summary.fitted.values$mean * overall_rate

# crude and smoothed maps comparison
plot1 <- ggplot(gambela_malaria) + geom_sf(aes(fill = incidence)) +
  scale_fill_viridis_c(option = "viridis", limits = c(min_limit, max_limit)) +
  labs(title = "Crude incidence", fill = "Rate") + theme_minimal()

plot2 <- ggplot(gambela_malaria) + geom_sf(aes(fill = smoothed_incidence)) +
  scale_fill_viridis_c(option = "viridis", limits = c(min_limit, max_limit)) +
  labs(title = "BYM2 smoothed", fill = "Rate") + theme_minimal()

plot1 / plot2

Crude vs Smoothed Comparison Comparison between crude incidence and BYM2 smoothed incidence.

Results: Interestingly, the smoothing had a minimal effect. The percentage change between crude and smoothed rates ranged only from -0.68% to 0.68%. This suggests that the Malaria Atlas data might already be smoothed or imputed, and the strong spatial signal heavily outweighs random noise.

Spatial Autocorrelation (Moran’s I)

We ran a Global Moran’s I test to confirm spatial dependence.

# checking the presense of spatial signal
nb <- poly2nb(gambela_malaria, queen = TRUE)
lw <- nb2listw(nb, style = "W", zero.policy = TRUE)

moran_result <- moran.test(gambela_malaria$incidence, lw)
print(moran_result)

Both show strong positive spatial autocorrelation. The slight increase in Moran’s I for the smoothed rate indicates that BYM2 successfully filtered out minor local noise.

Sample Estimates	Crude Rate	Smoothed Rate
Moran’s I Statistic	0.58653260	0.59493873
Expectation	-0.07142857	-0.07142857
Variance	0.03449463	0.03447319
p-value	0.0001981	0.000166

Table 4.1: Moran test value (crude incidence rate vs BYM2 incidence smoothed rate).

Modeling with Covariates: Accessibility

We investigated whether travel time to the nearest healthcare facility (Accessibility) explains malaria incidence.

# fitting BYM2 model with INLA, offsetting by the population
model_bym2 <- inla(round_cases ~  accessibility + 
                     f(ID, model = "bym2", graph = "gambela.graph"), 
                   family = "poisson", data = gambela_malaria, E = pop,
                   control.predictor = list(compute = TRUE, link = 1),
                   control.compute = list(dic = TRUE, waic = TRUE))

summary(model_bym2)

Variable	Non-Spatial (Poisson GLM)	BYM2 (Spatial Model)
Intercept	-2.490 (8.893e-03)***	-2.451 (-2.572, -2.344)
Accessibility	1.831 (6.901e-05)***	0.001 (0.000, 0.002)
Precision	-	91.435
Phi	-	0.677
AIC / WAIC	1445.1	172.90

Table 6.2: Comparison of Non-Spatial Poisson Log-Linear model vs spatial BYM2.

Key Finding: In the non-spatial model, accessibility appeared significant. However, this significance disappeared in the spatial model. With a Phi value of 0.68, it became clear that location itself is a much stronger predictor than travel time.

Sub-Woreda Insights

Aggregating data at the woreda level can sometimes hide local “hotspots.” By looking at the raw raster data, we observed that the risk follows a North-East to South-West gradient.

# cropping Ethiopia's incidence raster to Gambela only
library(terra)
library(tidyterra)

gambela_incidence <- crop(incidence_raster, vect(gambela_malaria), mask = TRUE)

ggplot() +
  geom_spatraster(data = gambela_incidence) +
  scale_fill_viridis_c(option = "viridis", na.value = "transparent", name = "Incidence Rate") +
  geom_spatvector(data = vect(gambela_malaria), fill = NA, color = "grey", linewidth = 0.3) +
  labs(title = "Malaria Incidence in Gambela (2000 - 2024)", caption = "source: Malaria Atlas Project") +
  theme_minimal()

Raster Incidence Map Raw raster-level malaria incidence in Gambela (Source: Malaria Atlas Project).

Public Health Implications

Based on our mapping and models, we recommend the following:

Prioritize Southern Woredas: Dima, Mengesh, and Godere are the highest-risk woredas needing focus.
Strategic Refugee Placement: Locate camps in the North-East where possible, as risk is lower.
Border Sensitivity: Border areas show higher risk, likely due to migration and environmental factors.
Beyond Woreda Borders: Public health efforts should look at sub-woreda variations.

This project was completed as part of an undergraduate study at the Department of Statistics, Addis Ababa University.

Acknowledgements

I would like to express my sincere gratitude to our instructor Bedilu A. (PhD) for his guidance and support throughout this project.

Special thanks also go to my co-authors: Remedan A. Musa, Yonas D. Melese, Hikma H. Eshetu, Hayat E. Shimels, Kibruyisfa E. Zebene, Arsema F. Tegafaw, Fitsum A. Zewude, and Michael T. Tadesse for their hard work and contributions to this research.

Posted on:: December 31, 2025

Length:: 6 minute read, 1250 words

Categories:: Spatial Analysis Public Health

Tags:: R INLA Malaria Disease Mapping

See Also: