Do Rainy Months Increase Public Concern About Dengue and Malaria?#

In this notebook, we explore the relationship between total precipitation in India and public awareness of dengue and malaria from 2004 to 2024.

We use varunayan to extract processed precipitation data for India’s region at monthly frequency.

By combining this climate data with Google Trends search activity, we investigate how rainfall patterns relate to public awareness of mosquito-borne diseases — and whether increased precipitation precedes spikes in concern.

Downloading Total Precipitation Data#

We use varunayan.era5ify_geojson to retrieve monthly total precipitation over India using a GeoJSON boundary and the ERA5 dataset.

import varunayan

df = varunayan.era5ify_geojson(
    request_id="prec_india_2004_2024",
    variables=["total_precipitation"],
    start_date="2004-1-1",
    end_date="2024-12-31",
    json_file="https://gist.githubusercontent.com/JaggeryArray/bf296307132e7d6127e28864c7bea5bf/raw/4bce03beea35d61a93007f54e52ba81f575a7feb/india.json",
    frequency="monthly"
)

============================================================
STARTING ERA5 SINGLE LEVEL PROCESSING
============================================================
Request ID: prec_india_2004_2024
Variables: ['total_precipitation']
Date Range: 2004-01-01 to 2024-12-31
Frequency: monthly
Resolution: 0.25°
GeoJSON File: C:\Users\ATHARV~1\AppData\Local\Temp\prec_india_2004_2024_temp_geojson.json


--- GeoJSON Mini Map ---

MINI MAP (68.18°W to 97.40°E, 7.97°S to 35.49°N):
┌─────────────────────────────────────────┐
│·········································│
│········■■■■■■■··························│
│··········■■■■■··························│
│·········■■■■■■■·························│
│·······■■■■■■■■■■························│
│····■■■■■■■■■■■■■■■■············■■■■■■■·│
│···■■■■■■■■■■■■■■■■■■■■■■■■■·■■■■■■■■····│
│····■■■■■■■■■■■■■■■■■■■■■■■■·····■■■■····│
│·■■■■■■■■■■■■■■■■■■■■■■■■■■■■···■■■······│
│··■■■■■■■■■■■■■■■■■■■■■■■■■■■············│
│·······■■■■■■■■■■■■■■■■■■■···············│
│·······■■■■■■■■■■■■■■■■··················│
│·······■■■■■■■■■■■■■·····················│
│········■■■■■■■■■························│
│·········■■■■■■■■························│
│··········■■■■■■■························│
│···········■■■■■·························│
│············■■■··························│
│·········································│
└─────────────────────────────────────────┘
  = Inside the shape
 · = Outside the shape

Saving files to output directory: prec_india_2004_2024_output
  Saved final data to: prec_india_2004_2024_output\prec_india_2004_2024_monthly_data.csv
  Saved unique coordinates to: prec_india_2004_2024_output\prec_india_2004_2024_unique_latlongs.csv
  Saved raw data to: prec_india_2004_2024_output\prec_india_2004_2024_raw_data.csv

============================================================
PROCESSING COMPLETE
============================================================

RESULTS SUMMARY:
----------------------------------------
Variables processed: 1
Time period:         2004-01-01 to 2024-12-31
Final output shape:  (252, 3)
Total complete processing time: 94.03 seconds

First 5 rows of aggregated data:
         tp  year  month
0  0.029095  2004      1
1  0.010954  2004      2
2  0.024185  2004      3
3  0.062215  2004      4
4  0.082469  2004      5

============================================================
ERA5 SINGLE LEVEL PROCESSING COMPLETED SUCCESSFULLY
============================================================

Visualizing Precipitation vs Search Interest in Dengue Symptoms#

This plot compares monthly rainfall and Google search interest in “dengue symptoms”, using log-scaling to smooth trend fluctuations.

The goal is to visually inspect if increases in rainfall correspond with spikes in public concern.

import numpy as np
import matplotlib.pyplot as plt

setup_matplotlib()

fig, ax1 = plt.subplots(figsize=(12, 5))

# Plot tp on left y-axis
ax1.plot(df_dengue_merged['date'], df_dengue_merged['tp'], color='tab:blue', marker='o', label='Total Precipitation (tp)')
ax1.set_ylabel('Total Precipitation (mm)', color='tab:blue')
ax1.tick_params(axis='y', labelcolor='tab:blue')

# Apply log transform to dengue column
dengue_log = np.log(df_dengue_merged['dengue symptoms'] + 1)  # Add 1 to avoid log(0)

# Plot on right y-axis
ax2 = ax1.twinx()
ax2.plot(df_dengue_merged['date'], dengue_log, color='tab:orange', marker='s', label='searches for \'dengue symptoms\' (log-scaled)')
ax2.set_ylabel('Log of \'dengue symptoms\' Trend', color='tab:orange')
ax2.tick_params(axis='y', labelcolor='tab:orange')

# Title and grid
plt.title('Total Precipitation vs Google Trends for \'dengue symptoms\' (Log-Transformed)')
ax1.set_xlabel('Date')
ax1.grid(True)
fig.tight_layout()

plt.show()
../_images/a25bac7738469535cc3a6d9133e7228552b2bfcf7122b882c5c7851ed682d3ab.png

There appears to be a pattern where search interest in dengue symptoms rises after peak rainfall months, which aligns with known transmission cycles for mosquito-borne diseases.

Visualizing Precipitation vs Search Interest in Malaria Symptoms#

We repeat the same visualization approach for malaria symptoms to examine whether precipitation patterns precede or coincide with rising public awareness of malaria risk.

import numpy as np
import matplotlib.pyplot as plt

setup_matplotlib()

fig, ax1 = plt.subplots(figsize=(12, 5))

# Plot tp on left y-axis
ax1.plot(df_malaria_merged['date'], df_malaria_merged['tp'], color='tab:blue', marker='o', label='Total Precipitation (tp)')
ax1.set_ylabel('Total Precipitation (mm)', color='tab:blue')
ax1.tick_params(axis='y', labelcolor='tab:blue')

# Apply log transform to malaria column
malaria_log = np.log(df_malaria_merged['malaria symptoms'] + 1)  # Add 1 to avoid log(0)

# Plot on right y-axis
ax2 = ax1.twinx()
ax2.plot(df_malaria_merged['date'], malaria_log, color='tab:orange', marker='s', label='searches for \'malaria symptoms\' (log-scaled)')
ax2.set_ylabel('Log of \'malaria symptoms\' Trend', color='tab:orange')
ax2.tick_params(axis='y', labelcolor='tab:orange')

# Title and grid
plt.title('Total Precipitation vs Google Trends for \'malaria symptoms\' (Log-Transformed)')
ax1.set_xlabel('Date')
ax1.grid(True)
fig.tight_layout()

plt.show()
../_images/6746fd213f7ee914cffb1e65c35fc3813b34ebe2e0bd0b70da9203552e013e98.png

As with dengue, malaria-related search interest often increases shortly after heavy rainfall, suggesting a seasonal awareness cycle.

Cross-Correlation: Dengue Symptoms vs Precipitation#

We use cross-correlation to check if dengue interest lags or leads precipitation.
A positive lag means rainfall leads search interest.
This helps assess whether people become concerned after rainy periods.

import numpy as np

tp = (df_dengue_merged['tp'] - df_dengue_merged['tp'].mean()) / df_dengue_merged['tp'].std()
dengue = (df_dengue_merged['dengue symptoms'] - df_dengue_merged['dengue symptoms'].mean()) / df_dengue_merged['dengue symptoms'].std()

# Full cross-correlation
corr = np.correlate(dengue - dengue.mean(), tp - tp.mean(), mode='full')
lags = np.arange(-len(dengue)+1, len(dengue))
corr = corr / (len(dengue) * dengue.std() * tp.std())  # Normalize

lag_limit = 12
mask = (lags >= -lag_limit) & (lags <= lag_limit)
lags_limited = lags[mask]
corr_limited = corr[mask]

# Plot
plt.figure(figsize=(10, 5))
plt.stem(lags_limited, corr_limited)
plt.xlabel('Lag (months)')
plt.ylabel('Cross-correlation')
plt.title('CCF: Searches for \'dengue symptoms\' vs Total Precipitation')
plt.axvline(0, color='gray', linestyle='--')
plt.grid(True)
plt.tight_layout()
plt.show()
../_images/151c31d92c49d11835e572e4afb93c5b0417596264d385edaa693ec27afe10d6.png

The CCF plot shows that search interest in dengue symptoms tends to follow rainfall by 2 months, indicating that public awareness spikes after breeding conditions increase.

Cross-Correlation: Malaria Symptoms vs Precipitation#

We repeat the same analysis for malaria, testing how far search trends lag or lead relative to precipitation.

import numpy as np

tp = (df_malaria_merged['tp'] - df_malaria_merged['tp'].mean()) / df_malaria_merged['tp'].std()
malaria = (df_malaria_merged['malaria symptoms'] - df_malaria_merged['malaria symptoms'].mean()) / df_malaria_merged['malaria symptoms'].std()

# Full cross-correlation
corr = np.correlate(malaria - malaria.mean(), tp - tp.mean(), mode='full')
lags = np.arange(-len(malaria)+1, len(malaria))
corr = corr / (len(malaria) * malaria.std() * tp.std())  # Normalize

lag_limit = 12
mask = (lags >= -lag_limit) & (lags <= lag_limit)
lags_limited = lags[mask]
corr_limited = corr[mask]

# Plot
plt.figure(figsize=(10, 5))
plt.stem(lags_limited, corr_limited)
plt.xlabel('Lag (months)')
plt.ylabel('Cross-correlation')
plt.title('CCF: Searches for \'malaria symptoms\' vs Total Precipitation')
plt.axvline(0, color='gray', linestyle='--')
plt.grid(True)
plt.tight_layout()
plt.show()
../_images/346165bbd9d5e089b4699834c5a7b79c43e145ee5fde371da6f7047631572c39.png

Similar to dengue, malaria search trends lag behind rainfall, (1 month to be exact), suggesting that increased rainfall is followed by heightened public awareness.

Conclusion#

This analysis demonstrates that total precipitation in India is positively correlated with public search interest in mosquito-borne diseases like dengue and malaria.

  • Peak rainfall often precedes rising public concern by 1–2 months

  • Cross-correlation confirms a lagging relationship, which aligns with seasonal mosquito activity and disease outbreaks

By combining climate data extracted using varunayan with search behavior, we can understand the environmentally driven public health patterns.