My Faults My Own

Any human’s death diminishes me,

because I am involved in humankind.

Age and Covid-19 IFR in India

epistemic status: Invoking Cunningham's Law; not fully confident, but showing my work in the hopes of being told where I'm wrong.


India's demographic average age is younger than that of the US. This implies that the strongly age-varying Covid-19 infection fatality rate (IFR) could cause a lower population-average IFR in India than in an older nation such as the US, all other factors being equal (spoiler: they're not).

I estimate that the age effect creates an India population-average IFR 39% that of the US rate (i.e., a US rate \(2.58\times\) greater), assuming age-uniform infection rates and no difference in medical care. This effect is driven by the reduced population share of age>70 in India (just 35% that of the US).

I do not attempt to model age-varying infection rates (which I expect would slightly decrease India fatality rates relative to the US), do not attempt to model selection pressure on patients' immune systems (which I expect would make India fatality rates modestly lower), and do not model environmental factors such as air pollution (which I expect would make India fatality rates higher).

Finally, this predicted effect is extremely sensitive to the IFR-by-age curves; if shortages in medical capacity cause higher IFR in the 45-69 age groups (representing 22% of population), India population-average IFR could be many times greater. I believe this sensitivity (which will also be present in developing economies experiencing Covid-19 outbreaks in the future) should urgently motivate research into IFR by age under triage and shortage conditions.

I recently read a preliminary analysis of estimated Covid-19 infection and fatality numbers in India which suggested an observed IFR in India significantly lower than estimates of US IFR. My correspondent presented this lower IFR as a puzzle. I wanted to determine to what extent a lower India IFR could be explained by differences in population demographics.

My work is here, using for US and India age demographics, and O'Driscoll et al. (Nature, submitted August 2020) for age-specific IFR. Here's the primary chart:

Note the much higher Covid-19 IFR in the 70+ age groups and the significantly lower India population share in those groups (India is 3.8% age>70, US 10.9%). If we assume age-uniform infection rates, then the population-average IFR for the US demographic distribution is 0.63% and the population-average IFR for the India demographic distribution is 0.25%.

In this model, I do not attempt to model any difference in medical care between India and global baselines -- I'm applying global studies of age-specific IFR directly to India age/sex demographics. The effect of medical care differences on IFR-by-age curves is of first-order importance to this analysis; as an example, if lack of care were equivalent to 12 years in fatality-rate terms, it would triple India population-average IFR to 0.75%. Thus, I believe further research on IFR under triage and shortage conditions is urgently needed.

The data sources I use give 5-year buckets for population and for IFR by age. Rather than using them directly, I attempt to allocate both population share and IFR at the per-year level. (This smoothing has the effect of slightly decreasing US population-average IFR from 0.65% to 0.63%, and a negligible affect on India rate estimates.)

The methodologies here are relatively naive and linear:

  • For age, I use the average of [population change from prior bucket] and [population change to next bucket] to infer a within-bucket annual survival rate, then use that rate to allocate population to years within the bucket.
  • The age buckets go up to a "100+" bucket. I use this number as a 100-105 number, inject a 105-110 bucket at the geometric midpoint of that and 1, and inject 1 person in the 110-115 bucket. (This doesn't really matter -- US "100+" population share is less than 0.03% and India's is even lower.)
  • I fit IFR to a linear-spline model with breakpoints at the midpoints of each bucket, which involved imputing a rate at the bucket center by backing out some of the imputed contribution from neighboring buckets.
  • The O'Driscoll et al. IFR estimates only go up to an "80+" bucket. I use 80% of the 80+ imputed bucket center for age 82, and assume IFR multiplies by \(1.3\times\) every 5-year bucket.

In both cases, I model male and female statistics separately, interact IFR with population share at the year*sex level, and combine on sex for the final numbers and graphs.

Limitations of this model:

  • I ignore differences in medical care capacity and availability. Lower supply or quality of medical care in India would create higher Covid-19 fatality rates, and if effects are significant in age<70 groups, population-average fatality rates could be many times greater.
  • I assume age-uniform infection rates in calculating the average fatality rate per infection. An infection rate increasing with age would increase US IFR relative to India -- doubling the relative infection risk of age>77 would increase US population-average IFR by 53% but India population-average IFR by only 40%.
  • I'm ignoring environmental factors (e.g., air pollution) that could (likely will) make courses of Covid-19 worse.
  • I don't model any mechanism by which older Indian patients may have stronger immune systems than global average for their age. Greater selection pressure on the strength of Indian patients' immune systems would create lower Covid-19 fatality rates.

2021-05-12: I'm currently corresponding with a foundation that aims to fund rapid research on Covid-19 outcomes in India. Please reach out if you're interested in contributing funding.