Historical patterns and trends
We identify patterns and trends in rabi cereal production with an updated annual time series from 1966 to 2017 of district-level crop area and production data28. The data separates rabi sorghum from kharif sorghum. Wheat is only a rabi crop. To harmonize analysis of the time series, we use the apportioned dataset with historical district boundaries available for 196129 for all analyses. District boundaries have shifted and divided considerably over the time period. Currently, there are 741 districts, increasing from 324 in the dataset for 1961 boundaries.
Sensitivity to historical and future temperature
To assess the historical temperature sensitivity of rabi cereal yields, we use a mixed-modeling framework following the approach of20,30,31. The analysis focuses on the 101 districts (historical boundaries) where both cereals were grown to maintain consistency in the range of climate conditions and other factors affecting yields. These common districts exclude the main wheat-growing region in northwest India and include the main area of sorghum cultivation in central India (Supplemental Fig. 1).
The two cereals differ in the timing of their growing seasons, which also vary in different locations (Supplemental Table 3)32. Excluding crop calendars from stations at Punjab and Himachal Pradesh, which are outside the 101 common districts, the first week of the wheat growing season varies among other stations from November 7 to Nov 21 and the last week from March 14 to April 11. For rabi sorghum, the first week varies from September 19 to October 3 and the last week from January 1 to February 14. Actual planting and harvesting dates can vary from these crop calendars. To include as much of the growing season as possible, for this analysis we consider the wheat growing season in the 101 common districts to be from November 7 to April 11 and the sorghum season to be September 19 to February 14. Sowing and emergence occur for sorghum in the first stage and for wheat in the second overlapping stage, while grain filling occurs in the overlapping stage for sorghum and the third stage for wheat.
From gridded maximum daily temperature at 1 × 1 degree resolution33 and precipitation data at 0.25 by 0.25 resolution34, we extract mean maximum temperature and precipitation for each district and each day between September 19 and April 11 for 1966–1967 to 2016–2017 (corresponding to harvest years 1966 and 2017 in the crop data). We obtain the median of the maximum daily temperature and the total precipitation for each district for three stages for each year: sorghum only from September 19 to November 6 (stage 1); overlapping sorghum and wheat from November 7 to February 13 (stage 2); and wheat only from February 14 to April 11 (stage 3). Statistics to obtain spatial means for districts were carried out with the R package “exactextractr”. We chose maximum daily temperature rather than mean or minimum based on wheat’s sensitivity to high temperatures.
Using only years in which both sorghum and wheat were produced in the district, sorghum yields were modeled as:
$$f\left({Y}_{s}\right)={\alpha sT}_{stage1}+ {\beta sT}_{stage2 }+ {\gamma sP}_{stage2} + {\delta sS}_{1}+ {\varepsilon sS}_{2}+\left(1|\text{d}\right)+\left(1|\text{t}\right),$$ (1)
and wheat yields were modeled as:
$$f\left({Y}_{w}\right)={\beta wT}_{stage2}+{\gamma wP}_{stage2 }+{\mu wT}_{stage3}+{\theta wP}_{stage3} +{\delta wS}_{1}+{\varepsilon wS}_{2}+\left(1|\text{d}\right)+\left(1|\text{t}\right),$$ (2)
where \({Y}_{s} \text{ and } {Y}_{w}=\text{sorghum and wheat yield respectively}\); \({T}_{stage1 , }{T}_{stage2 , }\text{ and } {T}_{stage2 }=\text{median of maximum daily temperature in stages }1, 2,\text{ and }3;\) \({P}_{stage2 }\text{ and } {T}_{stage2 }=\text{total precipitation in stages }2\text{ and }3;\) \({S}_{1}\text{ and } {S}_{2}=\text{proportion of sand and silt},\text{ respectively};\) \(\alpha s, \beta s, \gamma s, \delta s \text{ and } \varepsilon s \text{ are coefficients for the sorghum model};\) \(\beta w, \gamma w, \mu w, \theta w, \delta w, \text{ and } \varepsilon w \text{ are coefficients for the wheat model}\); and \(\left(1|\text{d}\right) \text{ and }\left(1|\text{t}\right) \text{represent random effects for district and time }\left(\text{year}\right)\text{ respectively}\).
The random effect for year accounts for increases in yield over time due to management, inputs and cultivars, which is clearly evident in the data especially for wheat. We use the random effect for year rather than detrending the data due to incomplete time series for all districts and because the trends in yields are not linear. Inclusion of quadratic terms for precipitation and temperature did not improve the models so we included only the linear terms (Supplemental Table 4).
We do not include precipitation for stage 1 in the sorghum model because precipitation and temperature were co-linear for that stage (Supplemental Table 5). We also excluded proportion of clay soil due to co-linearity with proportions of sand and silt. Variance Inflation Factors for all variables are less than 2 for the models in Eqs. (1) and (2). Models were run with R package “lme4” using the “glmer” function. To determine p-value for the model coefficients, we used the “p_value” function in the “parameters” package.
To compare predicted and observed yields using alternative sources of climate data, we obtained historical climate data for 2010–2015 (the most recent available) from two models from the Climate Model Intercomparison Project (CMIP6) multi-ensemble model collection of models35,36. We selected these models because they are the only models from CMIP6 that provide all of the variables required for calculating crop water requirements. The models are the lower-complexity CNRM-CM6-1-HR at 0.25 by 0.25 degree resolution37 and higher-complexity, second generation CNRM-ESM2-1 at 1 by 1 degree resolution38. These models fall in the mid-range of climate sensitivity compared with other CMIP6 models39 and represent the Indian monsoon with reasonable accuracy (approximately within two standard deviations of historical rainfall)40.
Using the predict () function in R from the models and daily maximum temperatures and precipitation processed the same way as the climate data used to construct the models, we compared predicted and actual yields for 2010–2015. Adjusted R2 values for correlations between district-level actual and predicted yield values are between 0.55 and 0.87 from both sorghum and wheat models and root mean square errors (rmse) are between 0.084 and 0.182 (Supplemental Table 6). Because rmse values were generally lower using climate data from CNRM-ESM2-1 model, we show these results in the main text and results from the CNRM-CM6-1-HR model in Supplemental Material.
To explore the sensitivity of rabi sorghum and wheat yields to future climate, we again predicted yields based on the yield models from Eqs. (1) and (2) and maximum daily temperatures and precipitation obtained from the highest emissions Shared Socioeconomic Pathway (SSP5-8.5) from the climate models. The predicted yields were for the base time period (average for 2010–2015) and three future time periods: 5 year averages for 2028–2032, 2038–2042, and 2048–2052. We used the historical climate data provided with the climate models for the base time period rather than the data used to generate the model to ensure consistency in resolution and other factors that might lead to spurious results. The yields predicted from the models and climate data are intended to test sensitivity to possible future climate rather than to predict actual yield, which depends on management, varieties, pests, and many other factors in addition to climate.
Data sources are all publically available and are listed in Supplemental Table 7.
Water requirements
Reference evapotranspiration for each of the stages (Sept 19–Nov 6; Nov 7–Feb 13; and Feb 14–April 11) was calculated for each of the time periods (2010–2015, 2028–2032, 2032–2042, 2048–2052) for each district. First we calculate monthly means for reference evapotranspiration using the United Nation’s Penman–Monteith equation41 with parameter values obtained from the CMIP-6 climate models:
$${ET}_{0}=\frac{0.408\Delta \left({R}_{n}-G\right)+\gamma \frac{900}{T+273}{u}_{2}\left({e}_{s}-{e}_{a}\right)}{\Delta +\gamma (1+0.34{u}_{2})},$$ (3)
where \({ET}_{0}\) = reference evapotranspiration (mm/day); \({R}_{n}\) = net radiation at the crop surface (MJ/m2/day); G = soil heat flux density (MJ/m2/day); T = air temperature at 2 m height (℃); \({u}_{2}\) = wind speed at 2 m height (m/s); \({e}_{s}\) = saturation vapour pressure (kPa); \({e}_{a}\) = actual vapour pressure (kPa); \({e}_{s}\)-\({e}_{a}\) = saturation vapour pressure deficit (kPa); \(\Delta\) = slope vapour pressure curve (kPa/℃); and \(\gamma\) = psychrometric constant (kPa/℃) (see Supplemental Table 8 for derivation of parameters).
Water requirements for wheat and sorghum in each stage were calculated from the reference evapotranspiration and crop coefficients and curves according to:
$${ET}_{c,j}=\sum_{i}^{Days}{ET}_{0,i}\times {k}_{c,i},$$ (4)
where \({ET}_{c,j}\)= water requirement for crop c and crop development stage j and \({k}_{c,i}\) = crop coefficient for crop c and day i from Table 1 and Fig. 1 in Ref.42.
The crop water requirement (ETc) represents mm of water required to keep the plant from experiencing water stress. To estimate water footprints (volume of water required per unit of production in m3/tonne) we divided the crop water requirement for each district by yield obtained from the yield model in Eqs. (1) and (2).