## Abstract

In this study, a maximum entropy copula-based frequency analysis (MECFA) method is developed through integrating maximum entropy, copulas and frequency analysis into a general framework. The advantages of MECFA are that the marginal modeling requires no assumption and joint distribution preserves the dependence structure of drought variables. MECFA is applied to assessing bivariate drought frequency in the Kaidu River Basin, China. Results indicate that the Kaidu River Basin experienced 28 drought events during 1958–2011, and drought inter-arrival time is 10.8 months. The average duration is 6.2 months (severity 4.6), and the most severe drought event lasts for 35 months (severity 41.2) that occurred from June 1977 to March 1980. Results also disclose that hydrological drought index (HDI) 1 is suitable for drought frequency analysis in the target year of return periods of 5 and 10, HDI 3, HDI 6 and HDI 12 are fit for the target year of return periods of 20, 50 and 100. The joint return period can be used as the upper bound of the target return period, and the joint return period that either duration or severity reaches the drought threshold can be used as the lower bound of the target return period.

## HIGHLIGHTS

An MECFA is developed for bivariate drought risk analysis.

Hydrological drought index is developed for drought identification.

MECFA provides a new way to build marginal distribution.

MECFA is applied to the Kaidu River Basin (KRB), China.

KRB suffered 28 drought events during 1958–2011.

## INTRODUCTION

Drought is a natural hazard with negative impacts on human lives and economic activities such as crop yield reduction, water resource shortage and ecological degradation (Bong & Richard 2020; Hassan *et al.* 2020). Over the past decades, the percentage of the world's land area affected by severe drought has doubled, and drought occurrence shows an upward trend by 2.69% per 50 years on Asian arid land since 1900 (Sternberg 2011; Stocker *et al.* 2013). In China, meteorological disaster losses account for more than 70% of all natural disaster losses, half of which are caused by drought; droughts lead to a loss of 39.2 billion kilograms of grains every year, accounting for 1.47% of the national GDP (Zheng *et al.* 2013). Occurrence frequency is the main criterion of drought risk in practice. Therefore, frequency analysis is conducive to strengthening drought prevention, which is also a basic work of drought risk management (Kim & Jehanzaib 2020).

Drought occurrence is approached from different characteristics including duration, severity, intensity and inter-arrival time. Duration and severity, as primary variables in estimating other characteristics, are two basic indexes for drought management (Jehanzaib *et al.* 2019). Many researchers have analyzed drought frequency in terms of single variables, which disclose that different drought characteristics (duration and severity in particular) indicate a significant dependence on each other (Montaseri & Amirataee 2018; Fan *et al.* 2020; Wu *et al.* 2021). Xu *et al.* (2015) concluded that univariate analysis cannot reveal the interaction between variables, which is inadequate to characterize the probabilistic nature of a drought. Compared with univariate analysis, drought frequency analysis based on the joint probability can provide more valuable information for drought risk management. Copula theory, as a multivariate descriptive tool for probability analysis, has been widely applied to hydrology and meteorology (Jew *et al.* 2020; Li *et al.* 2020a, 2020b; Chebana & Ouarda 2021). Copula can model two or more variables based on whatever distribution to get the joint distribution, which can reveal more interaction information between variables than univariate analysis (Huang *et al.* 2017; Li *et al.* 2020a, 2020b; Zhang *et al.* 2021). Foo *et al.* (2019) described the correlation and dependency between drought variables through the trivariate copula model and further analyzed the joint return periods to describe the drought properties. Zhang *et al.* (2019) proposed a trivariate standardized drought index for characterizing droughts from an integrated meteorological, hydrological and agricultural point of view. Soumia & Doudja (2020) employed three Archimedean copula families to disclose the relationship between drought duration and severity, and built three-dimensional surfaces of drought severity, drought duration and cumulated percentage of the affected area for each return period.

Due to the significant interaction between drought variables, joint probability analysis requires more complex computing processes than univariate probability analysis (Zuo *et al.* 2017). Different variables usually follow different distributions in statistics, and the traditional marginal fittings do not have the accuracy of dependence structure modeling. Goodness-of-fit tests have to be conducted for the marginal distribution, and in some cases, more than one distribution performs, which makes it hard to select (Liu *et al.* 2019; He *et al.* 2020). Maximum entropy (ME) is a criterion to select the statistical characteristics of random variables that best match the objective situation (Jaynes 1982). The advantage of coupling ME with copula is that it provides a unified distribution formulation that could reduce the complex contrast work. Some scholars have applied the concept of ME to hydrological simulation, and the results obtained have shown a good fit in the practical applications (Huang *et al.* 2017; Ishikawa *et al.* 2018; Shrestha & Wang 2020). To improve the traditional marginal distribution fitting, it is necessary to introduce ME and copula into drought frequency analysis.

Therefore, this study aims to develop a maximum entropy copula-based frequency analysis (MECFA) method for bivariate drought risk, through integrating ME, copula and frequency analysis into a general framework. MECFA has the advantage that it can provide an effective way to model the marginal distribution of drought variables without a curve fitting hypothesis. The main contributions of this study are (i) this is the first attempt to develop a novel method (abbreviated as MECFA) that integrates techniques of ME, copula and frequency analysis for assessing bivariate drought risk of the Kaidu River Basin and (ii) based on different confidence levels, the results of drought frequency are capable of providing decision supports for drought management in the arid regions.

## METHODOLOGY

In this study, a novel MECFA method is developed by combining the ME principle, copula and frequency analysis into a general framework (Figure 1). In detail, drought variables are identified based on the drought index and runs-theory. The drought index includes four time scales: 1-month, 3-month, 6-month and 12-month. The correlation between drought duration and severity is tested by using Kendall and Spearman coefficients. If the variables are significantly correlated, ME is used to model marginal distribution, and then the fitting degree is verified by comparing with the other four parametric distributions. Three copula functions are employed to model the dependence structures of duration and severity, and the optimal copula is selected based on the goodness-of-fit test. Bivariate drought frequency is analyzed based on joint return periods and joint probabilities.

### Copula-based joint distribution

*F*is a two-dimensional joint distribution of dependent random variables of

_{xy}*X*and

*Y*, and

*F*and

_{x}*F*are marginal distributions of

_{y}*X*and

*Y*, then a copula

*C*exists as follows (Nelsen 2006):

A copula is constituted from different families, including elliptical (normal and Student-t), Archimedean (Clayton, Gumbel–Hougaard, Frank and Joe), extreme value (Husler–Reiss, Galambos, Tawn and t-EV) and others (Plackett and Farlie–Gumbel–Morgenstern) (Jehanzaib *et al.* 2019). Among these copula families, Archimedean is the most attractive one for multivariate hydrological risk analysis, because an Archimedean copula is easy to model and it can capture the dependence structure with several desirable properties. In this study, three Archimedean copulas are employed, including Frank, Clayton and Gumbel. Drought variables that satisfy the copula function should show a significant correlation. Thus, two correlation tests are utilized to check the correlation coefficient of duration and severity, including Kendall's *τ* and Spearman's *ρ*.

### ME-based marginal distribution

*X*by maximizing the Shannon entropy

*H*(

*x*). Four statistics can be regarded as constraints to derive the marginal distribution of drought variables, including mean, standard deviation, skewness and kurtosis. In Equations (2) and (3),

*x*is a drought variable;

*h*(

_{i}*x*) is a known function of variable

*X*and

*h*(

_{i}*x*) can be specified as

*h*

_{1}=

*x*,

*h*

_{2}=

*x*

^{2},

*h*

_{3}=

*x*

^{3}and

*h*

_{4}=

*x*

^{4}which are corresponding to the constraints of mean, standard deviation, skewness and kurtosis, respectively; is the expectation of

*h*(

_{i}*x*), which can be specified as . Equation (2) presents that PDF needs to satisfy the cumulative probability theorem.

*H*(x). Subject to the constraints, the Lagrange function can be expressed aswhere (

*λ*= 1, 2, …,

*m*) are the Lagrange multipliers;

*f*(

*x*) is the PDF of random drought variable

*X*;

*h*(

_{i}*x*) are known functions of

*X*and

*C*are constraints of

_{i}*f*(

*x*).

*f*(

*x*) can be obtained through maximizing

*L*, thus, the derivative of

*L*respect to

*f*(

*x*) should be zero. Then, the ME-based PDF can be expressed aswhere

*λ*

_{0}is the 0th Lagrange multiplier expressed as Equation (6).

*λ*

_{0}was proved to be a strictly convex function of (

*λ*

_{0},

*λ*

_{1}, …,

*λ*

_{m}_{)}. The PDF of

*X*can be obtained by substituting Equation (6) into Equation (5), which preserves the most important statistical moments, as shown in Equation (7). Finally, the cumulative distribution function (CDF) can be obtained as follows:

### Joint probability distribution of duration and severity

*F*(

_{D}*d*) and

*F*(

_{S}*s*) represent the exceedance probability of duration and severity, respectively, and

*C*represents the copula function. Bivariate drought conditional probabilities can be calculated by using copula. When duration (severity) exceeds a specified threshold

*d*′(

*s*′), the joint return period of severity (duration) can be obtained based on the following equations:where

*E*(

*L*) represents the average drought inter-arrival time;

*F*is the CDF of duration and

_{D}*F*is the CDF of severity. is the joint return period when

_{S}*D*>

*d*and

*S*>

*s*and is the joint return period when

*D*>

*d*or

*S*>

*s*.

In this study, the parameter of the marginal distribution was estimated by using maximum likelihood estimation (MLE). MLE is the preferred full-parametric method for multidimensional parameter estimation because its approach is close to the true value and it is easier to implement. The copula dependence parameter (*θ*) was determined through the maximization of the likelihood function. Finally, the Akaike information criterion (AIC) and root mean-squared error (RMSE) were applied to test the goodness of fit of the copula models (Akaike 1987). The model with the lowest AIC, Bayesian Information Criterion (BIC) and RMSE values represents the optimal copula.

### Drought index and runs-theory

*et al.*2011). SPI is used to monitor drought conditions under different time scales. The formulation of the SPI (based on the Gamma function) is expressed aswhere

*α*is the formal parameter (

*α*> 0),

*β*is the scale parameter (

*β*> 0) and

*x*is the amount of precipitation, with

*x*varying according to

*α*and

*β*. By referring to the SPI, SRI is used to measure the relative loss of surface water (Shukla & Wood 2008). The HDI that takes the weighted sum of SPI and SRI is expressed as (Zhou

*et al.*2014)where

*a*and

*b*take the ratio of the average of SPI/SRI corresponding to the slight drought and the historical minimum, respectively. HDI is employed to analyze the drought frequency under different time scales. For example, HDIs 1–6 can reﬂect variation in precipitation and runoff in a short time, such as the dynamic change of soil moisture that has an important impact on agricultural production. HDIs 9–24 can reﬂect the evolution of long-term water resources, such as groundwater supply and surface runoff.

Yevjevich (1969) proposed runs-theory to identify drought characteristics. A run is defined as a portion of time series of variable *X _{t}*, in which all values are either below (a negative run) or above (a positive run) a selected truncation level of

*X*

_{0}. Consequently, two main components of a drought event are identified: Drought duration (

*D*) (defined as a continuous period when the HDI was below

_{i}*X*

_{0}or a single period when the HDI was below

*X*

_{2}) and drought severity (

*S*) (indicated as the cumulative values of the HDI within the drought). In this study, a drought event in one period can be identified as two cases: (1) HDI value is less than −1.0 or (2) continuous values (HDI value of all consecutive months without interruption) are between −0.5 and −1.0.

_{i}## CASE STUDY

The Kaidu River Basin is located in the central southern slope of the Tianshan Mountains, Xinjiang province of China, with a geographic position between 42°43′ and 43°21′N and from 82°58′ to 86°05′E (Figure 2). The basin is characterized by low and irregular precipitation, high temperature and evaporation in summer, and notable drought period (Wang & Huang 2015). The average annual temperature is −4.3°C and the extreme minimum temperature is −48.1 °C. The Kaidu River Basin has an average annual precipitation of less than 500 mm and a mean annual evaporation of 1,157 mm, which is a typical arid and semi-arid area with frequent drought disasters. The climate change in Xinjiang is basically in line with the global trend, with the average temperature rising 0.02°C per year in the Kaidu River Basin. Since the 1980s, drought in the Kaidu River Basin has intensified due to climate change and human activities, and many areas have experienced continuous severe drought time since 2005. The water shortage caused by drought has become an important factor restricting the local social and economic development. Therefore, drought frequency analysis can be helpful to regional drought resistance and disaster mitigation. In this study, precipitation and runoff data, which were collected from Bayanbulak and Dashankou stations, were used (1958–2011). Bayanbulak station is located upstream of the Kaidu River Basin, and Dashankou station is located in downstream. Data can be downloaded from the National Centers for Environment Information (https://gis.ncdc.noaa.gov/maps/) and Global Runoff Data Centre (https://www.bafg.de/GRDC/EN/01GRDC/13dtbse/databasenode.html) (Figure 4).

## RESULTS AND DISCUSSION

### Identification of drought variables

Figure 3 illustrates the monthly frequency distribution of HDI under four time scales. There are no obvious regularity alternations of dry (negative value) and wet (positive value) periods during 1958–2011. The amount of drought events and the mean of drought-arrival time represent the general drought conditions of the Kaidu River Basin, and the statistical parameters of HDI under four time scales, including minimum value, maximum value, mean, standard deviation, skewness and kurtosis of drought duration and severity are depicted in Table 1. There are more drought events and shorter average drought inter-arrival time when a short time scale is used. Under HDI 1, the two most severe drought events occurred from August 1984 to January 1985 (*S* = 5.42), July 1986 to April 1987 (*S* = 5.88). Under HDI 3, the three most severe drought events occurred from February to October 1983 (*S* = 6.56), May 1985 to February 1986 (*S* = 8.10) and April to November 1986 (*S* = 7.58). Under HDI 6, the four most severe drought events occurred from December 1978 to October 1979 (*S* = 12.81), March to December 1983 (*S* = 8.61), August 1984 to February 1985 (*S* = 7.64) and July 1985 to May 1987 (*S* = 24.43). Under HDI 12, the four most severe drought events occurred from June 1967 to May 1968 (*S* = 10.51), May 1974 to April 1976 (*S* = 21.41), June 1977 to July 1978 (*S* = 16.79) and August 1984 to May 1987 (*S* = 41.18). Severe drought events have been happening since 1967, and the most severe drought event occurred in 1983. The initial year of severe drought under the long time scale was earlier than the short time scale, because several continuous slight drought events were combined into one severe drought. Based on the statistical results of four time scales, drought began to worsen after the 1970s.

Drought characteristics . | Statistical parameters . | HDI 1 . | HDI 3 . | HDI 6 . | HDI 12 . |
---|---|---|---|---|---|

Frequency of drought | 32 | 35 | 28 | 16 | |

Inter-arrival time (month) | 10.0 | 9.4 | 10.8 | 12.9 | |

Duration (month) | Min | 1.0 | 1.0 | 1.0 | 2.0 |

Max | 10.0 | 10.0 | 23.0 | 35.0 | |

Mean | 3.5 | 4.3 | 5.4 | 11.6 | |

St.dev. | 2.2 | 2.2 | 4.7 | 8.7 | |

Skew | 1.1 | 0.9 | 2.6 | 1.3 | |

Kurt | 1.2 | 0.3 | 8.6 | 1.9 | |

Severity (dimensionless) | Min | 0.9 | 0.8 | 0.9 | 0.9 |

Max | 5.9 | 8.1 | 24.4 | 41.2 | |

Mean | 2.2 | 3.3 | 3.6 | 9.3 | |

St.dev. | 1.2 | 1.9 | 3.7 | 10.1 | |

Skew | 1.3 | 0.9 | 2.9 | 2.2 | |

Kurt | 1.7 | 0.1 | 10.5 | 5.6 |

Drought characteristics . | Statistical parameters . | HDI 1 . | HDI 3 . | HDI 6 . | HDI 12 . |
---|---|---|---|---|---|

Frequency of drought | 32 | 35 | 28 | 16 | |

Inter-arrival time (month) | 10.0 | 9.4 | 10.8 | 12.9 | |

Duration (month) | Min | 1.0 | 1.0 | 1.0 | 2.0 |

Max | 10.0 | 10.0 | 23.0 | 35.0 | |

Mean | 3.5 | 4.3 | 5.4 | 11.6 | |

St.dev. | 2.2 | 2.2 | 4.7 | 8.7 | |

Skew | 1.1 | 0.9 | 2.6 | 1.3 | |

Kurt | 1.2 | 0.3 | 8.6 | 1.9 | |

Severity (dimensionless) | Min | 0.9 | 0.8 | 0.9 | 0.9 |

Max | 5.9 | 8.1 | 24.4 | 41.2 | |

Mean | 2.2 | 3.3 | 3.6 | 9.3 | |

St.dev. | 1.2 | 1.9 | 3.7 | 10.1 | |

Skew | 1.3 | 0.9 | 2.9 | 2.2 | |

Kurt | 1.7 | 0.1 | 10.5 | 5.6 |

### Marginal distribution of drought variables

The correlation of two drought duration and severity needs to be validated before modeling the marginal distribution and the joint probability distribution. Kendall's *τ* and Spearman's *ρ* variables are 0.621 and 0.779 in HDI 1, 0.889 and 0.962 in HDI 3, 0.852 and 0.949 in HDI 6 and 0.799 and 0.917 in HDI 12, respectively. Results show that the duration and severity are significantly correlated under four time scales. The identification and determination of the optimal marginal distribution of each variable are the precondition of modeling joint probability distribution. In this study, the ME principle was used to fit the distribution of drought duration and severity. Table 2 presents the parameter estimation of marginal distribution, which was determined by the conjugate gradient method. The entropy values of duration and severity are 2.09 and 1.13 in HDI 1, 4.72 and 1.97 in HDI 3, 4.32 and 3.70 in HDI 6 and 5.70 and 2.28 in HDI 12. In order to prove the reliability of ME-based distribution, another four commonly-used probability distributions were also applied to fit the marginal distribution, including exponential, gamma, logistic and lognormal, and the shape and scale parameters of commonly-used distributions were determined by the MLE method. The goodness-of-fit test (AIC) was applied to pick out the optimal marginal distribution.

HDI . | λ_{0}. | λ_{1}. | λ_{2}. | λ_{3}. | λ_{4}. | Entropy . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

D
. | S
. | D
. | S
. | D
. | S
. | D
. | S
. | D
. | S
. | D
. | S
. | |

1 | 4.16 | 3.75 | −2.98 | −4.61 | 1.03 | 2.42 | −0.13 | −0.46 | 0.01 | 0.03 | 2.09 | 1.13 |

3 | 5.37 | 3.88 | −4.86 | −2.61 | 4.88 | 0.78 | −1.38 | −1.03 | 0.74 | 0.95 | 4.72 | 1.97 |

6 | 7.52 | 4.06 | −4.75 | −2.83 | 2.91 | 2.14 | −1.49 | −0.05 | 0.13 | 0.38 | 4.32 | 3.70 |

12 | 8.38 | 5.21 | −4.29 | −3.86 | 0.71 | 4.06 | −0.05 | −4.75 | 0.95 | 1.62 | 5.70 | 2.28 |

HDI . | λ_{0}. | λ_{1}. | λ_{2}. | λ_{3}. | λ_{4}. | Entropy . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

D
. | S
. | D
. | S
. | D
. | S
. | D
. | S
. | D
. | S
. | D
. | S
. | |

1 | 4.16 | 3.75 | −2.98 | −4.61 | 1.03 | 2.42 | −0.13 | −0.46 | 0.01 | 0.03 | 2.09 | 1.13 |

3 | 5.37 | 3.88 | −4.86 | −2.61 | 4.88 | 0.78 | −1.38 | −1.03 | 0.74 | 0.95 | 4.72 | 1.97 |

6 | 7.52 | 4.06 | −4.75 | −2.83 | 2.91 | 2.14 | −1.49 | −0.05 | 0.13 | 0.38 | 4.32 | 3.70 |

12 | 8.38 | 5.21 | −4.29 | −3.86 | 0.71 | 4.06 | −0.05 | −4.75 | 0.95 | 1.62 | 5.70 | 2.28 |

According to Table 3, by comparing four HDIs, ME-based distribution shows good performance on duration and severity marginal fitting due to the stability in goodness of fit. An AIC value of the ME-based distribution is always the minimum under any time scale (bold font). For other distributions, the optimal solutions of different variables under different time scales would not be identical. If ME-based distribution was not used, more computational work would be needed. Therefore, the ME principle can help derive a unified distribution function without a hypothesis, which makes the fitting process more efficient. The probability distribution and cumulative distribution were compared with empirical histograms, and the generated cumulative distribution was compared with the cumulative distribution, which is estimated by the Gringorten plotting position formula. By comparing cumulative distribution with the empirical CDF, we can find that the marginal cumulative distribution is able to capture the shape of empirical CDF, which validates the reliability of fitting processes (Figure 4).

Drought variables . | Distribution . | HDI 1 . | HDI 3 . | HDI 6 . | HDI 12 . | ||||
---|---|---|---|---|---|---|---|---|---|

Parameters . | AIC . | Parameters . | AIC . | Parameters . | AIC . | Parameter . | AIC . | ||

Duration | ME | / | 151.12 | / | 147.66 | / | 153.38 | / | 168.72 |

Exponential | 2.82 | 175.07 | 3.87 | 187.21 | 5.32 | 163.75 | 6.12 | 180.13 | |

Gamma | 1.63, 1.38 | 167.32 | 2.07, 1.50 | 174.55 | 1.46, 2.92 | 160.30 | 1.68, 3.36 | 176.33 | |

Logistic | 2.38, 1.11 | 161.93 | 1.06, 0.28 | 168.61 | 4.44, 2.18 | 175.68 | 5.11, 2.51 | 193.25 | |

Lognormal | 0.79, 0.56 | 161.02 | 1.10, 0.49 | 167.76 | 1.28, 0.58 | 154.10 | 1.47, 0.67 | 169.51 | |

Severity | ME | / | 124.79 | / | 156.39 | / | 141.14 | / | 155.25 |

Exponential | 2.50 | 207.08 | 4.22 | 221.59 | 5.08 | 195.16 | 5.84 | 214.68 | |

Gamma | 1.31, 1.52 | 162.36 | 1.36, 1.98 | 173.66 | 1.03, 3.59 | 156.66 | 1.18, 4.13 | 172.33 | |

Logistic | 1.87, 1.10 | 191.40 | 0.83, 0.33 | 163.69 | 3.66, 2.27 | 178.64 | 4.21, 2.61 | 196.50 | |

Lognormal | 0.64, 0.58 | 147.42 | 0.90, 0.59 | 163.55 | 1.06, 0.71 | 149.54 | 1.22, 0.82 | 164.49 |

Drought variables . | Distribution . | HDI 1 . | HDI 3 . | HDI 6 . | HDI 12 . | ||||
---|---|---|---|---|---|---|---|---|---|

Parameters . | AIC . | Parameters . | AIC . | Parameters . | AIC . | Parameter . | AIC . | ||

Duration | ME | / | 151.12 | / | 147.66 | / | 153.38 | / | 168.72 |

Exponential | 2.82 | 175.07 | 3.87 | 187.21 | 5.32 | 163.75 | 6.12 | 180.13 | |

Gamma | 1.63, 1.38 | 167.32 | 2.07, 1.50 | 174.55 | 1.46, 2.92 | 160.30 | 1.68, 3.36 | 176.33 | |

Logistic | 2.38, 1.11 | 161.93 | 1.06, 0.28 | 168.61 | 4.44, 2.18 | 175.68 | 5.11, 2.51 | 193.25 | |

Lognormal | 0.79, 0.56 | 161.02 | 1.10, 0.49 | 167.76 | 1.28, 0.58 | 154.10 | 1.47, 0.67 | 169.51 | |

Severity | ME | / | 124.79 | / | 156.39 | / | 141.14 | / | 155.25 |

Exponential | 2.50 | 207.08 | 4.22 | 221.59 | 5.08 | 195.16 | 5.84 | 214.68 | |

Gamma | 1.31, 1.52 | 162.36 | 1.36, 1.98 | 173.66 | 1.03, 3.59 | 156.66 | 1.18, 4.13 | 172.33 | |

Logistic | 1.87, 1.10 | 191.40 | 0.83, 0.33 | 163.69 | 3.66, 2.27 | 178.64 | 4.21, 2.61 | 196.50 | |

Lognormal | 0.64, 0.58 | 147.42 | 0.90, 0.59 | 163.55 | 1.06, 0.71 | 149.54 | 1.22, 0.82 | 164.49 |

### Joint probability distribution of drought variables

Three commonly-used Archimedean copulas were applied to modeling the joint probability distribution of duration and severity, including Clayton, Frank and Gumbel. The parameters of copula functions were estimated by using a local optimization algorithm and max-likelihood estimation. The optimal parameter would be determined by referring to the minimum value of RMSE. Finally, the optimal copula function would be determined based on the minimum value of AIC and the maximum value of max-likelihood. Table 4 shows the quantitative information about the estimated deviations.

HDI . | Goodness of fit . | Clayton . | Frank . | Gumbel . |
---|---|---|---|---|

1 | Par-local | 2.68 | 8.04 | 2.39 |

RMSE-Local | 0.12 | 0.10 | 0.10 | |

M-likelihood | 107.46 | 136.15 | 130.82 | |

AIC | 1.43 | 0.90 | 1.07 | |

3 | Par-local | 5.45 | 14.94 | 3.56 |

RMSE-Local | 0.14 | 0.14 | 0.14 | |

M-likelihood | 67.44 | 76.32 | 76.33 | |

AIC | 1.84 | 1.85 | 1.83 | |

6 | Par-local | 7.56 | 20.91 | 5.27 |

RMSE-Local | 0.26 | 0.16 | 0.21 | |

M-likelihood | 78.03 | 91.52 | 90.33 | |

AIC | 6.99 | 2.71 | 4.23 | |

12 | Par-local | 10.29 | 25.18 | 6.83 |

RMSE-Local | 0.34 | 0.18 | 0.26 | |

M-likelihood | 74.13 | 88.94 | 85.36 | |

AIC | 11.91 | 3.34 | 6.53 |

HDI . | Goodness of fit . | Clayton . | Frank . | Gumbel . |
---|---|---|---|---|

1 | Par-local | 2.68 | 8.04 | 2.39 |

RMSE-Local | 0.12 | 0.10 | 0.10 | |

M-likelihood | 107.46 | 136.15 | 130.82 | |

AIC | 1.43 | 0.90 | 1.07 | |

3 | Par-local | 5.45 | 14.94 | 3.56 |

RMSE-Local | 0.14 | 0.14 | 0.14 | |

M-likelihood | 67.44 | 76.32 | 76.33 | |

AIC | 1.84 | 1.85 | 1.83 | |

6 | Par-local | 7.56 | 20.91 | 5.27 |

RMSE-Local | 0.26 | 0.16 | 0.21 | |

M-likelihood | 78.03 | 91.52 | 90.33 | |

AIC | 6.99 | 2.71 | 4.23 | |

12 | Par-local | 10.29 | 25.18 | 6.83 |

RMSE-Local | 0.34 | 0.18 | 0.26 | |

M-likelihood | 74.13 | 88.94 | 85.36 | |

AIC | 11.91 | 3.34 | 6.53 |

Bold values present the parameters of the optimal copula.

Frank, Gumbel and Clayton copulas are applicable for modeling the dependence of drought variable pairs, because all the RMSE values are less than 0.5. Results reveal that Frank is the optimal copula in 1-month, 6-month and 12-month time scales, with minimum AIC values of 0.90, 0.16 and 0.18, and maximum max-likelihood values of 136.15, 91.52 and 88.94, respectively. Under the 3-month time scale, Gumbel would be the optimal copula with minimum AIC (1.83) and maximum max-likelihood (76.33) values. Besides, the values of RMSE, max-likelihood and AIC between Frank and Gumbel copulas are very close, which indicated that both Frank and Gumbel copulas would be applied to model the joint probability distribution of duration and severity in a 3-month time scale. Consequently, Frank would be the optimal copula for all time scales, and both Frank and Gumbel perform better fitting than Clayton copula. Furthermore, Figure 5 shows the joint plot of duration against severity for observed data under different time scales, which indicate that 1-month and 3-month time scales, with more drought events (more scatters), show better correlation than other time scales.

Figure 6 shows the joint cumulative distribution of drought duration and severity under four time scales, and the contours of three copula functions are illustrated in Figure 7, which represents the events with the same probability of occurrence. The tail effect usually exists in the copula function with different tail coefficients, mainly including the upper tail dependence coefficient (UTC) and the lower tail dependence coefficient (LTC). The UTC has a greater effect on dependence structure than the LTC. In bivariate drought frequency analysis, if tail dependence structure among drought variables is not well retained by a selected copula, it would provide a high uncertainty in the estimation of extreme value, which would cause big errors in joint probability calculation. Frank and Clayton copulas do not have UTC, and only the UTC of Gumbel copula needs to be estimated. In this study, UTC between drought duration and severity is estimated by using both parametrical and nonparametric estimation. For example, under a 3-month time scale, UTC and empirical UTC were compared to verify the adequacy of a copula. UTCs of 1-, 3-, 6- and 12-month time scales are 0.87, 0.75, 0.72 and 0.54, and the corresponding empirical UTCs are 0.89, 0.74, 0.73 and 0.65, respectively. There is no significant difference between UTC and empirical UTC, which indicates that the applied copulas have extreme value characteristics similar to the measured samples. Figures 6 and 7 also show that there exists UTC between drought duration and severity, which denotes the probability that one drought variable exceeds a certain threshold under some conditions while the other drought variable exceeds that threshold. The verification of the tail dependence coefficient can provide dependable information for joint drought frequency analysis.

### Joint frequency analysis

Table 5 presents the univariate and bivariate return periods of drought variables under four time scales. Generally, duration and severity would increase when the target year of return periods (*T*) increases, and the corresponding joint return periods would also increase. is longer than the corresponding target year of the return period, while is less. For example, under HDI 3 (*T* = 50 years), the univariate drought duration is 9.7 months and severity is 8.8. The joint response of duration and severity results in the return period of = 76.1 (*D* > 9.7 and *S* > 8.8) years and = 13.7 years (*D* > 9.7 or *S* > 8.8). By comparing with the univariate return period, increases 52.2% and decreases 72.6%. Similar observations are also found in HDI 1, 6 and 12 time scales.

HDI . | T (years)
. | Duration (months) . | Severity . | . | . |
---|---|---|---|---|---|

1 | 5 | 5.3 | 3.2 | 6.2 | 2.3 |

10 | 6.8 | 4.0 | 22.9 | 3.9 | |

20 | 8.2 | 4.4 | 55.5 | 7.1 | |

50 | 8.9 | 5.2 | 124.1 | 17.4 | |

100 | 9.9 | 5.8 | 279.2 | 35.8 | |

3 | 5 | 6.1 | 5.1 | 4.3 | 2.7 |

10 | 7.2 | 6.3 | 8.6 | 5.5 | |

20 | 8.9 | 7.4 | 25.0 | 9.2 | |

50 | 9.7 | 8.8 | 76.1 | 13.7 | |

100 | 10.7 | 9.7 | 152.3 | 27.5 | |

6 | 5 | 7.5 | 5.8 | 7.1 | 2.2 |

10 | 9.8 | 6.4 | 23.2 | 3.9 | |

20 | 18.2 | 13.3 | 25.6 | 5.3 | |

50 | 22.6 | 17.6 | 43.2 | 9.1 | |

100 | 25.0 | 19.9 | 150.4 | 27.3 | |

12 | 5 | 18.5 | 15.0 | 6.1 | 2.3 |

10 | 22.1 | 22.1 | 17.3 | 4.2 | |

20 | 32.2 | 36.8 | 27.3 | 6.9 | |

50 | 43.9 | 42.5 | 86.6 | 10.4 | |

100 | 50.3 | 45.1 | 207.4 | 54.6 |

HDI . | T (years)
. | Duration (months) . | Severity . | . | . |
---|---|---|---|---|---|

1 | 5 | 5.3 | 3.2 | 6.2 | 2.3 |

10 | 6.8 | 4.0 | 22.9 | 3.9 | |

20 | 8.2 | 4.4 | 55.5 | 7.1 | |

50 | 8.9 | 5.2 | 124.1 | 17.4 | |

100 | 9.9 | 5.8 | 279.2 | 35.8 | |

3 | 5 | 6.1 | 5.1 | 4.3 | 2.7 |

10 | 7.2 | 6.3 | 8.6 | 5.5 | |

20 | 8.9 | 7.4 | 25.0 | 9.2 | |

50 | 9.7 | 8.8 | 76.1 | 13.7 | |

100 | 10.7 | 9.7 | 152.3 | 27.5 | |

6 | 5 | 7.5 | 5.8 | 7.1 | 2.2 |

10 | 9.8 | 6.4 | 23.2 | 3.9 | |

20 | 18.2 | 13.3 | 25.6 | 5.3 | |

50 | 22.6 | 17.6 | 43.2 | 9.1 | |

100 | 25.0 | 19.9 | 150.4 | 27.3 | |

12 | 5 | 18.5 | 15.0 | 6.1 | 2.3 |

10 | 22.1 | 22.1 | 17.3 | 4.2 | |

20 | 32.2 | 36.8 | 27.3 | 6.9 | |

50 | 43.9 | 42.5 | 86.6 | 10.4 | |

100 | 50.3 | 45.1 | 207.4 | 54.6 |

From a short to a long time scale, drought duration and severity would increase when the target year of the return period remains unchanged. For example, when the target year of return period is 50 years, drought duration would be 8.9, 9.7, 22.6 and 43.9 months, and severity would be 5.2, 8.8, 17.6 and 42.5, respectively. The corresponding joint return period would also increase except for HDI 1. From HDI 1 to 12, the values of are 124.1, 76.1, 43.2 and 86.6 years, and the values of are 17.4, 13.7, 9.1 and 10.4, which show that the differences among the results of HDI 3, 6 and 12 are not significant. Because the 1-month time scale is more subdivided than 3-, 6- and 12-month time scales, a drought event with a longer duration and larger severity would be identified to be several slight drought events. The amount of identified drought events is more than other time scales, while drought duration is shorter and severity is less, and the joint return period becomes longer. Therefore, in this study, HDI 1 would be suitable for drought frequency analysis in the short-term period, and HDI 3, HDI 6 and HDI 12 would be suitable in the long-term period. Overall, univariate analysis does not provide adequate information about the drought frequency associated with two or more drought variables, while bivariate analysis can reveal the drought frequency with different durations and severities.

By considering five target years of return periods (*T* = 5, 10, 20, 50 and 100), the mean values of drought frequency under four time scales are summarized. The variation of joint drought frequency with different return periods is shown clearly, which can be used as a reference for drought planning and management. For example, drought duration is 9.35 months and severity is 7.28 when the target year of the return period is 5 years. In this scenario, (*D* > 9.35 and *S* > 7.28) would be 5.93 years and (*D* > 9.35 or *S* > 7.28) would be 2.38 years. Drought duration is 21.28 months and severity is 18.53 when the target year of return period is 100 years. In this scenario, (*D* > 21.28 and *S* > 18.53) would be 82.50 years, and (*D* > 21.28 or *S* > 18.53) would be 12.65 years. Generally, target years of 5 and 10 would be suitable for short-term drought planning, and target years of 20, 50 and 100 would be suitable for long-term drought planning.

Therefore, the return period obtained from univariate frequency analysis is longer than the joint return period in scenario , and less than the joint return period in scenario . This indicates that HDI 1 would be suitable for drought frequency analysis in 5 and 10 target years of return periods, HDI 3, HDI 6 and HDI 9 would be suitable in 20, 50 and 100 target years of return periods. In practical application, can be considered as the upper bound of the target year of the return period and can be considered as the lower bound. In the Kaidu River Basin, drought duration and severity are always at high levels. If the drought risk is judged only by the absolute value of a single variable and the interactions between different drought variables are ignored, the results will probably be inaccurate. Drought risk would be underestimated (the return period is overestimated) if only joint return period is considered, whereas drought risk would be overestimated (the return period is underestimated) if only the joint return period is considered.

## CONCLUSIONS

In this study, the MECFA method has been developed for drought risk analysis. MECFA is capable of identifying drought variables, modeling marginal and joint distribution of drought variables. The main advantages of MECFA are that the marginal modeling requires no distribution assumption and joint probability distribution preserves the dependence structure between drought duration and severity. MECFA is applied to analyze bivariate drought frequency in the Kaidu River Basin (during 1958–2011). Compared with the parametric probabilistic distributions, ME distribution is proven to be suitable for the marginal fitting of drought duration and severity. Compared with three Archimedean copulas, Frank is the optimal copula function for modeling the dependence structure of duration and severity.

Some findings can be summarized: (1) the Kaidu River Basin experienced about 28 drought events in 1958–2011, and drought inter-arrival time is 10.8 months; (2) the mean of drought duration is 6.2 months (severity 4.6), and the most severe drought event lasts for 34 months (severity 41.2) which occurred from August 1984 to May 1987; (3) HDI 1 is suitable for drought frequency analysis in the target year of return periods of 5 and 10, and HDI 3, HDI 6 and HDI 12 are effective for the target year of return periods of 20, 50 and 100, respectively; (4) the decrease in precipitation and runoff is the main cause of drought in the Kaidu River Basin; (5) the joint return period that both duration and severity reach the drought threshold can be used as the upper bound of the target return period, and the joint return period that either duration or severity reaches the drought threshold can be used as the lower bound of the target return period.

This study is the first attempt to develop a MECFA method for drought frequency analysis in the Kaidu River Basin, and there are still some potential works of the developed method. The development of a drought event usually evolves both static and dynamic features. Analyzing dynamic features, such as drought centroid and displacement direction, will help to reflect the drought situation comprehensively. Therefore, it is desired to integrate spatial–temporal analysis into MECFA to explore the impact of spatiotemporal variation on drought risk.

## ACKNOWLEDGEMENTS

This research was supported by the National Key Research & Development Program of China (2016YFA0601502) and the National Natural Science Foundation of China (51779008). The authors are grateful to the editors and the anonymous reviewers for their insightful comments and suggestions.

## DATA AVAILABILITY STATEMENT

All relevant data are available from http://dx.doi.org/10.3390/w12082293.