Predicting Climate Variability over the Indian Region Using Data Mining Strategies
This work addresses climate prediction for the Indian region, offering an incremental improvement over existing methods.
The paper tackles climate variability prediction in India by using expectation maximization clustering to identify climate regions and support vector machines for prediction, achieving RMSEs of 1.19 and 0.89 in specific regions compared to higher errors from baseline methods.
In this paper an approach based on expectation maximization (EM) clustering to find the climate regions and a support vector machine to build a predictive model for each of these regions is proposed. To minimize the biases in the estimations a ten cross fold validation is adopted both for obtaining clusters and building the predictive models. The EM clustering could identify all the zones as per the Koppen classification over Indian region. The proposed strategy when employed for predicting temperature has resulted in an RMSE of $1.19$ in the Montane climate region and $0.89$ in the Humid Sub Tropical region as compared to $2.9$ and $0.95$ respectively predicted using k-means and linear regression method.