Hybrid Anomaly Detection to Extract Extreme Events


The Earth is a complex dynamic and networked system. For the past few decades, a large number of extreme weather and climate events in Europe and worldwide have occurred which resulted in infrastructure damages and many casualties. Agricultural drought and wildfires are examples of some of the most critical hazards in terms of frequency, severity and impact on livelihoods. Detecting such extremes and anomalies is of paramount relevance to mitigate impacts and incorporating prevention measures. However, the phenomena are difficult to predict as they are complex and depend on many factors. A vast suite of approaches have been developed to monitor and characterize e.g. agricultural drought, based on either climatic ground-based data, soil moisture data or a variety of remote-sensing drought proxies. A recently proposed Soil Moisture Agricultural Drought Index (SMADI) is a simple and intuitive index that determines agricultural drought events based on key remote sensing indicators; land surface temperature (LST), vegetation indices (e.g., the NDVI) and surface soil moisture (SSM). While indices like SMADI and other alternative hand-crafted indices have been widely adopted and used in real practice yielding good results, they often ignore the complex nonlinear and multidimensional variable relations in the problem. In recent years, statistical machine learning (ML) has played a role in Earth observation data problems with positive results in problems like classification and anomaly detection. Machine learning can actually cope with multivariate and multiple source data sources and allows one to automatically detect anomalies. There is a plethora of ML algorithms for anomaly detection (AD), ranging from simple histogram-based models to more advanced hierarchical density-based clustering algorithms. Each has its advantages and disadvantages, but often need the help of user expertise and process understanding to improve results. In this work we introduce the application of a hybrid approach based on state-of-the-art ML anomaly detection methods and standard drought indices like SMADI for drought detection. We will illustrate the performance in the Earth System Data Lab (ESDL), a light-weight platform for Earth observation data analysis in the cloud. The ESDL allows to evaluate algorithms in a wide range of harmonized products including more than 40 variables spanning more than 10 years. The included variables account for atmospheric conditions, climate states, and terrestrial biosphere. Extracting anomalous events is possible with the multivariate spatial-temporal information contained within the Earth datacubes using advanced hybrid ML modeling. Algorithms will be compared in terms of accuracy, robustness and computational efficiency in selected examples of droughts in Europe during the last decade.

Aug 9, 2019 12:00 AM
Frascati, Italy