DOI: 10.1029/2023wr036334 ISSN: 0043-1397

Causality Analysis and Prediction of Riverine Algal Blooms by Combining Empirical Dynamic Modeling and Machine Learning Techniques

Jing Tian, Gangsheng Wang, Daifeng Xiang, Sheng Huang, Wanyu Li

Abstract

River algal blooms have become a global environmental problem due to their large impact range and environmental hazards. However, the complex mechanisms underlying these blooms make prediction and prevention challenging. Here, we employed empirical dynamic modeling (EDM) and machine learning to reveal the causes and predict diatom blooms from 2003 to 2017 in the Han River of China. The diatom cell density ranged from 0.1 to 5.1 × 107 cells L−1, whereas algal blooms often lasted for 10 days with density exceeding 107 cells L−1. The EDM results elucidated that, under consistent high nutrient concentrations, algal blooms were primarily regulated by eight environmental factors: water temperature in the Han River; water levels, flow velocities, and streamflow discharges in the Han River and the Yangtze River; and water level variation in the Han River. The poor performance (coefficient of determination R2 < 0) of the multiple linear regression, EDM, and random forest models indicated the challenge of predicting algae density. Therefore, we used machine learning classification models to predict algal blooms occurrences. With the resampling techniques to account for imbalanced data, machine learning models achieved perfect classification prediction (Kappa value = 1) of the 13 algae‐bloom events with a 10‐day lead time during a 15‐year period, providing an important reference for preemptive warning of riverine algal blooms.

More from our Archive