Shitao Yin, Xiaochun Lin, Zhifeng Zhang, Xiang Li

A class-rebalancing self-training semisupervised learning for imbalanced data lithology identification

  • Geochemistry and Petrology
  • Geophysics

Lithologic identification plays a crucial role in petroleum geologic exploration, and machine learning (ML) has become increasingly prevalent in intelligent lithology identification in recent years. However, identifying lithologies presents challenges due to a lack of lithologic labels and an imbalanced distribution of lithologies. To address this issue and obtain satisfactory lithologic identification results, this study investigates a class-rebalancing self-training (CReST) lithology identification framework. This framework uses logging data and limited lithologic labels as input and achieves promising lithology classification through the CReST approach. Four ML algorithms with high overall performance are selected from 25 common algorithms to establish CReST models, such as bagging classifier, extra trees classifier, random forest classifier, and support vector classifier. The classification results of the models are compared and analyzed under three conditions. The experimental findings indicate that (1) under label scarcity, the effect of category recognition varies greatly with different sample numbers; (2) under self-training (ST), overall performance is improved, but the difference in performance caused by category imbalance also increases; and (3) under CReST framework, the model effectively resolves the identification problems caused by a lack of labels and an imbalanced category distribution. Specifically, the precision of identifying categories with fewer samples is improved by more than 20%.

Need a simple solution for managing your BibTeX entries? Explore CiteDrive!

  • Web-based, modern reference management
  • Collaborate and share with fellow researchers
  • Integration with Overleaf
  • Comprehensive BibTeX/BibLaTeX support
  • Save articles and websites directly from your browser
  • Search for new articles from a database of tens of millions of references
Try out CiteDrive

More from our Archive