• 中文核心期刊
  • 中国科技核心期刊
  • RCCSE中国核心学术期刊

基于K-means SMOTE和随机森林算法的陷落柱识别模型

郝 帅, 王怀秀, 刘最亮

郝 帅, 王怀秀, 刘最亮. 基于K-means SMOTE和随机森林算法的陷落柱识别模型[J]. 煤矿安全, 2023, 54(2): 174-180.
引用本文: 郝 帅, 王怀秀, 刘最亮. 基于K-means SMOTE和随机森林算法的陷落柱识别模型[J]. 煤矿安全, 2023, 54(2): 174-180.
HAO Shuai, WANG Huaixiu, LIU Zuiliang. Collapsed column identification model based on K-means SMOTE and random forest algorithm[J]. Safety in Coal Mines, 2023, 54(2): 174-180.
Citation: HAO Shuai, WANG Huaixiu, LIU Zuiliang. Collapsed column identification model based on K-means SMOTE and random forest algorithm[J]. Safety in Coal Mines, 2023, 54(2): 174-180.

基于K-means SMOTE和随机森林算法的陷落柱识别模型

Collapsed column identification model based on K-means SMOTE and random forest algorithm

  • 摘要: 为了克服单一地震属性在对陷落柱进行识别时出现多解性和不确定性问题以及样本数据不平衡带来的识别准确率偏移问题,构建了基于K-means SMOTE和随机森林二分类陷落柱识别模型,通过对多个地震属性进行联合分析以达到识别陷落柱的目的。以山西新元煤炭责任有限公司首采区东翼南部矿区作为研究区域,将前方解释人员通过三维地震勘探技术提取到的12种地震属性作为样本特征,并将实际揭露的陷落柱信息作为样本标签,构建地震多属性数据集;通过相关性分析和聚类分析评估以及随机森林重要性分析进行地震属性优选,最终优选相对独立的6种地震属性作为样本特征;利用K-means SMOTE算法对数据集进行平衡处理,补充得到8 992个数据,选取其中6 294个数据作为训练集,2 698个数据作为测试集;基于python语言平台搭建随机森林二分类模型,最终预测陷落柱的准确率可达到87%。通过对比3种常见机器学习分类算法,该模型识别陷落柱的准确率更高。
    Abstract: In order to overcome the problem of multiple solutions and uncertainties in the identification of collapsed columns with a single seismic attribute and the problem of identification accuracy shift caused by unbalanced sample data, a binary classification collapsed column based on K-means SMOTE and random forest was constructed. The model can identify collapse columns by joint analysis of multiple seismic attributes. Taking the southern mining area of the east wing of the first mining area of Shanxi Xinyuan Coal Company as the research area, 12 seismic attributes extracted by the front interpreters through 3D seismic exploration technology are used as sample features, and the actually revealed collapse column information is used as sample labels to build a seismic multi-attribute attribute dataset; seismic attribute selection is carried out through correlation analysis, cluster analysis evaluation and random forest importance analysis, and 6 relatively independent seismic attributes are finally selected as sample features; the K-means SMOTE algorithm is used to balance the data set, and 8 992 data are obtained, of which 6 294 data are selected as the training set and 2 698 data are used as the test set; the random forest binary classification model is built based on the python language platform, and the final accuracy of predicting the collapsed column can reach 87%. By comparing three common machine learning classification algorithms, the model identified collapsed columns with higher accuracy.
  • [1] 陶文朋,董守华,黄亚平,等.地震属性技术在探测断层和陷落柱中的应用[J].煤炭科学技术,2008,36(12):79-81.

    TAO Wenpeng, DONG Shouhua, HUANG Yaping, et al. Application of seismic attribution technology to explore fault and sink hole[J]. Coal Science and Technology, 2008, 36(12): 79-81.

    [2] AlRegib G, Deriche M, Long Z, et al. Subsurface structure analysis using computational interpretation and learning: A visual signal processing perspective[J]. IEEE Signal Processing Magazine, 2018, 35(2): 82-98.
    [3] 白瑜,王晓亮,牛跟彦.地震属性分析技术在陷落柱识别中的应用[J].煤炭技术,2018,37(11):146-147.

    BAI Yu, WANG Xiaoliang, NIU Genyan. Application of seismic attribute analysis on recognizing collapse columns[J]. Coal Technology, 2018, 37(11): 146-147.

    [4] 李强,赵伟,黄志,等.地震属性分析技术在煤田陷落柱研究中的应用[J].石油地球物理勘探,2018,53(S2):285-288.

    LI Qiang, ZHAO Wei, HUANG Zhi, et al. Attribute analysis in seismic response of collapsed column[J]. Oil Geophysical Prospecting, 2018, 53(S2): 285-288.

    [5] 彭凡,杜文凤,刘洪栓.基于地震多属性融合技术的煤层巷道识别方法[J].煤炭科学技术,2021,49(6):235-241.

    PENG Fan, DU Wenfeng, LIU Hongshuan. Coal seam road-way identification method based on seismic multiattribute fusion technology[J]. Coal Science and Technology, 2021, 49(6): 235-241.

    [6] LI Dong, PENG Suping, LU Yongxu, et al. Seismic Structure interpretation based on machine learning: A case study in coal mining[J]. Interpretation, 2019, 7(3): 69-79.
    [7] WANG Zhen, DI Haibin, AlRegib G, et al. Successful leveraging of image processing and machine learning in seismic structural interpretation: A review[J]. The Leading Edge, 2018, 37(6): 451-461.
    [8] 王光宇,宋建国,徐飞,等.不平衡样本集随机森林岩性预测方法[J].石油地球物理勘探,2021,56(4):679-687.

    WANG Guangyu, SONG Jianguo, XU Fei, et al. Random forests lithology prediction method for imbalanced data sets[J]. Oil Geophysical Prospecting, 2021, 56(4): 679-688.

    [9] CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of artificial intelligence research, 2002, 16: 321-357.
    [10] LAST F, DOUZAS G, BACAO F. Oversampling for imbalanced learning based on k-means and SMOTE[J]. arXiv preprint arXiv: 1711.00837, 2017.
    [11] 李婷婷,王钊,马世忠,等.地震属性融合方法综述[J].地球物理学进展,2015,30(1):378-385.

    LI Tingting, WANG Zhao, MA Shizhong, et al. Summary of seismic attribute fusion method[J]. Progess in Geophysics, 2015, 30(1): 378-385.

    [12] 孙振宇,彭苏萍,邹冠贵.基于SVM算法的地震小断层自动识别[J].煤炭学报,2017,42(11):2945-2952.

    SUN Zhenyu, PENG Suping, ZOU Guangui. Automatic identification of small faults based on SVM and seismic data[J]. Journal of China Coal Society, 2017, 42(11): 2945-2952.

    [13] CHEN Y, ZHANG R. Research on Credit Card Default Prediction Based on k-Means SMOTE and BP Neural Network[J]. Complexity, 2021, 2021: 6618841.
    [14] 周志华.机器学习[M].北京:清华大学出版社,2016.
  • 期刊类型引用(0)

    其他类型引用(1)

计量
  • 文章访问数:  17
  • HTML全文浏览量:  0
  • PDF下载量:  15
  • 被引次数: 1
出版历程
  • 发布日期:  2023-02-19

目录

    /

    返回文章
    返回