Leveraging Hand-Crafted Features and Continuous DAS Data Stream for Automated Event Classification

Camille Huynh,Clément Hibert, Camille Jestin,Jean-Philippe Malet,Vincent Lanticq

crossref(2024)

引用 0|浏览1
暂无评分
摘要
Distributed Acoustic Sensing (DAS), renowned for its high spatio-temporal resolution, offers a way to detect and identify a large variety of event sources hard to measure with conventional seismometers, as anthropogenic or environmental event sources. In view of the large amount of data generated by DAS networks, detecting and cataloging such events appears to be a challenge.  To this end, machine learning is an opportunity and can help to automate this task. In this work, we implement a processing chain able to operate quasi-real-time classification of anthropogenic and natural seismic event sources. Our approach works on continuous data streams to avoid dependency of a third-party detection algorithm. The preprocessing step consists in computing hand-crafted features that encapsulate the observed seismic signal characteristics into quantities suitable for source classification. These features include, amongst other, temporal and spatial standard deviation, kurtosis, skewness, temporal power spectrum density of the seismic signals, and cross-correlation and dynamic time warping of multiple seismic traces. The processing step performs classification tasks using the XGBoost machine learning algorithm. XGBoost quantifies the contribution of each feature and the certainty of the produced classification, which helps to speed up the processing chain using only discriminative features, and to reduce the false alarm rate. The post-processing step, Markov Random Field, accommodates spatial and temporal information redundancy.  We tested our proposed processing chain on two scales: locally, with tests in a controlled environment, and regionally, for real-field event detection. Both have been recorded with a FEBUS A1-R DAS. The first dataset was obtained on the FEBUS Optics test bench for simulated seismic anthropogenic sources over a 600m long fiber optic. The catalog includes six anthropogenic seismic sources denoted as footsteps, impacts, backhoe, compactor, and leaks. The second dataset contains cataloged earthquakes and quarry blasts events that were collected in the Pyrenees along a 91 km-long fiber optic cable between August 30 and September 20, 2022 with the support of TotalEnergies. The conducted tests show that features related to signal temporal content are enough to perform classification on the test bench and reach a F1-score of 84% for streamed data, and of 88% after application of the post-processing algorithm. The processing chain also shows its interest for real-field data analysis, as 12 of the 13 earthquakes of magnitude above 0.4 were correctly detected despite the natural and anthropogenic noises. The promising outcomes achieved in both datasets indicate that the method is likely applicable to newly obtained data, but further data is needed to enhance the robustness of the algorithm. Working on these two datasets highlights the difficulties to work from events measured in controlled conditions to events acquired in the field. In particular, building a catalog from a continuous dataset is time-consuming and necessitates tools to identify events of interest. In future works, we aim at exploring the potential of self-supervised learning to help fasten the exploration of future newly acquired DAS datasets.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要