An innovative SAM-ViT based tool for the automatic detection of litter items on sandy beaches

crossref(2024)

引用 0|浏览2
暂无评分
摘要
Machine learning (ML) techniques in the field of Computer Vision turned out to be well-performing tools for the automatic detection of beach litter (BL) items on high resolution UAVs images. This study was carried out in the frame of the RiPARTI project (funded by Apulia Region) proposes an innovative approach based on the combination of the aero-photogrammetric surveys with a newly proposed ML tool. A series of experiments were conducted with a Mask-Regional Convolutional Neural Network-based (Mask-RCNN) algorithm using an image dataset acquired UAVs fights performed on different coastal sites, in Italy, Portugal, and Spain. Preliminary detection experiments were conducted using three BL items categories, “Bottles”, “Worked Wood” and “Nets”. Subsequently, a comparison with algorithms available in QGIS software confirmed the great potential of Computer Vision techniques. Indeed, in previous studies (Sozio et al., 2023), the performance of the Mask-RCNN based algorithm resulted higher than performances of algorithms available in QGIS software, but still not enough to obtain a definitive ML tool for BL automatic detection. The novel ML tool here proposed exploits the powerful dataset of Segment Anything (SAM) (Kirillov et al., 2023) developed by Meta AI, as segmentation algorithm and Visual Transformer (ViT) for the classification task. A first experiment was conducted with a dataset derived from UAVs images acquired in five different sites, i.e., Capitolo and Torre Guaceto beach (Italy), Leirosa beach (Portugal), Valdelagrana (Spain), and Cala del Cefalo beach (Italy). Aero-photogrammetric surveys were carried out at different flight heights for each site so, the final images resolution ranges from 0.3 cm/pixel to 0.7 cm/pixel. Moreover, the different color of the sand (background) represents a parameter which could affect the performance of segmentation process. Orthomosaics in .tiff format were split in 1000-pixel square tile and segmented by SAM. It executed a panoptic segmentation that produced 450 masks, both concave and convex, corresponding to objects identified on images. These masks were catalogued according to 11 labels (Bottles, Nets, Polystyrene, Worked wood, Vials, Buckets, Building waste, Ethernit, Sand, Vegetation, and Water), accounting for both the most common litter categories and natural assets. Subsequently, masks so gathered were used to train ViT, the classification algorithm and to perform the test phase, which was carried out on 450 masks, with a ratio of training, validation and test split of 7/10, 1/10 and 2/10, respectively. A preliminary experiment produced output images classified by ViT with an accuracy of 0.93 and an f1 score equal to 0.6. Data considered for this last experiment are more complex for number of classes and amount of data, so performance are better in projection” also considering the different images resolution and the background texture. Finally, identified items are georeferenced with a projected reference system. The method outstanding a very reliable performance for the BL detection task and could represent a useful and definitive approach for the assessment of the BL distribution and as well as for the identification of the main accumulation zones so as to make possible the development of tailored coastal management actions. 
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要