Theory-plus-code documentation of the DEPAM workflow for soundscape description
This work addresses standardization and scalability challenges for the PAM community in oceanography, though it is incremental as it builds on existing workflows.
The authors tackled the lack of standardized and scalable tools in Passive Acoustic Monitoring (PAM) by proposing a theory-plus-code documentation for a classical analysis workflow and implementing it in Scala within Spark/Hadoop frameworks to enable scalable processing on cluster systems.
In the Big Data era, the community of PAM faces strong challenges, including the need for more standardized processing tools accross its different applications in oceanography, and for more scalable and high-performance computing systems to process more efficiently the everly growing datasets. In this work we address conjointly both issues by first proposing a detailed theory-plus-code document of a classical analysis workflow to describe the content of PAM data, which hopefully will be reviewed and adopted by a maximum of PAM experts to make it standardized. Second, we transposed this workflow into the Scala language within the Spark/Hadoop frameworks so it can be directly scaled out on several node cluster.