SoundCam: A Dataset for Finding Humans Using Room Acoustics
This provides a resource for researchers in acoustics and machine learning to study human presence and movement using room acoustics, though it is incremental as it builds on existing RIR datasets by adding real-world measurements.
The authors tackled the problem of lacking real-world datasets for room impulse responses (RIRs) with systematic variations in human positions, and they introduced SoundCam, a dataset of 5,000 RIRs and 2,000 music recordings from three rooms, enabling tasks like human detection and tracking.
A room's acoustic properties are a product of the room's geometry, the objects within the room, and their specific positions. A room's acoustic properties can be characterized by its impulse response (RIR) between a source and listener location, or roughly inferred from recordings of natural signals present in the room. Variations in the positions of objects in a room can effect measurable changes in the room's acoustic properties, as characterized by the RIR. Existing datasets of RIRs either do not systematically vary positions of objects in an environment, or they consist of only simulated RIRs. We present SoundCam, the largest dataset of unique RIRs from in-the-wild rooms publicly released to date. It includes 5,000 10-channel real-world measurements of room impulse responses and 2,000 10-channel recordings of music in three different rooms, including a controlled acoustic lab, an in-the-wild living room, and a conference room, with different humans in positions throughout each room. We show that these measurements can be used for interesting tasks, such as detecting and identifying humans, and tracking their positions.