-
Hu Marilyne authoredHu Marilyne authored
Urban Sound Tagging Project

Introduction
In this project, you will develop a urban sound tagging system. Given a ten-second audio recording from some urban environment, it will return whether each of the following eight predefined audio sources is audible or not:
- engine
- machinery-impact
- non-machinery-impact
- powered-saw
- alert-signal
- music
- human-voice
- dog
This is a multi-label classification problem.
Context
The city of New York, like many others, has a "noise code". For reasons of comfort and public health, jackhammers can only operate on weekdays; pet owners are held accountable for their animals' noises; ice cream trucks may play their jingles while in motion, but should remain quiet once they've parked; blasting a car horn is restricted to situations of imminent danger. The noise code presents a plan of legal enforcement and thus mitigation of harmful and disruptive types of sounds.
In an effort towards reducing urban noise pollution, the engagement of citizens is crucial, yet by no means sufficient on its own. Indeed, the rate of complaints that are transmitted, in a given neighborhood, through a municipal service such as 3-1-1, is not necessarily proportional to the level of noise pollution in that neighborhood. In the case of New York City, the Department of Environmental Protection is in charge of attending to the subset of noise complaints which are caused by static sources, including construction and traffic. Unfortunately, statistical evidence demonstrates that, although harmful levels of noise predominantly affect low-income and unemployed New Yorkers, these residents are the least likely to take the initiative of filing a complaint to the city officials. Such a gap between reported exposure and actual exposure raises the challenge of improving fairness, accountability, and transparency in public policies against noise pollution.
Source: DCASE 2019 Task 5
Motivation
Noise pollution is one of the topmost quality of life issues for urban residents in the United States. It has been estimated that 9 out of 10 adults in New York City are exposed to excessive noise levels, i.e. beyond the limit of what the EPA considers to be harmful. When applied to U.S. cities of more than 4 million inhabitants, such estimates extend to over 72 million urban residents.
The objectives of SONYC (Sounds of New York City) are to create technological solutions for: (1) the systematic, constant monitoring of noise pollution at city scale; (2) the accurate description of acoustic environments in terms of its composing sources; (3) broadening citizen participation in noise reporting and mitigation; and (4) enabling city agencies to take effective, information-driven action for noise mitigation.
SONYC is an independent research project for mitigating urban noise pollution. One of its aims is to map the spatiotemporal distribution of noise at the scale of a megacity like New York, in real time, and throughout multiple years. To this end, SONYC has designed an acoustic sensor for noise pollution monitoring. This sensor combines a relatively high accuracy in sound acquisition with a relatively low production cost. Between 2015 and 2019, over 50 different sensors have been assembled and deployed in various areas of New York City. Collectively, these sensors have gathered the equivalent of 37 years of audio data.
Every year, the SONYC acoustic sensor network records millions of such audio snippets. This automated procedure of data acquisition, in its own right, gives some insight into the overall rumble of New York City through time and space. However, as of today, each SONYC sensor merely returns an overall sound pressure level (SPL) in its immediate vicinity, without breaking it down into specific components. From a perceptual standpoint, not all sources of outdoor noise are equally unpleasant. For this reason, determining whether a given acoustic scene comes in violation of the noise code requires, more than an SPL estimate in decibels, a list of all active sources in the scene. In other words, in the context of automated noise pollution monitoring, the resort to computational methods for detection and classification of acoustic scenes and events (DCASE) appears as necessary.
Source: SONYC project and DCASE 2019 Task 5.
Project overview
You will work in teams. Each team will compete with others to develop the best-performing urban sound tagging system. At the end of the project, all proposed systems will be ranked based on their performance.