Amruta Vidwans

1.4SDJun 21, 2019

Understanding and Classifying Cultural Music Using Melodic Features Case Of Hindustani, Carnatic And Turkish Music

Amruta Vidwans, Prateek Verma, Preeti Rao

We present a melody based classification of musical styles by exploiting the pitch and energy based characteristics derived from the audio signal. Three prominent musical styles were chosen which have improvisation as integral part with similar melodic principles, theme, and structure of concerts namely, Hindustani, Carnatic and Turkish music. Listeners of one or more of these genres can discriminate between these based on the melodic contour alone. Listening tests were carried out using melodic attributes alone, on similar melodic pieces with respect to raga/makam, and removing any instrumentation cue to validate our hypothesis that style distinction is evident in the melody. Our method is based on finding a set of highly discriminatory features, derived from musicology, to capture distinct characteristics of the melodic contour. Behavior in terms of transitions of the pitch contour, the presence of micro-tonal notes and the nature of variations in the vocal energy are exploited. The automatically classified style labels are found to correlate well with subjective listening judgments. This was verified by using statistical tests to compare the labels from subjective and objective judgments. The melody based features, when combined with timbre based features, were seen to improve the classification performance.

1.5SDJul 30, 2018

Audio segmentation based on melodic style with hand-crafted features and with convolutional neural networks

Amruta Vidwans, Nachiket Deo, Preeti Rao

We investigate methods for the automatic labeling of the taan section, a prominent structural component of the Hindustani Khayal vocal concert. The taan contains improvised raga-based melody rendered in the highly distinctive style of rapid pitch and energy modulations of the voice. We propose computational features that capture these specific high-level characteristics of the singing voice in the polyphonic context. The extracted local features are used to achieve classification at the frame level via a trained multilayer perceptron (MLP) network, followed by grouping and segmentation based on novelty detection. We report high accuracies with reference to musician annotated taan sections across artists and concerts. We also compare the performance obtained by the compact specialized features with frame-level classification via a convolutional neural network (CNN) operating directly on audio spectrogram patches for the same task. While the relatively simple architecture we experiment with does not quite attain the classification accuracy of the hand-crafted features, it provides for a performance well above chance with interesting insights about the ability of the network to learn discriminative features effectively from labeled data.

Amruta Vidwans

2 Papers