Ranjan Sengupta

SD
15papers
79citations
Novelty21%
AI Score17

15 Papers

SDFeb 11, 2021
A Fractal Approach to Characterize Emotions in Audio and Visual Domain: A Study on Cross-Modal Interaction

Sayan Nag, Uddalok Sarkar, Shankha Sanyal et al.

It is already known that both auditory and visual stimulus is able to convey emotions in human mind to different extent. The strength or intensity of the emotional arousal vary depending on the type of stimulus chosen. In this study, we try to investigate the emotional arousal in a cross-modal scenario involving both auditory and visual stimulus while studying their source characteristics. A robust fractal analytic technique called Detrended Fluctuation Analysis (DFA) and its 2D analogue has been used to characterize three (3) standardized audio and video signals quantifying their scaling exponent corresponding to positive and negative valence. It was found that there is significant difference in scaling exponents corresponding to the two different modalities. Detrended Cross Correlation Analysis (DCCA) has also been applied to decipher degree of cross-correlation among the individual audio and visual stimulus. This is the first of its kind study which proposes a novel algorithm with which emotional arousal can be classified in cross-modal scenario using only the source audio and visual signals while also attempting a correlation between them.

SDFeb 11, 2021
Language Independent Emotion Quantification using Non linear Modelling of Speech

Uddalok Sarkar, Sayan Nag, Chirayata Bhattacharya et al.

At present emotion extraction from speech is a very important issue due to its diverse applications. Hence, it becomes absolutely necessary to obtain models that take into consideration the speaking styles of a person, vocal tract information, timbral qualities and other congenital information regarding his voice. Our speech production system is a nonlinear system like most other real world systems. Hence the need arises for modelling our speech information using nonlinear techniques. In this work we have modelled our articulation system using nonlinear multifractal analysis. The multifractal spectral width and scaling exponents reveals essentially the complexity associated with the speech signals taken. The multifractal spectrums are well distinguishable the in low fluctuation region in case of different emotions. The source characteristics have been quantified with the help of different non-linear models like Multi-Fractal Detrended Fluctuation Analysis, Wavelet Transform Modulus Maxima. The Results obtained from this study gives a very good result in emotion clustering.

SDFeb 1, 2021
Neural Network architectures to classify emotions in Indian Classical Music

Uddalok Sarkar, Sayan Nag, Medha Basu et al.

Music is often considered as the language of emotions. It has long been known to elicit emotions in human being and thus categorizing music based on the type of emotions they induce in human being is a very intriguing topic of research. When the task comes to classify emotions elicited by Indian Classical Music (ICM), it becomes much more challenging because of the inherent ambiguity associated with ICM. The fact that a single musical performance can evoke a variety of emotional response in the audience is implicit to the nature of ICM renditions. With the rapid advancements in the field of Deep Learning, this Music Emotion Recognition (MER) task is becoming more and more relevant and robust, hence can be applied to one of the most challenging test case i.e. classifying emotions elicited from ICM. In this paper we present a new dataset called JUMusEmoDB which presently has 400 audio clips (30 seconds each) where 200 clips correspond to happy emotions and the remaining 200 clips correspond to sad emotion. For supervised classification purposes, we have used 4 existing deep Convolutional Neural Network (CNN) based architectures (resnet18, mobilenet v2.0, squeezenet v1.0 and vgg16) on corresponding music spectrograms of the 2000 sub-clips (where every clip was segmented into 5 sub-clips of about 5 seconds each) which contain both time as well as frequency domain information. The initial results are quite inspiring, and we look forward to setting the baseline values for the dataset using this architecture. This type of CNN based classification algorithm using a rich corpus of Indian Classical Music is unique even in the global perspective and can be replicated in other modalities of music also. This dataset is still under development and we plan to include more data containing other emotional features as well. We plan to make the dataset publicly available soon.

SDApr 15, 2020
Speaker Recognition in Bengali Language from Nonlinear Features

Uddalok Sarkar, Soumyadeep Pal, Sayan Nag et al.

At present Automatic Speaker Recognition system is a very important issue due to its diverse applications. Hence, it becomes absolutely necessary to obtain models that take into consideration the speaking style of a person, vocal tract information, timbral qualities of his voice and other congenital information regarding his voice. The study of Bengali speech recognition and speaker identification is scarce in the literature. Hence the need arises for involving Bengali subjects in modelling our speaker identification engine. In this work, we have extracted some acoustic features of speech using non linear multifractal analysis. The Multifractal Detrended Fluctuation Analysis reveals essentially the complexity associated with the speech signals taken. The source characteristics have been quantified with the help of different techniques like Correlation Matrix, skewness of MFDFA spectrum etc. The Results obtained from this study gives a good recognition rate for Bengali Speakers.

ASApr 15, 2020
Acoustical classification of different speech acts using nonlinear methods

Chirayata Bhattacharyya, Sourya Sengupta, Sayan Nag et al.

A recitation is a way of combining the words together so that they have a sense of rhythm and thus an emotional content is imbibed within. In this study we envisaged to answer these questions in a scientific manner taking into consideration 5 (five) well known Bengali recitations of different poets conveying a variety of moods ranging from joy to sorrow. The clips were recited as well as read (in the form of flat speech without any rhythm) by the same person to avoid any perceptual difference arising out of timbre variation. Next, the emotional content from the 5 recitations were standardized with the help of listening test conducted on a pool of 50 participants. The recitations as well as the speech were analyzed with the help of a latest non linear technique called Detrended Fluctuation Analysis (DFA) that gives a scaling exponent α, which is essentially the measure of long range correlations present in the signal. Similar pieces (the parts which have the exact lyrical content in speech as well as in the recital) were extracted from the complete signal and analyzed with the help of DFA technique. Our analysis shows that the scaling exponent for all parts of recitation were much higher in general as compared to their counterparts in speech. We have also established a critical value from our analysis, above which a mere speech may become a recitation. The case may be similar to the conventional phase transition, wherein the measurement of external condition at which the transformation occurs (generally temperature) is called phase transition. Further, we have also categorized the 5 recitations on the basis of their emotional content with the help of the same DFA technique. Analysis with a greater variety of recitations is being carried out to yield more interesting results.

NCDec 22, 2017
Music of Brain and Music on Brain: A Novel EEG Sonification approach

Sayan Nag, Shankha Sanyal, Archi Banerjee et al.

Can we hear the sound of our brain? Is there any technique which can enable us to hear the neuro-electrical impulses originating from the different lobes of brain? The answer to all these questions is YES. In this paper we present a novel method with which we can sonify the Electroencephalogram (EEG) data recorded in rest state as well as under the influence of a simplest acoustical stimuli - a tanpura drone. The tanpura drone has a very simple yet very complex acoustic features, which is generally used for creation of an ambiance during a musical performance. Hence, for this pilot project we chose to study the correlation between a simple acoustic stimuli (tanpura drone) and sonified EEG data. Till date, there have been no study which deals with the direct correlation between a bio-signal and its acoustic counterpart and how that correlation varies under the influence of different types of stimuli. This is the first of its kind study which bridges this gap and looks for a direct correlation between music signal and EEG data using a robust mathematical microscope called Multifractal Detrended Cross Correlation Analysis (MFDXA). For this, we took EEG data of 10 participants in 2 min 'rest state' (i.e. with white noise) and in 2 min 'tanpura drone' (musical stimulus) listening condition. Next, the EEG signals from different electrodes were sonified and MFDXA technique was used to assess the degree of correlation (or the cross correlation coefficient) between tanpura signal and EEG signals. The variation of γx for different lobes during the course of the experiment also provides major interesting new information. Only music stimuli has the ability to engage several areas of the brain significantly unlike other stimuli (which engages specific domains only).

NCApr 29, 2017
Can Musical Emotion Be Quantified With Neural Jitter Or Shimmer? A Novel EEG Based Study With Hindustani Classical Music

Sayan Nag, Sayan Biswas, Sourya Sengupta et al.

The term jitter and shimmer has long been used in the domain of speech and acoustic signal analysis as a parameter for speaker identification and other prosodic features. In this study, we look forward to use the same parameters in neural domain to identify and categorize emotional cues in different musical clips. For this, we chose two ragas of Hindustani music which are conventionally known to portray contrast emotions and EEG study was conducted on 5 participants who were made to listen to 3 min clip of these two ragas with sufficient resting period in between. The neural jitter and shimmer components were evaluated for each experimental condition. The results reveal interesting information regarding domain specific arousal of human brain in response to musical stimuli and also regarding trait characteristics of an individual. This novel study can have far reaching conclusions when it comes to modeling of emotional appraisal. The results and implications are discussed in detail.

SDMar 19, 2017
Gestalt Phenomenon in Music? A Neurocognitive Physics Study with EEG

Shankha Sanyal, Archi Banerjee, Souparno Roy et al.

The term gestalt has been widely used in the field of psychology which defined the perception of human mind to group any object not in part but as a unified whole. Music in general is polytonic i.e. a combination of a number of pure tones (frequencies) mixed together in a manner that sounds harmonius. The study of human brain response due to different frequency groups of acoustic signal can give us an excellent insight regarding the neural and functional architecture of brain functions. In this work we have tried to analyze the effect of different frequency bands of music on the various frequency rhythms of human brain obtained from EEG data of 5 participants. Four (4) widely popular Rabindrasangeet clips were subjected to Wavelet Transform method for extracting five resonant frequency bands from the original music signal. These resonant frequency bands were presented to the subjects as auditory stimulus and EEG signals recorded simultaneously in 19 different locations of the brain. The recorded EEG signals were noise cleaned and subjected to Multifractal Detrended Fluctuation Analysis (MFDFA) technique on the alpha, theta and gamma frequency range. Thus, we obtained the complexity values (in the form of multifractal spectral width) in alpha, theta and gamma EEG rhythms corresponding to different frequency bands of music. We obtain frequency specific arousal based response in different lobes of brain as well as in specific EEG bands corresponding to musical stimuli. This revelation can be of immense importance when it comes to the field of cognitive music therapy.

SDDec 1, 2016
A Non Linear Approach towards Automated Emotion Analysis in Hindustani Music

Shankha Sanyal, Archi Banerjee, Tarit Guhathakurata et al.

In North Indian Classical Music, raga forms the basic structure over which individual improvisations is performed by an artist based on his/her creativity. The Alap is the opening section of a typical Hindustani Music (HM) performance, where the raga is introduced and the paths of its development are revealed using all the notes used in that particular raga and allowed transitions between them with proper distribution over time. In India, corresponding to each raga, several emotional flavors are listed, namely erotic love, pathetic, devotional, comic, horrific, repugnant, heroic, fantastic, furious, peaceful. The detection of emotional cues from Hindustani Classical music is a demanding task due to the inherent ambiguity present in the different ragas, which makes it difficult to identify any particular emotion from a certain raga. In this study we took the help of a high resolution mathematical microscope (MFDFA or Multifractal Detrended Fluctuation Analysis) to procure information about the inherent complexities and time series fluctuations that constitute an acoustic signal. With the help of this technique, 3 min alap portion of six conventional ragas of Hindustani classical music namely, Darbari Kanada, Yaman, Mian ki Malhar, Durga, Jay Jayanti and Hamswadhani played in three different musical instruments were analyzed. The results are discussed in detail.

SDDec 1, 2016
A Non Linear Multifractal Study to Illustrate the Evolution of Tagore Songs Over a Century

Shankha Sanyal, Archi Banerjee, Tarit Guhathakurata et al.

The works of Rabindranath Tagore have been sung by various artistes over generations spanning over almost 100 years. there are few songs which were popular in the early years and have been able to retain their popularity over the years while some others have faded away. In this study we look to find cues for the singing style of these songs which have kept them alive for all these years. For this we took 3 min clip of four Tagore songs which have been sung by five generation of artistes over 100 years and analyze them with the help of latest nonlinear techniques Multifractal Detrended Fluctuation Analysis (MFDFA). The multifractal spectral width is a manifestation of the inherent complexity of the signal and may prove to be an important parameter to identify the singing style of particular generation of singers and how this style varies over different generations. The results are discussed in detail.

SDApr 8, 2016
Variation of singing styles within a particular Gharana of Hindustani classical music A nonlinear multifractal study

Archi Banerjee, Shankha Sanyal, Ranjan Sengupta et al.

Hindustani classical music is entirely based on the "Raga" structures. In Hindustani music, a "Gharana" or school refers to the adherence of a group of musicians to a particular musical style. Gharanas have their basis in the traditional mode of musical training and education. Every Gharana has its own distinct features; though within a particular Gharana, significant differences in singing styles are observed between generations of performers, which can be ascribed to the individual creativity of that singer. This work aims to study the evolution of singing style among four artists of four consecutive generations from Patiala Gharana. For this, alap and bandish parts of two different Ragas sung by the four artists were analyzed with the help of non linear multifractal analysis (MFDFA) technique. The multifractal spectral width obtained from the MFDFA method gives an estimate of the complexity of the signal. The observations from the variation of spectral width give a cue towards the scientific recognition of Guru-Shisya Parampara (teacher-student tradition) - a hitherto much-heard philosophical term. From a quantitative approach this study succeeds in analyzing the evolution of singing styles within a particular Gharana over generations of artists as well as the effect of globalization in the field of classical music.

SDApr 8, 2016
Ragas in Bollywood music A microscopic view through multrifractal cross-correlation method

Shankha Sanyal, Archi Banerjee, Souparno Roy et al.

Since the start of Indian cinema, a number of films have been made where a particular song is based on a certain raga. These songs have been taking a major role in spreading the essence of classical music to the common people, who have no formal exposure to classical music. In this paper, we look to explore what are the particular features of a certain raga which make it understandable to common people and enrich the song to a great extent. For this, we chose two common ragas of Hindustani classical music, namely "Bhairav" and "Mian ki Malhar" which are known to have widespread application in popular film music. We have taken 3 minute clips of these two ragas from the renderings of two eminent maestros of Hindustani classical music. 3 min clips of ten (10) widely popular songs of Bollywood films were selected for analysis. These were analyzed with the help of a latest non linear analysis technique called Multifractal Detrended Cross correlation Analysis (MFDXA). With this technique, all parts of the Film music and the renderings from the eminent maestros are analyzed to find out a cross correlation coefficient (γx) which gives the degree of correlation between these two signals. We hypothesize that the parts which have the highest degree of cross correlation are the parts in which that particular raga is established in the song. Also the variation of cross correlation coefficient in the different parts of the two samples gives a measure of the modulation that is executed by the singer. Thus, in nutshell we try to study scientifically the amount of correlation that exists between the raga and the same raga being utilized in Film music. This will help in generating an automated algorithm through which a naïve listener will relish the flavor of a particular raga in a popular film song. The results are discussed in detail.

SDJan 28, 2016
Categorization of Stringed Instruments with Multifractal Detrended Fluctuation Analysis

Archi Banerjee, Shankha Sanyal, Tarit Guhathakurata et al.

Categorization is crucial for content description in archiving of music signals. On many occasions, human brain fails to classify the instruments properly just by listening to their sounds which is evident from the human response data collected during our experiment. Some previous attempts to categorize several musical instruments using various linear analysis methods required a number of parameters to be determined. In this work, we attempted to categorize a number of string instruments according to their mode of playing using latest-state-of-the-art robust non-linear methods. For this, 30 second sound signals of 26 different string instruments from all over the world were analyzed with the help of non linear multifractal analysis (MFDFA) technique. The spectral width obtained from the MFDFA method gives an estimate of the complexity of the signal. From the variation of spectral width, we observed distinct clustering among the string instruments according to their mode of playing. Also there is an indication that similarity in the structural configuration of the instruments is playing a major role in the clustering of their spectral width. The observations and implications are discussed in detail.

SDJan 3, 2016
Categorization of Tablas by Wavelet Analysis

Anirban Patranabis, Kaushik Banerjee, Vishal Midya et al.

Tabla, a percussion instrument, mainly used to accompany vocalists, instrumentalists and dancers in every style of music from classical to light in India, mainly used for keeping rhythm. This percussion instrument consists of two drums played by two hands, structurally different and produces different harmonic sounds. Earlier work has done labeling tabla strokes from real time performances by testing neural networks and tree based classification methods. The current work extends previous work by C. V. Raman and S. Kumar in 1920 on spectrum modeling of tabla strokes. In this paper we have studied spectral characteristics (by wavelet analysis by sub band coding method and using torrence wavelet tool) of nine strokes from each of five tablas using Wavelet transform. Wavelet analysis is now a common tool for analyzing localized variations of power within a time series and to find the frequency distribution in time frequency space. Statistically, we will look into the patterns depicted by harmonics of different sub bands and the tablas. Distribution of dominant frequencies at different sub-band of stroke signals, distribution of power and behavior of harmonics are the important features, leads to categorization of tabla.

SDOct 15, 2015
Harmonic and Timbre Analysis of Tabla Strokes

Anirban Patranabis, Kaushik Banerjee, Vishal Midya et al.

Indian twin drums mainly bayan and dayan (tabla) are the most important percussion instruments in India popularly used for keeping rhythm. It is a twin percussion/drum instrument of which the right hand drum is called dayan and the left hand drum is called bayan. Tabla strokes are commonly called as `bol', constitutes a series of syllables. In this study we have studied the timbre characteristics of nine strokes from each of five different tablas. Timbre parameters were calculated from LTAS of each stroke signals. Study of timbre characteristics is one of the most important deterministic approach for analyzing tabla and its stroke characteristics. Statistical correlations among timbre parameters were measured and also through factor analysis we get to know about the parameters of timbre analysis which are closely related. Tabla strokes have unique harmonic and timbral characteristics at mid frequency range and have no uniqueness at low frequency ranges.