Vasanth Kalingeri

LGJun 20, 2022

Latent Variable Modelling Using Variational Autoencoders: A survey

Vasanth Kalingeri

A probability distribution allows practitioners to uncover hidden structure in the data and build models to solve supervised learning problems using limited data. The focus of this report is on Variational autoencoders, a method to learn the probability distribution of large complex datasets. The report provides a theoretical understanding of variational autoencoders and consolidates the current research in the field. The report is divided into multiple chapters, the first chapter introduces the problem, describes variational autoencoders and identifies key research directions in the field. Chapters 2, 3, 4 and 5 dive into the details of each of the key research areas. Chapter 6 concludes the report and suggests directions for future work. A reader who has a basic idea of machine learning but wants to learn about general themes in machine learning research can benefit from the report. The report explains central ideas on learning probability distributions, what people did to make this tractable and goes into details around how deep learning is currently applied. The report also serves a gentle introduction for someone looking to contribute to this sub-field.

SDDec 15, 2016

Music Generation with Deep Learning

Vasanth Kalingeri, Srikanth Grandhe

The use of deep learning to solve problems in literary arts has been a recent trend that has gained a lot of attention and automated generation of music has been an active area. This project deals with the generation of music using raw audio files in the frequency domain relying on various LSTM architectures. Fully connected and convolutional layers are used along with LSTM's to capture rich features in the frequency domain and increase the quality of music generated. The work is focused on unconstrained music generation and uses no information about musical structure(notes or chords) to aid learning.The music generated from various architectures are compared using blind fold tests. Using the raw audio to train models is the direction to tapping the enormous amount of mp3 files that exist over the internet without requiring the manual effort to make structured MIDI files. Moreover, not all audio files can be represented with MIDI files making the study of these models an interesting prospect to the future of such models.

Vasanth Kalingeri

2 Papers