CVOct 28, 2019

Virtual Piano using Computer Vision

arXiv:1910.12539v12 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of automated piano performance analysis for musicians and researchers, but it is incremental as it builds on existing computer vision techniques with specific enhancements.

The research tackled the problem of analyzing piano performances solely from visual information by developing a computer vision system that locates the keyboard and keys, detects key presses, and estimates key press intensity using CNNs, including a novel spatial-temporal CNN with early fusion for intensity detection, and created a new dataset for training, achieving effectiveness as demonstrated with optical flow images.

In this research, Piano performances have been analyzed only based on visual information. Computer vision algorithms, e.g., Hough transform and binary thresholding, have been applied to find where the keyboard and specific keys are located. At the same time, Convolutional Neural Networks(CNNs) has been also utilized to find whether specific keys are pressed or not, and how much intensity the keys are pressed only based on visual information. Especially for detecting intensity, a new method of utilizing spatial, temporal CNNs model is devised. Early fusion technique is especially applied in temporal CNNs architecture to analyze hand movement. We also make a new dataset for training each model. Especially when finding an intensity of a pressed key, both of video frames and their optical flow images are used to train models to find effectiveness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes