Hirotaka Niitsuma

1.9LGMay 17, 2016Code

Word2Vec is a special case of Kernel Correspondence Analysis and Kernels for Natural Language Processing

Hirotaka Niitsuma, Minho Lee

We show that correspondence analysis (CA) is equivalent to defining a Gini index with appropriately scaled one-hot encoding. Using this relation, we introduce a nonlinear kernel extension to CA. This extended CA gives a known analysis for natural language via specialized kernels that use an appropriate contingency table. We propose a semi-supervised CA, which is a special case of the kernel extension to CA. Because CA requires excessive memory if applied to numerous categories, CA has not been used for natural language processing. We address this problem by introducing delayed evaluation to randomized singular value decomposition. The memory-efficient CA is then applied to a word-vector representation task. We propose a tail-cut kernel, which is an extension to the skip-gram within the kernel extension to CA. Our tail-cut kernel outperforms existing word-vector representation methods.

1.9CVMar 16, 2014

Image processing using miniKanren

Hirotaka Niitsuma

An integral image is one of the most efficient optimization technique for image processing. However an integral image is only a special case of delayed stream or memoization. This research discusses generalizing concept of integral image optimization technique, and how to generate an integral image optimized program code automatically from abstracted image processing algorithm. In oder to abstruct algorithms, we forces to miniKanren.

Hirotaka Niitsuma

2 Papers