CLJan 28

RusLICA: A Russian-Language Platform for Automated Linguistic Inquiry and Category Analysis

arXiv:2601.20275v1
Originality Synthesis-oriented
AI Analysis

This work addresses the need for psycholinguistic analysis tools tailored to Russian, considering its grammatical and cultural specificities, though it is incremental as it adapts an existing methodology to a new language.

The researchers tackled the adaptation of the Linguistic Inquiry and Word Count (LIWC) methodology for Russian-language texts, resulting in a platform called RusLICA that includes 96 categories and integrates features like syntactic analysis and pre-trained language models for automated linguistic inquiry.

Defining psycholinguistic characteristics in written texts is a task gaining increasing attention from researchers. One of the most widely used tools in the current field is Linguistic Inquiry and Word Count (LIWC) that originally was developed to analyze English texts and translated into multiple languages. Our approach offers the adaptation of LIWC methodology for the Russian language, considering its grammatical and cultural specificities. The suggested approach comprises 96 categories, integrating syntactic, morphological, lexical, general statistical features, and results of predictions obtained using pre-trained language models (LMs) for text analysis. Rather than applying direct translation to existing thesauri, we built the dictionary specifically for the Russian language based on the content from several lexicographic resources, semantic dictionaries and corpora. The paper describes the process of mapping lemmas to 42 psycholinguistic categories and the implementation of the analyzer as part of RusLICA web service.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes