CLDec 5, 2025
Morphologically-Informed Tokenizers for Languages with Non-Concatenative Morphology: A case study of Yoloxóchtil Mixtec ASRChris Crawford
This paper investigates the impact of using morphologically-informed tokenizers to aid and streamline the interlinear gloss annotation of an audio corpus of Yoloxóchitl Mixtec (YM) using a combination of ASR and text-based sequence-to-sequence tools, with the goal of improving efficiency while reducing the workload of a human annotator. We present two novel tokenization schemes that separate words in a nonlinear manner, preserving information about tonal morphology as much as possible. One of these approaches, a Segment and Melody tokenizer, simply extracts the tones without predicting segmentation. The other, a Sequence of Processes tokenizer, predicts segmentation for the words, which could allow an end-to-end ASR system to produce segmented and unsegmented transcriptions in a single pass. We find that these novel tokenizers are competitive with BPE and Unigram models, and the Segment-and-Melody model outperforms traditional tokenizers in terms of word error rate but does not reach the same character error rate. In addition, we analyze tokenizers on morphological and information-theoretic metrics to find predictive correlations with downstream performance. Our results suggest that nonlinear tokenizers designed specifically for the non-concatenative morphology of a language are competitive with conventional BPE and Unigram models for ASR. Further research will be necessary to determine the applicability of these tokenizers in downstream processing tasks.
SPSep 2, 2020
American Sign Language Recognition Using RF SensingSevgi Z. Gurbuz, Ali C. Gurbuz, Evie A. Malaia et al.
Many technologies for human-computer interaction have been designed for hearing individuals and depend upon vocalized speech, precluding users of American Sign Language (ASL) in the Deaf community from benefiting from these advancements. While great strides have been made in ASL recognition with video or wearable gloves, the use of video in homes has raised privacy concerns, while wearable gloves severely restrict movement and infringe on daily life. Methods: This paper proposes the use of RF sensors for HCI applications serving the Deaf community. A multi-frequency RF sensor network is used to acquire non-invasive, non-contact measurements of ASL signing irrespective of lighting conditions. The unique patterns of motion present in the RF data due to the micro-Doppler effect are revealed using time-frequency analysis with the Short-Time Fourier Transform. Linguistic properties of RF ASL data are investigated using machine learning (ML). Results: The information content, measured by fractal complexity, of ASL signing is shown to be greater than that of other upper body activities encountered in daily living. This can be used to differentiate daily activities from signing, while features from RF data show that imitation signing by non-signers is 99\% differentiable from native ASL signing. Feature-level fusion of RF sensor network data is used to achieve 72.5\% accuracy in classification of 20 native ASL signs. Implications: RF sensing can be used to study dynamic linguistic properties of ASL and design Deaf-centric smart environments for non-invasive, remote recognition of ASL. ML algorithms should be benchmarked on native, not imitation, ASL data.
ROFeb 25, 2013
Work in Progress: Enabling robot device discovery through robot device descriptionsMonia Anderson, Chris Crawford, Paul Kilgo et al.
There is no dearth of new robots that provide both generalized and customized platforms for learning and research. Unfortunately as we attempt to adapt existing software components, we are faced with an explosion of device drivers that interface each hardware platform with existing frameworks. We certainly gain the efficiencies of reusing algorithms and tools developed across platforms but only once the device driver is created. We propose a domain specific language that describes the development and runtime interface of a robot and defines its link to existing frameworks. The Robot Device Interface Specification (RDIS) takes advantage of the internal firmware present on many existing devices by defining the communication mechanism, syntax and semantics in such a way to enable the generation of automatic interface links and resource discovery. We present the current domain model as it relates to differential drive robots as a mechanism to use the RDIS to link described robots to HTML5 via web sockets and ROS (Robot Operating System).