Taiga Hayami

CV
h-index2
5papers
8citations
Novelty51%
AI Score36

5 Papers

CVSep 27, 2024
Neural Video Representation for Redundancy Reduction and Consistency Preservation

Taiga Hayami, Takahiro Shindo, Shunsuke Akamatsu et al.

Implicit neural representation (INR) embed various signals into neural networks. They have gained attention in recent years because of their versatility in handling diverse signal types. In the context of video, INR achieves video compression by embedding video signals directly into networks and compressing them. Conventional methods either use an index that expresses the time of the frame or features extracted from individual frames as network inputs. The latter method provides greater expressive capability as the input is specific to each video. However, the features extracted from frames often contain redundancy, which contradicts the purpose of video compression. Additionally, such redundancies make it challenging to accurately reconstruct high-frequency components in the frames. To address these problems, we focus on separating the high-frequency and low-frequency components of the reconstructed frame. We propose a video representation method that generates both the high-frequency and low-frequency components of the frame, using features extracted from the high-frequency components and temporal information, respectively. Experimental results demonstrate that our method outperforms the existing HNeRV method, achieving superior results in 96 percent of the videos.

CVNov 11, 2025
Accurate and Efficient Surface Reconstruction from Point Clouds via Geometry-Aware Local Adaptation

Eito Ogawa, Taiga Hayami, Hiroshi Watanabe

Point cloud surface reconstruction has improved in accuracy with advances in deep learning, enabling applications such as infrastructure inspection. Recent approaches that reconstruct from small local regions rather than entire point clouds have attracted attention for their strong generalization capability. However, prior work typically places local regions uniformly and keeps their size fixed, limiting adaptability to variations in geometric complexity. In this study, we propose a method that improves reconstruction accuracy and efficiency by adaptively modulating the spacing and size of local regions based on the curvature of the input point cloud.

IVApr 30, 2025
SR-NeRV: Improving Embedding Efficiency of Neural Video Representation via Super-Resolution

Taiga Hayami, Kakeru Koizumi, Hiroshi Watanabe

Implicit Neural Representations (INRs) have garnered significant attention for their ability to model complex signals in various domains. Recently, INR-based frameworks have shown promise in neural video compression by embedding video content into compact neural networks. However, these methods often struggle to reconstruct high-frequency details under stringent constraints on model size, which are critical in practical compression scenarios. To address this limitation, we propose an INR-based video representation framework that integrates a general-purpose super-resolution (SR) network. This design is motivated by the observation that high-frequency components tend to exhibit low temporal redundancy across frames. By offloading the reconstruction of fine details to a dedicated SR network pre-trained on natural images, the proposed method improves visual fidelity. Experimental results demonstrate that the proposed method outperforms conventional INR-based baselines in reconstruction quality, while maintaining a comparable model size.

CVJun 15, 2025
Structure-Preserving Patch Decoding for Efficient Neural Video Representation

Taiga Hayami, Kakeru Koizumi, Hiroshi Watanabe

Implicit neural representations (INRs) are the subject of extensive research, particularly in their application to modeling complex signals by mapping spatial and temporal coordinates to corresponding values. When handling videos, mapping compact inputs to entire frames or spatially partitioned patch images is an effective approach. This strategy better preserves spatial relationships, reduces computational overhead, and improves reconstruction quality compared to coordinate-based mapping. However, predicting entire frames often limits the reconstruction of high-frequency visual details. Additionally, conventional patch-based approaches based on uniform spatial partitioning tend to introduce boundary discontinuities that degrade spatial coherence. We propose a neural video representation method based on Structure-Preserving Patches (SPPs) to address such limitations. Our method separates each video frame into patch images of spatially aligned frames through a deterministic pixel-based splitting similar to PixelUnshuffle. This operation preserves the global spatial structure while allowing patch-level decoding. We train the decoder to reconstruct these structured patches, enabling a global-to-local decoding strategy that captures the global layout first and refines local details. This effectively reduces boundary artifacts and mitigates distortions from naive upsampling. Experiments on standard video datasets demonstrate that our method achieves higher reconstruction quality and better compression performance than existing INR-based baselines.

CVJun 15, 2024
Implicit Neural Representation for Videos Based on Residual Connection

Taiga Hayami, Hiroshi Watanabe

Video compression technology is essential for transmitting and storing videos. Many video compression methods reduce information in videos by removing high-frequency components and utilizing similarities between frames. Alternatively, the implicit neural representations (INRs) for videos, which use networks to represent and compress videos through model compression. A conventional method improves the quality of reconstruction by using frame features. However, the detailed representation of the frames can be improved. To improve the quality of reconstructed frames, we propose a method that uses low-resolution frames as residual connection that is considered effective for image reconstruction. Experimental results show that our method outperforms the existing method, HNeRV, in PSNR for 46 of the 49 videos.