CVFeb 22
A Two-Stage Detection-Tracking Framework for Stable Apple Quality Inspection in Dense Conveyor-Belt EnvironmentsKeonvin Park, Aditya Pal, Jin Hong Mok
Industrial fruit inspection systems must operate reliably under dense multi-object interactions and continuous motion, yet most existing works evaluate detection or classification at the image level without ensuring temporal stability in video streams. We present a two-stage detection-tracking framework for stable multi-apple quality inspection in conveyor-belt environments. An orchard-trained YOLOv8 model performs apple localization, followed by ByteTrack multi-object tracking to maintain persistent identities. A ResNet18 defect classifier, fine-tuned on a healthy-defective fruit dataset, is applied to cropped apple regions. Track-level aggregation is introduced to enforce temporal consistency and reduce prediction oscillation across frames. We define video-level industrial metrics such as track-level defect ratio and temporal consistency to evaluate system robustness under realistic processing conditions. Results demonstrate improved stability compared to frame-wise inference, suggesting that integrating tracking is essential for practical automated fruit grading systems.
PMMar 9
Joint Return and Risk Modeling with Deep Neural Networks for Portfolio ConstructionKeonvin Park
Portfolio construction traditionally relies on separately estimating expected returns and covariance matrices using historical statistics, often leading to suboptimal allocation under time-varying market conditions. This paper proposes a joint return and risk modeling framework based on deep neural networks that enables end-to-end learning of dynamic expected returns and risk structures from sequential financial data. Using daily data from ten large-cap US equities spanning 2010 to 2024, the proposed model is evaluated across return prediction, risk estimation, and portfolio-level performance. Out-of-sample results during 2020 to 2024 show that the deep forecasting model achieves competitive predictive accuracy (RMSE = 0.0264) with economically meaningful directional accuracy (51.9%). More importantly, the learned representation effectively captures volatility clustering and regime shifts. When integrated into portfolio optimization, the proposed Neural Portfolio strategy achieves an annual return of 36.4% and a Sharpe ratio of 0.91, outperforming equal weight and historical mean-variance benchmarks in terms of risk-adjusted performance. These findings demonstrate that jointly modeling return and covariance dynamics can provide consistent improvements over traditional allocation approaches. The framework offers a scalable and practical alternative for data-driven portfolio construction under nonstationary market conditions.
LGFeb 6, 2025
PINT: Physics-Informed Neural Time Series Models with Applications to Long-term Inference on WeatherBench 2m-Temperature DataKeonvin Park, Jisu Kim, Jaemin Seo
This paper introduces PINT (Physics-Informed Neural Time Series Models), a framework that integrates physical constraints into neural time series models to improve their ability to capture complex dynamics. We apply PINT to the ERA5 WeatherBench dataset, focusing on long-term forecasting of 2m-temperature data. PINT incorporates the Simple Harmonic Oscillator Equation as a physics-informed prior, embedding its periodic dynamics into RNN, LSTM, and GRU architectures. This equation's analytical solutions (sine and cosine functions) facilitate rigorous evaluation of the benefits of incorporating physics-informed constraints. By benchmarking against a linear regression baseline derived from its exact solutions, we quantify the impact of embedding physical principles in data-driven models. Unlike traditional time series models that rely on future observations, PINT is designed for practical forecasting. Using only the first 90 days of observed data, it iteratively predicts the next two years, addressing challenges posed by limited real-time updates. Experiments on the WeatherBench dataset demonstrate PINT's ability to generalize, capture periodic trends, and align with physical principles. This study highlights the potential of physics-informed neural models in bridging machine learning and interpretable climate applications. Our models and datasets are publicly available on GitHub: https://github.com/KV-Park.
APFeb 22
Dynamic Elasticity Between Forest Loss and Carbon Emissions: A Subnational Panel Analysis of the United StatesKeonvin Park
Accurate quantification of the relationship between forest loss and associated carbon emissions is critical for both environmental monitoring and policy evaluation. Although many studies have documented spatial patterns of forest degradation, there is limited understanding of the dynamic elasticity linking tree cover loss to carbon emissions at subnational scales. In this paper, we construct a comprehensive panel dataset of annual forest loss and carbon emission estimates for U.S. subnational administrative units from 2001 to 2023, based on the Hansen Global Forest Change dataset. We apply fixed effects and dynamic panel regression techniques to isolate within-region variation and account for temporal persistence in emissions. Our results show that forest loss has a significant positive short-run elasticity with carbon emissions, and that emissions exhibit strong persistence over time. Importantly, the estimated long-run elasticity, accounting for autoregressive dynamics, is substantially larger than the short-run effect, indicating cumulative impacts of repeated forest loss events. These findings highlight the importance of modeling temporal dynamics when assessing environmental responses to land cover change. The dynamic elasticity framework proposed here offers a robust and interpretable tool for analyzing environmental change processes, and can inform both regional monitoring systems and carbon accounting frameworks.
CVFeb 9
Understanding Image2Video Domain Shift in Food Segmentation: An Instance-level Analysis on ApplesKeonvin Park, Aditya Pal, Jin Hong Mok
Food segmentation models trained on static images have achieved strong performance on benchmark datasets; however, their reliability in video settings remains poorly understood. In real-world applications such as food monitoring and instance counting, segmentation outputs must be temporally consistent, yet image-trained models often break down when deployed on videos. In this work, we analyze this failure through an instance segmentation and tracking perspective, focusing on apples as a representative food category. Models are trained solely on image-level food segmentation data and evaluated on video sequences using an instance segmentation with tracking-by-matching framework, enabling object-level temporal analysis. Our results reveal that high frame-wise segmentation accuracy does not translate to stable instance identities over time. Temporal appearance variations, particularly illumination changes, specular reflections, and texture ambiguity, lead to mask flickering and identity fragmentation, resulting in significant errors in apple counting. These failures are largely overlooked by conventional image-based metrics, which substantially overestimate real-world video performance. Beyond diagnosing the problem, we examine practical remedies that do not require full video supervision, including post-hoc temporal regularization and self-supervised temporal consistency objectives. Our findings suggest that the root cause of failure lies in image-centric training objectives that ignore temporal coherence, rather than model capacity. This study highlights a critical evaluation gap in food segmentation research and motivates temporally-aware learning and evaluation protocols for video-based food analysis.