Orca: Ocean Significant Wave Height Estimation with Spatio-temporally Aware Large Language Models
This work addresses the challenge of accurate wave height prediction for applications like marine energy and early warning systems, though it is incremental as it builds on existing LLMs with a novel encoding module.
The paper tackles the problem of significant wave height estimation in marine science by proposing Orca, a framework that enhances large language models with spatio-temporal awareness to work with limited observational data, achieving state-of-the-art performance on the Gulf of Mexico dataset.
Significant wave height (SWH) is a vital metric in marine science, and accurate SWH estimation is crucial for various applications, e.g., marine energy development, fishery, early warning systems for potential risks, etc. Traditional SWH estimation methods that are based on numerical models and physical theories are hindered by computational inefficiencies. Recently, machine learning has emerged as an appealing alternative to improve accuracy and reduce computational time. However, due to limited observational technology and high costs, the scarcity of real-world data restricts the potential of machine learning models. To overcome these limitations, we propose an ocean SWH estimation framework, namely Orca. Specifically, Orca enhances the limited spatio-temporal reasoning abilities of classic LLMs with a novel spatiotemporal aware encoding module. By segmenting the limited buoy observational data temporally, encoding the buoys' locations spatially, and designing prompt templates, Orca capitalizes on the robust generalization ability of LLMs to estimate significant wave height effectively with limited data. Experimental results on the Gulf of Mexico demonstrate that Orca achieves state-of-the-art performance in SWH estimation.