CVSep 14, 2024
On the Generalizability of Foundation Models for Crop Type MappingYi-Chia Chang, Adam J. Stewart, Favyen Bastani et al.
Foundation models pre-trained using self-supervised learning have shown powerful transfer learning capabilities on various downstream tasks, including language understanding, text generation, and image recognition. The Earth observation (EO) field has produced several foundation models pre-trained directly on multispectral satellite imagery for applications like precision agriculture, wildfire and drought monitoring, and natural disaster response. However, few studies have investigated the ability of these models to generalize to new geographic locations, and potential concerns of geospatial bias -- models trained on data-rich developed nations not transferring well to data-scarce developing nations -- remain. We evaluate three popular EO foundation models, SSL4EO-S12, SatlasPretrain, and ImageNet, on five crop classification datasets across five continents. Results show that pre-trained weights designed explicitly for Sentinel-2, such as SSL4EO-S12, outperform general pre-trained weights like ImageNet. While only 100 labeled images are sufficient for achieving high overall accuracy, 900 images are required to mitigate class imbalance and improve average accuracy.
17.3NIMay 6
SILC: Lookahead Caching for Short-form Video Delivery SystemsMaleeha Masood, Shreya Kannan, Om Chabra et al.
Short video platforms like TikTok, Instagram Reels, and YouTube Shorts have gained immense popularity in the last few years and are responsible for a large and growing fraction of Internet traffic. We identify two unique opportunities for improving short video delivery using their existing interactions with content delivery networks (CDNs). First, short videos use a push-based recommendation system, where the user is presented a sequence of videos recommended by the algorithm rather than user explicitly picking content to watch (e.g., in YouTube). Such push-based short video systems offer a unique opportunity for system design by providing visibility into upcoming requests. Second, the popularity of these videos follows a highly skewed Pareto distribution, leading to geographical and temporal overlap amongst videos being served. We leverage these opportunities to build SILC - a lookahead-aware caching system, aimed at (i) reducing CDN cache miss rates, as well as (ii) reducing midgress bandwidth between the CDN and the origin server. Our evaluation of SILC uses traces that we collect from real users, through (i) an in-person user study, and (ii) a data donation program involving 100 TikTok users across the world. Using a combination of these traces, we simulate traffic from 10,000 simultaneous users. Our evaluation shows that, compared to 10 state-of-the-art heuristic and learning-based cache eviction policies, SILC reduces a CDN's midgress costs by 11.1% to 111%.