83.6LGApr 12Code
TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and AnalysisZhengpeng Feng, Clement Atzberger, Sadiq Jaffer et al.
Satellite Earth-observation (EO) time series in the optical and microwave ranges of the electromagnetic spectrum are often irregular due to orbital patterns and cloud obstruction. Compositing addresses these issues but loses information with respect to vegetation phenology, which is critical for many downstream tasks. Instead, we present TESSERA, a pixel-wise foundation model for multi-modal (Sentinel-1/2) EO time series that learns robust, label-efficient embeddings. During model training, TESSERA uses Barlow Twins and sparse random temporal sampling to enforce invariance to the selection of valid observations. We employ two key regularizers: global shuffling to decorrelate spatial neighborhoods and mix-based regulation to improve invariance under extreme sparsity. We find that for diverse classification, segmentation, and regression tasks, TESSERA embeddings deliver state-of-the-art accuracy with high label efficiency, often requiring only a small task head and minimal computation. To democratize access, adhere to FAIR - principles, and simplify use, we release global, annual, 10m, pixel-wise int8 embeddings together with open weights/code and lightweight adaptation heads, thus providing practical tooling for large-scale retrieval and inference at planetary scale. All code and data are available at: https://github.com/ucam-eo/tessera.
CROct 27, 2023
Enhancing Enterprise Network Security: Comparing Machine-Level and Process-Level Analysis for Dynamic Malware DetectionBaskoro Adi Pratomo, Toby Jackson, Pete Burnap et al.
Analysing malware is important to understand how malicious software works and to develop appropriate detection and prevention methods. Dynamic analysis can overcome evasion techniques commonly used to bypass static analysis and provide insights into malware runtime activities. Much research on dynamic analysis focused on investigating machine-level information (e.g., CPU, memory, network usage) to identify whether a machine is running malicious activities. A malicious machine does not necessarily mean all running processes on the machine are also malicious. If we can isolate the malicious process instead of isolating the whole machine, we could kill the malicious process, and the machine can keep doing its job. Another challenge dynamic malware detection research faces is that the samples are executed in one machine without any background applications running. It is unrealistic as a computer typically runs many benign (background) applications when a malware incident happens. Our experiment with machine-level data shows that the existence of background applications decreases previous state-of-the-art accuracy by about 20.12% on average. We also proposed a process-level Recurrent Neural Network (RNN)-based detection model. Our proposed model performs better than the machine-level detection model; 0.049 increase in detection rate and a false-positive rate below 0.1.
LGJun 25, 2025
TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and AnalysisZhengpeng Feng, Clement Atzberger, Sadiq Jaffer et al.
Satellite Earth-observation (EO) time series in the optical and microwave ranges of the electromagnetic spectrum are often irregular due to orbital patterns and cloud obstruction. Compositing addresses these issues but loses information with respect to vegetation phenology, which is critical for many downstream tasks. Instead, we present TESSERA, a pixel-wise foundation model for multi-modal (Sentinel-1/2) EO time series that learns robust, label-efficient embeddings. During model training, TESSERA uses Barlow Twins and sparse random temporal sampling to enforce invariance to the selection of valid observations. We employ two key regularizers: global shuffling to decorrelate spatial neighborhoods and mix-based regulation to improve invariance under extreme sparsity. We find that for diverse classification, segmentation, and regression tasks, TESSERA embeddings deliver state-of-the-art accuracy with high label efficiency, often requiring only a small task head and minimal computation. To democratize access, adhere to FAIR principles, and simplify use, we release global, annual, 10m, pixel-wise int8 embeddings together with open weights/code and lightweight adaptation heads, thus providing practical tooling for large-scale retrieval and inference at planetary scale. The model training/inference code, downstream task code, and pre-generated embeddings can be accessed at https://github.com/ucam-eo