CVMay 4, 2022
Self-Supervised Learning for Invariant Representations from Multi-Spectral and SAR ImagesPallavi Jain, Bianca Schoen-Phelan, Robert Ross
Self-Supervised learning (SSL) has become the new state-of-art in several domain classification and segmentation tasks. Of these, one popular category in SSL is distillation networks such as BYOL. This work proposes RSDnet, which applies the distillation network (BYOL) in the remote sensing (RS) domain where data is non-trivially different from natural RGB images. Since Multi-spectral (MS) and synthetic aperture radar (SAR) sensors provide varied spectral and spatial resolution information, we utilised them as an implicit augmentation to learn invariant feature embeddings. In order to learn RS based invariant features with SSL, we trained RSDnet in two ways, i.e., single channel feature learning and three channel feature learning. This work explores the usefulness of single channel feature learning from random MS and SAR bands compared to the common notion of using three or more bands. In our linear evaluation, these single channel features reached a 0.92 F1 score on the EuroSAT classification task and 59.6 mIoU on the DFC segmentation task for certain single bands. We also compared our results with ImageNet weights and showed that the RS based SSL model outperforms the supervised ImageNet based model. We further explored the usefulness of multi-modal data compared to single modality data, and it is shown that utilising MS and SAR data learn better invariant representations than utilising only MS data.
10.0GTMar 30
Bribery's Influence on Ranked AggregationPallavi Jain, Anshul Thakur
Kemeny Consensus is a well-known rank aggregation method in social choice theory. In this method, given a set of rankings, the goal is to find a ranking $Î $ that minimizes the total Kendall tau distance to the input rankings. Computing a Kemeny consensus is NP-hard, and even verifying whether a given ranking is a Kemeny consensus is coNP-complete. Fitzsimmons and Hemaspaandra [IJCAI 2021] established the computational intractability of achieving a desired consensus through manipulative actions. Kemeny Consensus is an optimisation problem related to Kemeny's rule. In this paper, we consider a decision problem related to Kemeny's rule, known as Kemeny Score, in which the goal is to decide whether there exists a ranking $Î $ whose total Kendall tau distance from the given rankings is at most $k$. Computation of Kemeny score is known to be NP-complete. In this paper, we investigate the impact of several manipulation actions on the Kemeny Score problem, in which given a set of rankings, an integer $k$, and a ranking $X$, the question is to decide whether it is possible to manipulate the given rankings so that the total Kendall tau distance of $X$ from the manipulated rankings is at most $k$. We show that this problem can be solved in polynomial time for various manipulation actions. Interestingly, these same manipulation actions are known to be computationally hard for Kemeny consensus.
CVApr 28, 2025Code
EcoWikiRS: Learning Ecological Representation of Satellite Images from Weak Supervision with Species Observations and WikipediaValerie Zermatten, Javiera Castillo-Navarro, Pallavi Jain et al.
The presence of species provides key insights into the ecological properties of a location such as land cover, climatic conditions or even soil properties. We propose a method to predict such ecological properties directly from remote sensing (RS) images by aligning them with species habitat descriptions. We introduce the EcoWikiRS dataset, consisting of high-resolution aerial images, the corresponding geolocated species observations, and, for each species, the textual descriptions of their habitat from Wikipedia. EcoWikiRS offers a scalable way of supervision for RS vision language models (RS-VLMs) for ecology. This is a setting with weak and noisy supervision, where, for instance, some text may describe properties that are specific only to part of the species' niche or is irrelevant to a specific image. We tackle this by proposing WINCEL, a weighted version of the InfoNCE loss. We evaluate our model on the task of ecosystem zero-shot classification by following the habitat definitions from the European Nature Information System (EUNIS). Our results show that our approach helps in understanding RS images in a more ecologically meaningful manner. The code and the dataset are available at https://github.com/eceo-epfl/EcoWikiRS.
CVAug 16, 2025Code
TimeSenCLIP: A Vision-Language Model for Remote Sensing Using Single-Pixel Time SeriesPallavi Jain, Diego Marcos, Dino Ienco et al.
Vision-language models have shown significant promise in remote sensing applications, particularly for land-use and land-cover (LULC) via zero-shot classification and retrieval. However, current approaches face two key challenges: reliance on large spatial tiles that increase computational cost, and dependence on text-based supervision, which is often not readily available. In this work, we present TimeSenCLIP, a lightweight framework that reevaluate the role of spatial context by evaluating the effectiveness of a single pixel by leveraging its temporal and spectral dimensions, for classifying LULC and ecosystem types. By leveraging spectral and temporal information from Sentinel-2 imagery and cross-view learning with geo-tagged ground-level photos, we minimises the need for caption-based training while preserving semantic alignment between overhead (satellite) and ground perspectives. Our approach is grounded in the LUCAS and Sen4Map datasets, and evaluated on classification tasks including LULC, crop type, and ecosystem type. We demonstrate that single pixel inputs, when combined with temporal and spectral cues, are sufficient for thematic mapping, offering a scalable and efficient alternative for large-scale remote sensing applications. Code is available at https://github.com/pallavijain-pj/TimeSenCLIP
CVDec 11, 2024
SenCLIP: Enhancing zero-shot land-use mapping for Sentinel-2 with ground-level promptingPallavi Jain, Dino Ienco, Roberto Interdonato et al.
Pre-trained vision-language models (VLMs), such as CLIP, demonstrate impressive zero-shot classification capabilities with free-form prompts and even show some generalization in specialized domains. However, their performance on satellite imagery is limited due to the underrepresentation of such data in their training sets, which predominantly consist of ground-level images. Existing prompting techniques for satellite imagery are often restricted to generic phrases like a satellite image of ..., limiting their effectiveness for zero-shot land-use and land-cover (LULC) mapping. To address these challenges, we introduce SenCLIP, which transfers CLIPs representation to Sentinel-2 imagery by leveraging a large dataset of Sentinel-2 images paired with geotagged ground-level photos from across Europe. We evaluate SenCLIP alongside other SOTA remote sensing VLMs on zero-shot LULC mapping tasks using the EuroSAT and BigEarthNet datasets with both aerial and ground-level prompting styles. Our approach, which aligns ground-level representations with satellite imagery, demonstrates significant improvements in classification accuracy across both prompt styles, opening new possibilities for applying free-form textual descriptions in zero-shot LULC mapping.
GTMay 23, 2025
Efficient Algorithms for Electing Successive CommitteesPallavi Jain, Andrzej Kaczmarczyk
In a recently introduced model of successive committee elections (Bredereck et al., AAAI-20) for a given set of ordinal or approval preferences one aims to find a sequence of a given length of "best" same-size committees such that each candidate is a member of a limited number of consecutive committees. However, the practical usability of this model remains limited, as the described task turns out to be NP-hard for most selection criteria already for seeking committees of size three. Non-trivial or somewhat efficient algorithms for these cases are lacking too. Motivated by a desire to unlock the full potential of the described temporal model of committee elections, we devise (parameterized) algorithms that effectively solve the mentioned hard cases in realistic scenarios of a moderate number of candidates or of a limited time horizon.
GTDec 9, 2020
Participatory Budgeting with Project GroupsPallavi Jain, Krzysztof Sornat, Nimrod Talmon et al.
We study a generalization of the standard approval-based model of participatory budgeting (PB), in which voters are providing approval ballots over a set of predefined projects and -- in addition to a global budget limit, there are several groupings of the projects, each group with its own budget limit. We study the computational complexity of identifying project bundles that maximize voter satisfaction while respecting all budget limits. We show that the problem is generally intractable and describe efficient exact algorithms for several special cases, including instances with only few groups and instances where the group structure is close to be hierarchical, as well as efficient approximation algorithms. Our results could allow, e.g., municipalities to hold richer PB processes that are thematically and geographically inclusive.
DSFeb 19, 2017
Polynomial Time Efficient Construction Heuristics for Vertex Separation Minimization ProblemPallavi Jain, Gur Saran, Kamal Srivastava
Vertex Separation Minimization Problem (VSMP) consists of finding a layout of a graph G = (V,E) which minimizes the maximum vertex cut or separation of a layout. It is an NP-complete problem in general for which metaheuristic techniques can be applied to find near optimal solution. VSMP has applications in VLSI design, graph drawing and computer language compiler design. VSMP is polynomially solvable for grids, trees, permutation graphs and cographs. Construction heuristics play a very important role in the metaheuristic techniques as they are responsible for generating initial solutions which lead to fast convergence. In this paper, we have proposed three construction heuristics H1, H2 and H3 and performed experiments on Grids, Small graphs, Trees and Harwell Boeing graphs, totaling 248 instances of graphs. Experiments reveal that H1, H2 and H3 are able to achieve best results for 88.71%, 43.5% and 37.1% of the total instances respectively while the best construction heuristic in the literature achieves the best solution for 39.9% of the total instances. We have also compared the results with the state-of-the-art metaheuristic GVNS and observed that the proposed construction heuristics improves the results for some of the input instances. It was found that GVNS obtained best results for 82.9% instances of all input instances and the heuristic H1 obtained best results for 82.3% of all input instances.