Alina Zare

CV
h-index7
49papers
569citations
Novelty36%
AI Score42

49 Papers

CVOct 25, 2022
Shared Manifold Learning Using a Triplet Network for Multiple Sensor Translation and Fusion with Missing Data

Aditya Dutt, Alina Zare, Paul Gader

Heterogeneous data fusion can enhance the robustness and accuracy of an algorithm on a given task. However, due to the difference in various modalities, aligning the sensors and embedding their information into discriminative and compact representations is challenging. In this paper, we propose a Contrastive learning based MultiModal Alignment Network (CoMMANet) to align data from different sensors into a shared and discriminative manifold where class information is preserved. The proposed architecture uses a multimodal triplet autoencoder to cluster the latent space in such a way that samples of the same classes from each heterogeneous modality are mapped close to each other. Since all the modalities exist in a shared manifold, a unified classification framework is proposed. The resulting latent space representations are fused to perform more robust and accurate classification. In a missing sensor scenario, the latent space of one sensor is easily and efficiently predicted using another sensor's latent space, thereby allowing sensor translation. We conducted extensive experiments on a manually labeled multimodal dataset containing hyperspectral data from AVIRIS-NG and NEON, and LiDAR (light detection and ranging) data from NEON. Lastly, the model is validated on two benchmark datasets: Berlin Dataset (hyperspectral and synthetic aperture radar) and MUUFL Gulfport Dataset (hyperspectral and LiDAR). A comparison made with other methods demonstrates the superiority of this method. We achieved a mean overall accuracy of 94.3% on the MUUFL dataset and the best overall accuracy of 71.26% on the Berlin dataset, which is better than other state-of-the-art approaches.

CVSep 8, 2022
Histogram Layers for Synthetic Aperture Sonar Imagery

Joshua Peeples, Alina Zare, Jeffrey Dale et al.

Synthetic aperture sonar (SAS) imagery is crucial for several applications, including target recognition and environmental segmentation. Deep learning models have led to much success in SAS analysis; however, the features extracted by these approaches may not be suitable for capturing certain textural information. To address this problem, we present a novel application of histogram layers on SAS imagery. The addition of histogram layer(s) within the deep learning models improved performance by incorporating statistical texture information on both synthetic and real-world datasets.

CVOct 11, 2021Code
Learnable Adaptive Cosine Estimator (LACE) for Image Classification

Joshua Peeples, Connor McCurley, Sarah Walker et al.

In this work, we propose a new loss to improve feature discriminability and classification performance. Motivated by the adaptive cosine/coherence estimator (ACE), our proposed method incorporates angular information that is inherently learned by artificial neural networks. Our learnable ACE (LACE) transforms the data into a new "whitened" space that improves the inter-class separability and intra-class compactness. We compare our LACE to alternative state-of-the art softmax-based and feature regularization approaches. Our results show that the proposed method can serve as a viable alternative to cross entropy and angular softmax approaches. Our code is publicly available: https://github.com/GatorSense/LACE.

LGJan 1, 2020Code
Histogram Layers for Texture Analysis

Joshua Peeples, Weihuang Xu, Alina Zare

An essential aspect of texture analysis is the extraction of features that describe the distribution of values in local, spatial regions. We present a localized histogram layer for artificial neural networks. Instead of computing global histograms as done previously, the proposed histogram layer directly computes the local, spatial distribution of features for texture analysis and parameters for the layer are estimated during backpropagation. We compare our method with state-of-the-art texture encoding methods such as the Deep Encoding Network Pooling, Deep Texture Encoding Network, Fisher Vector convolutional neural network, and Multi-level Texture Encoding and Representation on three material/texture datasets: (1) the Describable Texture Dataset; (2) an extension of the ground terrain in outdoor scenes; (3) and a subset of the Materials in Context dataset. Results indicate that the inclusion of the proposed histogram layer improves performance. The source code for the histogram layer is publicly available: https://github.com/GatorSense/Histogram_Layer.

CVApr 4
Task-Guided Multi-Annotation Triplet Learning for Remote Sensing Representations

Meilun Zhou, Alina Zare

Prior multi-task triplet loss methods relied on static weights to balance supervision between various types of annotation. However, static weighting requires tuning and does not account for how tasks interact when shaping a shared representation. To address this, the proposed task-guided multi-annotation triplet loss removes this dependency by selecting triplets through a mutual-information criteria that identifies triplets most informative across tasks. This strategy modifies which samples influence the representation rather than adjusting loss magnitudes. Experiments on an aerial wildlife dataset compare the proposed task-guided selection against several triplet loss setups for shaping a representation in an effective multi-task manner. The results show improved classification and regression performance and demonstrate that task-aware triplet selection produces a more effective shared representation for downstream tasks.

CVApr 4
Beyond Task-Driven Features for Object Detection

Meilun Zhou, Alina Zare

Task-driven features learned by modern object detectors optimize end task loss yet often capture shortcut correlations that fail to reflect underlying annotation structure. Such representations limit transfer, interpretability, and robustness when task definitions change or supervision becomes sparse. This paper introduces an annotation-guided feature augmentation framework that injects embeddings into an object detection backbone. The method constructs dense spatial feature grids from annotation-guided latent spaces and fuses them with feature pyramid representations to influence region proposal and detection heads. Experiments across wildlife and remote sensing datasets evaluate classification, localization, and data efficiency under multiple supervision regimes. Results show consistent improvements in object focus, reduced background sensitivity, and stronger generalization to unseen or weakly supervised tasks. The findings demonstrate that aligning features with annotation geometry yields more meaningful representations than purely task optimized features.

CVMar 25, 2024
Histogram Layers for Neural Engineered Features

Joshua Peeples, Salim Al Kharsa, Luke Saleh et al.

In the computer vision literature, many effective histogram-based features have been developed. These engineered features include local binary patterns and edge histogram descriptors among others and they have been shown to be informative features for a variety of computer vision tasks. In this paper, we explore whether these features can be learned through histogram layers embedded in a neural network and, therefore, be leveraged within deep learning frameworks. By using histogram features, local statistics of the feature maps from the convolution neural networks can be used to better represent the data. We present neural versions of local binary pattern and edge histogram descriptors that jointly improve the feature representation and perform image classification. Experiments are presented on benchmark and real-world datasets.

CVApr 10, 2025
Multi-Task Learning with Multi-Annotation Triplet Loss for Improved Object Detection

Meilun Zhou, Aditya Dutt, Alina Zare

Triplet loss traditionally relies only on class labels and does not use all available information in multi-task scenarios where multiple types of annotations are available. This paper introduces a Multi-Annotation Triplet Loss (MATL) framework that extends triplet loss by incorporating additional annotations, such as bounding box information, alongside class labels in the loss formulation. By using these complementary annotations, MATL improves multi-task learning for tasks requiring both classification and localization. Experiments on an aerial wildlife imagery dataset demonstrate that MATL outperforms conventional triplet loss in both classification and localization. These findings highlight the benefit of using all available annotations for triplet loss in multi-task learning frameworks.

LGJan 2, 2025
DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

Mason Lary, Richard Samuelson, Alexander Wilentz et al.

Motivated by deep learning regimes with multiple interacting yet distinct model components, we introduce learning diagrams, graphical depictions of training setups that capture parameterized learning as data rather than code. A learning diagram compiles to a unique loss function on which component models are trained. The result of training on this loss is a collection of models whose predictions ``agree" with one another. We show that a number of popular learning setups such as few-shot multi-task learning, knowledge distillation, and multi-modal learning can be depicted as learning diagrams. We further implement learning diagrams in a library that allows users to build diagrams of PyTorch and Flux.jl models. By implementing some classic machine learning use cases, we demonstrate how learning diagrams allow practitioners to build complicated models as compositions of smaller components, identify relationships between workflows, and manipulate models during or after training. Leveraging a category theoretic framework, we introduce a rigorous semantics for learning diagrams that puts such operations on a firm mathematical foundation.

CVJun 27, 2024
Cost-efficient Active Illumination Camera For Hyper-spectral Reconstruction

Yuxuan Zhang, T. M. Sazzad, Yangyang Song et al.

Hyper-spectral imaging has recently gained increasing attention for use in different applications, including agricultural investigation, ground tracking, remote sensing and many other. However, the high cost, large physical size and complicated operation process stop hyperspectral cameras from being employed for various applications and research fields. In this paper, we introduce a cost-efficient, compact and easy to use active illumination camera that may benefit many applications. We developed a fully functional prototype of such camera. With the hope of helping with agricultural research, we tested our camera for plant root imaging. In addition, a U-Net model for spectral reconstruction was trained by using a reference hyperspectral camera's data as ground truth and our camera's data as input. We demonstrated our camera's ability to obtain additional information over a typical RGB camera. In addition, the ability to reconstruct hyperspectral data from multi-spectral input makes our device compatible to models and algorithms developed for hyperspectral applications with no modifications required.

LGJun 24, 2024
Quantifying Heterogeneous Ecosystem Services With Multi-Label Soft Classification

Zhihui Tian, John Upchurch, G. Austin Simon et al.

Understanding and quantifying ecosystem services are crucial for sustainable environmental management, conservation efforts, and policy-making. The advancement of remote sensing technology and machine learning techniques has greatly facilitated this process. Yet, ground truth labels, such as biodiversity, are very difficult and expensive to measure. In addition, more easily obtainable proxy labels, such as land use, often fail to capture the complex heterogeneity of the ecosystem. In this paper, we demonstrate how land use proxy labels can be implemented with a soft, multi-label classifier to predict ecosystem services with complex heterogeneity.

CVJan 20, 2022
PRMI: A Dataset of Minirhizotron Images for Diverse Plant Root Study

Weihuang Xu, Guohao Yu, Yiming Cui et al.

Understanding a plant's root system architecture (RSA) is crucial for a variety of plant science problem domains including sustainability and climate adaptation. Minirhizotron (MR) technology is a widely-used approach for phenotyping RSA non-destructively by capturing root imagery over time. Precisely segmenting roots from the soil in MR imagery is a critical step in studying RSA features. In this paper, we introduce a large-scale dataset of plant root images captured by MR technology. In total, there are over 72K RGB root images across six different species including cotton, papaya, peanut, sesame, sunflower, and switchgrass in the dataset. The images span a variety of conditions including varied root age, root structures, soil types, and depths under the soil surface. All of the images have been annotated with weak image-level labels indicating whether each image contains roots or not. The image-level labels can be used to support weakly supervised learning in plant root segmentation tasks. In addition, 63K images have been manually annotated to generate pixel-level binary masks indicating whether each pixel corresponds to root or not. These pixel-level binary masks can be used as ground truth for supervised learning in semantic segmentation tasks. By introducing this dataset, we aim to facilitate the automatic segmentation of roots and the research of RSA with deep learning and other image analysis algorithms.

CVDec 12, 2021
Image-to-Height Domain Translation for Synthetic Aperture Sonar

Dylan Stewart, Shawn Johnson, Alina Zare

Observations of seabed texture with synthetic aperture sonar are dependent upon several factors. In this work, we focus on collection geometry with respect to isotropic and anisotropic textures. The low grazing angle of the collection geometry, combined with orientation of the sonar path relative to anisotropic texture, poses a significant challenge for image-alignment and other multi-view scene understanding frameworks. We previously proposed using features captured from estimated seabed relief to improve scene understanding. While several methods have been developed to estimate seabed relief via intensity, no large-scale study exists in the literature. Furthermore, a dataset of coregistered seabed relief maps and sonar imagery is nonexistent to learn this domain translation. We address these problems by producing a large simulated dataset containing coregistered pairs of seabed relief and intensity maps from two unique sonar data simulation techniques. We apply three types of models, with varying complexity, to translate intensity imagery to seabed relief: a Gaussian Markov Random Field approach (GMRF), a conditional Generative Adversarial Network (cGAN), and UNet architectures. Methods are compared in reference to the coregistered simulated datasets using L1 error. Additionally, predictions on simulated and real SAS imagery are shown. Finally, models are compared on two datasets of hand-aligned SAS imagery and evaluated in terms of L1 error across multiple aspects in comparison to using intensity. Our comprehensive experiments show that the proposed UNet architectures outperform the GMRF and pix2pix cGAN models on seabed relief estimation for simulated and real SAS imagery.

LGNov 10, 2021
Cross-Layered Distributed Data-driven Framework For Enhanced Smart Grid Cyber-Physical Security

Allen Starke, Keerthiraj Nagaraj, Cody Ruben et al.

Smart Grid (SG) research and development has drawn much attention from academia, industry and government due to the great impact it will have on society, economics and the environment. Securing the SG is a considerably significant challenge due the increased dependency on communication networks to assist in physical process control, exposing them to various cyber-threats. In addition to attacks that change measurement values using False Data Injection (FDI) techniques, attacks on the communication network may disrupt the power system's real-time operation by intercepting messages, or by flooding the communication channels with unnecessary data. Addressing these attacks requires a cross-layer approach. In this paper a cross-layered strategy is presented, called Cross-Layer Ensemble CorrDet with Adaptive Statistics(CECD-AS), which integrates the detection of faulty SG measurement data as well as inconsistent network inter-arrival times and transmission delays for more reliable and accurate anomaly detection and attack interpretation. Numerical results show that CECD-AS can detect multiple False Data Injections, Denial of Service (DoS) and Man In The Middle (MITM) attacks with a high F1-score compared to current approaches that only use SG measurement data for detection such as the traditional physics-based State Estimation, Ensemble CorrDet with Adaptive Statistics strategy and other machine learning classification-based detection schemes.

LGOct 19, 2021
Robust Semi-Supervised Classification using GANs with Self-Organizing Maps

Ronald Fick, Paul Gader, Alina Zare

Generative adversarial networks (GANs) have shown tremendous promise in learning to generate data and effective at aiding semi-supervised classification. However, to this point, semi-supervised GAN methods make the assumption that the unlabeled data set contains only samples of the joint distribution of the classes of interest, referred to as inliers. Consequently, when presented with a sample from other distributions, referred to as outliers, GANs perform poorly at determining that it is not qualified to make a decision on the sample. The problem of discriminating outliers from inliers while maintaining classification accuracy is referred to here as the DOIC problem. In this work, we describe an architecture that combines self-organizing maps (SOMs) with SS-GANS with the goal of mitigating the DOIC problem and experimental results indicating that the architecture achieves the goal. Multiple experiments were conducted on hyperspectral image data sets. The SS-GANS performed slightly better than supervised GANS on classification problems with and without the SOM. Incorporating the SOMs into the SS-GANs and the supervised GANS led to substantially mitigation of the DOIC problem when compared to SS-GANS and GANs without the SOMs. Furthermore, the SS-GANS performed much better than GANS on the DOIC problem, even without the SOMs.

CVOct 14, 2021
Possibilistic Fuzzy Local Information C-Means with Automated Feature Selection for Seafloor Segmentation

Joshua Peeples, Daniel Suen, Alina Zare et al.

The Possibilistic Fuzzy Local Information C-Means (PFLICM) method is presented as a technique to segment side-look synthetic aperture sonar (SAS) imagery into distinct regions of the sea-floor. In this work, we investigate and present the results of an automated feature selection approach for SAS image segmentation. The chosen features and resulting segmentation from the image will be assessed based on a select quantitative clustering validity criterion and the subset of the features that reach a desired threshold will be used for the segmentation process.

CVMay 5, 2021
RandCrowns: A Quantitative Metric for Imprecisely Labeled Tree Crown Delineation

Dylan Stewart, Alina Zare, Sergio Marconi et al.

Supervised methods for object delineation in remote sensing require labeled ground-truth data. Gathering sufficient high quality ground-truth data is difficult, especially when targets are of irregular shape or difficult to distinguish from background or neighboring objects. Tree crown delineation provides key information from remote sensing images for forestry, ecology, and management. However, tree crowns in remote sensing imagery are often difficult to label and annotate due to irregular shape, overlapping canopies, shadowing, and indistinct edges. There are also multiple approaches to annotation in this field (e.g., rectangular boxes vs. convex polygons) that further contribute to annotation imprecision. However, current evaluation methods do not account for this uncertainty in annotations, and quantitative metrics for evaluation can vary across multiple annotators. In this paper, we address these limitations by developing an adaptation of the Rand index for weakly-labeled crown delineation that we call RandCrowns. Our new RandCrowns evaluation metric provides a method to appropriately evaluate delineated tree crowns while taking into account imprecision in the ground-truth delineations. The RandCrowns metric reformulates the Rand index by adjusting the areas over which each term of the index is computed to account for uncertain and imprecise object delineation labels. Quantitative comparisons to the commonly used intersection over union method shows a decrease in the variance generated by differences among multiple annotators. Combined with qualitative examples, our results suggest that the RandCrowns metric is more robust for scoring target delineations in the presence of uncertainty and imprecision in annotations that are inherent to tree crown delineation.

CVMar 8, 2021
The Weakly-Labeled Rand Index

Dylan Stewart, Anna Hampton, Alina Zare et al.

Synthetic Aperture Sonar (SAS) surveys produce imagery with large regions of transition between seabed types. Due to these regions, it is difficult to label and segment the imagery and, furthermore, challenging to score the image segmentations appropriately. While there are many approaches to quantify performance in standard crisp segmentation schemes, drawing hard boundaries in remote sensing imagery where gradients and regions of uncertainty exist is inappropriate. These cases warrant weak labels and an associated appropriate scoring approach. In this paper, a labeling approach and associated modified version of the Rand index for weakly-labeled data is introduced to address these issues. Results are evaluated with the new index and compared to traditional segmentation evaluation methods. Experimental results on a SAS data set containing must-link and cannot-link labels show that our Weakly-Labeled Rand index scores segmentations appropriately in reference to qualitative performance and is more suitable than traditional quantitative metrics for scoring weakly-labeled data.

IVJan 6, 2021
Explainable Systematic Analysis for Synthetic Aperture Sonar Imagery

Sarah Walker, Joshua Peeples, Jeff Dale et al.

In this work, we present an in-depth and systematic analysis using tools such as local interpretable model-agnostic explanations (LIME) (arXiv:1602.04938) and divergence measures to analyze what changes lead to improvement in performance in fine tuned models for synthetic aperture sonar (SAS) data. We examine the sensitivity to factors in the fine tuning process such as class imbalance. Our findings show not only an improvement in seafloor texture classification, but also provide greater insight into what features play critical roles in improving performance as well as a knowledge of the importance of balanced data for fine tuning deep learning models for seafloor classification in SAS imagery.

LGDec 31, 2020
Divergence Regulated Encoder Network for Joint Dimensionality Reduction and Classification

Joshua Peeples, Sarah Walker, Connor McCurley et al.

Feature representation is an important aspect of remote-sensing based image classification. While deep convolutional neural networks are able to effectively amalgamate information, large numbers of parameters often make learned features inscrutable and difficult to transfer to alternative models. In order to better represent statistical texture information for remote-sensing image classification, in this paper, we investigate performing joint dimensionality reduction and classification using a novel histogram neural network. Motivated by a popular dimensionality reduction approach, t-Distributed Stochastic Neighbor Embedding (t-SNE), our proposed method incorporates a classification loss computed on samples in a low-dimensional embedding space. We compare the learned sample embeddings against coordinates found by t-SNE in terms of classification accuracy and qualitative assessment. We also explore use of various divergence measures in the t-SNE objective. The proposed method has several advantages such as readily embedding out-of-sample points and reducing feature dimensionality while retaining class discriminability. Our results show that the proposed approach maintains and/or improves classification performance and reveals characteristics of features produced by neural networks that may be helpful for other applications.

CVJul 30, 2020
Weakly Supervised Minirhizotron Image Segmentation with MIL-CAM

Guohao Yu, Alina Zare, Weihuang Xu et al.

We present a multiple instance learning class activation map (MIL-CAM) approach for pixel-level minirhizotron image segmentation given weak image-level labels. Minirhizotrons are used to image plant roots in situ. Minirhizotron imagery is often composed of soil containing a few long and thin root objects of small diameter. The roots prove to be challenging for existing semantic image segmentation methods to discriminate. In addition to learning from weak labels, our proposed MIL-CAM approach re-weights the root versus soil pixels during analysis for improved performance due to the heavy imbalance between soil and root pixels. The proposed approach outperforms other attention map and multiple instance learning methods for localization of root objects in minirhizotron imagery.

LGJul 2, 2020
Outlier Detection through Null Space Analysis of Neural Networks

Matthew Cook, Alina Zare, Paul Gader

Many machine learning classification systems lack competency awareness. Specifically, many systems lack the ability to identify when outliers (e.g., samples that are distinct from and not represented in the training data distribution) are being presented to the system. The ability to detect outliers is of practical significance since it can help the system behave in an reasonable way when encountering unexpected data. In prior work, outlier detection is commonly carried out in a processing pipeline that is distinct from the classification model. Thus, for a complete system that incorporates outlier detection and classification, two models must be trained, increasing the overall complexity of the approach. In this paper we use the concept of the null space to integrate an outlier detection method directly into a neural network used for classification. Our method, called Null Space Analysis (NuSA) of neural networks, works by computing and controlling the magnitude of the null space projection as data is passed through a network. Using these projections, we can then calculate a score that can differentiate between normal and abnormal data. Results are shown that indicate networks trained with NuSA retain their classification performance while also being able to detect outliers at rates similar to commonly used outlier detection algorithms.

CVMar 30, 2020
Super Resolution for Root Imaging

Jose F. Ruiz-Munoz, Jyothier K. Nimmagadda, Tyler G. Dowd et al.

High-resolution cameras have become very helpful for plant phenotyping by providing a mechanism for tasks such as target versus background discrimination, and the measurement and analysis of fine-above-ground plant attributes. However, the acquisition of high-resolution (HR) imagery of plant roots is more challenging than above-ground data collection. Thus, an effective super-resolution (SR) algorithm is desired for overcoming resolution limitations of sensors, reducing storage space requirements, and boosting the performance of later analysis, such as automatic segmentation. We propose a SR framework for enhancing images of plant roots by using convolutional neural networks (CNNs). We compare three alternatives for training the SR model: i) training with non-plant-root images, ii) training with plant-root images, and iii) pretraining the model with non-plant-root images and fine-tuning with plant-root images. We demonstrate on a collection of publicly available datasets that the SR models outperform the basic bicubic interpolation even when trained with non-root datasets. Also, our segmentation experiments show that high performance on this task can be achieved independently of the SNR. Therefore, we conclude that the quality of the image enhancement depends on the application.

CVOct 20, 2019
Peanut Maturity Classification using Hyperspectral Imagery

Sheng Zou, Yu-Chien Tseng, Alina Zare et al.

Seed maturity in peanut (Arachis hypogaea L.) determines economic return to a producer because of its impact on seed weight (yield), and critically influences seed vigor and other quality characteristics. During seed development, the inner mesocarp layer of the pericarp (hull) transitions in color from white to black as the seed matures. The maturity assessment process involves the removal of the exocarp of the hull and visually categorizing the mesocarp color into varying color classes from immature (white, yellow, orange) to mature (brown, and black). This visual color classification is time consuming because the exocarp must be manually removed. In addition, the visual classification process involves human assessment of colors, which leads to large variability of color classification from observer to observer. A more objective, digital imaging approach to peanut maturity is needed, optimally without the requirement of removal of the hull's exocarp. This study examined the use of a hyperspectral imaging (HSI) process to determine pod maturity with intact pericarps. The HSI method leveraged spectral differences between mature and immature pods within a classification algorithm to identify the mature and immature pods. The results showed a high classification accuracy with consistency using samples from different years and cultivars. In addition, the proposed method was capable of estimating a continuous-valued, pixel-level maturity value for individual peanut pods, allowing for a valuable tool that can be utilized in seed quality research. This new method solves issues of labor intensity and subjective error that all current methods of peanut maturity determination have.

IVSep 7, 2019
Multi-Target Multiple Instance Learning for Hyperspectral Target Detection

Susan Meerdink, James Bocinsky, Alina Zare et al.

In remote sensing, it is often challenging to acquire or collect a large dataset that is accurately labeled. This difficulty is usually due to several issues, including but not limited to the study site's spatial area and accessibility, errors in the global positioning system (GPS), and mixed pixels caused by an image's spatial resolution. We propose an approach, with two variations, that estimates multiple target signatures from training samples with imprecise labels: Multi-Target Multiple Instance Adaptive Cosine Estimator (Multi-Target MI-ACE) and Multi-Target Multiple Instance Spectral Match Filter (Multi-Target MI-SMF). The proposed methods address the problems above by directly considering the multiple-instance, imprecisely labeled dataset. They learn a dictionary of target signatures that optimizes detection against a background using the Adaptive Cosine Estimator (ACE) and Spectral Match Filter (SMF). Experiments were conducted to test the proposed algorithms using a simulated hyperspectral dataset, the MUUFL Gulfport hyperspectral dataset collected over the University of Southern Mississippi-Gulfpark Campus, and the AVIRIS hyperspectral dataset collected over Santa Barbara County, California. Both simulated and real hyperspectral target detection experiments show the proposed algorithms are effective at learning target signatures and performing target detection.

LGApr 30, 2019
Investigation of Initialization Strategies for the Multiple Instance Adaptive Cosine Estimator

James Bocinsky, Connor McCurley, Daniel Shats et al.

Sensors which use electromagnetic induction (EMI) to excite a response in conducting bodies have long been investigated for subsurface explosive hazard detection. In particular, EMI sensors have been used to discriminate between different types of objects, and to detect objects with low metal content. One successful, previously investigated approach is the Multiple Instance Adaptive Cosine Estimator (MI-ACE). In this paper, a number of new initialization techniques for MI-ACE are proposed and evaluated using their respective performance and speed. The cross validated learned signatures, as well as learned background statistics, are used with Adaptive Cosine Estimator (ACE) to generate confidence maps, which are clustered into alarms. Alarms are scored against a ground truth and the initialization approaches are compared.

IVApr 1, 2019
Comparison of Possibilistic Fuzzy Local Information C-Means and Possibilistic K-Nearest Neighbors for Synthetic Aperture Sonar Image Segmentation

Joshua Peeples, Matthew Cook, Daniel Suen et al.

Synthetic aperture sonar (SAS) imagery can generate high resolution images of the seafloor. Thus, segmentation algorithms can be used to partition the images into different seafloor environments. In this paper, we compare two possibilistic segmentation approaches. Possibilistic approaches allow for the ability to detect novel or outlier environments as well as well known classes. The Possibilistic Fuzzy Local Information C-Means (PFLICM) algorithm has been previously applied to segment SAS imagery. Additionally, the Possibilistic K-Nearest Neighbors (PKNN) algorithm has been used in other domains such as landmine detection and hyperspectral imagery. In this paper, we compare the segmentation performance of a semi-supervised approach using PFLICM and a supervised method using Possibilistic K-NN. We include final segmentation results on multiple SAS images and a quantitative assessment of each algorithm.

LGMar 22, 2019
Comparison of Hand-held WEMI Target Detection Algorithms

Connor H. McCurley, James Bocinsky, Alina Zare

Wide-band Electromagnetic Induction Sensors (WEMI) have been used for a number of years in subsurface detection of explosive hazards. While WEMI sensors have proven effective at localizing objects exhibiting large magnetic responses, detecting objects lacking or containing very low amounts of conductive materials can be challenging. In this paper, we compare a number of target detection algorithms in the literature in terms of detection performance. In the comparison, methods are tested on two real-world data sets: one containing relatively low amounts of ground noise pollution, and the other demonstrating highly-magnetic soil interference. Results are quantitatively evaluated through receiver-operator characteristic (ROC) curves and are used to highlight the strengths and weaknesses of the compared approaches in hand-held explosive hazard detection.

CVMar 22, 2019
Overcoming Small Minirhizotron Datasets Using Transfer Learning

Weihuang Xu, Guohao Yu, Alina Zare et al.

Minirhizotron technology is widely used for studying the development of roots. Such systems collect visible-wavelength color imagery of plant roots in-situ by scanning an imaging system within a clear tube driven into the soil. Automated analysis of root systems could facilitate new scientific discoveries that would be critical to address the world's pressing food, resource, and climate issues. A key component of automated analysis of plant roots from imagery is the automated pixel-level segmentation of roots from their surrounding soil. Supervised learning techniques appear to be an appropriate tool for the challenge due to varying local soil and root conditions, however, lack of enough annotated training data is a major limitation due to the error-prone and time-consuming manually labeling process. In this paper, we investigate the use of deep neural networks based on the U-net architecture for automated, precise pixel-wise root segmentation in minirhizotron imagery. We compiled two minirhizotron image datasets to accomplish this study: one with 17,550 peanut root images and another with 28 switchgrass root images. Both datasets were paired with manually labeled ground truth masks. We trained three neural networks with different architectures on the larger peanut root dataset to explore the effect of the neural network depth on segmentation performance. To tackle the more limited switchgrass root dataset, we showed that models initialized with features pre-trained on the peanut dataset and then fine-tuned on the switchgrass dataset can improve segmentation performance significantly. We obtained 99\% segmentation accuracy in switchgrass imagery using only 21 training images. We also observed that features pre-trained on a closely related but relatively moderate size dataset like our peanut dataset are more effective than features pre-trained on the large but unrelated ImageNet dataset.

CVMar 18, 2019
Complex Scene Classification of PolSAR Imagery based on a Self-paced Learning Approach

Wenshuai Chen, Shuiping Gou, Xinlin Wang et al.

Existing polarimetric synthetic aperture radar (PolSAR) image classification methods cannot achieve satisfactory performance on complex scenes characterized by several types of land cover with significant levels of noise or similar scattering properties across land cover types. Hence, we propose a supervised classification method aimed at constructing a classifier based on self-paced learning (SPL). SPL has been demonstrated to be effective at dealing with complex data while providing classifier. In this paper, a novel Support Vector Machine (SVM) algorithm based on SPL with neighborhood constraints (SVM_SPLNC) is proposed. The proposed method leverages the easiest samples first to obtain an initial parameter vector. Then, more complex samples are gradually incorporated to update the parameter vector iteratively. Moreover, neighborhood constraints are introduced during the training process to further improve performance. Experimental results on three real PolSAR images show that the proposed method performs well on complex scenes.

CVMar 7, 2019
Root Identification in Minirhizotron Imagery with Multiple Instance Learning

Guohao Yu, Alina Zare, Hudanyun Sheng et al.

In this paper, multiple instance learning (MIL) algorithms to automatically perform root detection and segmentation in minirhizotron imagery using only image-level labels are proposed. Root and soil characteristics vary from location to location, thus, supervised machine learning approaches that are trained with local data provide the best ability to identify and segment roots in minirhizotron imagery. However, labeling roots for training data (or otherwise) is an extremely tedious and time-consuming task. This paper aims to address this problem by labeling data at the image level (rather than the individual root or root pixel level) and train algorithms to perform individual root pixel level segmentation using MIL strategies. Three MIL methods (multiple instance adaptive cosine coherence estimator, multiple instance support vector machine, multiple instance learning with randomized trees) were applied to root detection and compared to non-MIL approches. The results show that MIL methods improve root segmentation in challenging minirhizotron imagery and reduce the labeling burden. In our results, multiple instance support vector machine outperformed other methods. The multiple instance adaptive cosine coherence estimator algorithm was a close second with an added advantage that it learned an interpretable root signature which identified the traits used to distinguish roots from soil and did not require parameter selection.

CVMay 2, 2018
Multi-Resolution Multi-Modal Sensor Fusion For Remote Sensing Data With Label Uncertainty

Xiaoxiao Du, Alina Zare

In remote sensing, each sensor can provide complementary or reinforcing information. It is valuable to fuse outputs from multiple sensors to boost overall performance. Previous supervised fusion methods often require accurate labels for each pixel in the training data. However, in many remote sensing applications, pixel-level labels are difficult or infeasible to obtain. In addition, outputs from multiple sensors often have different resolution or modalities. For example, rasterized hyperspectral imagery presents data in a pixel grid while airborne Light Detection and Ranging (LiDAR) generates dense three-dimensional (3D) point clouds. It is often difficult to directly fuse such multi-modal, multi-resolution data. To address these challenges, we present a novel Multiple Instance Multi-Resolution Fusion (MIMRF) framework that can fuse multi-resolution and multi-modal sensor outputs while learning from automatically-generated, imprecisely-labeled data. Experiments were conducted on the MUUFL Gulfport hyperspectral and LiDAR data set and a remotely-sensed soybean and weed data set. Results show improved, consistent performance on scene understanding and agricultural applications when compared to traditional fusion methods.

CVMar 11, 2018
Multiple Instance Choquet Integral Classifier Fusion and Regression for Remote Sensing Applications

Xiaoxiao Du, Alina Zare

In classifier (or regression) fusion the aim is to combine the outputs of several algorithms to boost overall performance. Standard supervised fusion algorithms often require accurate and precise training labels. However, accurate labels may be difficult to obtain in many remote sensing applications. This paper proposes novel classification and regression fusion models that can be trained given ambiguosly and imprecisely labeled training data in which training labels are associated with sets of data points (i.e., "bags") instead of individual data points (i.e., "instances") following a multiple instance learning framework. Experiments were conducted based on the proposed algorithms on both synthetic data and applications such as target detection and crop yield prediction given remote sensing data. The proposed algorithms show effective classification and regression performance.

CVOct 31, 2017
Multiple Instance Hybrid Estimator for Hyperspectral Target Characterization and Sub-pixel Target Detection

Changzhe Jiao, Chao Chen, Ronald G. McGarvey et al.

The Multiple Instance Hybrid Estimator for discriminative target characterization from imprecisely labeled hyperspectral data is presented. In many hyperspectral target detection problems, acquiring accurately labeled training data is difficult. Furthermore, each pixel containing target is likely to be a mixture of both target and non-target signatures (i.e., sub-pixel targets), making extracting a pure prototype signature for the target class from the data extremely difficult. The proposed approach addresses these problems by introducing a data mixing model and optimizing the response of the hybrid sub-pixel detector within a multiple instance learning framework. The proposed approach iterates between estimating a set of discriminative target and non-target signatures and solving a sparse unmixing problem. After learning target signatures, a signature based detector can then be applied on test data. Both simulated and real hyperspectral target detection experiments show the proposed algorithm is effective at learning discriminative target signatures and achieves superior performance over state-of-the-art comparison algorithms.

CVSep 28, 2017
Possibilistic Fuzzy Local Information C-Means for Sonar Image Segmentation

Alina Zare, Nicholas Young, Daniel Suen et al.

Side-look synthetic aperture sonar (SAS) can produce very high quality images of the sea-floor. When viewing this imagery, a human observer can often easily identify various sea-floor textures such as sand ripple, hard-packed sand, sea grass and rock. In this paper, we present the Possibilistic Fuzzy Local Information C-Means (PFLICM) approach to segment SAS imagery into sea-floor regions that exhibit these various natural textures. The proposed PFLICM method incorporates fuzzy and possibilistic clustering methods and leverages (local) spatial information to perform soft segmentation. Results are shown on several SAS scenes and compared to alternative segmentation approaches.

MLJun 11, 2017
Multiple Instance Dictionary Learning for Beat-to-Beat Heart Rate Monitoring from Ballistocardiograms

Changzhe Jiao, Bo-Yu Su, Princess Lyons et al.

A multiple instance dictionary learning approach, Dictionary Learning using Functions of Multiple Instances (DL-FUMI), is used to perform beat-to-beat heart rate estimation and to characterize heartbeat signatures from ballistocardiogram (BCG) signals collected with a hydraulic bed sensor. DL-FUMI estimates a "heartbeat concept" that represents an individual's personal ballistocardiogram heartbeat pattern. DL-FUMI formulates heartbeat detection and heartbeat characterization as a multiple instance learning problem to address the uncertainty inherent in aligning BCG signals with ground truth during training. Experimental results show that the estimated heartbeat concept found by DL-FUMI is an effective heartbeat prototype and achieves superior performance over comparison algorithms.

CVMar 17, 2017
Hyperspectral Unmixing with Endmember Variability using Semi-supervised Partial Membership Latent Dirichlet Allocation

Sheng Zou, Hao Sun, Alina Zare

A semi-supervised Partial Membership Latent Dirichlet Allocation approach is developed for hyperspectral unmixing and endmember estimation while accounting for spectral variability and spatial information. Partial Membership Latent Dirichlet Allocation is an effective approach for spectral unmixing while representing spectral variability and leveraging spatial information. In this work, we extend Partial Membership Latent Dirichlet Allocation to incorporate any available (imprecise) label information to help guide unmixing. Experimental results on two hyperspectral datasets show that the proposed semi-supervised PM-LDA can yield improved hyperspectral unmixing and endmember estimation results.

CVJan 9, 2017
Multiple Instance Hybrid Estimator for Learning Target Signatures

Changzhe Jiao, Alina Zare

Signature-based detectors for hyperspectral target detection rely on knowing the specific target signature in advance. However, target signature are often difficult or impossible to obtain. Furthermore, common methods for obtaining target signatures, such as from laboratory measurements or manual selection from an image scene, usually do not capture the discriminative features of target class. In this paper, an approach for estimating a discriminative target signature from imprecise labels is presented. The proposed approach maximizes the response of the hybrid sub-pixel detector within a multiple instance learning framework and estimates a set of discriminative target signatures. After learning target signatures, any signature based detector can then be applied on test data. Both simulated and real hyperspectral target detection experiments are shown to illustrate the effectiveness of the method.

CVJan 6, 2017
Map-guided Hyperspectral Image Superpixel Segmentation Using Proportion Maps

Hao Sun, Alina Zare

A map-guided superpixel segmentation method for hyperspectral imagery is developed and introduced. The proposed approach develops a hyperspectral-appropriate version of the SLIC superpixel segmentation algorithm, leverages map information to guide segmentation, and incorporates the semi-supervised Partial Membership Latent Dirichlet Allocation (sPM-LDA) to obtain a final superpixel segmentation. The proposed method is applied to two real hyperspectral data sets and quantitative cluster validity metrics indicate that the proposed approach outperforms existing hyperspectral superpixel segmentation methods.

CVDec 28, 2016
Partial Membership Latent Dirichlet Allocation

Chao Chen, Alina Zare, Huy Trinh et al.

Topic models (e.g., pLSA, LDA, sLDA) have been widely used for segmenting imagery. However, these models are confined to crisp segmentation, forcing a visual word (i.e., an image patch) to belong to one and only one topic. Yet, there are many images in which some regions cannot be assigned a crisp categorical label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, we present a partial membership latent Dirichlet allocation (PM-LDA) model and an associated parameter estimation algorithm. This model can be useful for imagery where a visual word may be a mixture of multiple topics. Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.

CVSep 12, 2016
Hyperspectral Unmixing with Endmember Variability using Partial Membership Latent Dirichlet Allocation

Sheng Zou, Alina Zare

The application of Partial Membership Latent Dirichlet Allocation(PM-LDA) for hyperspectral endmember estimation and spectral unmixing is presented. PM-LDA provides a model for a hyperspectral image analysis that accounts for spectral variability and incorporates spatial information through the use of superpixel-based 'documents.' In our application of PM-LDA, we employ the Normal Compositional Model in which endmembers are represented as Normal distributions to account for spectral variability and proportion vectors are modeled as random variables governed by a Dirichlet distribution. The use of the Dirichlet distribution enforces positivity and sum-to-one constraints on the proportion values. Algorithm results on real hyperspectral data indicate that PM-LDA produces endmember distributions that represent the ground truth classes and their associated variability.

CVJun 20, 2016
Multiple Instance Hyperspectral Target Characterization

Alina Zare, Changzhe Jiao, Taylor Glenn

In this paper, two methods for multiple instance target characterization, MI-SMF and MI-ACE, are presented. MI-SMF and MI-ACE estimate a discriminative target signature from imprecisely-labeled and mixed training data. In many applications, such as sub-pixel target detection in remotely-sensed hyperspectral imagery, accurate pixel-level labels on training data is often unavailable and infeasible to obtain. Furthermore, since sub-pixel targets are smaller in size than the resolution of a single pixel, training data is comprised only of mixed data points (in which target training points are mixtures of responses from both target and non-target classes). Results show improved, consistent performance over existing multiple instance concept learning methods on several hyperspectral sub-pixel target detection problems.

CVMay 16, 2016
Heart Beat Characterization from Ballistocardiogram Signals using Extended Functions of Multiple Instances

Changzhe Jiao, Princess Lyons, Alina Zare et al.

A multiple instance learning (MIL) method, extended Function of Multiple Instances ($e$FUMI), is applied to ballistocardiogram (BCG) signals produced by a hydraulic bed sensor. The goal of this approach is to learn a personalized heartbeat "concept" for an individual. This heartbeat concept is a prototype (or "signature") that characterizes the heartbeat pattern for an individual in ballistocardiogram data. The $e$FUMI method models the problem of learning a heartbeat concept from a BCG signal as a MIL problem. This approach elegantly addresses the uncertainty inherent in a BCG signal e. g., misalignment between training data and ground truth, mis-collection of heartbeat by some transducers, etc. Given a BCG training signal coupled with a ground truth signal (e.g., a pulse finger sensor), training "bags" labeled with only binary labels denoting if a training bag contains a heartbeat signal or not can be generated. Then, using these bags, $e$FUMI learns a personalized concept of heartbeat for a subject as well as several non-heartbeat background concepts. After learning the heartbeat concept, heartbeat detection and heart rate estimation can be applied to test data. Experimental results show that the estimated heartbeat concept found by $e$FUMI is more representative and a more discriminative prototype of the heartbeat signals than those found by comparison MIL methods in the literature.

CVMar 21, 2016
Instance Influence Estimation for Hyperspectral Target Signature Characterization using Extended Functions of Multiple Instances

Sheng Zou, Alina Zare

The Extended Functions of Multiple Instances (eFUMI) algorithm is a generalization of Multiple Instance Learning (MIL). In eFUMI, only bag level (i.e. set level) labels are needed to estimate target signatures from mixed data. The training bags in eFUMI are labeled positive if any data point in a bag contains or represents any proportion of the target signature and are labeled as a negative bag if all data points in the bag do not represent any target. From these imprecise labels, eFUMI has been shown to be effective at estimating target signatures in hyperspectral subpixel target detection problems. One motivating scenario for the use of eFUMI is where an analyst circles objects/regions of interest in a hyperspectral scene such that the target signatures of these objects can be estimated and be used to determine whether other instances of the object appear elsewhere in the image collection. The regions highlighted by the analyst serve as the imprecise labels for eFUMI. Often, an analyst may want to iteratively refine their imprecise labels. In this paper, we present an approach for estimating the influence on the estimated target signature if the label for a particular input data point is modified. This "instance influence estimation" guides an analyst to focus on (re-)labeling the data points that provide the largest change in the resulting estimated target signature and, thus, reduce the amount of time an analyst needs to spend refining the labels for a hyperspectral scene. Results are shown on real hyperspectral sub-pixel target detection data sets.

CVMar 19, 2016
Adaptive coherence estimator (ACE) for explosive hazard detection using wideband electromagnetic induction (WEMI)

Brendan Alvey, Alina Zare, Matthew Cook et al.

The adaptive coherence estimator (ACE) estimates the squared cosine of the angle between a known target vector and a sample vector in a whitened coordinate space. The space is whitened according to an estimation of the background statistics, which directly effects the performance of the statistic as a target detector. In this paper, the ACE detection statistic is used to detect buried explosive hazards with data from a Wideband Electromagnetic Induction (WEMI) sensor. Target signatures are based on a dictionary defined using a Discrete Spectrum of Relaxation Frequencies (DSRF) model. Results are summarized as a receiver operator curve (ROC) and compared to other leading methods.

CVMar 19, 2016
Buried object detection using handheld WEMI with task-driven extended functions of multiple instances

Matthew Cook, Alina Zare, Dominic Ho

Many effective supervised discriminative dictionary learning methods have been developed in the literature. However, when training these algorithms, precise ground-truth of the training data is required to provide very accurate point-wise labels. Yet, in many applications, accurate labels are not always feasible. This is especially true in the case of buried object detection in which the size of the objects are not consistent. In this paper, a new multiple instance dictionary learning algorithm for detecting buried objects using a handheld WEMI sensor is detailed. The new algorithm, Task Driven Extended Functions of Multiple Instances, can overcome data that does not have very precise point-wise labels and still learn a highly discriminative dictionary. Results are presented and discussed on measured WEMI data.

CVNov 9, 2015
Multiple Instance Dictionary Learning using Functions of Multiple Instances

Changzhe Jiao, Alina Zare

A multiple instance dictionary learning method using functions of multiple instances (DL-FUMI) is proposed to address target detection and two-class classification problems with inaccurate training labels. Given inaccurate training labels, DL-FUMI learns a set of target dictionary atoms that describe the most distinctive and representative features of the true positive class as well as a set of nontarget dictionary atoms that account for the shared information found in both the positive and negative instances. Experimental results show that the estimated target dictionary atoms found by DL-FUMI are more representative prototypes and identify better discriminative features of the true positive class than existing methods in the literature. DL-FUMI is shown to have significantly better performance on several target detection and classification problems as compared to other multiple instance learning (MIL) dictionary learning algorithms on a variety of MIL problems.

MLNov 9, 2015
Partial Membership Latent Dirichlet Allocation

Chao Chen, Alina Zare, J. Tory Cobb

Topic models (e.g., pLSA, LDA, SLDA) have been widely used for segmenting imagery. These models are confined to crisp segmentation. Yet, there are many images in which some regions cannot be assigned a crisp label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, we present a partial membership latent Dirichlet allocation (PM-LDA) model and associated parameter estimation algorithms. Experimental results on two natural image datasets and one SONAR image dataset show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability existing methods do not have.

CVOct 30, 2015
Estimating Target Signatures with Diverse Density

Taylor Glenn, Alina Zare

Hyperspectral target detection algorithms rely on knowing the desired target signature in advance. However, obtaining an effective target signature can be difficult; signatures obtained from laboratory measurements or hand-spectrometers in the field may not transfer to airborne imagery effectively. One approach to dealing with this difficulty is to learn an effective target signature from training data. An approach for learning target signatures from training data is presented. The proposed approach addresses uncertainty and imprecision in groundtruth in the training data using a multiple instance learning, diverse density (DD) based objective function. After learning the target signature given data with uncertain and imprecise groundtruth, target detection can be applied on test data. Results are shown on simulated and real data.