Hanan Samet

CL
h-index6
5papers
4,324citations
Novelty46%
AI Score34

5 Papers

CLJun 3, 2025
DistRAG: Towards Distance-Based Spatial Reasoning in LLMs

Nicole R Schneider, Nandini Ramachandran, Kent O'Sullivan et al.

Many real world tasks where Large Language Models (LLMs) can be used require spatial reasoning, like Point of Interest (POI) recommendation and itinerary planning. However, on their own LLMs lack reliable spatial reasoning capabilities, especially about distances. To address this problem, we develop a novel approach, DistRAG, that enables an LLM to retrieve relevant spatial information not explicitly learned during training. Our method encodes the geodesic distances between cities and towns in a graph and retrieves a context subgraph relevant to the question. Using this technique, our method enables an LLM to answer distance-based reasoning questions that it otherwise cannot answer. Given the vast array of possible places an LLM could be asked about, DistRAG offers a flexible first step towards providing a rudimentary `world model' to complement the linguistic knowledge held in LLMs.

HCMay 4, 2020
Equal Area Breaks: A Classification Scheme for Data to Obtain an Evenly-colored Choropleth Map

Anis Abboud, John Kastner, Hanan Samet

An efficient algorithm for computing the choropleth map classification scheme known as equal area breaks or geographical quantiles is introduced. An equal area breaks classification aims to obtain a coloring for the map such that the area associated with each of the colors is approximately equal. This is meant to be an alternative to an approach that assigns an equal number of regions with a particular range of property values to each color, called quantiles, which could result in the mapped area being dominated by one or a few colors. Moreover, it is possible that the other colors are barely discernible. This is the case when some regions are much larger than others (e.g., compare Switzerland with Russia). A number of algorithms of varying computational complexity are presented to achieve an equal area assignment to regions. They include a pair of greedy algorithms, as well as an optimal algorithm that is based on dynamic programming. The classification obtained from the optimal equal area algorithm is compared with the quantiles and Jenks natural breaks algorithms and found to be superior from a visual standpoint by a user study. Finally, a modified approach is presented which enables users to vary the extent to which the coloring algorithm satisfies the conflicting goals of equal area for each color with that of assigning an equal number of regions to each color.

IRFeb 28, 2020
NewsStand CoronaViz: A Map Query Interface for Spatio-Temporal and Spatio-Textual Monitoring of Disease Spread

John Kastner, Hanan Samet, Hong Wei

With the rapid continuing spread of COVID-19, it is clearly important to be able to track the progress of the virus over time in order to be better prepared to anticipate its emergence and spread in new regions as well as declines in its presence in regions thereby leading to or justifying "reopening" decisions. There are many applications and web sites that monitor officially released numbers of cases which are likely to be the most accurate methods for tracking the progress of the virus; however, they will not necessarily paint a complete picture. To begin filling any gaps in official reports, we have developed the NewsStand CoronaViz web application (https://coronaviz.umiacs.io) that can run on desktops and mobile devices that allows users to explore the geographic spread in discussions about the virus through analysis of keyword prevalence in geotagged news articles and tweets in relation to the real spread of the virus as measured by confirmed case numbers reported by the appropriate authorities. NewsStand CoronaViz users have access to dynamic variants of the disease-related variables corresponding to the numbers of confirmed cases, active cases, deaths, and recoveries (where they are provided) via a map query interface. It has the ability to step forward and backward in time using both a variety of temporal window sizes (day, week, month, or combinations thereof) in addition to user-defined varying spatial window sizes specified by direct manipulation actions (e.g., pan, zoom, and hover) as well as textually (e.g., by the name of the containing country, state or province, or county as well as textually-specified spatially-adjacent combinations thereof), and finally by the amount of spatio-temporally-varying news and tweet volume involving COVID-19.

LGJun 7, 2017
Training Quantized Nets: A Deeper Understanding

Hao Li, Soham De, Zheng Xu et al.

Currently, deep neural networks are deployed on low-power portable devices by first training a full-precision model using powerful hardware, and then deriving a corresponding low-precision model for efficient inference on such systems. However, training models directly with coarsely quantized weights is a key step towards learning on embedded platforms that have limited computing resources, memory capacity, and power consumption. Numerous recent publications have studied methods for training quantized networks, but these studies have mostly been empirical. In this work, we investigate training methods for quantized neural networks from a theoretical viewpoint. We first explore accuracy guarantees for training methods under convexity assumptions. We then look at the behavior of these algorithms for non-convex problems, and show that training algorithms that exploit high-precision representations have an important greedy search phase that purely quantized training methods lack, which explains the difficulty of training using low-precision arithmetic.

CVAug 31, 2016
Pruning Filters for Efficient ConvNets

Hao Li, Asim Kadav, Igor Durdanovic et al.

The success of CNNs in various applications is accompanied by a significant increase in the computation and parameter storage costs. Recent efforts toward reducing these overheads involve pruning and compressing the weights of various layers without hurting original accuracy. However, magnitude-based pruning of weights reduces a significant number of parameters from the fully connected layers and may not adequately reduce the computation costs in the convolutional layers due to irregular sparsity in the pruned networks. We present an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy. By removing whole filters in the network together with their connecting feature maps, the computation costs are reduced significantly. In contrast to pruning weights, this approach does not result in sparse connectivity patterns. Hence, it does not need the support of sparse convolution libraries and can work with existing efficient BLAS libraries for dense matrix multiplications. We show that even simple filter pruning techniques can reduce inference costs for VGG-16 by up to 34% and ResNet-110 by up to 38% on CIFAR10 while regaining close to the original accuracy by retraining the networks.