Lyndon Kennedy

h-index15

5papers

1,802citations

Novelty43%

AI Score19

Ranked #186,723 of 194,257 authors (top 96%)#58,006 in CV (top 98%)

5 Papers

5.7IRJul 15

Personalizing Incremental Video Search with Hybrid Text and ID Embeddings

Vivek Kanojiya, Vishalaksh Aggarwal, Daeho Baek et al.

Incremental video search requires high-quality ranking after each keystroke, where intent is often underspecified (e.g., 1-3 character prefixes). We present a personalization system for Apple TV search that combines complementary semantic and collaborative signals at ranking time. Our approach learns two item embedding spaces: (i) a text-based multilingual encoder (TextEmb) fine-tuned on co-engagement triplets via contrastive learning, and (ii) an ID-based collaborative embedding model (IdEmb) trained on interaction-derived positives. At serving time, we construct user representations from recent watch history and inject text- and ID-based user-item cosine similarities into a pairwise XGBoost ranker. We evaluate with temporally held-out offline datasets and a three-week online controlled experiment. Offline, for sessions with user history, the personalized ranker improves NDCG@10 by 2.99% and MRR by 3.30% over the non-personalized baseline. Slice analyses show that personalization is most needed in incremental search, where intent is still forming: on ambiguous prefix queries (1-3 characters), NDCG@10 lift is +8.63%, versus +1.46% on longer, fully specified queries. Longer-history users benefit more: NDCG lift rises from +2.13% for users with 1-5 history items to +4.37% for users with 51-100, even though baseline relevance is lower for these cohorts (NDCG@10 drops from 0.733 to 0.680), indicating that personalization adds the most value where default ranking underperforms. Online, treatment yields statistically significant gains of +1.14% tap-through rate and +1.23% conversion rate, with a 2.91% improvement in converted-item rank position. We further analyze coverage-precision trade-offs between semantic and collaborative embeddings via ablations isolating each signal, and evaluate embedding quality on a held-out corpus with LLM-judged similarity labels to reduce click/exposure bias.

1.2CVJun 17, 2020

Pain Intensity Estimation from Mobile Video Using 2D and 3D Facial Keypoints

Matthew Lee, Lyndon Kennedy, Andreas Girgensohn et al.

Managing post-surgical pain is critical for successful surgical outcomes. One of the challenges of pain management is accurately assessing the pain level of patients. Self-reported numeric pain ratings are limited because they are subjective, can be affected by mood, and can influence the patient's perception of pain when making comparisons. In this paper, we introduce an approach that analyzes 2D and 3D facial keypoints of post-surgical patients to estimate their pain intensity level. Our approach leverages the previously unexplored capabilities of a smartphone to capture a dense 3D representation of a person's face as input for pain intensity level estimation. Our contributions are adata collection study with post-surgical patients to collect ground-truth labeled sequences of 2D and 3D facial keypoints for developing a pain estimation algorithm, a pain estimation model that uses multiple instance learning to overcome inherent limitations in facial keypoint sequences, and the preliminary results of the pain estimation model using 2D and 3D features with comparisons of alternate approaches.

3.3HCJun 11, 2020

Automatic Photo to Ideophone Manga Matching

David A. Shamma, Tony Dunnigan, Lyndon Kennedy

Photo applications offer tools for annotation via text and stickers. Ideophones, mimetic and onomatopoeic words, which are common in graphic novels, have yet to be explored for photo annotation use. We present a method for automatic ideophone recommendation and positioning of the text on photos. These annotations are accomplished by obtaining a list of ideophones with English definitions and applying a suite of visual object detectors to the image. Next, a semantic embedding maps the visual objects to the possible relevant ideophones. Our system stands in contrast to traditional computer vision-based annotation systems, which stop at recommending object and scene-level annotation, by providing annotations that are communicative, fun, and engaging. We test these annotations in Japanese and find they carry a strong preference and increase enjoyment and sharing likelihood when compared to unannotated and object-based annotated photos.

1.1CVApr 21, 2016

Visual Congruent Ads for Image Search

Yannis Kalantidis, Ayman Farahat, Lyndon Kennedy et al.

The quality of user experience online is affected by the relevance and placement of advertisements. We propose a new system for selecting and displaying visual advertisements in image search result sets. Our method compares the visual similarity of candidate ads to the image search results and selects the most visually similar ad to be displayed. The method further selects an appropriate location in the displayed image grid to minimize the perceptual visual differences between the ad and its neighbors. We conduct an experiment with about 900 users and find that our proposed method provides significant improvement in the users' overall satisfaction with the image search experience, without diminishing the users' ability to see the ad or recall the advertised brand.

3.0CVApr 21, 2016

LOH and behold: Web-scale visual search, recommendation and clustering using Locally Optimized Hashing

Yannis Kalantidis, Lyndon Kennedy, Huy Nguyen et al.

We propose a novel hashing-based matching scheme, called Locally Optimized Hashing (LOH), based on a state-of-the-art quantization algorithm that can be used for efficient, large-scale search, recommendation, clustering, and deduplication. We show that matching with LOH only requires set intersections and summations to compute and so is easily implemented in generic distributed computing systems. We further show application of LOH to: a) large-scale search tasks where performance is on par with other state-of-the-art hashing approaches; b) large-scale recommendation where queries consisting of thousands of images can be used to generate accurate recommendations from collections of hundreds of millions of images; and c) efficient clustering with a graph-based algorithm that can be scaled to massive collections in a distributed environment or can be used for deduplication for small collections, like search results, performing better than traditional hashing approaches while only requiring a few milliseconds to run. In this paper we experiment on datasets of up to 100 million images, but in practice our system can scale to larger collections and can be used for other types of data that have a vector representation in a Euclidean space.