Omar Javed

h-index67

3papers

4citations

Novelty43%

AI Score34

Ranked #130,697 of 205,806 authors (top 64%)#40,917 in CV (top 69%)

3 Papers

CVFeb 12

Adapting Vision-Language Models for E-commerce Understanding at Scale

Matteo Nulli, Vladimir Orshulevich, Tala Bazazo et al.

E-commerce product understanding demands by nature, strong multimodal comprehension from text, images, and structured attributes. General-purpose Vision-Language Models (VLMs) enable generalizable multimodal latent modelling, yet there is no documented, well-known strategy for adapting them to the attribute-centric, multi-image, and noisy nature of e-commerce data, without sacrificing general performance. In this work, we show through a large-scale experimental study, how targeted adaptation of general VLMs can substantially improve e-commerce performance while preserving broad multimodal capabilities. Furthermore, we propose a novel extensive evaluation suite covering deep product understanding, strict instruction following, and dynamic attribute extraction.

CRJan 11, 2021

Understanding the Quality of Container Security Vulnerability Detection Tools

Omar Javed, Salman Toor

Virtualization enables information and communications technology industry to better manage computing resources. In this regard, improvements in virtualization approaches together with the need for consistent runtime environment, lower overhead and smaller package size has led to the growing adoption of containers. This is a technology, which packages an application, its dependencies and Operating System (OS) to run as an isolated unit. However, the pressing concern with the use of containers is its susceptibility to security attacks. Consequently, a number of container scanning tools are available for detecting container security vulnerabilities. Therefore, in this study, we investigate the quality of existing container scanning tools by proposing two metrics that reflects coverage and accuracy. We analyze 59 popular public container images for Java applications hosted on DockerHub using different container scanning tools (such as Clair, Anchore, and Microscanner). Our findings show that existing container scanning approach does not detect application package vulnerabilities. Furthermore, existing tools do not have high accuracy, since 34% vulnerabilities are being missed by the best performing tool. Finally, we also demonstrate quality of Docker images for Java applications hosted on DockerHub by assessing complete vulnerability landscape i.e., number of vulnerabilities detected in images.

CVOct 25, 2015

Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries

S. Hussain Raza, Omar Javed, Aveek Das et al.

We present an algorithm to estimate depth in dynamic video scenes. We propose to learn and infer depth in videos from appearance, motion, occlusion boundaries, and geometric context of the scene. Using our method, depth can be estimated from unconstrained videos with no requirement of camera pose estimation, and with significant background/foreground motions. We start by decomposing a video into spatio-temporal regions. For each spatio-temporal region, we learn the relationship of depth to visual appearance, motion, and geometric classes. Then we infer the depth information of new scenes using piecewise planar parametrization estimated within a Markov random field (MRF) framework by combining appearance to depth learned mappings and occlusion boundary guided smoothness constraints. Subsequently, we perform temporal smoothing to obtain temporally consistent depth maps. To evaluate our depth estimation algorithm, we provide a novel dataset with ground truth depth for outdoor video scenes. We present a thorough evaluation of our algorithm on our new dataset and the publicly available Make3d static image dataset.