Sagi Eppel

CV
h-index12
21papers
368citations
Novelty43%
AI Score41

21 Papers

CVDec 1, 2022
One-shot recognition of any material anywhere using contrastive learning with physics-based rendering

Manuel S. Drehwald, Sagi Eppel, Jolina Li et al.

Visual recognition of materials and their states is essential for understanding most aspects of the world, from determining whether food is cooked, metal is rusted, or a chemical reaction has occurred. However, current image recognition methods are limited to specific classes and properties and can't handle the vast number of material states in the world. To address this, we present MatSim: the first dataset and benchmark for computer vision-based recognition of similarities and transitions between materials and textures, focusing on identifying any material under any conditions using one or a few examples. The dataset contains synthetic and natural images. The synthetic images were rendered using giant collections of textures, objects, and environments generated by computer graphics artists. We use mixtures and gradual transitions between materials to allow the system to learn cases with smooth transitions between states (like gradually cooked food). We also render images with materials inside transparent containers to support beverage and chemistry lab use cases. We use this dataset to train a siamese net that identifies the same material in different objects, mixtures, and environments. The descriptor generated by this net can be used to identify the states of materials and their subclasses using a single image. We also present the first few-shot material recognition benchmark with images from a wide range of fields, including the state of foods and drinks, types of grounds, and many other use cases. We show that a net trained on the MatSim synthetic dataset outperforms state-of-the-art models like Clip on the benchmark and also achieves good results on other unsupervised material classification tasks.

AISep 23, 2022
Predicting the Future of AI with AI: High-quality link prediction in an exponentially growing knowledge network

Mario Krenn, Lorenzo Buffoni, Bruno Coutinho et al.

A tool that could suggest new personalized research directions and ideas by taking insights from the scientific literature could significantly accelerate the progress of science. A field that might benefit from such an approach is artificial intelligence (AI) research, where the number of scientific publications has been growing exponentially over the last years, making it challenging for human researchers to keep track of the progress. Here, we use AI techniques to predict the future research directions of AI itself. We develop a new graph-based benchmark based on real-world data -- the Science4Cast benchmark, which aims to predict the future state of an evolving semantic network of AI. For that, we use more than 100,000 research papers and build up a knowledge network with more than 64,000 concept nodes. We then present ten diverse methods to tackle this task, ranging from pure statistical to pure learning methods. Surprisingly, the most powerful methods use a carefully curated set of network features, rather than an end-to-end AI approach. It indicates a great potential that can be unleashed for purely ML approaches without human knowledge. Ultimately, better predictions of new future research directions will be a crucial component of more advanced research suggestion tools.

CVJan 8
Coding the Visual World: From Image to Simulation Using Vision Language Models

Sagi Eppel

The ability to construct mental models of the world is a central aspect of understanding. Similarly, visual understanding can be viewed as the ability to construct a representative model of the system depicted in an image. This work explores the capacity of Vision Language Models (VLMs) to recognize and simulate the systems and mechanisms depicted in images using the Im2Sim methodology. The VLM is given a natural image of a real-world system (e.g., cities, clouds, vegetation) and is tasked with describing the system and writing code that simulates and generates it. This generative code is then executed to produce a synthetic image, which is compared against the original. This approach is tested on various complex emergent systems, ranging from physical systems (waves, lights, clouds) to vegetation, cities, materials, and geological formations. Through analysis of the models and images generated by the VLMs, we examine their understanding of the systems in images. The results show that leading VLMs (GPT, Gemini) have the ability to understand and model complex, multi-component systems across multiple layers of abstraction and a wide range of domains. At the same time, the VLMs exhibit limited ability to replicate fine details and low-level arrangements of patterns in the image. These findings reveal an interesting asymmetry: VLMs combine high-level, deep visual understanding of images with limited perception of fine details.

CVNov 3, 2025
SciTextures: Collecting and Connecting Visual Patterns, Models, and Code Across Science and Art

Sagi Eppel, Alona Strugatski

The ability to connect visual patterns with the processes that form them represents one of the deepest forms of visual understanding. Textures of clouds and waves, the growth of cities and forests, or the formation of materials and landscapes are all examples of patterns emerging from underlying mechanisms. We present the Scitextures dataset, a large-scale collection of textures and visual patterns from all domains of science, tech, and art, along with the models and code that generate these images. Covering over 1,200 different models and 100,000 images of patterns and textures from physics, chemistry, biology, sociology, technology, mathematics, and art, this dataset offers a way to explore the connection between the visual patterns that shape our world and the mechanisms that produce them. Created by an agentic AI pipeline that autonomously collects and implements models in standardized form, we use SciTextures to evaluate the ability of leading AI models to link visual patterns to the models and code that generate them, and to identify different patterns that emerged from the same process. We also test AIs ability to infer and recreate the mechanisms behind visual patterns by providing a natural image of a real-world pattern and asking the AI to identify, model, and code the mechanism that formed the pattern, then run this code to generate a simulated image that is compared to the real image. These benchmarks show that vision-language models (VLMs) can understand and simulate the physical system beyond a visual pattern. The dataset and code are available at: https://zenodo.org/records/17485502

CVMar 29, 2025
Shape and Texture Recognition in Large Vision-Language Models

Sagi Eppel, Mor Bismut, Alona Faktor-Strugatski

Shapes and textures are the basic building blocks of visual perception. The ability to identify shapes regardless of orientation, texture, or context, and to recognize textures and materials independently of their associated objects, is essential for a general visual understanding of the world. This work introduces the Large Shape and Textures dataset (LAS&T), a giant collection of highly diverse shapes and textures, created by unsupervised extraction of patterns from natural images. This dataset is used to benchmark how effectively leading Large Vision-Language Models (VLM) recognize and represent shapes, textures, and materials in 2D and 3D scenes. For shape recognition, we test the models' ability to match images of identical shapes that differ in orientation, texture, color, or environment. Our results show that the shape recognition capabilities of the LVLMs remain significantly below human performance. VLMs rely predominantly on high-level and semantic features and struggle with abstract shapes lacking class associations. For texture and material recognition, we evaluated the models' ability to identify images with identical textures and materials across different objects and environments. Interestingly, leading LVLMs approach human-level performance in recognizing materials in 3D scenes, yet substantially underperform humans when identifying simpler, more abstract 2D textures and shapes. These results are consistent across a wide range of leading LVLMs (GPT/Gemini/Qwen) and foundation vision models (DINO/CLIP), exposing major deficiencies in the ability of leading models to extract and represent low-level visual features. In contrast, humans and simple nets trained directly for these tasks achieve high accuracy. The LAS&T dataset, featuring over 700,000 images for 2D/3D shape, texture, and material recognition and retrieval is freely available.

CVMar 5, 2024
Learning Zero-Shot Material States Segmentation, by Implanting Natural Image Patterns in Synthetic Data

Sagi Eppel, Jolina Li, Manuel Drehwald et al.

Visual recognition of materials and their states is essential for understanding the physical world, from identifying wet regions on surfaces or stains on fabrics to detecting infected areas on plants or minerals in rocks. Collecting data that captures this vast variability is complex due to the scattered and gradual nature of material states. Manually annotating real-world images is constrained by cost and precision, while synthetic data, although accurate and inexpensive, lacks real-world diversity. This work aims to bridge this gap by infusing patterns automatically extracted from real-world images into synthetic data. Hence, patterns collected from natural images are used to generate and map materials into synthetic scenes. This unsupervised approach captures the complexity of the real world while maintaining the precision and scalability of synthetic data. We also present the first comprehensive benchmark for zero-shot material state segmentation, utilizing real-world images across a diverse range of domains, including food, soils, construction, plants, liquids, and more, each appears in various states such as wet, dry, infected, cooked, burned, and many others. The annotation includes partial similarity between regions with similar but not identical materials and hard segmentation of only identical material states. This benchmark eluded top foundation models, exposing the limitations of existing data collection methods. Meanwhile, nets trained on the infused data performed significantly better on this and related tasks. The dataset, code, and trained model are available. We also share 300,000 extracted textures and SVBRDF/PBR materials to facilitate future datasets generation.

CVDec 14, 2024
Do large language vision models understand 3D shapes?

Sagi Eppel

Large vision language models (LVLM) are the leading A.I approach for achieving a general visual understanding of the world. Models such as GPT, Claude, Gemini, and LLama can use images to understand and analyze complex visual scenes. 3D objects and shapes are the basic building blocks of the world, recognizing them is a fundamental part of human perception. The goal of this work is to test whether LVLMs truly understand 3D shapes by testing the models ability to identify and match objects of the exact same 3D shapes but with different orientations and materials/textures. A large number of test images were created using CGI with a huge number of highly diverse objects, materials, and scenes. The results of this test show that the ability of such models to match 3D shapes is significantly below humans but much higher than random guesses. Suggesting that the models have gained some abstract understanding of 3D shapes but still trail far beyond humans in this task. Mainly it seems that the models can easily identify the same object with a different orientation as well as matching identical 3D shapes of the same orientation but with different materials and textures. However, when both the object material and orientation are changed, all models perform poorly relative to humans. Code and benchmark are available.

CVJun 24, 2024
Vastextures: Vast repository of textures and PBR materials extracted from real-world images using unsupervised methods

Sagi Eppel

Vastextures is a vast repository of 500,000 textures and PBR materials extracted from real-world images using an unsupervised process. The extracted materials and textures are extremely diverse and cover a vast range of real-world patterns, but at the same time less refined compared to existing repositories. The repository is composed of 2D textures cropped from natural images and SVBRDF/PBR materials generated from these textures. Textures and PBR materials are essential for CGI. Existing materials repositories focus on games, animation, and arts, that demand a limited amount of high-quality assets. However, virtual worlds and synthetic data are becoming increasingly important for training A.I systems for computer vision. This application demands a huge amount of diverse assets but at the same time less affected by noisy and unrefined assets. Vastexture aims to address this need by creating a free, huge, and diverse assets repository that covers as many real-world materials as possible. The materials are automatically extracted from natural images in two steps: 1) Automatically scanning a giant amount of images to identify and crop regions with uniform textures. This is done by splitting the image into a grid of cells and identifying regions in which all of the cells share a similar statistical distribution. 2) Extracting the properties of the PBR material from the cropped texture. This is done by randomly guessing every correlation between the properties of the texture image and the properties of the PBR material. The resulting PBR materials exhibit a vast amount of real-world patterns as well as unexpected emergent properties. Neutral nets trained on this repository outperformed nets trained using handcrafted assets.

CVSep 30, 2021
Seeing Glass: Joint Point Cloud and Depth Completion for Transparent Objects

Haoping Xu, Yi Ru Wang, Sagi Eppel et al.

The basis of many object manipulation algorithms is RGB-D input. Yet, commodity RGB-D sensors can only provide distorted depth maps for a wide range of transparent objects due light refraction and absorption. To tackle the perception challenges posed by transparent objects, we propose TranspareNet, a joint point cloud and depth completion method, with the ability to complete the depth of transparent objects in cluttered and complex scenes, even with partially filled fluid contents within the vessels. To address the shortcomings of existing transparent object data collection schemes in literature, we also propose an automated dataset creation workflow that consists of robot-controlled image collection and vision-based automatic annotation. Through this automated workflow, we created Toronto Transparent Objects Depth Dataset (TODD), which consists of nearly 15000 RGB-D images. Our experimental evaluation demonstrates that TranspareNet outperforms existing state-of-the-art depth completion methods on multiple datasets, including ClearGrasp, and that it also handles cluttered scenes when trained on TODD. Code and dataset will be released at https://www.pair.toronto.edu/TranspareNet/

CVSep 15, 2021
Predicting 3D shapes, masks, and properties of materials, liquids, and objects inside transparent containers, using the TransProteus CGI dataset

Sagi Eppel, Haoping Xu, Yi Ru Wang et al.

We present TransProteus, a dataset, and methods for predicting the 3D structure, masks, and properties of materials, liquids, and objects inside transparent vessels from a single image without prior knowledge of the image source and camera parameters. Manipulating materials in transparent containers is essential in many fields and depends heavily on vision. This work supplies a new procedurally generated dataset consisting of 50k images of liquids and solid objects inside transparent containers. The image annotations include 3D models, material properties (color/transparency/roughness...), and segmentation masks for the vessel and its content. The synthetic (CGI) part of the dataset was procedurally generated using 13k different objects, 500 different environments (HDRI), and 1450 material textures (PBR) combined with simulated liquids and procedurally generated vessels. In addition, we supply 104 real-world images of objects inside transparent vessels with depth maps of both the vessel and its content. We propose a camera agnostic method that predicts 3D models from an image as an XYZ map. This allows the trained net to predict the 3D model as a map with XYZ coordinates per pixel without prior knowledge of the image source. To calculate the training loss, we use the distance between pairs of points inside the 3D model instead of the absolute XYZ coordinates. This makes the loss function translation invariant. We use this to predict 3D models of vessels and their content from a single image. Finally, we demonstrate a net that uses a single image to predict the material properties of the vessel content and surface.

CVMay 4, 2021
Computer vision for liquid samples in hospitals and medical labs using hierarchical image segmentation and relations prediction

Sagi Eppel, Haoping Xu, Alan Aspuru-Guzik

This work explores the use of computer vision for image segmentation and classification of medical fluid samples in transparent containers (for example, tubes, syringes, infusion bags). Handling fluids such as infusion fluids, blood, and urine samples is a significant part of the work carried out in medical labs and hospitals. The ability to accurately identify and segment the liquids and the vessels that contain them from images can help in automating such processes. Modern computer vision typically involves training deep neural nets on large datasets of annotated images. This work presents a new dataset containing 1,300 annotated images of medical samples involving vessels containing liquids and solid material. The images are annotated with the type of liquid (e.g., blood, urine), the phase of the material (e.g., liquid, solid, foam, suspension), the type of vessel (e.g., syringe, tube, cup, infusion bottle/bag), and the properties of the vessel (transparent, opaque). In addition, vessel parts such as corks, labels, spikes, and valves are annotated. Relations and hierarchies between vessels and materials are also annotated, such as which vessel contains which material or which vessels are linked or contain each other. Three neural networks are trained on the dataset: One network learns to detect vessels, a second net detects the materials and parts inside each vessel, and a third net identifies relationships and connectivity between vessels.

LGDec 17, 2020
Deep Molecular Dreaming: Inverse machine learning for de-novo molecular design and interpretability with surjective representations

Cynthia Shen, Mario Krenn, Sagi Eppel et al.

Computer-based de-novo design of functional molecules is one of the most prominent challenges in cheminformatics today. As a result, generative and evolutionary inverse designs from the field of artificial intelligence have emerged at a rapid pace, with aims to optimize molecules for a particular chemical property. These models 'indirectly' explore the chemical space; by learning latent spaces, policies, distributions or by applying mutations on populations of molecules. However, the recent development of the SELFIES string representation of molecules, a surjective alternative to SMILES, have made possible other potential techniques. Based on SELFIES, we therefore propose PASITHEA, a direct gradient-based molecule optimization that applies inceptionism techniques from computer vision. PASITHEA exploits the use of gradients by directly reversing the learning process of a neural network, which is trained to predict real-valued chemical properties. Effectively, this forms an inverse regression model, which is capable of generating molecular variants optimized for a certain property. Although our results are preliminary, we observe a shift in distribution of a chosen property during inverse-training, a clear indication of PASITHEA's viability. A striking property of inceptionism is that we can directly probe the model's understanding of the chemical space it was trained on. We expect that extending PASITHEA to larger datasets, molecules and more complex properties will lead to advances in the design of new functional molecules as well as the interpretation and explanation of machine learning models.

CVAug 24, 2019
Generator evaluator-selector net for panoptic image segmentation and splitting unfamiliar objects into parts

Sagi Eppel, Alan Aspuru-Guzik

In machine learning and other fields, suggesting a good solution to a problem is usually a harder task than evaluating the quality of such a solution. This asymmetry is the basis for a large number of selection oriented methods that use a generator system to guess a set of solutions and an evaluator system to rank and select the best solutions. This work examines the use of this approach to the problem of panoptic image segmentation and class agnostic parts segmentation. The generator/evaluator approach for this case consists of two independent convolutional neural nets: a generator net that suggests variety segments corresponding to objects, stuff and parts regions in the image, and an evaluator net that chooses the best segments to be merged into the segmentation map. The result is a trial and error evolutionary approach in which a generator that guesses segments with low average accuracy, but with wide variability, can still produce good results when coupled with an accurate evaluator. The generator consists of a Pointer net that receives an image and a point in the image, and predicts the region of the segment containing the point. Generating and evaluating each segment separately is essential in this case since it demands exponentially fewer guesses compared to a system that guesses and evaluates the full segmentation map in each try. The classification of the selected segments is done by an independent region-specific classification net. This allows the segmentation to be class agnostic and hence, capable of segmenting unfamiliar categories that were not part of the training set. The method was examined on the COCO Panoptic segmentation benchmark and gave results comparable to those of the basic semantic segmentation and Mask-RCNN methods. In addition, the system was used for the task of splitting objects of unseen classes (that did not appear in the training set) into parts.

CVFeb 20, 2019
Class-independent sequential full image segmentation, using a convolutional net that finds a segment within an attention region, given a pointer pixel within this segment

Sagi Eppel

This work examines the use of a fully convolutional net (FCN) to find an image segment, given a pixel within this segment region. The net receives an image, a point in the image and a region of interest (RoI ) mask. The net output is a binary mask of the segment in which the point is located. The region where the segment can be found is contained within the input RoI mask. Full image segmentation can be achieved by running this net sequentially, region-by-region on the image, and stitching the output segments into a single segmentation map. This simple method addresses two major challenges of image segmentation: 1) Segmentation of unknown categories that were not included in the training set. 2) Segmentation of both individual object instances (things) and non-objects (stuff), such as sky and vegetation. Hence, if the pointer pixel is located within a person in a group, the net will output a mask that covers that individual person; if the pointer point is located within the sky region, the net returns the region of the sky in the image. This is true even if no example for sky or person appeared in the training set. The net was tested and trained on the COCO panoptic dataset and achieved 67% IOU for segmentation of familiar classes (that were part of the net training set) and 53% IOU for segmentation of unfamiliar classes (that were not included in the training).

CVDec 1, 2018
Classifying a specific image region using convolutional nets with an ROI mask as input

Sagi Eppel

Convolutional neural nets (CNN) are the leading computer vision method for classifying images. In some cases, it is desirable to classify only a specific region of the image that corresponds to a certain object. Hence, assuming that the region of the object in the image is known in advance and is given as a binary region of interest (ROI) mask, the goal is to classify the object in this region using a convolutional neural net. This goal is achieved using a standard image classification net with the addition of a side branch, which converts the ROI mask into an attention map. This map is then combined with the image classification net. This allows the net to focus the attention on the object region while still extracting contextual cues from the background. This approach was evaluated using the COCO object dataset and the OpenSurfaces materials dataset. In both cases, it gave superior results to methods that completely ignore the background region. In addition, it was found that combining the attention map at the first layer of the net gave better results than combining it at higher layers of the net. The advantages of this method are most apparent in the classification of small regions which demands a great deal of contextual information from the background.

CVOct 14, 2017
Hierarchical semantic segmentation using modular convolutional neural networks

Sagi Eppel

Image recognition tasks that involve identifying parts of an object or the contents of a vessel can be viewed as a hierarchical problem, which can be solved by initial recognition of the main object, followed by recognition of its parts or contents. To achieve such modular recognition, it is necessary to use the output of one recognition method (which identifies the general object) as the input for a second method (which identifies the parts or contents). In recent years, convolutional neural networks have emerged as the dominant method for segmentation and classification of images. This work examines a method for serially connecting convolutional neural networks for semantic segmentation of materials inside transparent vessels. It applies one fully convolutional neural net to segment the image into vessel and background, and the vessel region is used as an input for a second net which recognizes the contents of the glass vessel. Transferring the segmentation map generated by the first nets to the second net was performed using the valve filter attention method that involves using different filters on different segments of the image. This modular semantic segmentation method outperforms a single step method in which both the vessel and its contents are identified using a single net. An advantage of the modular neural net is that it allows networks to be built from existing trained modules, as well the transfer and reuse of trained net modules without the need for any retraining of the assembled net.

CVAug 29, 2017
Setting an attention region for convolutional neural networks using region selective features, for recognition of materials within glass vessels

Sagi Eppel

Convolutional neural networks have emerged as the leading method for the classification and segmentation of images. In some cases, it is desirable to focus the attention of the net on a specific region in the image; one such case is the recognition of the contents of transparent vessels, where the vessel region in the image is already known. This work presents a valve filter approach for focusing the attention of the net on a region of interest (ROI). In this approach, the ROI is inserted into the net as a binary map. The net uses a different set of convolution filters for the ROI and background image regions, resulting in a different set of features being extracted from each region. More accurately, for each filter used on the image, a corresponding valve filter exists that acts on the ROI map and determines the regions in which the corresponding image filter will be used. This valve filter effectively acts as a valve that inhibits specific features in different image regions according to the ROI map. In addition, a new data set for images of materials in glassware vessels in a chemistry laboratory setting is presented. This data set contains a thousand images with pixel-wise annotation according to categories ranging from filled and empty to the exact phase of the material inside the vessel. The results of the valve filter approach and fully convolutional neural nets (FCN) with no ROI input are compared based on this data set.

CVJan 31, 2016
Tracing liquid level and material boundaries in transparent vessels using the graph cut computer vision approach

Sagi Eppel

Detection of boundaries of materials stored in transparent vessels is essential for identifying properties such as liquid level and phase boundaries, which are vital for controlling numerous processes in the industry and chemistry laboratory. This work presents a computer vision method for identifying the boundary of materials in transparent vessels using the graph-cut algorithm. The method receives an image of a transparent vessel containing a material and the contour of the vessel in the image. The boundary of the material in the vessel is found by the graph cut method. In general the method uses the vessel region of the image to create a graph, where pixels are vertices, and the cost of an edge between two pixels is inversely correlated with their intensity difference. The bottom 10% of the vessel region in the image is assumed to correspond to the material phase and defined as the graph and source. The top 10% of the pixels in the vessels are assumed to correspond to the air phase and defined as the graph sink. The minimal cut that splits the resulting graph between the source and sink (hence, material and air) is traced using the max-flow/min-cut approach. This cut corresponds to the boundary of the material in the image. The method gave high accuracy in boundary recognition for a wide range of liquid, solid, granular and powder materials in various glass vessels from everyday life and the chemistry laboratory, such as bottles, jars, Glasses, Chromotography colums and separatory funnels.

CVMay 30, 2015
Using curvature to distinguish between surface reflections and vessel contents in computer vision based recognition of materials in transparent vessels

Sagi Eppel

The recognition of materials and objects inside transparent containers using computer vision has a wide range of applications, ranging from industrial bottles filling to the automation of chemistry laboratory. One of the main challenges in such recognition is the ability to distinguish between image features resulting from the vessels surface and image features resulting from the material inside the vessel. Reflections and the functional parts of a vessels surface can create strong edges that can be mistakenly identified as corresponding to the vessel contents, and cause recognition errors. The ability to evaluate whether a specific edge in an image stems from the vessels surface or from its contents can considerably improve the ability to identify materials inside transparent vessels. This work will suggest a method for such evaluation, based on the following two assumptions: 1) Areas of high curvature on the vessel surface are likely to cause strong edges due to changes in reflectivity, as is the appearance of functional parts (e.g. corks or valves). 2) Most transparent vessels (bottles, glasses) have high symmetry (cylindrical). As a result the curvature angle of the vessels surface at each point of the image is similar to the curvature angle of the contour line of the vessel in the same row in the image. These assumptions, allow the identification of image regions with strong edges corresponding to the vessel surface reflections. Combining this method with existing image analysis methods for detecting materials inside transparent containers allows considerable improvement in accuracy.

CVJan 20, 2015
Tracing the boundaries of materials in transparent vessels using computer vision

Sagi Eppel

Visual recognition of material boundaries in transparent vessels is valuable for numerous applications. Such recognition is essential for estimation of fill-level, volume and phase-boundaries as well as for tracking of such chemical processes as precipitation, crystallization, condensation, evaporation and phase-separation. The problem of material boundary recognition in images is particularly complex for materials with non-flat surfaces, i.e., solids, powders and viscous fluids, in which the material interfaces have unpredictable shapes. This work demonstrates a general method for finding the boundaries of materials inside transparent containers in images. The method uses an image of the transparent vessel containing the material and the boundary of the vessel in this image. The recognition is based on the assumption that the material boundary appears in the image in the form of a curve (with various constraints) whose endpoints are both positioned on the vessel contour. The probability that a curve matches the material boundary in the image is evaluated using a cost function based on some image properties along this curve. Several image properties were examined as indicators for the material boundary. The optimal boundary curve was found using Dijkstra's algorithm. The method was successfully examined for recognition of various types of phase-boundaries, including liquid-air, solid-air and solid-liquid interfaces, as well as for various types of glassware containers from everyday life and the chemistry laboratory (i.e., bottles, beakers, flasks, jars, columns, vials and separation-funnels). In addition, the method can be easily extended to materials carried on top of carrier vessels (i.e., plates, spoons, spatulas).

CVApr 28, 2014
Computer vision-based recognition of liquid surfaces and phase boundaries in transparent vessels, with emphasis on chemistry applications

Sagi Eppel, Tal Kachman

The ability to recognize the liquid surface and the liquid level in transparent containers is perhaps the most commonly used evaluation method when dealing with fluids. Such recognition is essential in determining the liquid volume, fill level, phase boundaries and phase separation in various fluid systems. The recognition of liquid surfaces is particularly important in solution chemistry, where it is essential to many laboratory techniques (e.g., extraction, distillation, titration). A general method for the recognition of interfaces between liquid and air or between phase-separating liquids could have a wide range of applications and contribute to the understanding of the visual properties of such interfaces. This work examines a computer vision method for the recognition of liquid surfaces and liquid levels in various transparent containers. The method can be applied to recognition of both liquid-air and liquid-liquid surfaces. No prior knowledge of the number of phases is required. The method receives the image of the liquid container and the boundaries of the container in the image and scans all possible curves that could correspond to the outlines of liquid surfaces in the image. The method then compares each curve to the image to rate its correspondence with the outline of the real liquid surface by examining various image properties in the area surrounding each point of the curve. The image properties that were found to give the best indication of the liquid surface are the relative intensity change, the edge density change and the gradient direction relative to the curve normal.