Artur Grigorev

CV
h-index42
17papers
660citations
Novelty50%
AI Score41

17 Papers

CVDec 14, 2022
HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics

Artur Grigorev, Bernhard Thomaszewski, Michael J. Black et al.

We propose a method that leverages graph neural networks, multi-level message passing, and unsupervised training to enable real-time prediction of realistic clothing dynamics. Whereas existing methods based on linear blend skinning must be trained for specific garments, our method is agnostic to body shape and applies to tight-fitting garments as well as loose, free-flowing clothing. Our method furthermore handles changes in topology (e.g., garments with buttons or zippers) and material properties at inference time. As one key contribution, we propose a hierarchical message-passing scheme that efficiently propagates stiff stretching modes while preserving local detail. We empirically show that our method outperforms strong baselines quantitatively and that its results are perceived as more realistic than state-of-the-art methods.

CVSep 12, 2024
Gaussian Garments: Reconstructing Simulation-Ready Clothing with Photorealistic Appearance from Multi-View Video

Boxiang Rong, Artur Grigorev, Wenbo Wang et al.

We introduce Gaussian Garments, a novel approach for reconstructing realistic simulation-ready garment assets from multi-view videos. Our method represents garments with a combination of a 3D mesh and a Gaussian texture that encodes both the color and high-frequency surface details. This representation enables accurate registration of garment geometries to multi-view videos and helps disentangle albedo textures from lighting effects. Furthermore, we demonstrate how a pre-trained graph neural network (GNN) can be fine-tuned to replicate the real behavior of each garment. The reconstructed Gaussian Garments can be automatically combined into multi-garment outfits and animated with the fine-tuned GNN.

LGSep 19, 2022
Traffic incident duration prediction via a deep learning framework for text description encoding

Artur Grigorev, Adriana-Simona Mihaita, Khaled Saleh et al.

Predicting the traffic incident duration is a hard problem to solve due to the stochastic nature of incident occurrence in space and time, a lack of information at the beginning of a reported traffic disruption, and lack of advanced methods in transport engineering to derive insights from past accidents. This paper proposes a new fusion framework for predicting the incident duration from limited information by using an integration of machine learning with traffic flow/speed and incident description as features, encoded via several Deep Learning methods (ANN autoencoder and character-level LSTM-ANN sentiment classifier). The paper constructs a cross-disciplinary modelling approach in transport and data science. The approach improves the incident duration prediction accuracy over the top-performing ML models applied to baseline incident reports. Results show that our proposed method can improve the accuracy by $60\%$ when compared to standard linear or support vector regression models, and a further $7\%$ improvement with respect to the hybrid deep learning auto-encoded GBDT model which seems to outperform all other models. The application area is the city of San Francisco, rich in both traffic incident logs (Countrywide Traffic Accident Data set) and past historical traffic congestion information (5-minute precision measurements from Caltrans Performance Measurement System).

LGMay 10, 2022
Incident duration prediction using a bi-level machine learning framework with outlier removal and intra-extra joint optimisation

Artur Grigorev, Adriana-Simona Mihaita, Seunghyeon Lee et al.

Predicting the duration of traffic incidents is a challenging task due to the stochastic nature of events. The ability to accurately predict how long accidents will last can provide significant benefits to both end-users in their route choice and traffic operation managers in handling of non-recurrent traffic congestion. This paper presents a novel bi-level machine learning framework enhanced with outlier removal and intra-extra joint optimisation for predicting the incident duration on three heterogeneous data sets collected for both arterial roads and motorways from Sydney, Australia and San-Francisco, U.S.A. Firstly, we use incident data logs to develop a binary classification prediction approach, which allows us to classify traffic incidents as short-term or long-term. We find the optimal threshold between short-term versus long-term traffic incident duration, targeting both class balance and prediction performance while also comparing the binary versus multi-class classification approaches. Secondly, for more granularity of the incident duration prediction to the minute level, we propose a new Intra-Extra Joint Optimisation algorithm (IEO-ML) which extends multiple baseline ML models tested against several regression scenarios across the data sets. Final results indicate that: a) 40-45 min is the best split threshold for identifying short versus long-term incidents and that these incidents should be modelled separately, b) our proposed IEO-ML approach significantly outperforms baseline ML models in $66\%$ of all cases showcasing its great potential for accurate incident duration prediction. Lastly, we evaluate the feature importance and show that time, location, incident type, incident reporting source and weather at among the top 10 critical factors which influence how long incidents will last.

CVSep 20, 2022
Traffic Accident Risk Forecasting using Contextual Vision Transformers

Khaled Saleh, Artur Grigorev, Adriana-Simona Mihaita

Recently, the problem of traffic accident risk forecasting has been getting the attention of the intelligent transportation systems community due to its significant impact on traffic clearance. This problem is commonly tackled in the literature by using data-driven approaches that model the spatial and temporal incident impact, since they were shown to be crucial for the traffic accident risk forecasting problem. To achieve this, most approaches build different architectures to capture the spatio-temporal correlations features, making them inefficient for large traffic accident datasets. Thus, in this work, we are proposing a novel unified framework, namely a contextual vision transformer, that can be trained in an end-to-end approach which can effectively reason about the spatial and temporal aspects of the problem while providing accurate traffic accident risk predictions. We evaluate and compare the performance of our proposed methodology against baseline approaches from the literature across two large-scale traffic accident datasets from two different geographical locations. The results have shown a significant improvement with roughly 2\% in RMSE score in comparison to previous state-of-art works (SoTA) in the literature. Moreover, our proposed approach has outperformed the SoTA technique over the two datasets while only requiring 23x fewer computational requirements.

LGJun 27, 2024Code
Predicting the duration of traffic incidents for Sydney greater metropolitan area using machine learning methods

Artur Grigorev, Sajjad Shafiei, Hanna Grzybowska et al.

This research presents a comprehensive approach to predicting the duration of traffic incidents and classifying them as short-term or long-term across the Sydney Metropolitan Area. Leveraging a dataset that encompasses detailed records of traffic incidents, road network characteristics, and socio-economic indicators, we train and evaluate a variety of advanced machine learning models including Gradient Boosted Decision Trees (GBDT), Random Forest, LightGBM, and XGBoost. The models are assessed using Root Mean Square Error (RMSE) for regression tasks and F1 score for classification tasks. Our experimental results demonstrate that XGBoost and LightGBM outperform conventional models with XGBoost achieving the lowest RMSE of 33.7 for predicting incident duration and highest classification F1 score of 0.62 for a 30-minute duration threshold. For classification, the 30-minute threshold balances performance with 70.84% short-term duration classification accuracy and 62.72% long-term duration classification accuracy. Feature importance analysis, employing both tree split counts and SHAP values, identifies the number of affected lanes, traffic volume, and types of primary and secondary vehicles as the most influential features. The proposed methodology not only achieves high predictive accuracy but also provides stakeholders with vital insights into factors contributing to incident durations. These insights enable more informed decision-making for traffic management and response strategies. The code is available by the link: https://github.com/Future-Mobility-Lab/SydneyIncidents

CVApr 29, 2024
4D-DRESS: A 4D Dataset of Real-world Human Clothing with Semantic Annotations

Wenbo Wang, Hsuan-I Ho, Chen Guo et al.

The studies of human clothing for digital avatars have predominantly relied on synthetic datasets. While easy to collect, synthetic data often fall short in realism and fail to capture authentic clothing dynamics. Addressing this gap, we introduce 4D-DRESS, the first real-world 4D dataset advancing human clothing research with its high-quality 4D textured scans and garment meshes. 4D-DRESS captures 64 outfits in 520 human motion sequences, amounting to 78k textured scans. Creating a real-world clothing dataset is challenging, particularly in annotating and segmenting the extensive and complex 4D human scans. To address this, we develop a semi-automatic 4D human parsing pipeline. We efficiently combine a human-in-the-loop process with automation to accurately label 4D scans in diverse garments and body movements. Leveraging precise annotations and high-quality garment meshes, we establish several benchmarks for clothing simulation and reconstruction. 4D-DRESS offers realistic and challenging data that complements synthetic sources, paving the way for advancements in research of lifelike human clothing. Website: https://ait.ethz.ch/4d-dress.

GRMay 15, 2024
ContourCraft: Learning to Resolve Intersections in Neural Multi-Garment Simulations

Artur Grigorev, Giorgio Becherini, Michael J. Black et al.

Learning-based approaches to cloth simulation have started to show their potential in recent years. However, handling collisions and intersections in neural simulations remains a largely unsolved problem. In this work, we present \moniker{}, a learning-based solution for handling intersections in neural cloth simulations. Unlike conventional approaches that critically rely on intersection-free inputs, \moniker{} robustly recovers from intersections introduced through missed collisions, self-penetrating bodies, or errors in manually designed multi-layer outfits. The technical core of \moniker{} is a novel intersection contour loss that penalizes interpenetrations and encourages rapid resolution thereof. We integrate our intersection loss with a collision-avoiding repulsion objective into a neural cloth simulation method based on graph neural networks (GNNs). We demonstrate our method's ability across a challenging set of diverse multi-layer outfits under dynamic human motions. Our extensive analysis indicates that \moniker{} significantly improves collision handling for learned simulation and produces visually compelling results.

CVDec 23, 2024
ChatGarment: Garment Estimation, Generation and Editing via Large Language Models

Siyuan Bian, Chenghao Xu, Yuliang Xiu et al.

We introduce ChatGarment, a novel approach that leverages large vision-language models (VLMs) to automate the estimation, generation, and editing of 3D garments from images or text descriptions. Unlike previous methods that struggle in real-world scenarios or lack interactive editing capabilities, ChatGarment can estimate sewing patterns from in-the-wild images or sketches, generate them from text descriptions, and edit garments based on user instructions, all within an interactive dialogue. These sewing patterns can then be draped on a 3D body and animated. This is achieved by finetuning a VLM to directly generate a JSON file that includes both textual descriptions of garment types and styles, as well as continuous numerical attributes. This JSON file is then used to create sewing patterns through a programming parametric model. To support this, we refine the existing programming model, GarmentCode, by expanding its garment type coverage and simplifying its structure for efficient VLM fine-tuning. Additionally, we construct a large-scale dataset of image-to-sewing-pattern and text-to-sewing-pattern pairs through an automated data pipeline. Extensive evaluations demonstrate ChatGarment's ability to accurately reconstruct, generate, and edit garments from multimodal inputs, highlighting its potential to simplify workflows in fashion and gaming applications. Code and data are available at https://chatgarment.github.io/ .

LGMar 20, 2024
Enhancing Traffic Incident Management with Large Language Models: A Hybrid Machine Learning Approach for Severity Classification

Artur Grigorev, Khaled Saleh, Yuming Ou et al.

This research showcases the innovative integration of Large Language Models into machine learning workflows for traffic incident management, focusing on the classification of incident severity using accident reports. By leveraging features generated by modern language models alongside conventional data extracted from incident reports, our research demonstrates improvements in the accuracy of severity classification across several machine learning algorithms. Our contributions are threefold. First, we present an extensive comparison of various machine learning models paired with multiple large language models for feature extraction, aiming to identify the optimal combinations for accurate incident severity classification. Second, we contrast traditional feature engineering pipelines with those enhanced by language models, showcasing the superiority of language-based feature engineering in processing unstructured text. Third, our study illustrates how merging baseline features from accident reports with language-based features can improve the severity classification accuracy. This comprehensive approach not only advances the field of incident management but also highlights the cross-domain application potential of our methodology, particularly in contexts requiring the prediction of event outcomes from unstructured textual data or features translated into textual representation. Specifically, our novel methodology was applied to three distinct datasets originating from the United States, the United Kingdom, and Queensland, Australia. This cross-continental application underlines the robustness of our approach, suggesting its potential for widespread adoption in improving incident management processes globally.

LGApr 29, 2024
IncidentResponseGPT: Generating Traffic Incident Response Plans with Generative Artificial Intelligence

Artur Grigorev, Adriana-Simona Mihaita Khaled Saleh, Yuming Ou

The proposed IncidentResponseGPT framework - a novel system that applies generative artificial intelligence (AI) to potentially enhance the efficiency and effectiveness of traffic incident response. This model allows for synthesis of region-specific incident response guidelines and generates incident response plans adapted to specific area, aiming to expedite decision-making for traffic management authorities. This approach aims to accelerate incident resolution times by suggesting various recommendations (e.g. optimal rerouting strategies, estimating resource needs) to minimize the overall impact on the urban traffic network. The system suggests specific actions, including dynamic lane closures, optimized rerouting and dispatching appropriate emergency resources. We utilize the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) to rank generated response plans based on criteria like impact minimization and resource efficiency based on their proximity to an human-proposed solution.

SYJun 3, 2025
Automated Traffic Incident Response Plans using Generative Artificial Intelligence: Part 1 -- Building the Incident Response Benchmark

Artur Grigorev, Khaled Saleh, Jiwon Kim et al.

Traffic incidents remain a critical public safety concern worldwide, with Australia recording 1,300 road fatalities in 2024, which is the highest toll in 12 years. Similarly, the United States reports approximately 6 million crashes annually, raising significant challenges in terms of a fast reponse time and operational management. Traditional response protocols rely on human decision-making, which introduces potential inconsistencies and delays during critical moments when every minute impacts both safety outcomes and network performance. To address this issue, we propose a novel Incident Response Benchmark that uses generative artificial intelligence to automatically generate response plans for incoming traffic incidents. Our approach aims to significantly reduce incident resolution times by suggesting context-appropriate actions such as variable message sign deployment, lane closures, and emergency resource allocation adapted to specific incident characteristics. First, the proposed methodology uses real-world incident reports from the Performance Measurement System (PeMS) as training and evaluation data. We extract historically implemented actions from these reports and compare them against AI-generated response plans that suggest specific actions, such as lane closures, variable message sign announcements, and/or dispatching appropriate emergency resources. Second, model evaluations reveal that advanced generative AI models like GPT-4o and Grok 2 achieve superior alignment with expert solutions, demonstrated by minimized Hamming distances (averaging 2.96-2.98) and low weighted differences (approximately 0.27-0.28). Conversely, while Gemini 1.5 Pro records the lowest count of missed actions, its extremely high number of unnecessary actions (1547 compared to 225 for GPT-4o) indicates an over-triggering strategy that reduces the overall plan efficiency.

SYJun 3, 2025
Rapid Urban Visibility Hotspots: Quantifying Building Vertex Visibility from Connected Vehicle Trajectories using Spatial Indexing

Artur Grigorev, Adriana-Simona Mihaita

Effective placement of Out-of-Home advertising and street furniture requires accurate identification of locations offering maximum visual exposure to target audiences, particularly vehicular traffic. Traditional site selection methods often rely on static traffic counts or subjective assessments. This research introduces a data-driven methodology to objectively quantify location visibility by analyzing large-scale connected vehicle trajectory data (sourced from Compass IoT) within urban environments. We model the dynamic driver field-of-view using a forward-projected visibility area for each vehicle position derived from interpolated trajectories. By integrating this with building vertex locations extracted from OpenStreetMap, we quantify the cumulative visual exposure, or ``visibility count'', for thousands of potential points of interest near roadways. The analysis reveals that visibility is highly concentrated, identifying specific ``visual hotspots'' that receive disproportionately high exposure compared to average locations. The core technical contribution involves the construction of a BallTree spatial index over building vertices. This enables highly efficient (O(logN) complexity) radius queries to determine which vertices fall within the viewing circles of millions of trajectory points across numerous trips, significantly outperforming brute-force geometric checks. Analysis reveals two key findings: 1) Visibility is highly concentrated, identifying distinct 'visual hotspots' receiving disproportionately high exposure compared to average locations. 2) The aggregated visibility counts across vertices conform to a Log-Normal distribution.

CVApr 16, 2021
StylePeople: A Generative Model of Fullbody Human Avatars

Artur Grigorev, Karim Iskakov, Anastasia Ianina et al.

We propose a new type of full-body human avatars, which combines parametric mesh-based body model with a neural texture. We show that with the help of neural textures, such avatars can successfully model clothing and hair, which usually poses a problem for mesh-based approaches. We also show how these avatars can be created from multiple frames of a video using backpropagation. We then propose a generative model for such avatars that can be trained from datasets of images and videos of people. The generative model allows us to sample random avatars as well as to create dressed avatars of people from one or few images. The code for the project is available at saic-violet.github.io/style-people.

CVApr 16, 2021
Point-Based Modeling of Human Clothing

Ilya Zakharkin, Kirill Mazur, Artur Grigorev et al.

We propose a new approach to human clothing modeling based on point clouds. Within this approach, we learn a deep model that can predict point clouds of various outfits, for various human poses, and for various human body shapes. Notably, outfits of various types and topologies can be handled by the same model. Using the learned model, we can infer the geometry of new outfits from as little as a single image, and perform outfit retargeting to new bodies in new poses. We complement our geometric model with appearance modeling that uses the point cloud geometry as a geometric scaffolding and employs neural point-based graphics to capture outfit appearance from videos and to re-render the captured outfits. We validate both geometric modeling and appearance modeling aspects of the proposed approach against recently proposed methods and establish the viability of point-based clothing modeling.

CVApr 24, 2020
Neural Head Reenactment with Latent Pose Descriptors

Egor Burkov, Igor Pasechnik, Artur Grigorev et al.

We propose a neural head reenactment system, which is driven by a latent pose representation and is capable of predicting the foreground segmentation alongside the RGB image. The latent pose representation is learned as a part of the entire reenactment system, and the learning process is based solely on image reconstruction losses. We show that despite its simplicity, with a large and diverse enough training dataset, such learning successfully decomposes pose from identity. The resulting system can then reproduce mimics of the driving person and, furthermore, can perform cross-person reenactment. Additionally, we show that the learned descriptors are useful for other pose-related tasks, such as keypoint prediction and pose-based retrieval.

CVNov 28, 2018
Coordinate-based Texture Inpainting for Pose-Guided Image Generation

Artur Grigorev, Artem Sevastopolsky, Alexander Vakhitov et al.

We present a new deep learning approach to pose-guided resynthesis of human photographs. At the heart of the new approach is the estimation of the complete body surface texture based on a single photograph. Since the input photograph always observes only a part of the surface, we suggest a new inpainting method that completes the texture of the human body. Rather than working directly with colors of texture elements, the inpainting network estimates an appropriate source location in the input image for each element of the body surface. This correspondence field between the input image and the texture is then further warped into the target image coordinate frame based on the desired pose, effectively establishing the correspondence between the source and the target view even when the pose change is drastic. The final convolutional network then uses the established correspondence and all other available information to synthesize the output image. A fully-convolutional architecture with deformable skip connections guided by the estimated correspondence field is used. We show state-of-the-art result for pose-guided image synthesis. Additionally, we demonstrate the performance of our system for garment transfer and pose-guided face resynthesis.