8.0CVMay 24
Fine-Tuning Vision-Language Models for Understanding Current Damage and Scoring Priority with Quality Guard AgentTakato Yasuno
Bridge inspection in Japan requires mandatory visual assessments every five years, yet qualitative damage ratings (levels a-e) assigned by different engineers exhibit significant inter-rater variability -- a critical barrier to consistent infrastructure management. The aging of skilled engineers further threatens inspection capacity. This paper presents a methodology for automating bridge damage understanding and repair priority scoring using fine-tuned Vision-Language Models (VLMs). We fine-tune LLaVA-1.5-7B with QLoRA on up to 4,000 paired bridge damage images and inspection text records, then evaluate on a fixed test set of 800 images. The model outputs natural language descriptions identifying structural members and damage patterns, from which a rule-based scoring engine calculates a five-level repair priority index. A progressive training study (1k/2k/3k/4k samples) reveals that 2k training samples achieve near-optimal validation loss in only 2.9 hours of training; beyond 2k, validation loss improves by no more than 0.2% per doubling of training samples, exhibiting clear diminishing returns. Furthermore, semantic similarity on the held-out test set peaks at 3k (0.6909) and degrades at 4k (0.6739), indicating that quality-curated mid-scale data outperforms larger but noisier corpora. Inference optimization combining torch.compile() and batch processing (batch_size=8) achieves 10.06 seconds per image -- a 70.2% reduction over the unoptimized baseline. Our approach contributes to data governance in bridge inspection, reduces inter-rater variability, and provides AI-assisted triage to augment expert engineers in inspection workflows. Furthermore, we introduce a two-stage Quality Guard using a fine-tuned Swallow-8B SLM to reject low-quality VLM outputs before priority scoring, preventing spurious scores from damaged or unrecognised images.
IRMar 3Code
Suppressing Domain-Specific Hallucination in Construction LLMs: A Knowledge Graph Foundation for GraphRAG and QLoRA on River and Sediment Control Technical StandardsTakato Yasuno
This paper addresses the challenge of answering technical questions derived from Japan's River and Sediment Control Technical Standards -- a multi-volume regulatory document covering survey, planning, design, and maintenance of river levees, dams, and sabo structures -- using open-source large language models running entirely on local hardware. We implement and evaluate three complementary approaches: Case A (plain 20B LLM baseline), Case B (8B LLM with QLoRA domain fine-tuning on 715 graph-derived QA pairs), and Case C (20B LLM augmented with a Neo4j knowledge graph via GraphRAG). All three cases use the Swallow series of Japanese-adapted LLMs and are evaluated on a 100-question benchmark spanning 8 technical categories, judged automatically by an independent LLM (Qwen2.5-14B, score 0--3). The key finding is a performance inversion: the 8B QLoRA fine-tuned model (Case B) achieves a judge average of 2.92/3 -- surpassing both the 20B plain baseline (Case A: 2.29/3, $+$0.63) and the 20B GraphRAG approach (Case C: 2.62/3, $+$0.30) -- while running at 3$\times$ faster latency (14.2s vs. 42.2s for Case A). GraphRAG provides moderate gains ($+$0.33 over baseline) but is outperformed by domain-specific fine-tuning in both quality and efficiency. We document the full engineering pipeline, including knowledge graph construction (200 nodes, 268 relations), QLoRA training data generation from Neo4j relations, training on a single GPU (16 GB VRAM) using unsloth, GGUF Q4_K_M quantisation and Ollama deployment, and the graph retrieval and re-ranking design. High-level engineering lessons are distilled in the main body; implementation pitfalls and toolchain details are documented in Supplementary Materials.
55.6APMay 19
Understanding Deterioration Random Effects for Causal Discovery in Infrastructure ManagementTakato Yasuno
Infrastructure deterioration poses significant challenges for asset management, yet existing approaches rely on population-averaged models that overlook equipment-specific heterogeneity. We present a novel framework that combines Bayesian hierarchical hazard modeling with causal discovery to identify operational patterns that drive heterogeneous deterioration rates in pump equipment. Our approach first estimates pump-specific random effects $u_i$ using GPU-accelerated No-U-Turn Sampling (NUTS), achieving 3--5$\times$ speedup over CPU implementations. We then employ DirectLiNGAM to discover causal relationships between 22 engineered time-series features and deterioration rates, stratified by positive ($u_i > 0$, faster deterioration) versus negative ($u_i \leq 0$, slower deterioration) random effects. Analyzing 112 pumps with 92,861 observations over 650 days, we uncover striking heterogeneity: the negative group exhibits causal effects 400$\times$ larger than the positive group, with standard deviation (std) showing a strong positive causal effect ($+1.515$) on deterioration rates in low-risk equipment. We validate linearity assumptions through NonlinearLiNGAM comparison and demonstrate practical scalability through GPU acceleration. Our findings enable targeted maintenance strategies by revealing that different operational regimes require fundamentally distinct management approaches, advancing predictive maintenance from population-averaged to heterogeneity-aware decision making.
3.4LGApr 27
Heterogeneous Variational Inference for Markov Degradation Hazard Models: Discretized Mixture with Interpretable ClustersTakato Yasuno
Bayesian finite mixture models can identify discrete risk clusters (low-risk vs. high-risk equipment), but face three critical bottlenecks: (1) insufficient degradation signals from coarse state discretization, (2) unstable cluster identification when data inherently supports fewer clusters than explored, and (3) computational infeasibility of Markov Chain Monte Carlo (MCMC) methods for production deployment (7+ hours per model). We propose a practical framework combining (1) 8-state global percentile discretization that amplifies degradation events, (2) 30-dimensional feature engineering integrating statistical trends (22 features), continuous health indicators, and text embeddings (PCA-compressed to 3 dimensions), (3) interpretable model selection rules enforcing minimum cluster share and separation alongside WAIC, and (4) Automatic Differentiation Variational Inference (ADVI) with full-rank covariance for stable, fast estimation. Applied to 280 industrial pump equipment with 104,703 inspection records, we demonstrate: (1) Random effect models (baseline) show ADVI and NUTS produce nearly identical estimates with 15$\times$ speedup, validating ADVI accuracy. (2) Finite mixture models identify optimal number of clusters with interpretability constraints. (3) NUTS exhibits severe convergence issues and label switching, while ADVI provides stable results in 84$\times$ less time. We contributed that (1) First demonstration that fine-grained state discretization (8-state) is essential for mixture model stability in survival analysis.(2) Comprehensive feature engineering strategy combining statistical, continuous, and semantic signals. (3) Practical interpretability rules preventing overfitting in automated model selection. (4) Empirical evidence that ADVI outperforms NUTS for finite mixture models in terms of convergence, stability, and computational efficiency.
CVMar 3, 2023
One-class Damage Detector Using Deeper Fully-Convolutional Data Descriptions for Civil ApplicationTakato Yasuno, Masahiro Okano, Junichiro Fujii
Infrastructure managers must maintain high standards to ensure user satisfaction during the lifecycle of infrastructures. Surveillance cameras and visual inspections have enabled progress in automating the detection of anomalous features and assessing the occurrence of deterioration. However, collecting damage data is typically time consuming and requires repeated inspections. The one-class damage detection approach has an advantage in that normal images can be used to optimize model parameters. Additionally, visual evaluation of heatmaps enables us to understand localized anomalous features. The authors highlight damage vision applications utilized in the robust property and localized damage explainability. First, we propose a civil-purpose application for automating one-class damage detection reproducing a fully convolutional data description (FCDD) as a baseline model. We have obtained accurate and explainable results demonstrating experimental studies on concrete damage and steel corrosion in civil engineering. Additionally, to develop a more robust application, we applied our method to another outdoor domain that contains complex and noisy backgrounds using natural disaster datasets collected using various devices. Furthermore, we propose a valuable solution of deeper FCDDs focusing on other powerful backbones to improve the performance of damage detection and implement ablation studies on disaster datasets. The key results indicate that the deeper FCDDs outperformed the baseline FCDD on datasets representing natural disaster damage caused by hurricanes, typhoons, earthquakes, and four-event disasters.
CVMar 2, 2022
VAE-iForest: Auto-encoding Reconstruction and Isolation-based Anomalies Detecting Fallen Objects on Road SurfaceTakato Yasuno, Junichiro Fujii, Riku Ogata et al.
In road monitoring, it is an important issue to detect changes in the road surface at an early stage to prevent damage to third parties. The target of the falling object may be a fallen tree due to the external force of a flood or an earthquake, and falling rocks from a slope. Generative deep learning is possible to flexibly detect anomalies of the falling objects on the road surface. We prototype a method that combines auto-encoding reconstruction and isolation-based anomaly detector in application for road surface monitoring. Actually, we apply our method to a set of test images that fallen objects is located on the raw inputs added with fallen stone and plywood, and that snow is covered on the winter road. Finally we mention the future works for practical purpose application.
1.0LGApr 20
Bridge-Centered Metapath Classification Using R-GCN-VGAE for Disaster-Resilient Maintenance DecisionsTakato Yasuno
Daily infrastructure management in preparation for disasters is critical for urban resilience. When bridges remain resilient against disaster-induced external forces, access to hospitals, shops, and residences via metapaths can be sustained, maintaining essential urban functions. However, prioritizing bridge maintenance under limited budgets requires quantifying the multi-dimensional roles that bridges play in disaster scenarios -- a challenge that existing single-indicator approaches fail to address. We focus on metapaths from national highways through bridges to buildings (hospitals, shops, residences), constructing a heterogeneous graph with road, bridge, and building layers. A Relation-centric Graph Convolutional Network Variational Autoencoder (R-GCN-VGAE) learns metapath-based feature representations, enabling classification of bridges into disaster-preparedness categories: Supply Chain (commercial logistics), Medical Access (emergency healthcare), and Residential Protection (preventing isolation). Using OSMnx and open data, we validate our methodology on three diverse cities in Ibaraki Prefecture, Japan: Mito (697 bridges), Chikusei (258 bridges), and Moriya (148 bridges), totaling 1,103 bridges. The heterogeneous graph construction from open data enables redefining bridge roles for disaster scenarios, supporting maintenance budget decision-making. We contributed that (1) Open-data methodology for constructing urban heterogeneous graphs. (2) Redefinition of bridge roles for disaster scenarios via metapath-based classification. (3) Establishment of maintenance budget decision support methodology. (4) k-NN tuning strategy validated across diverse city scales. (5) Empirical demonstration of UMAP superiority over t-SNE/PCA for multi-role bridge visualization.
CVJan 15, 2023
MN-Pair Contrastive Damage Representation and Clustering for Prognostic ExplanationTakato Yasuno, Masahiro Okano, Junichiro Fujii
For infrastructure inspections, damage representation does not constantly match the predefined classes of damage grade, resulting in detailed clusters of unseen damages or more complex clusters from overlapped space between two grades. The damage representation has fundamentally complex features; consequently, not all the damage classes can be perfectly predefined. The proposed MN-pair contrastive learning method helps to explore an embedding damage representation beyond the predefined classes by including more detailed clusters. It maximizes both the similarity of M-1 positive images close to an anchor and dissimilarity of N-1 negative images using both weighting loss functions. It learns faster than the N-pair algorithm using one positive image. We proposed a pipeline to obtain the damage representation and used a density-based clustering on a 2-D reduction space to automate finer cluster discrimination. We also visualized the explanation of the damage feature using Grad-CAM for MN-pair damage metric learning. We demonstrated our method in three experimental studies: steel product defect, concrete crack, and the effectiveness of our method and discuss future works.
CVJul 13, 2022
River Surface Patch-wise Detector Using Mixture Augmentation for Scum-cover-indexTakato Yasuno, Junichiro Fujii, Masazumi Amakata
Urban rivers provide a water environment that influences residential living. River surface monitoring has become crucial for making decisions about where to prioritize cleaning and when to automatically start the cleaning treatment. We focus on the organic mud, or "scum", that accumulates on the river's surface and contributes to the river's odor and has external economic effects on the landscape. Because of its feature of a sparsely distributed and unstable pattern of organic shape, automating the monitoring process has proved difficult. We propose a patch-wise classification pipeline to detect scum features on the river surface using mixture image augmentation to increase the diversity between the scum floating on the river and the entangled background on the river surface reflected by nearby structures like buildings, bridges, poles, and barriers. Furthermore, we propose a scum-index cover on rivers to help monitor worse grade online, collect floating scum, and decide on chemical treatment policies. Finally, we demonstrate the application of our method on a time series dataset with frames every ten minutes recording river scum events over several days. We discuss the significance of our pipeline and its experimental findings.
47.8LGMar 12
Adapting Methods for Domain-Specific Japanese Small LMs: Scale, Architecture, and QuantizationTakato Yasuno
This paper presents a systematic methodology for building domain-specific Japanese small language models using QLoRA fine-tuning. We address three core questions: optimal training scale, base-model selection, and architecture-aware quantization. Stage 1 (Training scale): Scale-learning experiments (1k--5k samples) identify n=4,000 as optimal, where test-set NLL reaches minimum (1.127) before overfitting at 5k samples. Stage 2 (Compare finetuned SLMs): Comparing four Japanese LLMs shows that Llama-3 models with Japanese continual pre-training (Swallow-8B, ELYZA-JP-8B) outperform multilingual models (Qwen2.5-7B). Stage 3 (Quantization): Llama-3 architectures improve under Q4_K_M quantization, while GQA architectures degrade severely (Qwen2.5: -0.280 points). Production recommendation: Swallow-8B Q4_K_M achieves 2.830/3 score, 8.9 s/question, 4.9 GB size. The methodology generalizes to low-resource technical domains and provides actionable guidance for compact Japanese specialist LMs on consumer hardware.
41.1CVMar 24
Quantized Vision-Language Models for Damage Assessment: A Comparative Study of LLaVA-1.5-7B Quantization LevelsTakato Yasuno
Bridge infrastructure inspection is a critical but labor-intensive task requiring expert assessment of structural damage such as rebar exposure, cracking, and corrosion. This paper presents a comprehensive study of quantized Vision-Language Models (VLMs) for automated bridge damage assessment, focusing on the trade-offs between description quality, inference speed, and resource requirements. We develop an end-to-end pipeline combining LLaVA-1.5-7B for visual damage analysis, structured JSON extraction, and rule-based priority scoring. To enable deployment on consumer-grade GPUs, we conduct a systematic comparison of three quantization levels: Q4_K_M, Q5_K_M, and Q8\_0 across 254 rebar exposure images. We introduce a 5-point quality evaluation framework assessing damage type recognition, severity classification. Our results demonstrate that Q5_K_M achieves the optimal balance: quality score 3.18$\pm$1.35/5.0, inference time 5.67s/image, and 0.56 quality/sec efficiency -- 8.5% higher quality than Q4_K_M with only 4.5% speed reduction, while matching Q8_0's quality with 25% faster inference. Statistical analysis reveals Q5_K_M exhibits the weakest text-quality correlation (-0.148), indicating consistent performance regardless of description length.
LGFeb 16
Hybrid Feature Learning with Time Series Embeddings for Equipment Anomaly PredictionTakato Yasuno
In predictive maintenance of equipment, deep learning-based time series anomaly detection has garnered significant attention; however, pure deep learning approaches often fail to achieve sufficient accuracy on real-world data. This study proposes a hybrid approach that integrates 64-dimensional time series embeddings from Granite TinyTimeMixer with 28-dimensional statistical features based on domain knowledge for HVAC equipment anomaly prediction tasks. Specifically, we combine time series embeddings extracted from a Granite TinyTimeMixer encoder fine-tuned with LoRA (Low-Rank Adaptation) and 28 types of statistical features including trend, volatility, and drawdown indicators, which are then learned using a LightGBM gradient boosting classifier. In experiments using 64 equipment units and 51,564 samples, we achieved Precision of 91--95\% and ROC-AUC of 0.995 for anomaly prediction at 30-day, 60-day, and 90-day horizons. Furthermore, we achieved production-ready performance with a false positive rate of 1.1\% or less and a detection rate of 88--94\%, demonstrating the effectiveness of the system for predictive maintenance applications. This work demonstrates that practical anomaly detection systems can be realized by leveraging the complementary strengths between deep learning's representation learning capabilities and statistical feature engineering.
CVJul 29, 2024
Cell Culture Assistive Application for Precipitation Image DiagnosisTakato Yasuno
In regenerative medicine research, we experimentally design the composition of chemical medium. We add different components to 384-well plates and culture the biological cells. We monitor the condition of the cells and take time-lapse bioimages for morphological assay. In particular, precipitation can appear as artefacts in the image and contaminate the noise in the imaging assay. Inspecting precipitates is a tedious task for the observer, and differences in experience can lead to variations in judgement from person to person. The machine learning approach will remove the burden of human inspection and provide consistent inspection. In addition, precipitation features are as small as 10-20 μm. A 1200 pixel square well image resized under a resolution of 2.82 μm/pixel will result in a reduction in precipitation features. Dividing the well images into 240-pixel squares and learning without resizing preserves the resolution of the original image. In this study, we developed an application to automatically detect precipitation on 384-well plates utilising optical microscope images. We apply MN-pair contrastive clustering to extract precipitation classes from approximately 20,000 patch images. To detect precipitation features, we compare deeper FCDDs detectors with optional backbones and build a machine learning pipeline to detect precipitation from the maximum score of quadruplet well images using isolation Forest algorithm, where the anomaly score is ranged from zero to one. Furthermore, using this application we can visualise precipitation situ heatmap on a 384-well plate.
CVJul 24, 2023
Few-shot $\mathbf{1/a}$ Anomalies Feedback : Damage Vision Mining Opportunity and Embedding Feature ImbalanceTakato Yasuno
Over the past decade, previous balanced datasets have been used to advance deep learning algorithms for industrial applications. In urban infrastructures and living environments, damage data mining cannot avoid imbalanced data issues because of rare unseen events and the high-quality status of improved operations. For visual inspection, the deteriorated class acquired from the surface of concrete and steel components are occasionally imbalanced. From numerous related surveys, we conclude that imbalanced data problems can be categorised into four types: 1) missing range of target and label valuables, 2) majority-minority class imbalance, 3) foreground background of spatial imbalance, and 4) long-tailed class of pixel-wise imbalance. Since 2015, many imbalanced studies have been conducted using deep-learning approaches, including regression, image classification, object detection, and semantic segmentation. However, anomaly detection for imbalanced data is not well known. In this study, we highlight a one-class anomaly detection application, whether anomalous class or not, and demonstrate clear examples of imbalanced vision datasets: medical disease, hazardous behaviour, material deterioration, plant disease, river sludge, and disaster damage. We provide key results on the advantage of damage-vision mining, hypothesising that the more effective the range of the positive ratio, the higher the accuracy gain of the anomalies feedback. In our imbalanced studies, compared with the balanced case with a positive ratio of $1/1$, we find that there is an applicable positive ratio $1/a$ where the accuracy is consistently high. However, the extremely imbalanced range is from one shot to $1/2a$, the accuracy of which is inferior to that of the applicable ratio. In contrast, with a positive ratio ranging over $2/a$, it shifts in the over-mining phase without an effective gain in accuracy.
CVJun 5, 2023
Disaster Anomaly Detector via Deeper FCDDs for Explainable Initial ResponsesTakato Yasuno, Masahiro Okano, Junichiro Fujii
Extreme natural disasters can have devastating effects on both urban and rural areas. In any disaster event, an initial response is the key to rescue within 72 hours and prompt recovery. During the initial stage of disaster response, it is important to quickly assess the damage over a wide area and identify priority areas. Among machine learning algorithms, deep anomaly detection is effective in detecting devastation features that are different from everyday features. In addition, explainable computer vision applications should justify the initial responses. In this paper, we propose an anomaly detection application utilizing deeper fully convolutional data descriptions (FCDDs), that enables the localization of devastation features and visualization of damage-marked heatmaps. More specifically, we show numerous training and test results for a dataset AIDER with the four disaster categories: collapsed buildings, traffic incidents, fires, and flooded areas. We also implement ablation studies of anomalous class imbalance and the data scale competing against the normal class. Our experiments provide results of high accuracies over 95% for F1. Furthermore, we found that the deeper FCDD with a VGG16 backbone consistently outperformed other baselines CNN27, ResNet101, and Inceptionv3. This study presents a new solution that offers a disaster anomaly detection application for initial responses with higher accuracy and devastation explainability, providing a novel contribution to the prompt disaster recovery problem in the research area of anomaly scene understanding. Finally, we discuss future works to improve more robust, explainable applications for effective initial responses.
7.3LGApr 9
Heterogeneous Graph Importance Scoring and Clustering with Automated LLM-based InterpretationTakato Yasuno
Urban bridge networks are critical infrastructure whose disruption can cascade into severe impacts on transportation, emergency services, and economic activity. This paper presents a comprehensive methodology for assessing bridge importance through heterogeneous graph analysis, unsupervised clustering, and automated interpretation via large language models (LLMs). Our approach addresses three fundamental challenges: (1) quantifying multi-dimensional bridge importance using only open data sources, (2) discovering functional bridge archetypes across different cities, and (3) generating policy-relevant interpretations automatically. We construct heterogeneous graphs from OpenStreetMap (OSM) data incorporating bridges, road networks, buildings, and public facilities. Five social impact indicators are computed: transit desert score, hospital access score, isolation risk score, supply chain impact score, and green space access score. These 52-dimensional feature vectors undergo dimensionality reduction via UMAP and density-based clustering via HDBSCAN. Discovered clusters are interpreted using temperature-optimized LLMs (Elyza8b, trained on construction domain corpus). (1) A complete open-data pipeline from OSM to actionable bridge importance rankings, (2) a five-indicator scoring methodology with 40$\times$ computational optimization, (3) a UMAP+HDBSCAN clustering framework validated on multi-city data, (4) an LLM interpretation methodology including temperature optimization and model selection rationale, and (5) transferability demonstration across cities via configuration-only adaptation.
LGFeb 22
FedAvg-Based CTMC Hazard Model for Federated Bridge Deterioration AssessmentTakato Yasuno
Bridge periodic inspection records contain sensitive information about public infrastructure, making cross-organizational data sharing impractical under existing data governance constraints. We propose a federated framework for estimating a Continuous-Time Markov Chain (CTMC) hazard model of bridge deterioration, enabling municipalities to collaboratively train a shared benchmark model without transferring raw inspection records. Each User holds local inspection data and trains a log-linear hazard model over three deterioration-direction transitions -- Good$\to$Minor, Good$\to$Severe, and Minor$\to$Severe -- with covariates for bridge age, coastline distance, and deck area. Local optimization is performed via mini-batch stochastic gradient descent on the CTMC log-likelihood, and only a 12-dimensional pseudo-gradient vector is uploaded to a central server per communication round. The server aggregates User updates using sample-weighted Federated Averaging (FedAvg) with momentum and gradient clipping. All experiments in this paper are conducted on fully synthetic data generated from a known ground-truth parameter set with region-specific heterogeneity, enabling controlled evaluation of federated convergence behaviour. Simulation results across heterogeneous Users show consistent convergence of the average negative log-likelihood, with the aggregated gradient norm decreasing as User scale increases. Furthermore, the federated update mechanism provides a natural participation incentive: Users who register their local inspection datasets on a shared technical-standard platform receive in return the periodically updated global benchmark parameters -- information that cannot be obtained from local data alone -- thereby enabling evidence-based life-cycle planning without surrendering data sovereignty.
CVMay 9, 2023
Wooden Sleeper Deterioration Detection for Rural Railway Prognostics Using Unsupervised Deeper FCDDsTakato Yasuno, Masahiro Okano, Junichiro Fujii
Maintaining high standards for user safety during daily railway operations is crucial for railway managers. To aid in this endeavor, top- or side-view cameras and GPS positioning systems have facilitated progress toward automating periodic inspections of defective features and assessing the deteriorating status of railway components. However, collecting data on deteriorated status can be time-consuming and requires repeated data acquisition because of the extreme temporal occurrence imbalance. In supervised learning, thousands of paired data sets containing defective raw images and annotated labels are required. However, the one-class classification approach offers the advantage of requiring fewer images to optimize parameters for training normal and anomalous features. The deeper fully-convolutional data descriptions (FCDDs) were applicable to several damage data sets of concrete/steel components in structures, and fallen tree, and wooden building collapse in disasters. However, it is not yet known to feasible to railway components. In this study, we devised a prognostic discriminator pipeline to automate one-class damage classification using the deeper FCDDs for defective railway components. We also performed ablation studies of the deeper backbone based on convolutional neural networks (CNNs). Furthermore, we visualized deterioration features by using transposed Gaussian upsampling. We demonstrated our application to railway inspection using a video acquisition dataset of railway track from backward view at a cloudy and sunny scene. Finally, we examined the usability of our approach for prognostics and future work on railway inspection.
MLDec 6, 2021
Flood Inflow Forecast Using L2-norm Ensemble Weighting Sea Surface FeatureTakato Yasuno, Masazumi Amakata, Junichiro Fujii et al.
It is important to forecast dam inflow for flood damage mitigation. The hydrograph provides critical information such as the start time, peak level, and volume. Particularly, dam management requires a 6-h lead time of the dam inflow forecast based on a future hydrograph. The authors propose novel target inflow weights to create an ocean feature vector extracted from the analyzed images of the sea surface. We extracted 4,096 elements of the dimension vector in the fc6 layer of the pre-trained VGG16 network. Subsequently, we reduced it to three dimensions of t-SNE. Furthermore, we created the principal component of the sea temperature weights using PCA. We found that these weights contribute to the stability of predictor importance by numerical experiments. As base regression models, we calibrate the least squares with kernel expansion, the quantile random forest minimized out-of bag error, and the support vector regression with a polynomial kernel. When we compute the predictor importance, we visualize the stability of each variable importance introduced by our proposed weights, compared with other results without weights. We apply our method to a dam at Kanto region in Japan and focus on the trained term from 2007 to 2018, with a limited flood term from June to October. We test the accuracy over the 2019 flood term. Finally, we present the applied results and further statistical learning for unknown flood forecast.
CVJun 30, 2021
One-class Steel Detector Using Patch GAN Discriminator for Visualising Anomalous Feature MapTakato Yasuno, Junichiro Fujii, Sakura Fukami
For steel product manufacturing in indoor factories, steel defect detection is important for quality control. For example, a steel sheet is extremely delicate, and must be accurately inspected. However, to maintain the painted steel parts of the infrastructure around a severe outdoor environment, corrosion detection is critical for predictive maintenance. In this paper, we propose a general-purpose application for steel anomaly detection that consists of the following four components. The first, a learner, is a unit image classification network to determine whether the region of interest or background has been recognised, after dividing the original large sized image into 256 square unit images. The second, an extractor, is a discriminator feature encoder based on a pre-trained steel generator with a patch generative adversarial network discriminator(GAN). The third, an anomaly detector, is a one-class support vector machine(SVM) to predict the anomaly score using the discriminator feature. The fourth, an indicator, is an anomalous probability map used to visually explain the anomalous features. Furthermore, we demonstrated our method through the inspection of steel sheet defects with 13,774 unit images using high-speed cameras, and painted steel corrosion with 19,766 unit images based on an eye inspection of the photographs. Finally, we visualise anomalous feature maps of steel using a strip and painted steel inspection dataset
CVFeb 28, 2021
Snowy Night-to-Day Translator and Semantic Segmentation Label Similarity for Snow Hazard IndicatorTakato Yasuno, Hiroaki Sugawara, Junichiro Fujii et al.
In 2021, Japan recorded more than three times as much snowfall as usual, so road user maybe come across dangerous situation. The poor visibility caused by snow triggers traffic accidents. For example, 2021 January 19, due to the dry snow and the strong wind speed of 27 m / s, blizzards occurred and the outlook has been ineffective. Because of the whiteout phenomenon, multiple accidents with 17 casualties occurred, and 134 vehicles were stacked up for 10 hours over 1 km. At the night time zone, the temperature drops and the road surface tends to freeze. CCTV images on the road surface have the advantage that we enable to monitor the status of major points at the same time. Road managers are required to make decisions on road closures and snow removal work owing to the road surface conditions even at night. In parallel, they would provide road users to alert for hazardous road surfaces. This paper propose a method to automate a snow hazard indicator that the road surface region is generated from the night snow image using the Conditional GAN, pix2pix. In addition, the road surface and the snow covered ROI are predicted using the semantic segmentation DeepLabv3+ with a backbone MobileNet, and the snow hazard indicator to automatically compute how much the night road surface is covered with snow. We demonstrate several results applied to the cold and snow region in the winter of Japan January 19 to 21 2021, and mention the usefulness of high similarity between snowy night-to-day fake output and real snowy day image for night snow visibility.
CVJan 14, 2021
Road Surface Translation Under Snow-covered and Semantic Segmentation for Snow Hazard IndexTakato Yasuno, Junichiro Fujii, Hiroaki Sugawara et al.
In 2020, there was a record heavy snowfall owing to climate change. In reality, 2,000 vehicles were stuck on the highway for three days. Because of the freezing of the road surface, 10 vehicles had a billiard accident. Road managers are required to provide indicators to alert drivers regarding snow cover at hazardous locations. This study proposes a deep learning application with live image post-processing to automatically calculate a snow hazard ratio indicator. First, the road surface hidden under snow is translated using a generative adversarial network, pix2pix. Second, snow-covered and road surface classes are detected by semantic segmentation using DeepLabv3+ with MobileNet as a backbone. Based on these trained networks, we automatically compute the road to snow rate hazard index, indicating the amount of snow covered on the road surface. We demonstrate the applied results to 1,155 live snow images of the cold region in Japan. We mention the usefulness and the practical robustness of our study.
LGSep 30, 2020
Rain-Code Fusion : Code-to-code ConvLSTM Forecasting Spatiotemporal PrecipitationTakato Yasuno, Akira Ishii, Masazumi Amakata
Recently, flood damage has become a social problem owing to unexperienced weather conditions arising from climate change. An immediate response to heavy rain is important for the mitigation of economic losses and also for rapid recovery. Spatiotemporal precipitation forecasts may enhance the accuracy of dam inflow prediction, more than 6 hours forward for flood damage mitigation. However, the ordinary ConvLSTM has the limitation of predictable range more than 3-timesteps in real-world precipitation forecasting owing to the irreducible bias between target prediction and ground-truth value. This paper proposes a rain-code approach for spatiotemporal precipitation code-to-code forecasting. We propose a novel rainy feature that represents a temporal rainy process using multi-frame fusion for the timestep reduction. We perform rain-code studies with various term ranges based on the standard ConvLSTM. We applied to a dam region within the Japanese rainy term hourly precipitation data, under 2006 to 2019 approximately 127 thousands hours, every year from May to October. We apply the radar analysis hourly data on the central broader region with an area of 136 x 148 km2 . Finally we have provided sensitivity studies between the rain-code size and hourly accuracy within the several forecasting range.
IVJun 27, 2020
Generative Damage Learning for Concrete Aging Detection using Auto-flight ImagesTakato Yasuno, Akira Ishii, Junichiro Fujii et al.
In order to monitor the state of large-scale infrastructures, image acquisition by autonomous flight drones is efficient for stable angle and high-quality images. Supervised learning requires a large data set consisting of images and annotation labels. It takes a long time to accumulate images, including identifying the damaged regions of interest (ROIs). In recent years, unsupervised deep learning approaches such as generative adversarial networks (GANs) for anomaly detection algorithms have progressed. When a damaged image is a generator input, it tends to reverse from the damaged state to the healthy state generated image. Using the distance of distribution between the real damaged image and the generated reverse aging healthy state fake image, it is possible to detect the concrete damage automatically from unsupervised learning. This paper proposes an anomaly detection method using unpaired image-to-image translation mapping from damaged images to reverse aging fakes that approximates healthy conditions. We apply our method to field studies, and we examine the usefulness of our method for health monitoring of concrete damage.
CVMay 7, 2020
Synthetic Image Augmentation for Damage Region Segmentation using Conditional GAN with Structure EdgeTakato Yasuno, Michihiro Nakajima, Tomoharu Sekiguchi et al.
Recently, social infrastructure is aging, and its predictive maintenance has become important issue. To monitor the state of infrastructures, bridge inspection is performed by human eye or bay drone. For diagnosis, primary damage region are recognized for repair targets. But, the degradation at worse level has rarely occurred, and the damage regions of interest are often narrow, so their ratio per image is extremely small pixel count, as experienced 0.6 to 1.5 percent. The both scarcity and imbalance property on the damage region of interest influences limited performance to detect damage. If additional data set of damaged images can be generated, it may enable to improve accuracy in damage region segmentation algorithm. We propose a synthetic augmentation procedure to generate damaged images using the image-to-image translation mapping from the tri-categorical label that consists the both semantic label and structure edge to the real damage image. We use the Sobel gradient operator to enhance structure edge. Actually, in case of bridge inspection, we apply the RC concrete structure with the number of 208 eye-inspection photos that rebar exposure have occurred, which are prepared 840 block images with size 224 by 224. We applied popular per-pixel segmentation algorithms such as the FCN-8s, SegNet, and DeepLabv3+Xception-v2. We demonstrates that re-training a data set added with synthetic augmentation procedure make higher accuracy based on indices the mean IoU, damage region of interest IoU, precision, recall, BF score when we predict test images.
CVApr 22, 2020
Per-pixel Classification Rebar Exposures in Bridge Eye-inspectionTakato Yasuno, Nakajima Michihiro, Noda Kazuhiro
Efficient inspection and accurate diagnosis are required for civil infrastructures with 50 years since completion. Especially in municipalities, the shortage of technical staff and budget constraints on repair expenses have become a critical problem. If we can detect damaged photos automatically per-pixels from the record of the inspection record in addition to the 5-step judgment and countermeasure classification of eye-inspection vision, then it is possible that countermeasure information can be provided more flexibly, whether we need to repair and how large the expose of damage interest. A piece of damage photo is often sparse as long as it is not zoomed around damage, exactly the range where the detection target is photographed, is at most only 1%. Generally speaking, rebar exposure is frequently occurred, and there are many opportunities to judge repair measure. In this paper, we propose three damage detection methods of transfer learning which enables semantic segmentation in an image with low pixels using damaged photos of human eye-inspection. Also, we tried to create a deep convolutional network from scratch with the preprocessing that random crops with rotations are generated. In fact, we show the results applied this method using the 208 rebar exposed images on the 106 real-world bridges. Finally, future tasks of damage detection modeling are mentioned.
CVApr 21, 2020
Natural Disaster Classification using Aerial Photography Explainable for Typhoon Damaged FeatureTakato Yasuno, Masazumi Amakata, Masahiro Okano
Recent years, typhoon damages has become social problem owing to climate change. In 9 September 2019, Typhoon Faxai passed on the Chiba in Japan, whose damages included with electric provision stop because of strong wind recorded on the maximum 45 meter per second. A large amount of tree fell down, and the neighbor electric poles also fell down at the same time. These disaster features have caused that it took 18 days for recovery longer than past ones. Immediate responses are important for faster recovery. As long as we can, aerial survey for global screening of devastated region would be required for decision support to respond where to recover ahead. This paper proposes a practical method to visualize the damaged areas focused on the typhoon disaster features using aerial photography. This method can classify eight classes which contains land covers without damages and areas with disaster. Using target feature class probabilities, we can visualize disaster feature map to scale a color range. Furthermore, we can realize explainable map on each unit grid images to compute the convolutional activation map using Grad-CAM. We demonstrate case studies applied to aerial photographs recorded at the Chiba region after typhoon.
LGApr 21, 2020
Generative Synthetic Augmentation using Label-to-Image Translation for Nuclei Image SegmentationTakato Yasuno
In medical image diagnosis, pathology image analysis using semantic segmentation becomes important for efficient screening as a field of digital pathology. The spatial augmentation is ordinary used for semantic segmentation. Tumor images under malignant are rare and to annotate the labels of nuclei region takes much time-consuming. We require an effective use of dataset to maximize the segmentation accuracy. It is expected that some augmentation to transform generalized images influence the segmentation performance. We propose a synthetic augmentation using label-to-image translation, mapping from a semantic label with the edge structure to a real image. Exactly this paper deal with stain slides of nuclei in tumor. Actually, we demonstrate several segmentation algorithms applied to the initial dataset that contains real images and labels using synthetic augmentation in order to add their generalized images. We computes and reports that a proposed synthetic augmentation procedure improve their accuracy.