97.3CLMay 27
ATLAS: All-round Testing of Long-context Abilities across ScalesDeli Huang, Cunguang Wang, Hongyin Tang et al.
Long-context language models now advertise context windows up to millions of tokens, yet evaluations typically report a single length or a narrow task family, masking two failure modes: performance can collapse as length grows, and strong retrieval need not transfer to downstream use. We present ATLAS, a benchmarking framework that redefines long-context evaluation as length-dependent capability profiling. ATLAS contributes three methodological principles:(i) a layered taxonomy separating foundational operations from application workloads so failures can be attributed, (ii) length-aware AUC scoring that integrates score-length curves over a fixed 8K-1M grid, replacing single-point metrics with full degradation profiles, and (iii) ATLAScore, a harmonic-mean aggregate over taxonomy categories that penalizes imbalanced profiles, with end-to-end uncertainty propagation from subset scores through the nonlinear final aggregate. We instantiate the framework across eight capability dimensions with nine auditable components and 6,438 instances, and evaluate 26 models. Gemini-3.1-Pro-Preview leads at 128K, Claude-Opus-4.6 leads at 1M. Rankings reshuffle substantially between ATLASscore@8K-128K and ATLASscore@8K-1M: 7 models move by at least two ranks, and the two taxonomy layers share only 61% of cross-model variance, with individual rank gaps up to 12 positions. These results support reporting long-context quality by capability and length, not by a single headline score.
100.0LGMar 10Code
KernelSkill: A Multi-Agent Framework for GPU Kernel OptimizationQitong Sun, Jun Han, Tianlin Li et al.
Improving GPU kernel efficiency is crucial for advancing AI systems. Recent work has explored leveraging large language models (LLMs) for GPU kernel generation and optimization. However, existing LLM-based kernel optimization pipelines typically rely on opaque, implicitly learned heuristics within the LLMs to determine optimization strategies. This leads to inefficient trial-and-error and weakly interpretable optimizations. Our key insight is to replace implicit heuristics with expert optimization skills that are knowledge-driven and aware of task trajectories. Specifically, we present KernelSkill, a multi-agent framework with a dual-level memory architecture. KernelSkill operates by coordinating agents with long-term memory of reusable expert skills and short-term memory to prevent repetitive backtracking. On KernelBench Levels 1-3, KernelSkill achieves a 100% success rate and average speedups of 5.44x, 2.82x, and 1.92x over Torch Eager on Levels 1, 2, and 3, respectively, outperforming prior baselines. Code is available at https://github.com/0satan0/KernelMem/.
LGJul 18, 2024
Hierarchical Stage-Wise Training of Linked Deep Neural Networks for Multi-Building and Multi-Floor Indoor Localization Based on Wi-Fi RSSI FingerprintingSihao Li, Kyeong Soo Kim, Zhe Tang et al.
In this paper, we present a new solution to the problem of large-scale multi-building and multi-floor indoor localization based on linked neural networks, where each neural network is dedicated to a sub-problem and trained under a hierarchical stage-wise training framework. When the measured data from sensors have a hierarchical representation as in multi-building and multi-floor indoor localization, it is important to exploit the hierarchical nature in data processing to provide a scalable solution. In this regard, the hierarchical stage-wise training framework extends the original stage-wise training framework to the case of multiple linked networks by training a lower-hierarchy network based on the prior knowledge gained from the training of higher-hierarchy networks. The experimental results with the publicly-available UJIIndoorLoc multi-building and multi-floor Wi-Fi RSSI fingerprint database demonstrate that the linked neural networks trained under the proposed hierarchical stage-wise training framework can achieve a three-dimensional localization error of 8.19 m, which, to the best of the authors' knowledge, is the most accurate result ever obtained for neural network-based models trained and evaluated with the full datasets of the UJIIndoorLoc database, and that, when applied to a model based on hierarchical convolutional neural networks, the proposed training framework can also significantly reduce the three-dimensional localization error from 11.78 m to 8.71 m.
NINov 19, 2022
On the Multidimensional Augmentation of Fingerprint Data for Indoor Localization in A Large-Scale Building Complex Based on Multi-Output Gaussian ProcessZhe Tang, Sihao Li, Kyeong Soo Kim et al.
Wi-Fi fingerprinting becomes a dominant solution for large-scale indoor localization due to its major advantage of not requiring new infrastructure and dedicated devices. The number and the distribution of Reference Points (RPs) for the measurement of localization fingerprints like RSSI during the offline phase, however, greatly affects the localization accuracy; for instance, the UJIIndoorLoc is known to have the issue of uneven spatial distribution of RPs over buildings and floors. Data augmentation has been proposed as a feasible solution to not only improve the smaller number and the uneven distribution of RPs in the existing fingerprint databases but also reduce the labor and time costs of constructing new fingerprint databases. In this paper, we propose the multidimensional augmentation of fingerprint data for indoor localization in a large-scale building complex based on Multi-Output Gaussian Process (MOGP) and systematically investigate the impact of augmentation ratio as well as MOGP kernel functions and models with their hyperparameters on the performance of indoor localization using the UJIIndoorLoc database and the state-of-the-art neural network indoor localization model based on a hierarchical RNN. The investigation based on experimental results suggests that we can generate synthetic RSSI fingerprint data up to ten times the original data -- i.e., the augmentation ratio of 10 -- through the proposed multidimensional MOGP-based data augmentation without significantly affecting the indoor localization performance compared to that of the original data alone, which extends the spatial coverage of the combined RPs and thereby could improve the localization performance at the locations that are not part of the test dataset.
LGJul 18, 2024
Mean Teacher based SSL Framework for Indoor Localization Using Wi-Fi RSSI FingerprintingSihao Li, Zhe Tang, Kyeong Soo Kim et al.
Wi-Fi fingerprinting is widely applied for indoor localization due to the widespread availability of Wi-Fi devices. However, traditional methods are not ideal for multi-building and multi-floor environments due to the scalability issues. Therefore, more and more researchers have employed deep learning techniques to enable scalable indoor localization. This paper introduces a novel semi-supervised learning framework for neural networks based on wireless access point selection, noise injection, and Mean Teacher model, which leverages unlabeled fingerprints to enhance localization performance. The proposed framework can manage hybrid in/outsourcing and voluntarily contributed databases and continually expand the fingerprint database with newly submitted unlabeled fingerprints during service. The viability of the proposed framework was examined using two established deep-learning models with the UJIIndoorLoc database. The experimental results suggest that the proposed framework significantly improves localization performance compared to the supervised learning-based approach in terms of floor-level coordinate estimation using EvAAL metric. It shows enhancements up to 10.99% and 8.98% in the former scenario and 4.25% and 9.35% in the latter, respectively with additional studies highlight the importance of the essential components of the proposed framework.
CLApr 30, 2025Code
Meeseeks: A Feedback-Driven, Iterative Self-Correction Benchmark evaluating LLMs' Instruction Following CapabilityJiaming wang, Yunke Zhao, Peng Ding et al.
The capability to precisely adhere to instructions is a cornerstone for Large Language Models (LLMs) to function as dependable agents in real-world scenarios. However, confronted with complex prompts, LLMs frequently encounter difficulties in fulfilling all specified requirements within a single response. Drawing inspiration from recent advancements in Chain-of-Thought (CoT) prompting and self-correction methodologies, we introduce Meeseeks (The name is inspired by Mr. Meeseeks from "Rick and Morty," a character renowned for efficiently accomplishing assigned tasks. See: https://en.wikipedia.org/wiki/Mr._Meeseeks), a fully automated iterative instruction-following benchmark equipped with an integrated feedback mechanism. Meeseeks identifies erroneous components in model responses and provides corresponding feedback accurately, thereby iteratively guiding the model toward self-correction. The dataset contains over 700 curated instances annotated by 32 distinct capability tags in Chinese and English. Extensive experimental results reveal that different state-of-the-art commercial and open-source LLMs exhibit vastly disparate performance, and even after 20 turns of iterative feedback-driven self-correction, nearly all models demonstrate suboptimal performance. We conducted comprehensive analysis from both macro and instance levels, uncovering numerous common issues prevalent in current state-of-the-art models, as well as several counterintuitive phenomena. We've open-sourced our work on https://github.com/ADoublLEN/Meeseeks.
SPAug 24, 2024
SGP-RI: A Real-Time-Trainable and Decentralized IoT Indoor Localization Model Based on Sparse Gaussian Process with Reduced-Dimensional InputsZhe Tang, Sihao Li, Zichen Huang et al.
Internet of Things (IoT) devices are deployed in the filed, there is an enormous amount of untapped potential in local computing on those IoT devices. Harnessing this potential for indoor localization, therefore, becomes an exciting research area. Conventionally, the training and deployment of indoor localization models are based on centralized servers with substantial computational resources. This centralized approach faces several challenges, including the database's inability to accommodate the dynamic and unpredictable nature of the indoor electromagnetic environment, the model retraining costs, and the susceptibility of centralized servers to security breaches. To mitigate these challenges we aim to amalgamate the offline and online phases of traditional indoor localization methods using a real-time-trainable and decentralized IoT indoor localization model based on Sparse Gaussian Process with Reduced-dimensional Inputs (SGP-RI), where the number and dimension of the input data are reduced through reference point and wireless access point filtering, respectively. The experimental results based on a multi-building and multi-floor static database as well as a single-building and single-floor dynamic database, demonstrate that the proposed SGP-RI model with less than half the training samples as inducing inputs can produce comparable localization performance to the standard Gaussian Process model with the whole training samples. The SGP-RI model enables the decentralization of indoor localization, facilitating its deployment to resource-constrained IoT devices, and thereby could provide enhanced security and privacy, reduced costs, and network dependency. Also, the model's capability of real-time training makes it possible to quickly adapt to the time-varying indoor electromagnetic environment.
CLOct 10, 2025Code
SOP-Maze: Evaluating Large Language Models on Complicated Business Standard Operating ProceduresJiaming Wang, Zhe Tang, Yilin Jin et al.
As large language models (LLMs) are widely deployed as domain-specific agents, many benchmarks have been proposed to evaluate their ability to follow instructions and make decisions in real-world scenarios. However, business scenarios often involve complex standard operating procedures (SOPs), and the evaluation of LLM capabilities in such contexts has not been fully explored. To bridge this gap, we propose SOP-Maze, a benchmark constructed from real-world business data and adapted into a collection of 397 tasks from 23 complex SOP scenarios. We further categorize SOP tasks into two broad classes: Lateral Root System (LRS), representing wide-option tasks that demand precise selection; and Heart Root System (HRS), which emphasizes deep logical reasoning with complex branches. Extensive experiments reveal that nearly all state-of-the-art models struggle with SOP-Maze. We conduct a comprehensive analysis and identify three key error categories: (i) route blindness: difficulty following procedures; (ii) conversational fragility: inability to handle real dialogue nuances; and (iii) calculation errors: mistakes in time or arithmetic reasoning under complex contexts. The systematic study explores LLM performance across SOP tasks that challenge both breadth and depth, offering new insights for improving model capabilities. We have open-sourced our work on https://github.com/ADoublLEN/SOP-Maze.
LGMar 19, 2024Code
SEVEN: Pruning Transformer Model by Reserving SentinelsJinying Xiao, Ping Li, Jie Nie et al.
Large-scale Transformer models (TM) have demonstrated outstanding performance across various tasks. However, their considerable parameter size restricts their applicability, particularly on mobile devices. Due to the dynamic and intricate nature of gradients on TM compared to Convolutional Neural Networks, commonly used pruning methods tend to retain weights with larger gradient noise. This results in pruned models that are sensitive to sparsity and datasets, exhibiting suboptimal performance. Symbolic Descent (SD) is a general approach for training and fine-tuning TM. In this paper, we attempt to describe the noisy batch gradient sequences on TM through the cumulative process of SD. We utilize this design to dynamically assess the importance scores of weights.SEVEN is introduced by us, which particularly favors weights with consistently high sensitivity, i.e., weights with small gradient noise. These weights are tended to be preserved by SEVEN. Extensive experiments on various TM in natural language, question-answering, and image classification domains are conducted to validate the effectiveness of SEVEN. The results demonstrate significant improvements of SEVEN in multiple pruning scenarios and across different sparsity levels. Additionally, SEVEN exhibits robust performance under various fine-tuning strategies. The code is publicly available at https://github.com/xiaojinying/SEVEN.
LGFeb 20, 2024
Static vs. Dynamic Databases for Indoor Localization based on Wi-Fi Fingerprinting: A Discussion from a Data PerspectiveZhe Tang, Ruocheng Gu, Sihao Li et al.
Wi-Fi fingerprinting has emerged as the most popular approach to indoor localization. The use of ML algorithms has greatly improved the localization performance of Wi-Fi fingerprinting, but its success depends on the availability of fingerprint databases composed of a large number of RSSIs, the MAC addresses of access points, and the other measurement information. However, most fingerprint databases do not reflect well the time varying nature of electromagnetic interferences in complicated modern indoor environment. This could result in significant changes in statistical characteristics of training/validation and testing datasets, which are often constructed at different times, and even the characteristics of the testing datasets could be different from those of the data submitted by users during the operation of localization systems after their deployment. In this paper, we consider the implications of time-varying Wi-Fi fingerprints on indoor localization from a data-centric point of view and discuss the differences between static and dynamic databases. As a case study, we have constructed a dynamic database covering three floors of the IR building of XJTLU based on RSSI measurements, over 44 days, and investigated the differences between static and dynamic databases in terms of statistical characteristics and localization performance. The analyses based on variance calculations and Isolation Forest show the temporal shifts in RSSIs, which result in a noticeable trend of the increase in the localization error of a Gaussian process regression model with the maximum error of 6.65 m after 14 days of training without model adjustments. The results of the case study with the XJTLU dynamic database clearly demonstrate the limitations of static databases and the importance of the creation and adoption of dynamic databases for future indoor localization research and real-world deployment.
CVOct 27, 2025
EndoWave: Rational-Wavelet 4D Gaussian Splatting for Endoscopic ReconstructionTaoyu Wu, Yiyi Miao, Jiaxin Guo et al.
In robot-assisted minimally invasive surgery, accurate 3D reconstruction from endoscopic video is vital for downstream tasks and improved outcomes. However, endoscopic scenarios present unique challenges, including photometric inconsistencies, non-rigid tissue motion, and view-dependent highlights. Most 3DGS-based methods that rely solely on appearance constraints for optimizing 3DGS are often insufficient in this context, as these dynamic visual artifacts can mislead the optimization process and lead to inaccurate reconstructions. To address these limitations, we present EndoWave, a unified spatio-temporal Gaussian Splatting framework by incorporating an optical flow-based geometric constraint and a multi-resolution rational wavelet supervision. First, we adopt a unified spatio-temporal Gaussian representation that directly optimizes primitives in a 4D domain. Second, we propose a geometric constraint derived from optical flow to enhance temporal coherence and effectively constrain the 3D structure of the scene. Third, we propose a multi-resolution rational orthogonal wavelet as a constraint, which can effectively separate the details of the endoscope and enhance the rendering performance. Extensive evaluations on two real surgical datasets, EndoNeRF and StereoMIS, demonstrate that our method EndoWave achieves state-of-the-art reconstruction quality and visual accuracy compared to the baseline method.
CVNov 18, 2025
Silhouette-to-Contour Registration: Aligning Intraoral Scan Models with Cephalometric RadiographsYiyi Miao, Taoyu Wu, Ji Jiang et al.
Reliable 3D-2D alignment between intraoral scan (IOS) models and lateral cephalometric radiographs is critical for orthodontic diagnosis, yet conventional intensity-driven registration methods struggle under real clinical conditions, where cephalograms exhibit projective magnification, geometric distortion, low-contrast dental crowns, and acquisition-dependent variation. These factors hinder the stability of appearance-based similarity metrics and often lead to convergence failures or anatomically implausible alignments. To address these limitations, we propose DentalSCR, a pose-stable, contour-guided framework for accurate and interpretable silhouette-to-contour registration. Our method first constructs a U-Midline Dental Axis (UMDA) to establish a unified cross-arch anatomical coordinate system, thereby stabilizing initialization and standardizing projection geometry across cases. Using this reference frame, we generate radiograph-like projections via a surface-based DRR formulation with coronal-axis perspective and Gaussian splatting, which preserves clinical source-object-detector magnification and emphasizes external silhouettes. Registration is then formulated as a 2D similarity transform optimized with a symmetric bidirectional Chamfer distance under a hierarchical coarse-to-fine schedule, enabling both large capture range and subpixel-level contour agreement. We evaluate DentalSCR on 34 expert-annotated clinical cases. Experimental results demonstrate substantial reductions in landmark error-particularly at posterior teeth-tighter dispersion on the lower jaw, and low Chamfer and controlled Hausdorff distances at the curve level. These findings indicate that DentalSCR robustly handles real-world cephalograms and delivers high-fidelity, clinically inspectable 3D--2D alignment, outperforming conventional baselines.
CVNov 18, 2025
ArchMap: Arch-Flattening and Knowledge-Guided Vision Language Model for Tooth Counting and Structured Dental UnderstandingBohan Zhang, Yiyi Miao, Taoyu Wu et al.
A structured understanding of intraoral 3D scans is essential for digital orthodontics. However, existing deep-learning approaches rely heavily on modality-specific training, large annotated datasets, and controlled scanning conditions, which limit generalization across devices and hinder deployment in real clinical workflows. Moreover, raw intraoral meshes exhibit substantial variation in arch pose, incomplete geometry caused by occlusion or tooth contact, and a lack of texture cues, making unified semantic interpretation highly challenging. To address these limitations, we propose ArchMap, a training-free and knowledge-guided framework for robust structured dental understanding. ArchMap first introduces a geometry-aware arch-flattening module that standardizes raw 3D meshes into spatially aligned, continuity-preserving multi-view projections. We then construct a Dental Knowledge Base (DKB) encoding hierarchical tooth ontology, dentition-stage policies, and clinical semantics to constrain the symbolic reasoning space. We validate ArchMap on 1060 pre-/post-orthodontic cases, demonstrating robust performance in tooth counting, anatomical partitioning, dentition-stage classification, and the identification of clinical conditions such as crowding, missing teeth, prosthetics, and caries. Compared with supervised pipelines and prompted VLM baselines, ArchMap achieves higher accuracy, reduced semantic drift, and superior stability under sparse or artifact-prone conditions. As a fully training-free system, ArchMap demonstrates that combining geometric normalization with ontology-guided multimodal reasoning offers a practical and scalable solution for the structured analysis of 3D intraoral scans in modern digital orthodontics.
CVNov 18, 2025
Dental3R: Geometry-Aware Pairing for Intraoral 3D Reconstruction from Sparse-View PhotographsYiyi Miao, Taoyu Wu, Tong Chen et al.
Intraoral 3D reconstruction is fundamental to digital orthodontics, yet conventional methods like intraoral scanning are inaccessible for remote tele-orthodontics, which typically relies on sparse smartphone imagery. While 3D Gaussian Splatting (3DGS) shows promise for novel view synthesis, its application to the standard clinical triad of unposed anterior and bilateral buccal photographs is challenging. The large view baselines, inconsistent illumination, and specular surfaces common in intraoral settings can destabilize simultaneous pose and geometry estimation. Furthermore, sparse-view photometric supervision often induces a frequency bias, leading to over-smoothed reconstructions that lose critical diagnostic details. To address these limitations, we propose \textbf{Dental3R}, a pose-free, graph-guided pipeline for robust, high-fidelity reconstruction from sparse intraoral photographs. Our method first constructs a Geometry-Aware Pairing Strategy (GAPS) to intelligently select a compact subgraph of high-value image pairs. The GAPS focuses on correspondence matching, thereby improving the stability of the geometry initialization and reducing memory usage. Building on the recovered poses and point cloud, we train the 3DGS model with a wavelet-regularized objective. By enforcing band-limited fidelity using a discrete wavelet transform, our approach preserves fine enamel boundaries and interproximal edges while suppressing high-frequency artifacts. We validate our approach on a large-scale dataset of 950 clinical cases and an additional video-based test set of 195 cases. Experimental results demonstrate that Dental3R effectively handles sparse, unposed inputs and achieves superior novel view synthesis quality for dental occlusion visualization, outperforming state-of-the-art methods.
ROSep 27, 2025
EKF-Based Fusion of Wi-Fi/LiDAR/IMU for Indoor Localization and NavigationZeyi Li, Zhe Tang, Kyeong Soo Kim et al.
Conventional Wi-Fi received signal strength indicator (RSSI) fingerprinting cannot meet the growing demand for accurate indoor localization and navigation due to its lower accuracy, while solutions based on light detection and ranging (LiDAR) can provide better localization performance but is limited by their higher deployment cost and complexity. To address these issues, we propose a novel indoor localization and navigation framework integrating Wi-Fi RSSI fingerprinting, LiDAR-based simultaneous localization and mapping (SLAM), and inertial measurement unit (IMU) navigation based on an extended Kalman filter (EKF). Specifically, coarse localization by deep neural network (DNN)-based Wi-Fi RSSI fingerprinting is refined by IMU-based dynamic positioning using a Gmapping-based SLAM to generate an occupancy grid map and output high-frequency attitude estimates, which is followed by EKF prediction-update integrating sensor information while effectively suppressing Wi-Fi-induced noise and IMU drift errors. Multi-group real-world experiments conducted on the IR building at Xi'an Jiaotong-Liverpool University demonstrates that the proposed multi-sensor fusion framework suppresses the instability caused by individual approaches and thereby provides stable accuracy across all path configurations with mean two-dimensional (2D) errors ranging from 0.2449 m to 0.3781 m. In contrast, the mean 2D errors of Wi-Fi RSSI fingerprinting reach up to 1.3404 m in areas with severe signal interference, and those of LiDAR/IMU localization are between 0.6233 m and 2.8803 m due to cumulative drift.
LGMar 19, 2024
LNPT: Label-free Network Pruning and TrainingJinying Xiao, Ping Li, Zhe Tang et al.
Pruning before training enables the deployment of neural networks on smart devices. By retaining weights conducive to generalization, pruned networks can be accommodated on resource-constrained smart devices. It is commonly held that the distance on weight norms between the initialized and the fully-trained networks correlates with generalization performance. However, as we have uncovered, inconsistency between this metric and generalization during training processes, which poses an obstacle to determine the pruned structures on smart devices in advance. In this paper, we introduce the concept of the learning gap, emphasizing its accurate correlation with generalization. Experiments show that the learning gap, in the form of feature maps from the penultimate layer of networks, aligns with variations of generalization performance. We propose a novel learning framework, LNPT, which enables mature networks on the cloud to provide online guidance for network pruning and learning on smart devices with unlabeled data. Our results demonstrate the superiority of this approach over supervised training.
NIFeb 4, 2022
Multi-Output Gaussian Process-Based Data Augmentation for Multi-Building and Multi-Floor Indoor LocalizationZhe Tang, Sihao Li, Kyeong Soo Kim et al.
Location fingerprinting based on RSSI becomes a mainstream indoor localization technique due to its advantage of not requiring the installation of new infrastructure and the modification of existing devices, especially given the prevalence of Wi-Fi-enabled devices and the ubiquitous Wi-Fi access in modern buildings. The use of AI/ML technologies like DNNs makes location fingerprinting more accurate and reliable, especially for large-scale multi-building and multi-floor indoor localization. The application of DNNs for indoor localization, however, depends on a large amount of preprocessed and deliberately-labeled data for their training. Considering the difficulty of the data collection in an indoor environment, especially under the current epidemic situation of COVID-19, we investigate three different methods of RSSI data augmentation based on Multi-Output Gaussian Process (MOGP), i.e., by a single floor, by neighboring floors, and by a single building; unlike Single-Output Gaussian Process (SOGP), MOGP can take into account the correlation among RSSI observations from multiple Access Points (APs) deployed closely to each other (e.g., APs on the same floor of a building) by collectively handling them. The feasibility of the MOGP-based RSSI data augmentation is demonstrated through experiments based on the state-of-the-art RNN indoor localization model and the UJIIndoorLoc, i.e., the most popular publicly-available multi-building and multi-floor indoor localization database, where the RNN model trained with the UJIIndoorLoc database augmented by using the whole RSSI data of a building in fitting an MOGP model (i.e., by a single building) outperforms the other two augmentation methods as well as the RNN model trained with the original UJIIndoorLoc database, resulting in the mean three-dimensional positioning error of 8.42 m.
IVDec 13, 2021
The Brain Tumor Sequence Registration (BraTS-Reg) Challenge: Establishing Correspondence Between Pre-Operative and Follow-up MRI Scans of Diffuse Glioma PatientsBhakti Baheti, Satrajit Chakrabarty, Hamed Akbari et al.
Registration of longitudinal brain MRI scans containing pathologies is challenging due to dramatic changes in tissue appearance. Although there has been progress in developing general-purpose medical image registration techniques, they have not yet attained the requisite precision and reliability for this task, highlighting its inherent complexity. Here we describe the Brain Tumor Sequence Registration (BraTS-Reg) challenge, as the first public benchmark environment for deformable registration algorithms focusing on estimating correspondences between pre-operative and follow-up scans of the same patient diagnosed with a diffuse brain glioma. The BraTS-Reg data comprise de-identified multi-institutional multi-parametric MRI (mpMRI) scans, curated for size and resolution according to a canonical anatomical template, and divided into training, validation, and testing sets. Clinical experts annotated ground truth (GT) landmark points of anatomical locations distinct across the temporal domain. Quantitative evaluation and ranking were based on the Median Euclidean Error (MEE), Robustness, and the determinant of the Jacobian of the displacement field. The top-ranked methodologies yielded similar performance across all evaluation metrics and shared several methodological commonalities, including pre-alignment, deep neural networks, inverse consistency analysis, and test-time instance optimization per-case basis as a post-processing step. The top-ranked method attained the MEE at or below that of the inter-rater variability for approximately 60% of the evaluated landmarks, underscoring the scope for further accuracy and robustness improvements, especially relative to human experts. The aim of BraTS-Reg is to continue to serve as an active resource for research, with the data and online evaluation tools accessible at https://bratsreg.github.io/.
AINov 13, 2020
A differential evolution-based optimization tool for interplanetary transfer trajectory designMingcheng Zuo, Guangming Dai, Lei Peng et al.
The extremely sensitive and highly nonlinear search space of interplanetary transfer trajectory design bring about big challenges on global optimization. As a representative, the current known best solution of the global trajectory optimization problem (GTOP) designed by the European space agency (ESA) is very hard to be found. To deal with this difficulty, a powerful differential evolution-based optimization tool named COoperative Differential Evolution (CODE) is proposed in this paper. CODE employs a two-stage evolutionary process, which concentrates on learning global structure in the earlier process, and tends to self-adaptively learn the structures of different local spaces. Besides, considering the spatial distribution of global optimum on different problems and the gradient information on different variables, a multiple boundary check technique has been employed. Also, Covariance Matrix Adaptation Evolutionary Strategies (CMA-ES) is used as a local optimizer. The previous studies have shown that a specific swarm intelligent optimization algorithm usually can solve only one or two GTOP problems. However, the experimental test results show that CODE can find the current known best solutions of Cassini1 and Sagas directly, and the cooperation with CMA-ES can solve Cassini2, GTOC1, Messenger (reduced) and Rosetta. For the most complicated Messenger (full) problem, even though CODE cannot find the current known best solution, the found best solution with objective function equaling to 3.38 km/s is still a level that other swarm intelligent algorithms cannot easily reach.
LGOct 17, 2018
XJTLUIndoorLoc: A New Fingerprinting Database for Indoor Localization and Trajectory Estimation Based on Wi-Fi RSS and Geomagnetic FieldZhenghang Zhong, Zhe Tang, Xiangxing Li et al.
In this paper, we present a new location fingerprinting database comprised of Wi-Fi received signal strength (RSS) and geomagnetic field intensity measured with multiple devices at a multi-floor building in Xi'an Jiatong-Liverpool University, Suzhou, China. We also provide preliminary results of localization and trajectory estimation based on convolutional neural network (CNN) and long short-term memory (LSTM) network with this database. For localization, we map RSS data for a reference point to an image-like, two-dimensional array and then apply CNN which is popular in image and video analysis and recognition. For trajectory estimation, we use a modified random way point model to efficiently generate continuous step traces imitating human walking and train a stacked two-layer LSTM network with the generated data to remember the changing pattern of geomagnetic field intensity against (x,y) coordinates. Experimental results demonstrate the usefulness of our new database and the feasibility of the CNN and LSTM-based localization and trajectory estimation with the database.