CVAug 13, 2024Code
Towards Cross-Domain Single Blood Cell Image Classification via Large-Scale LoRA-based Segment Anything ModelYongcheng Li, Lingcong Cai, Ying Lu et al.
Accurate classification of blood cells plays a vital role in hematological analysis as it aids physicians in diagnosing various medical conditions. In this study, we present a novel approach for classifying blood cell images known as BC-SAM. BC-SAM leverages the large-scale foundation model of Segment Anything Model (SAM) and incorporates a fine-tuning technique using LoRA, allowing it to extract general image embeddings from blood cell images. To enhance the applicability of BC-SAM across different blood cell image datasets, we introduce an unsupervised cross-domain autoencoder that focuses on learning intrinsic features while suppressing artifacts in the images. To assess the performance of BC-SAM, we employ four widely used machine learning classifiers (Random Forest, Support Vector Machine, Artificial Neural Network, and XGBoost) to construct blood cell classification models and compare them against existing state-of-the-art methods. Experimental results conducted on two publicly available blood cell datasets (Matek-19 and Acevedo-20) demonstrate that our proposed BC-SAM achieves a new state-of-the-art result, surpassing the baseline methods with a significant improvement. The source code of this paper is available at https://github.com/AnoK3111/BC-SAM.
CVAug 14, 2024Code
Domain-invariant Representation Learning via Segment Anything Model for Blood Cell ClassificationYongcheng Li, Lingcong Cai, Ying Lu et al.
Accurate classification of blood cells is of vital significance in the diagnosis of hematological disorders. However, in real-world scenarios, domain shifts caused by the variability in laboratory procedures and settings, result in a rapid deterioration of the model's generalization performance. To address this issue, we propose a novel framework of domain-invariant representation learning (DoRL) via segment anything model (SAM) for blood cell classification. The DoRL comprises two main components: a LoRA-based SAM (LoRA-SAM) and a cross-domain autoencoder (CAE). The advantage of DoRL is that it can extract domain-invariant representations from various blood cell datasets in an unsupervised manner. Specifically, we first leverage the large-scale foundation model of SAM, fine-tuned with LoRA, to learn general image embeddings and segment blood cells. Additionally, we introduce CAE to learn domain-invariant representations across different-domain datasets while mitigating images' artifacts. To validate the effectiveness of domain-invariant representations, we employ five widely used machine learning classifiers to construct blood cell classification models. Experimental results on two public blood cell datasets and a private real dataset demonstrate that our proposed DoRL achieves a new state-of-the-art cross-domain performance, surpassing existing methods by a significant margin. The source code can be available at the URL (https://github.com/AnoK3111/DoRL).
QUANT-PHMar 29, 2022
Quantum compiling with a variational instruction set for accurate and fast quantum computingYing Lu, Peng-Fei Zhou, Shao-Ming Fei et al.
The quantum instruction set (QIS) is defined as the quantum gates that are physically realizable by controlling the qubits in quantum hardware. Compiling quantum circuits into the product of the gates in a properly defined QIS is a fundamental step in quantum computing. We here propose the quantum variational instruction set (QuVIS) formed by flexibly designed multi-qubit gates for higher speed and accuracy of quantum computing. The controlling of qubits for realizing the gates in a QuVIS is variationally achieved using the fine-grained time optimization algorithm. Significant reductions in both the error accumulation and time cost are demonstrated in realizing the swaps of multiple qubits and quantum Fourier transformations, compared with the compiling by a standard QIS such as the quantum microinstruction set (QuMIS, formed by several one- and two-qubit gates including one-qubit rotations and controlled-NOT gates). With the same requirement on quantum hardware, the time cost for QuVIS is reduced to less than one half of that for QuMIS. Simultaneously, the error is suppressed algebraically as the depth of the compiled circuit is reduced. As a general compiling approach with high flexibility and efficiency, QuVIS can be defined for different quantum circuits and be adapted to the quantum hardware with different interactions.
LGSep 5, 2023
Developing A Fair Individualized Polysocial Risk Score (iPsRS) for Identifying Increased Social Risk of Hospitalizations in Patients with Type 2 Diabetes (T2D)Yu Huang, Jingchuan Guo, William T Donahoo et al.
Background: Racial and ethnic minority groups and individuals facing social disadvantages, which often stem from their social determinants of health (SDoH), bear a disproportionate burden of type 2 diabetes (T2D) and its complications. It is therefore crucial to implement effective social risk management strategies at the point of care. Objective: To develop an EHR-based machine learning (ML) analytical pipeline to identify the unmet social needs associated with hospitalization risk in patients with T2D. Methods: We identified 10,192 T2D patients from the EHR data (from 2012 to 2022) from the University of Florida Health Integrated Data Repository, including contextual SDoH (e.g., neighborhood deprivation) and individual-level SDoH (e.g., housing stability). We developed an electronic health records (EHR)-based machine learning (ML) analytic pipeline, namely individualized polysocial risk score (iPsRS), to identify high social risk associated with hospitalizations in T2D patients, along with explainable AI (XAI) techniques and fairness assessment and optimization. Results: Our iPsRS achieved a C statistic of 0.72 in predicting 1-year hospitalization after fairness optimization across racial-ethnic groups. The iPsRS showed excellent utility for capturing individuals at high hospitalization risk; the actual 1-year hospitalization rate in the top 5% of iPsRS was ~13 times as high as the bottom decile. Conclusion: Our ML pipeline iPsRS can fairly and accurately screen for patients who have increased social risk leading to hospitalization in T2D patients.
32.0CLMay 25
A general tensor-structured compression scheme for efficient large language modelsYing Lu, Peng-Fei Zhou, Qi-Xuan Fang et al.
Large language models (LLMs) are dominated by dense linear transformations, whose storage, memory and computational overheads hinder efficient adaptation and deployment while masking the functional impacts of structural simplification. Here we present Tensor Mixture (MixT), a general tensor-structured compression scheme that replaces targeted dense linear layers with natively executable mixtures of tensor operators. Operating directly on generic linear projections instead of model-specific components, MixT is potentially applicable across Transformer-based LLMs and other dense neural mappings. We evaluate MixT on Qwen3-8B and LLaMA2-7B under a unified recovery protocol, identifying a broad compressible regime in which MMLU accuracy is largely preserved before an abrupt transition at model-specific boundaries. This transition coincides with coordinated shifts in output entropy, prediction entropy and inter-layer geometry. At the LLaMA2-7B transition boundary, MixT reduces full-model parameters by 47.5\%, inference FLOPs by 37.1\%, training FLOPs by 52.1\% and peak inference memory by 60.4\%, demonstrating its practical potential for lower-cost LLM compression.
QUANT-PHJul 21, 2023
Persistent Ballistic Entanglement Spreading with Optimal Control in Quantum Spin ChainsYing Lu, Pei Shi, Xiao-Han Wang et al.
Entanglement propagation provides a key routine to understand quantum many-body dynamics in and out of equilibrium. The entanglement entropy (EE) usually approaches to a sub-saturation known as the Page value $\tilde{S}_{P} =\tilde{S} - dS$ (with $\tilde{S}$ the maximum of EE and $dS$ the Page correction) in, e.g., the random unitary evolutions. The ballistic spreading of EE usually appears in the early time and will be deviated far before the Page value is reached. In this work, we uncover that the magnetic field that maximizes the EE robustly induces persistent ballistic spreading of entanglement in quantum spin chains. The linear growth of EE is demonstrated to persist till the maximal $\tilde{S}$ (along with a flat entanglement spectrum) is reached. The robustness of ballistic spreading and the enhancement of EE under such an optimal control are demonstrated, considering particularly perturbing the initial state by random pure states (RPS's). These are argued as the results from the endomorphism of the time evolution under such an entanglement-enhancing optimal control for the RPS's.
6.0CEApr 15
An End-to-end Building Load Forecasting Framework with Patch-based Information Fusion Network and Error-weighted Adaptive LossHang Fan, Ying Lu, Weican Liu et al.
Accurate building load forecasting plays a critical role in facilitating demand response aggregation and optimizing energy management. However, the complex temporal dependencies and high volatility of building loads limit the improvement of prediction accuracy. To this end, we propose a novel end-to-end building load forecasting framework. Specifically, the framework can be divided into two main stages. In the two-stage data preprocessing module enhanced by interpretable feature selection, we utilize the Local Outlier Factor (LOF) algorithm to accurately detect and correct anomalies in the original building load series. Furthermore, we employ SVM-SHAP feature analysis to quantify the impact of environmental variables, filtering out critical feature combinations to mitigate redundancy. In the building load forecasting module, we propose the patch-based information fusion network (PIF-Net). This model applies patching technology to process input series into local blocks, extracting temporal features through a shared Gated Recurrent Unit (GRU) network with residual connections. Subsequently, an information fusion module based on a customized gating mechanism integrates the ensemble hidden states to weight the importance of different temporal patches dynamically. Additionally, the framework is trained using a novel Error-weighted Adaptive Loss (EWAL) function. By combining a rational quadratic function and logarithmic loss to dynamically adjust penalty weights based on real-time prediction error distributions, EWAL significantly enhances the model's robustness under extreme load conditions. Finally, extensive experiments demonstrate the effectiveness and superiority of our proposed framework.
CVDec 4, 2024Code
Stain-aware Domain Alignment for Imbalance Blood Cell ClassificationYongcheng Li, Lingcong Cai, Ying Lu et al.
Blood cell identification is critical for hematological analysis as it aids physicians in diagnosing various blood-related diseases. In real-world scenarios, blood cell image datasets often present the issues of domain shift and data imbalance, posing challenges for accurate blood cell identification. To address these issues, we propose a novel blood cell classification method termed SADA via stain-aware domain alignment. The primary objective of this work is to mine domain-invariant features in the presence of domain shifts and data imbalances. To accomplish this objective, we propose a stain-based augmentation approach and a local alignment constraint to learn domain-invariant features. Furthermore, we propose a domain-invariant supervised contrastive learning strategy to capture discriminative features. We decouple the training process into two stages of domain-invariant feature learning and classification training, alleviating the problem of data imbalance. Experiment results on four public blood cell datasets and a private real dataset collected from the Third Affiliated Hospital of Sun Yat-sen University demonstrate that SADA can achieve a new state-of-the-art baseline, which is superior to the existing cutting-edge methods with a big margin. The source code can be available at the URL (\url{https://github.com/AnoK3111/SADA}).
CVMar 24, 2024
Opportunities and challenges in the application of large artificial intelligence models in radiologyLiangrui Pan, Zhenyu Zhao, Ying Lu et al.
Influenced by ChatGPT, artificial intelligence (AI) large models have witnessed a global upsurge in large model research and development. As people enjoy the convenience by this AI large model, more and more large models in subdivided fields are gradually being proposed, especially large models in radiology imaging field. This article first introduces the development history of large models, technical details, workflow, working principles of multimodal large models and working principles of video generation large models. Secondly, we summarize the latest research progress of AI large models in radiology education, radiology report generation, applications of unimodal and multimodal radiology. Finally, this paper also summarizes some of the challenges of large AI models in radiology, with the aim of better promoting the rapid revolution in the field of radiography.
SISep 14, 2021
Embedding Node Structural Role Identity Using Stress MajorizationLili Wang, Chenghan Huang, Weicheng Ma et al.
Nodes in networks may have one or more functions that determine their role in the system. As opposed to local proximity, which captures the local context of nodes, the role identity captures the functional "role" that nodes play in a network, such as being the center of a group, or the bridge between two groups. This means that nodes far apart in a network can have similar structural role identities. Several recent works have explored methods for embedding the roles of nodes in networks. However, these methods all rely on either approximating or indirect modeling of structural equivalence. In this paper, we present a novel and flexible framework using stress majorization, to transform the high-dimensional role identities in networks directly (without approximation or indirect modeling) to a low-dimensional embedding space. Our method is also flexible, in that it does not rely on specific structural similarity definitions. We evaluated our method on the tasks of node classification, clustering, and visualization, using three real-world and five synthetic networks. Our experiments show that our framework achieves superior results than existing methods in learning node role representations.
QUANT-PHJun 3, 2021
Preparation of Many-body Ground States by Time Evolution with Variational Microscopic Magnetic Fields and Incomplete InteractionsYing Lu, Yue-Min Li, Peng-Fei Zhou et al.
State preparation is of fundamental importance in quantum physics, which can be realized by constructing the quantum circuit as a unitary that transforms the initial state to the target, or implementing a quantum control protocol to evolve to the target state with a designed Hamiltonian. In this work, we study the latter on quantum many-body systems by the time evolution with fixed couplings and variational magnetic fields. In specific, we consider to prepare the ground states of the Hamiltonians containing certain interactions that are missing in the Hamiltonians for the time evolution. An optimization method is proposed to optimize the magnetic fields by "fine-graining" the discretization of time, in order to gain high precision and stability. The back propagation technique is utilized to obtain the gradients of the fields against the logarithmic fidelity. Our method is tested on preparing the ground state of Heisenberg chain with the time evolution by the XY and Ising interactions, and its performance surpasses two baseline methods that use local and global optimization strategies, respectively. Our work can be applied and generalized to other quantum models such as those defined on higher dimensional lattices. It enlightens to reduce the complexity of the required interactions for implementing quantum control or other tasks in quantum information and computation by means of optimizing the magnetic fields.
SINov 3, 2020
Embedding Node Structural Role Identity into Hyperbolic SpaceLili Wang, Ying Lu, Chenghan Huang et al.
Recently, there has been an interest in embedding networks in hyperbolic space, since hyperbolic space has been shown to work well in capturing graph/network structure as it can naturally reflect some properties of complex networks. However, the work on network embedding in hyperbolic space has been focused on microscopic node embedding. In this work, we are the first to present a framework to embed the structural roles of nodes into hyperbolic space. Our framework extends struct2vec, a well-known structural role preserving embedding method, by moving it to a hyperboloid model. We evaluated our method on four real-world and one synthetic network. Our results show that hyperbolic space is more effective than euclidean space in learning latent representations for the structural role of nodes.
LGApr 3, 2019
D$^2$-City: A Large-Scale Dashcam Video Dataset of Diverse Traffic ScenariosZhengping Che, Guangyu Li, Tracy Li et al.
Driving datasets accelerate the development of intelligent driving and related computer vision technologies, while substantial and detailed annotations serve as fuels and powers to boost the efficacy of such datasets to improve learning-based models. We propose D$^2$-City, a large-scale comprehensive collection of dashcam videos collected by vehicles on DiDi's platform. D$^2$-City contains more than 10000 video clips which deeply reflect the diversity and complexity of real-world traffic scenarios in China. We also provide bounding boxes and tracking annotations of 12 classes of objects in all frames of 1000 videos and detection annotations on keyframes for the remainder of the videos. Compared with existing datasets, D$^2$-City features data in varying weather, road, and traffic conditions and a huge amount of elaborate detection and tracking annotations. By bringing a diverse set of challenging cases to the community, we expect the D$^2$-City dataset will advance the perception and related areas of intelligent driving.
CVFeb 21, 2018
Discriminative Label Consistent Domain AdaptationLingkun Luo, Liming Chen, Ying lu et al.
Domain adaptation (DA) is transfer learning which aims to learn an effective predictor on target data from source data despite data distribution mismatch between source and target. We present in this paper a novel unsupervised DA method for cross-domain visual recognition which simultaneously optimizes the three terms of a theoretically established error bound. Specifically, the proposed DA method iteratively searches a latent shared feature subspace where not only the divergence of data distributions between the source domain and the target domain is decreased as most state-of-the-art DA methods do, but also the inter-class distances are increased to facilitate discriminative learning. Moreover, the proposed DA method sparsely regresses class labels from the features achieved in the shared subspace while minimizing the prediction errors on the source data and ensuring label consistency between source and target. Data outliers are also accounted for to further avoid negative knowledge transfer. Comprehensive experiments and in-depth analysis verify the effectiveness of the proposed DA method which consistently outperforms the state-of-the-art DA methods on standard DA benchmarks, i.e., 12 cross-domain image classification tasks.
CVJan 17, 2018
Brenier approach for optimal transportation between a quasi-discrete measure and a discrete measureYing Lu, Liming Chen, Alexandre Saidi et al.
Correctly estimating the discrepancy between two data distributions has always been an important task in Machine Learning. Recently, Cuturi proposed the Sinkhorn distance which makes use of an approximate Optimal Transport cost between two distributions as a distance to describe distribution discrepancy. Although it has been successfully adopted in various machine learning applications (e.g. in Natural Language Processing and Computer Vision) since then, the Sinkhorn distance also suffers from two unnegligible limitations. The first one is that the Sinkhorn distance only gives an approximation of the real Wasserstein distance, the second one is the `divide by zero' problem which often occurs during matrix scaling when setting the entropy regularization coefficient to a small value. In this paper, we introduce a new Brenier approach for calculating a more accurate Wasserstein distance between two discrete distributions, this approach successfully avoids the two limitations shown above for Sinkhorn distance and gives an alternative way for estimating distribution discrepancy.
CVDec 28, 2017
Discriminative and Geometry Aware Unsupervised Domain AdaptationLingkun Luo, Liming Chen, Shiqiang Hu et al.
Domain adaptation (DA) aims to generalize a learning model across training and testing data despite the mismatch of their data distributions. In light of a theoretical estimation of upper error bound, we argue in this paper that an effective DA method should 1) search a shared feature subspace where source and target data are not only aligned in terms of distributions as most state of the art DA methods do, but also discriminative in that instances of different classes are well separated; 2) account for the geometric structure of the underlying data manifold when inferring data labels on the target domain. In comparison with a baseline DA method which only cares about data distribution alignment between source and target, we derive three different DA models, namely CDDA, GA-DA, and DGA-DA, to highlight the contribution of Close yet Discriminative DA(CDDA) based on 1), Geometry Aware DA (GA-DA) based on 2), and finally Discriminative and Geometry Aware DA (DGA-DA) implementing jointly 1) and 2). Using both synthetic and real data, we show the effectiveness of the proposed approach which consistently outperforms state of the art DA methods over 36 image classification DA tasks through 6 popular benchmarks. We further carry out in-depth analysis of the proposed DA method in quantifying the contribution of each term of our DA model and provide insights into the proposed DA methods in visualizing both real and synthetic data.
CVSep 9, 2017
Optimal Transport for Deep Joint Transfer LearningYing Lu, Liming Chen, Alexandre Saidi
Training a Deep Neural Network (DNN) from scratch requires a large amount of labeled data. For a classification task where only small amount of training data is available, a common solution is to perform fine-tuning on a DNN which is pre-trained with related source data. This consecutive training process is time consuming and does not consider explicitly the relatedness between different source and target tasks. In this paper, we propose a novel method to jointly fine-tune a Deep Neural Network with source data and target data. By adding an Optimal Transport loss (OT loss) between source and target classifier predictions as a constraint on the source classifier, the proposed Joint Transfer Learning Network (JTLN) can effectively learn useful knowledge for target classification from source data. Furthermore, by using different kind of metric as cost matrix for the OT loss, JTLN can incorporate different prior knowledge about the relatedness between target categories and source categories. We carried out experiments with JTLN based on Alexnet on image classification datasets and the results verify the effectiveness of the proposed JTLN in comparison with standard consecutive fine-tuning. This Joint Transfer Learning with OT loss is general and can also be applied to other kind of Neural Networks.
DBJun 2, 2017
Efficient Detection of Points of Interest from Georeferenced Visual ContentYing Lu, Juan A. Colmenares
Many people take photos and videos with smartphones and more recently with 360-degree cameras at popular places and events, and share them in social media. Such visual content is produced in large volumes in urban areas, and it is a source of information that online users could exploit to learn what has got the interest of the general public on the streets of the cities where they live or plan to visit. A key step to providing users with that information is to identify the most popular k spots in specified areas. In this paper, we propose a clustering and incremental sampling (C&IS) approach that trades off accuracy of top-k results for detection speed. It uses clustering to determine areas with high density of visual content, and incremental sampling, controlled by stopping criteria, to limit the amount of computational work. It leverages spatial metadata, which represent the scenes in the visual content, to rapidly detect the hotspots, and uses a recently proposed Gaussian probability model to describe the capture intention distribution in the query area. We evaluate the approach with metadata, derived from a non-synthetic, user-generated dataset, for regular mobile and 360-degree visual content. Our results show that the C&IS approach offers 2.8x-19x reductions in processing time over an optimized baseline, while in most cases correctly identifying 4 out of 5 top locations.