Jianpeng Ma

SP
h-index32
7papers
397citations
Novelty55%
AI Score40

7 Papers

LGAug 21, 2025Code
Intern-S1: A Scientific Multimodal Foundation Model

Lei Bai, Zhongrui Cai, Yuhang Cao et al.

In recent years, a plethora of open-source foundation models have emerged, achieving remarkable progress in some widely attended fields, with performance being quite close to that of closed-source models. However, in high-value but more challenging scientific professional fields, either the fields still rely on expert models, or the progress of general foundation models lags significantly compared to those in popular areas, far from sufficient for transforming scientific research and leaving substantial gap between open-source models and closed-source models in these scientific domains. To mitigate this gap and explore a step further toward Artificial General Intelligence (AGI), we introduce Intern-S1, a specialized generalist equipped with general understanding and reasoning capabilities with expertise to analyze multiple science modal data. Intern-S1 is a multimodal Mixture-of-Experts (MoE) model with 28 billion activated parameters and 241 billion total parameters, continually pre-trained on 5T tokens, including over 2.5T tokens from scientific domains. In the post-training stage, Intern-S1 undergoes offline and then online reinforcement learning (RL) in InternBootCamp, where we propose Mixture-of-Rewards (MoR) to synergize the RL training on more than 1000 tasks simultaneously. Through integrated innovations in algorithms, data, and training systems, Intern-S1 achieved top-tier performance in online RL training. On comprehensive evaluation benchmarks, Intern-S1 demonstrates competitive performance on general reasoning tasks among open-source models and significantly outperforms open-source models in scientific domains, surpassing closed-source state-of-the-art models in professional tasks, such as molecular synthesis planning, reaction condition prediction, predicting thermodynamic stabilities for crystals. Our models are available at https://huggingface.co/internlm/Intern-S1.

CVOct 9, 2023
CAMEL2: Enhancing weakly supervised learning for histopathology images by incorporating the significance ratio

Gang Xu, Shuhao Wang, Lingyu Zhao et al.

Histopathology image analysis plays a crucial role in cancer diagnosis. However, training a clinically applicable segmentation algorithm requires pathologists to engage in labour-intensive labelling. In contrast, weakly supervised learning methods, which only require coarse-grained labels at the image level, can significantly reduce the labeling efforts. Unfortunately, while these methods perform reasonably well in slide-level prediction, their ability to locate cancerous regions, which is essential for many clinical applications, remains unsatisfactory. Previously, we proposed CAMEL, which achieves comparable results to those of fully supervised baselines in pixel-level segmentation. However, CAMEL requires 1,280x1,280 image-level binary annotations for positive WSIs. Here, we present CAMEL2, by introducing a threshold of the cancerous ratio for positive bags, it allows us to better utilize the information, consequently enabling us to scale up the image-level setting from 1,280x1,280 to 5,120x5,120 while maintaining the accuracy. Our results with various datasets, demonstrate that CAMEL2, with the help of 5,120x5,120 image-level binary annotations, which are easy to annotate, achieves comparable performance to that of a fully supervised baseline in both instance- and slide-level classifications.

ITAug 9, 2021
Deep Learning Based Antenna-time Domain Channel Extrapolation for Hybrid mmWave Massive MIMO

Shunbo Zhang, Shun Zhang, Jianpeng Ma et al.

In a time-varying massive multiple-input multipleoutput (MIMO) system, the acquisition of the downlink channel state information at the base station (BS) is a very challenging task due to the prohibitively high overheads associated with downlink training and uplink feedback. In this paper, we consider the hybrid precoding structure at BS and examine the antennatime domain channel extrapolation. We design a latent ordinary differential equation (ODE)-based network under the variational auto-encoder (VAE) framework to learn the mapping function from the partial uplink channels to the full downlink ones at the BS side. Specifically, the gated recurrent unit is adopted for the encoder and the fully-connected neural network is used for the decoder. The end-to-end learning is utilized to optimize the network parameters. Simulation results show that the designed network can efficiently infer the full downlink channels from the partial uplink ones, which can significantly reduce the channel training overhead.

SPMay 14, 2021
Deep Learning Based RIS Channel Extrapolation with Element-grouping

Shunbo Zhang, Shun Zhang, Feifei Gao et al.

Reconfigurable intelligent surface (RIS) is considered as a revolutionary technology for future wireless communication networks. In this letter, we consider the acquisition of the cascaded channels, which is a challenging task due to the massive number of passive RIS elements. To reduce the pilot overhead, we adopt the element-grouping strategy, where each element in one group shares the same reflection coefficient and is assumed to have the same channel condition. We analyze the channel interference caused by the element-grouping strategy and further design two deep learning based networks. The first one aims to refine the partial channels by eliminating the interference, while the second one tries to extrapolate the full channels from the refined partial channels. We cascade the two networks and jointly train them. Simulation results show that the proposed scheme provides significant gain compared to the conventional element-grouping method without interference elimination.

SPSep 3, 2020
Deep Learning Based Antenna Selection for Channel Extrapolation in FDD Massive MIMO

Yindi Yang, Shun Zhang, Feifei Gao et al.

In massive multiple-input multiple-output (MIMO) systems, the large number of antennas would bring a great challenge for the acquisition of the accurate channel state information, especially in the frequency division duplex mode. To overcome the bottleneck of the limited number of radio links in hybrid beamforming, we utilize the neural networks (NNs) to capture the inherent connection between the uplink and downlink channel data sets and extrapolate the downlink channels from a subset of the uplink channel state information. We study the antenna subset selection problem in order to achieve the best channel extrapolation and decrease the data size of NNs. The probabilistic sampling theory is utilized to approximate the discrete antenna selection as a continuous and differentiable function, which makes the back propagation of the deep learning feasible. Then, we design the proper off-line training strategy to optimize both the antenna selection pattern and the extrapolation NNs. Finally, numerical results are presented to verify the effectiveness of our proposed massive MIMO channel extrapolation algorithm.

SPSep 3, 2020
Deep Learning Optimized Sparse Antenna Activation for Reconfigurable Intelligent Surface Assisted Communication

Shunbo Zhang, Shun Zhang, Feifei Gao et al.

To capture the communications gain of the massive radiating elements with low power cost, the conventional reconfigurable intelligent surface (RIS) usually works in passive mode. However, due to the cascaded channel structure and the lack of signal processing ability, it is difficult for RIS to obtain the individual channel state information and optimize the beamforming vector. In this paper, we add signal processing units for a few antennas at RIS to partially acquire the channels. To solve the crucial active antenna selection problem, we construct an active antenna selection network that utilizes the probabilistic sampling theory to select the optimal locations of these active antennas. With this active antenna selection network, we further design two deep learning (DL) based schemes, i.e., the channel extrapolation scheme and the beam searching scheme, to enable the RIS communication system. The former utilizes the selection network and a convolutional neural network to extrapolate the full channels from the partial channels received by the active RIS antennas, while the latter adopts a fully-connected neural network to achieve the direct mapping between the partial channels and the optimal beamforming vector with maximal transmission rate. Simulation results are provided to demonstrate the effectiveness of the designed DL-based schemes.

IVAug 28, 2019
CAMEL: A Weakly Supervised Learning Framework for Histopathology Image Segmentation

Gang Xu, Zhigang Song, Zhuo Sun et al.

Histopathology image analysis plays a critical role in cancer diagnosis and treatment. To automatically segment the cancerous regions, fully supervised segmentation algorithms require labor-intensive and time-consuming labeling at the pixel level. In this research, we propose CAMEL, a weakly supervised learning framework for histopathology image segmentation using only image-level labels. Using multiple instance learning (MIL)-based label enrichment, CAMEL splits the image into latticed instances and automatically generates instance-level labels. After label enrichment, the instance-level labels are further assigned to the corresponding pixels, producing the approximate pixel-level labels and making fully supervised training of segmentation models possible. CAMEL achieves comparable performance with the fully supervised approaches in both instance-level classification and pixel-level segmentation on CAMELYON16 and a colorectal adenoma dataset. Moreover, the generality of the automatic labeling methodology may benefit future weakly supervised learning studies for histopathology image analysis.