Shengwei Wang

h-index82

2papers

22,727citations

2 Papers

2.6LGApr 29, 2024

Bridging Data Barriers among Participants: Assessing the Potential of Geoenergy through Federated Learning

Weike Peng, Jiaxin Gao, Yuntian Chen et al.

Machine learning algorithms emerge as a promising approach in energy fields, but its practical is hindered by data barriers, stemming from high collection costs and privacy concerns. This study introduces a novel federated learning (FL) framework based on XGBoost models, enabling safe collaborative modeling with accessible yet concealed data from multiple parties. Hyperparameter tuning of the models is achieved through Bayesian Optimization. To ascertain the merits of the proposed FL-XGBoost method, a comparative analysis is conducted between separate and centralized models to address a classical binary classification problem in geoenergy sector. The results reveal that the proposed FL framework strikes an optimal balance between privacy and accuracy. FL models demonstrate superior accuracy and generalization capabilities compared to separate models, particularly for participants with limited data or low correlation features and offers significant privacy benefits compared to centralized model. The aggregated optimization approach within the FL agreement proves effective in tuning hyperparameters. This study opens new avenues for assessing unconventional reservoirs through collaborative and privacy-preserving FL techniques.

17.1HCSep 5, 2016

Crowdsourcing Information Extraction for Biomedical Systematic Reviews

Yalin Sun, Pengxiang Cheng, Shengwei Wang et al.

Information extraction is a critical step in the practice of conducting biomedical systematic literature reviews. Extracted structured data can be aggregated via methods such as statistical meta-analysis. Typically highly trained domain experts extract data for systematic reviews. The high expense of conducting biomedical systematic reviews has motivated researchers to explore lower cost methods that achieve similar rigor without compromising quality. Crowdsourcing represents one such promising approach. In this work-in-progress study, we designed a crowdsourcing task for biomedical information extraction. We briefly report the iterative design process and the results of two pilot testings. We found that giving more concrete examples in the task instruction can help workers better understand the task, especially for concepts that are abstract and confusing. We found a few workers completed most of the work, and our payment level appeared more attractive to workers from low-income countries. In the future, we will further evaluate our results with reference to gold standard extractions, thus assessing the feasibility of tasking crowd workers with extracting biomedical intervention information for systematic reviews.