Guangsheng Yu

CR
h-index17
22papers
263citations
Novelty44%
AI Score54

22 Papers

CRMay 25
Counted NFT Transfers

Qin Wang, Minfeng Qi, Guangsheng Yu et al.

Non-fungible tokens (NFTs) on Ethereum currently follow a binary mobility paradigm: ERC-721 enables unrestricted transfers, whereas SBTs (ERC-5192) prohibit transfers entirely. We identify a design gap in which no standard mechanism supports bounded transferability, where ownership mobility is allowed but limited to a finite number of programmable transfers. We study counted NFT transfers and introduce ERC-7634 as a minimal realization compatible with ERC-721. The design augments each token with a transfer counter and configurable cap L, allowing ownership to evolve under a finite transfer budget. ERC-7634 defines a minimal extension interface with three lightweight functions (transferCountOf, setTransferLimit, and transferLimitOf), two events, and native-transfer hooks, requiring fewer than 60 additional lines of Solidity while preserving full backward compatibility with existing NFT infrastructure. We analyze behavioral and economic consequences of counted transfers. Our results reveal (i) a mobility premium induced by remaining transfer capacity, (ii) a protocol-level costing signal that can deter wash trading in cap-aware markets through irreversible budget consumption, (iii) bounded recursive collateralization enabled by limited ownership turnover, and (iv) associated security and gas-cost implications, including wrapper-bypass trade-offs. Evaluation on calibrated simulations shows that moderate limits (e.g., L = 10) affect fewer than 15% of tokens under representative transfer distributions, while repeated manipulation becomes unprofitable after a few cycles in a cap-aware pricing model; the additional gas overhead remains below 11% per transfer. We further position ERC-7634 within the NFT mobility design space, derive practical cap-selection guidelines, and discuss post-cap ownership outcomes including soulbound conversion, auto-burn, and provenance freeze.

LGJan 7, 2023
IronForge: An Open, Secure, Fair, Decentralized Federated Learning

Guangsheng Yu, Xu Wang, Caijun Sun et al.

Federated learning (FL) provides an effective machine learning (ML) architecture to protect data privacy in a distributed manner. However, the inevitable network asynchrony, the over-dependence on a central coordinator, and the lack of an open and fair incentive mechanism collectively hinder its further development. We propose \textsc{IronForge}, a new generation of FL framework, that features a Directed Acyclic Graph (DAG)-based data structure and eliminates the need for central coordinators to achieve fully decentralized operations. \textsc{IronForge} runs in a public and open network, and launches a fair incentive mechanism by enabling state consistency in the DAG, so that the system fits in networks where training resources are unevenly distributed. In addition, dedicated defense strategies against prevalent FL attacks on incentive fairness and data privacy are presented to ensure the security of \textsc{IronForge}. Experimental results based on a newly developed testbed FLSim highlight the superiority of \textsc{IronForge} to the existing prevalent FL frameworks under various specifications in performance, fairness, and security. To the best of our knowledge, \textsc{IronForge} is the first secure and fully decentralized FL framework that can be applied in open networks with realistic network and training settings.

LGAug 16, 2024Code
Parallel Unlearning in Inherited Model Networks

Xiao Liu, Mingyuan Li, Guangsheng Yu et al.

Unlearning is challenging in generic learning frameworks with the continuous growth and updates of models exhibiting complex inheritance relationships. This paper presents a novel unlearning framework that enables fully parallel unlearning among models exhibiting inheritance. We use a chronologically Directed Acyclic Graph (DAG) to capture various unlearning scenarios occurring in model inheritance networks. Central to our framework is the Fisher Inheritance Unlearning (FIUn) method, designed to enable efficient parallel unlearning within the DAG. FIUn utilizes the Fisher Information Matrix (FIM) to assess the significance of model parameters for unlearning tasks and adjusts them accordingly. To handle multiple unlearning requests simultaneously, we propose the Merging-FIM (MFIM) function, which consolidates FIMs from multiple upstream models into a unified matrix. This design supports all unlearning scenarios captured by the DAG, enabling one-shot removal of inherited knowledge while significantly reducing computational overhead. Experiments confirm the effectiveness of our unlearning framework. For single-class tasks, it achieves complete unlearning with 0% accuracy for unlearned labels while maintaining 94.53% accuracy for retained labels. For multi-class tasks, the accuracy is 1.07% for unlearned labels and 84.77% for retained labels. Our framework accelerates unlearning by 99% compared to alternative methods. Code is in https://github.com/MJLee00/Parallel-Unlearning-in-Inherited-Model-Networks.

LGDec 10, 2025Code
CFLight: Enhancing Safety with Traffic Signal Control through Counterfactual Learning

Mingyuan Li, Chunyu Liu, Zhuojun Li et al.

Traffic accidents result in millions of injuries and fatalities globally, with a significant number occurring at intersections each year. Traffic Signal Control (TSC) is an effective strategy for enhancing safety at these urban junctures. Despite the growing popularity of Reinforcement Learning (RL) methods in optimizing TSC, these methods often prioritize driving efficiency over safety, thus failing to address the critical balance between these two aspects. Additionally, these methods usually need more interpretability. CounterFactual (CF) learning is a promising approach for various causal analysis fields. In this study, we introduce a novel framework to improve RL for safety aspects in TSC. This framework introduces a novel method based on CF learning to address the question: ``What if, when an unsafe event occurs, we backtrack to perform alternative actions, and will this unsafe event still occur in the subsequent period?'' To answer this question, we propose a new structure causal model to predict the result after executing different actions, and we propose a new CF module that integrates with additional ``X'' modules to promote safe RL practices. Our new algorithm, CFLight, which is derived from this framework, effectively tackles challenging safety events and significantly improves safety at intersections through a near-zero collision control strategy. Through extensive numerical experiments on both real-world and synthetic datasets, we demonstrate that CFLight reduces collisions and improves overall traffic performance compared to conventional RL methods and the recent safe RL model. Moreover, our method represents a generalized and safe framework for RL methods, opening possibilities for applications in other domains. The data and code are available in the github https://github.com/MJLee00/CFLight-Enhancing-Safety-with-Traffic-Signal-Control-through-Counterfactual-Learning.

LGJul 17, 2023
A Secure Aggregation for Federated Learning on Long-Tailed Data

Yanna Jiang, Baihe Ma, Xu Wang et al.

As a distributed learning, Federated Learning (FL) faces two challenges: the unbalanced distribution of training data among participants, and the model attack by Byzantine nodes. In this paper, we consider the long-tailed distribution with the presence of Byzantine nodes in the FL scenario. A novel two-layer aggregation method is proposed for the rejection of malicious models and the advisable selection of valuable models containing tail class data information. We introduce the concept of think tank to leverage the wisdom of all participants. Preliminary experiments validate that the think tank can make effective model selections for global aggregation.

CRFeb 24
SoK: Agentic Skills -- Beyond Tool Use in LLM Agents

Yanna Jiang, Delong Li, Haiyu Deng et al.

Agentic systems increasingly rely on reusable procedural capabilities, \textit{a.k.a., agentic skills}, to execute long-horizon workflows reliably. These capabilities are callable modules that package procedural knowledge with explicit applicability conditions, execution policies, termination criteria, and reusable interfaces. Unlike one-off plans or atomic tool calls, skills operate (and often do well) across tasks. This paper maps the skill layer across the full lifecycle (discovery, practice, distillation, storage, composition, evaluation, and update) and introduces two complementary taxonomies. The first is a system-level set of \textbf{seven design patterns} capturing how skills are packaged and executed in practice, from metadata-driven progressive disclosure and executable code skills to self-evolving libraries and marketplace distribution. The second is an orthogonal \textbf{representation $\times$ scope} taxonomy describing what skills \emph{are} (natural language, code, policy, hybrid) and what environments they operate over (web, OS, software engineering, robotics). We analyze the security and governance implications of skill-based agents, covering supply-chain risks, prompt injection via skill payloads, and trust-tiered execution, grounded by a case study of the ClawHavoc campaign in which nearly 1{,}200 malicious skills infiltrated a major agent marketplace, exfiltrating API keys, cryptocurrency wallets, and browser credentials at scale. We further survey deterministic evaluation approaches, anchored by recent benchmark evidence that curated skills can substantially improve agent success rates while self-generated skills may degrade them. We conclude with open challenges toward robust, verifiable, and certifiable skills for real-world autonomous agents.

CRAug 17, 2024
ByCAN: Reverse Engineering Controller Area Network (CAN) Messages from Bit to Byte Level

Xiaojie Lin, Baihe Ma, Xu Wang et al.

As the primary standard protocol for modern cars, the Controller Area Network (CAN) is a critical research target for automotive cybersecurity threats and autonomous applications. As the decoding specification of CAN is a proprietary black-box maintained by Original Equipment Manufacturers (OEMs), conducting related research and industry developments can be challenging without a comprehensive understanding of the meaning of CAN messages. In this paper, we propose a fully automated reverse-engineering system, named ByCAN, to reverse engineer CAN messages. ByCAN outperforms existing research by introducing byte-level clusters and integrating multiple features at both byte and bit levels. ByCAN employs the clustering and template matching algorithms to automatically decode the specifications of CAN frames without the need for prior knowledge. Experimental results demonstrate that ByCAN achieves high accuracy in slicing and labeling performance, i.e., the identification of CAN signal boundaries and labels. In the experiments, ByCAN achieves slicing accuracy of 80.21%, slicing coverage of 95.21%, and labeling accuracy of 68.72% for general labels when analyzing the real-world CAN frames.

CRApr 30
MEV in Binance Builder

Qin Wang, Ruiqiang Li, Guangsheng Yu et al.

We study builder-driven MEV arbitrage on BNB Smart Chain (BSC). BSC's Proposer-Builder Separation (PBS) adopts a leaner design: only whitelisted builders can participate, blocks are produced at shorter intervals, and private order flow bypasses the public mempool. These features have long raised community concerns over centralization, which we empirically confirm by tracing the arbitrage activities of the two dominant builders from Apr. 1, 2025 to Feb. 28, 2026 (full observable activity cycle). Within months, the two leading builders, \bd{48Club} and \bd{Blockrazor}, produced over 87\% of blocks and captured about 90\%+ of MEV profits. We find that profits concentrate in short, low-hop arbitrage routes over wrapped tokens and stablecoins, and that block construction rapidly converges toward monopoly. Beyond concentration alone, our analysis reveals a structural source of inequality: BSC's short block interval and whitelisted PBS collapse the contestable window for MEV competition, amplifying latency advantages and excluding slower builders and searchers. MEV extraction on BSC is not only more centralized than on Ethereum, but also structurally more vulnerable to censorship and fairness erosion.

CRMar 27
Clawed and Dangerous: Can We Trust Open Agentic Systems?

Shiping Chen, Qin Wang, Guangsheng Yu et al.

Open agentic systems combine LLM-based planning with external capabilities, persistent memory, and privileged execution. They are used in coding assistants, browser copilots, and enterprise automation. OpenClaw is a visible instance of this broader class. Without much attention yet, their security challenge is fundamentally different from that of traditional software that relies on predictable execution and well-defined control flow. In open agentic systems, everything is ''probabilistic'': plans are generated at runtime, key decisions may be shaped by untrusted natural-language inputs and tool outputs, execution unfolds in uncertain environments, and actions are taken under authority delegated by human users. The central challenge is therefore not merely robustness against individual attacks, but the governance of agentic behavior under persistent uncertainty. This paper systematizes the area through a software engineering lens. We introduce a six-dimensional analytical taxonomy and synthesize 50 papers spanning attacks, benchmarks, defenses, audits, and adjacent engineering foundations. From this synthesis, we derive a reference doctrine for secure-by-construction agent platforms, together with an evaluation scorecard for assessing platform security posture. Our review shows that the literature is relatively mature in attack characterization and benchmark construction, but remains weak in deployment controls, operational governance, persistent-memory integrity, and capability revocation. These gaps define a concrete engineering agenda for building agent ecosystems that are governable, auditable, and resilient under compromise.

CRMar 19
PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents

Guangsheng Yu, Qin Wang, Rui Lang et al.

Cloud-hosted large language models (LLMs) have become the de facto planners in agentic systems, coordinating tools and guiding execution over local environments. In many deployments, however, the environment being planned over is private, containing source code, files, credentials, and metadata that cannot be exposed to the cloud. Existing solutions address adjacent concerns, such as execution isolation, access control, or confidential inference, but they do not control what cloud planners observe during planning: within the permitted scope, \textit{raw environment state is still exposed}. We introduce PlanTwin, a privacy-preserving architecture for cloud-assisted planning without exposing raw local context. The key idea is to project the real environment into a \textit{planning-oriented digital twin}: a schema-constrained and de-identified abstract graph that preserves planning-relevant structure while removing reconstructable details. The cloud planner operates solely on this sanitized twin through a bounded capability interface, while a local gatekeeper enforces safety policies and cumulative disclosure budgets. We further formalize the privacy-utility trade-off as a capability granularity problem, define architectural privacy goals using $(k,δ)$-anonymity and $ε$-unlinkability, and mitigate compositional leakage through multi-turn disclosure control. We implement PlanTwin as middleware between local agents and cloud planners and evaluate it on 60 agentic tasks across ten domains with four cloud planners. PlanTwin achieves full sensitive-item non-disclosure (SND = 1.0) while maintaining planning quality close to full-context systems: three of four planners achieve PQS $> 0.79$, and the full pipeline incurs less than 2.2\% utility loss.

CEMar 19
In the Margins: An Empirical Study of Ethereum Inscriptions

Xihan Xiong, Minfeng Qi, Shiping Chen et al.

Ethereum Inscriptions (Ethscriptions) repurpose Ethereum calldata into a persistent inscription channel by embedding \texttt{data:}~URI payloads. These transactions typically target externally owned accounts, allowing the payload to bypass EVM execution while remaining permanently replicated across full nodes. Although calldata was originally designed for compact smart-contract parameters, this repurposing enables structured data embedding with long-term storage consequences. We present the first large-scale empirical study of Ethscriptions, treating them as a distinct \emph{calldata-resident workload} rather than merely a subset of general calldata usage. Our analysis focuses on the \textit{Ethscription} operational subset, which consists of payloads that decode to JSON and conform to a token-operation grammar (e.g., \texttt{p}, \texttt{op}, \texttt{tick}, \texttt{amt}). From $6.27$ million Ethscription candidates (\Uone), we extract $4.75$ million Ethscription operations (\Utwo, $75.8\%$ of \Uone). This result shows that structured token-like activity dominates the ecosystem. Our measurements further reveal (i) a complete workload lifecycle compressed into nine months (bootstrap, expansion, saturation), (ii) proliferation of $30$+ competing protocols without convergence toward a dominant standard, (iii) a lifecycle funnel exhibiting $201\times$ deploy-to-mint amplification and a $57.6{:}1$ mint-to-transfer collapse indicative of speculative minting, (iv) extreme participation inequality (Gini~$0.86$), and (v) a measurable permanent data footprint imposed on the Ethereum network.

CRMar 13Code
Why Neural Structural Obfuscation Can't Kill White-Box Watermarks for Good!

Yanna Jiang, Guangsheng Yu, Qingyuan Yu et al.

Neural Structural Obfuscation (NSO) (USENIX Security'23) is a family of ``zero cost'' structure-editing transforms (\texttt{nso\_zero}, \texttt{nso\_clique}, \texttt{nso\_split}) that inject dummy neurons. By combining neuron permutation and parameter scaling, NSO makes a radical modification to the network structure and parameters while strictly preserving functional equivalence, thereby disrupting white-box watermark verification. This capability has been a fundamental challenge to the reliability of existing white-box watermarking schemes. We rethink NSO and, for the first time, fully recover from the damage it has caused. We redefine NSO as a graph-consistent threat model within a \textit{producer--consumer} paradigm. This formulation posits that any obfuscation of a producer node necessitates a compatible layout update in all downstream consumers to maintain structural integrity. Building on these consistency constraints on signal propagation, we present \textsc{Canon}, a recovery framework that probes the attacked model to identify redundancy/dummy channels and then \textit{globally} canonicalizes the network by rewriting \textit{all} downstream consumers by construction, synchronizing layouts across \texttt{fan-out}, \texttt{add}, and \texttt{cat}. Extensive experiments demonstrate that, even under strong composed and extended NSO attacks, \textsc{Canon} achieves \textbf{100\%} recovery success, restoring watermark verifiability while preserving task utility. Our code is available at https://anonymous.4open.science/r/anti-NSO-9874.

AIApr 19
Knows: Agent-Native Structured Research Representations

Guangsheng Yu, Xu Wang

Research artifacts are distributed primarily as reader-oriented documents like PDFs. This creates a bottleneck for increasingly agent-assisted and agent-native research workflows, in which LLM agents need to infer fine-grained, task-relevant information from lengthy full documents, a process that is expensive, repetitive, and unstable at scale. We introduce Knows, a lightweight companion specification that binds structured claims, evidence, provenance, and verifiable relations to existing research artifacts in a form LLM agents can consume directly. Knows addresses the gap with a thin YAML sidecar (KnowsRecord) that coexists with the original PDF, requiring no changes to the publication itself, and validated by a deterministic schema linter. We evaluate Knows on 140 comprehension questions across 20 papers spanning 14 academic disciplines, comparing PDF-only, sidecar-only, and hybrid conditions across six LLM agents of varying capacity. Weak models (0.8B--2B parameters) improve from 19--25\% to 47--67\% accuracy (+29 to +42 percentage points) when reading sidecar instead of PDF, while consuming 29--86\% fewer input tokens; an LLM-as-judge re-scoring confirms that weak-model sidecar accuracy (75--77\%) approaches stronger-model PDF accuracy (78--83\%). Beyond this controlled evaluation, a community sidecar hub at https://knows.academy/ has already indexed over ten thousand publications and continues to grow daily, providing independent evidence that the format is adoption-ready at scale.

CRApr 9, 2024
Is Your AI Truly Yours? Leveraging Blockchain for Copyrights, Provenance, and Lineage

Qin Wang, Guangsheng Yu, Yilin Sai et al.

As Artificial Intelligence (AI) integrates into diverse areas, particularly in content generation, ensuring rightful ownership and ethical use becomes paramount, AI service providers are expected to prioritize responsibly sourcing training data and obtaining licenses from data owners. However, existing studies primarily center on safeguarding static copyrights, which simply treat metadata/datasets as non-fungible items with transferable/trading capabilities, neglecting the dynamic nature of training procedures that can shape an ongoing trajectory. In this paper, we present \textsc{IBis}, a blockchain-based framework tailored for AI model training workflows. Our design can dynamically manage copyright compliance and data provenance in decentralized AI model training processes, ensuring that intellectual property rights are respected throughout iterative model enhancements and licensing updates. Technically, \textsc{IBis} integrates on-chain registries for datasets, licenses and models, alongside off-chain signing services to facilitate collaboration among multiple participants. Further, \textsc{IBis} provides APIs designed for seamless integration with existing contract management software, minimizing disruptions to established model training processes. We implement \textsc{IBis} using Daml on the Canton blockchain. Evaluation results showcase the feasibility and scalability of \textsc{IBis} across varying numbers of users, datasets, models, and licenses.

CRFeb 26, 2024
BlockFUL: Enabling Unlearning in Blockchained Federated Learning

Xiao Liu, Mingyuan Li, Xu Wang et al.

Unlearning in Federated Learning (FL) presents significant challenges, as models grow and evolve with complex inheritance relationships. This complexity is amplified when blockchain is employed to ensure the integrity and traceability of FL, where the need to edit multiple interlinked blockchain records and update all inherited models complicates the process.In this paper, we introduce Blockchained Federated Unlearning (BlockFUL), a novel framework with a dual-chain structure comprising a live chain and an archive chain for enabling unlearning capabilities within Blockchained FL. BlockFUL introduces two new unlearning paradigms, i.e., parallel and sequential paradigms, which can be effectively implemented through gradient-ascent-based and re-training-based unlearning methods. These methods enhance the unlearning process across multiple inherited models by enabling efficient consensus operations and reducing computational costs. Our extensive experiments validate that these methods effectively reduce data dependency and operational overhead, thereby boosting the overall performance of unlearning inherited models within BlockFUL on CIFAR-10 and Fashion-MNIST datasets using AlexNet, ResNet18, and MobileNetV2 models.

CRJul 13, 2025
CAN-Trace Attack: Exploit CAN Messages to Uncover Driving Trajectories

Xiaojie Lin, Baihe Ma, Xu Wang et al.

Driving trajectory data remains vulnerable to privacy breaches despite existing mitigation measures. Traditional methods for detecting driving trajectories typically rely on map-matching the path using Global Positioning System (GPS) data, which is susceptible to GPS data outage. This paper introduces CAN-Trace, a novel privacy attack mechanism that leverages Controller Area Network (CAN) messages to uncover driving trajectories, posing a significant risk to drivers' long-term privacy. A new trajectory reconstruction algorithm is proposed to transform the CAN messages, specifically vehicle speed and accelerator pedal position, into weighted graphs accommodating various driving statuses. CAN-Trace identifies driving trajectories using graph-matching algorithms applied to the created graphs in comparison to road networks. We also design a new metric to evaluate matched candidates, which allows for potential data gaps and matching inaccuracies. Empirical validation under various real-world conditions, encompassing different vehicles and driving regions, demonstrates the efficacy of CAN-Trace: it achieves an attack success rate of up to 90.59% in the urban region, and 99.41% in the suburban region.

CRJun 30, 2025
SoK: Semantic Privacy in Large Language Models

Baihe Ma, Yanna Jiang, Xu Wang et al.

As Large Language Models (LLMs) are increasingly deployed in sensitive domains, traditional data privacy measures prove inadequate for protecting information that is implicit, contextual, or inferable - what we define as semantic privacy. This Systematization of Knowledge (SoK) introduces a lifecycle-centric framework to analyze how semantic privacy risks emerge across input processing, pretraining, fine-tuning, and alignment stages of LLMs. We categorize key attack vectors and assess how current defenses, such as differential privacy, embedding encryption, edge computing, and unlearning, address these threats. Our analysis reveals critical gaps in semantic-level protection, especially against contextual inference and latent representation leakage. We conclude by outlining open challenges, including quantifying semantic leakage, protecting multimodal inputs, balancing de-identification with generation quality, and ensuring transparency in privacy enforcement. This work aims to inform future research on designing robust, semantically aware privacy-preserving techniques for LLMs.

CRMay 18, 2025
PoLO: Proof-of-Learning and Proof-of-Ownership at Once with Chained Watermarking

Haiyu Deng, Yanna Jiang, Guangsheng Yu et al.

Machine learning models are increasingly shared and outsourced, raising requirements of verifying training effort (Proof-of-Learning, PoL) to ensure claimed performance and establishing ownership (Proof-of-Ownership, PoO) for transactions. When models are trained by untrusted parties, PoL and PoO must be enforced together to enable protection, attribution, and compensation. However, existing studies typically address them separately, which not only weakens protection against forgery and privacy breaches but also leads to high verification overhead. We propose PoLO, a unified framework that simultaneously achieves PoL and PoO using chained watermarks. PoLO splits the training process into fine-grained training shards and embeds a dedicated watermark in each shard. Each watermark is generated using the hash of the preceding shard, certifying the training process of the preceding shard. The chained structure makes it computationally difficult to forge any individual part of the whole training process. The complete set of watermarks serves as the PoL, while the final watermark provides the PoO. PoLO offers more efficient and privacy-preserving verification compared to the vanilla PoL solutions that rely on gradient-based trajectory tracing and inadvertently expose training data during verification, while maintaining the same level of ownership assurance of watermark-based PoO schemes. Our evaluation shows that PoLO achieves 99% watermark detection accuracy for ownership verification, while preserving data privacy and cutting verification costs to just 1.5-10% of traditional methods. Forging PoLO demands 1.1-4x more resources than honest proof generation, with the original proof retaining over 90% detection accuracy even after attacks.

CRMay 25, 2023
Distributed Trust Through the Lens of Software Architecture

Sin Kit Lo, Yue Liu, Guangsheng Yu et al.

Distributed trust is a nebulous concept that has evolved from different perspectives in recent years. While one can attribute its current prominence to blockchain and cryptocurrency, the distributed trust concept has been cultivating progress in federated learning, trustworthy and responsible AI in an ecosystem setting, data sharing, privacy issues across organizational boundaries, and zero trust cybersecurity. This paper will survey the concept of distributed trust in multiple disciplines. It will take a system/software architecture point of view to look at trust redistribution/shift and the associated tradeoffs in systems and applications enabled by distributed trust technologies.

LGMay 8, 2023
Blockchained Federated Learning for Internet of Things: A Comprehensive Survey

Yanna Jiang, Baihe Ma, Xu Wang et al.

The demand for intelligent industries and smart services based on big data is rising rapidly with the increasing digitization and intelligence of the modern world. This survey comprehensively reviews Blockchained Federated Learning (BlockFL) that joins the benefits of both Blockchain and Federated Learning to provide a secure and efficient solution for the demand. We compare the existing BlockFL models in four Internet-of-Things (IoT) application scenarios: Personal IoT (PIoT), Industrial IoT (IIoT), Internet of Vehicles (IoV), and Internet of Health Things (IoHT), with a focus on security and privacy, trust and reliability, efficiency, and data heterogeneity. Our analysis shows that the features of decentralization and transparency make BlockFL a secure and effective solution for distributed model training, while the overhead and compatibility still need further study. It also reveals the unique challenges of each domain presents unique challenges, e.g., the requirement of accommodating dynamic environments in IoV and the high demands of identity and permission management in IoHT, in addition to some common challenges identified, such as privacy, resource constraints, and data heterogeneity. Furthermore, we examine the existing technologies that can benefit BlockFL, thereby helping researchers and practitioners to make informed decisions about the selection and development of BlockFL for various IoT application scenarios.

SEOct 26, 2021
Defining Blockchain Governance Principles: A Comprehensive Framework

Yue Liu, Qinghua Lu, Guangsheng Yu et al.

Blockchain eliminates the need for trusted third-party intermediaries in business by enabling decentralised architecture design in software applications. However, the vulnerabilities in on-chain autonomous decision-makings and cumbersome off-chain coordination lead to serious concerns about blockchain's ability to behave in a trustworthy and efficient way. Blockchain governance has received considerable attention to support the decision-making process during the use and evolution of blockchain. Nevertheless, the conventional governance frameworks do not apply to blockchain due to its distributed architecture and decentralised decision process. These inherent features lead to the absence of a clear source of authority in blockchain ecosystem. Currently, there is a lack of systematic guidance on the governance of blockchain. Therefore, in this paper, we present a comprehensive blockchain governance framework, which elucidates an integrated view of the degree of decentralisation, decision rights, incentives, accountability, ecosystem, and legal and ethical responsibilities. The above aspects are formulated as six high-level principles for blockchain governance. We demonstrate a qualitative analysis of the proposed framework, including case studies on five extant blockchain platforms, and comparison with existing blockchain governance frameworks. The results show that our proposed framework is feasible and applicable in a real-world context.

CRJun 27, 2021
Capacity Analysis of Public Blockchain

Xu Wang, Wei Ni, Xuan Zha et al.

As distributed ledgers, blockchains run consensus protocols which trade capacity for consistency, especially in non-ideal networks with incomplete connectivity and erroneous links. Existing studies on the tradeoff between capacity and consistency are only qualitative or rely on specific assumptions. This paper presents discrete-time Markov chain models to quantify the capacity of Proof-of-Work based public blockchains in non-ideal networks. The comprehensive model is collapsed to be ergodic under the eventual consistency of blockchains, achieving tractability and efficient evaluations of blockchain capacity. A closed-form expression for the capacity is derived in the case of two miners. Another important aspect is that we extend the ergodic model to analyze the capacity under strong consistency, evaluating the robustness of blockchains against double-spending attacks. Validated by simulations, the proposed models are accurate and reveal the effect of link quality and the distribution of mining rates on blockchain capacity and the ratio of stale blocks.