Zhiqiu Huang

CR
15papers
139citations
Novelty49%
AI Score48

15 Papers

CVJul 19, 2022
Shrinking the Semantic Gap: Spatial Pooling of Local Moment Invariants for Copy-Move Forgery Detection

Chao Wang, Zhiqiu Huang, Shuren Qi et al.

Copy-move forgery is a manipulation of copying and pasting specific patches from and to an image, with potentially illegal or unethical uses. Recent advances in the forensic methods for copy-move forgery have shown increasing success in detection accuracy and robustness. However, for images with high self-similarity or strong signal corruption, the existing algorithms often exhibit inefficient processes and unreliable results. This is mainly due to the inherent semantic gap between low-level visual representation and high-level semantic concept. In this paper, we present a very first study of trying to mitigate the semantic gap problem in copy-move forgery detection, with spatial pooling of local moment invariants for midlevel image representation. Our detection method expands the traditional works on two aspects: 1) we introduce the bag-of-visual-words model into this field for the first time, may meaning a new perspective of forensic study; 2) we propose a word-to-phrase feature description and matching pipeline, covering the spatial structure and visual saliency information of digital images. Extensive experimental results show the superior performance of our framework over state-of-the-art algorithms in overcoming the related problems caused by the semantic gap.

CRApr 21Code
Security Is Relative: Training-Free Vulnerability Detection via Multi-Agent Behavioral Contract Synthesis

Yongchao Wang, Zhiqiu Huang

Deep learning for vulnerability detection has shown promising results on early benchmarks, but recent evaluations reveal catastrophic degradation: models achieving F1 > 0.68 on legacy datasets collapse to 0.031 under strict deduplication. We identify the root cause as the semantic ambiguity problem: identical code can be secure or vulnerable depending on project-specific behavioral contracts, rendering global classification fundamentally inadequate. We propose Phoenix, a training-free multi-agent framework that resolves this ambiguity through Behavioral Contract Synthesis. Phoenix decomposes detection into three stages: a Semantic Slicer extracting minimal vulnerability-relevant context, a Requirement Reverse Engineer synthesizing Gherkin behavioral specifications encoding the security contract, and a Contract Judge evaluating code against these specifications via strict compliance checking. On PrimeVul Paired, Phoenix achieves F1 = 0.825 and Pair-Correct = 64.4%, surpassing RASM-Vul (F1 = 0.668) and VulTrial (F1 = 0.563) while using open-source models up to 48x smaller (7-14B vs. 671B). Ablation across 25 configurations demonstrates Gherkin specifications as the decisive driver (+0.09 to +0.35 F1). Error analysis reveals 18% of "False Positives" identify genuine security concerns in patched code, demonstrating that security is a relative property defined against behavioral contracts, not an absolute property of code syntax.

NEOct 15, 2019Code
DeepVS: An Efficient and Generic Approach for Source Code Modeling Usage

Yasir Hussain, Zhiqiu Huang, Yu Zhou et al.

The source code suggestions provided by current IDEs are mostly dependent on static type learning. These suggestions often end up proposing irrelevant suggestions for a peculiar context. Recently, deep learning-based approaches have shown great potential in the modeling of source code for various software engineering tasks. However, these techniques lack adequate generalization and resistance to acclimate the use of such models in a real-world software development environment. This letter presents \textit{DeepVS}, an end-to-end deep neural code completion tool that learns from existing codebases by exploiting the bidirectional Gated Recurrent Unit (BiGRU) neural net. The proposed tool is capable of providing source code suggestions instantly in an IDE by using pre-trained BiGRU neural net. The evaluation of this work is two-fold, quantitative and qualitative. Through extensive evaluation on ten real-world open-source software systems, the proposed method shows significant performance enhancement and its practicality. Moreover, the results also suggest that \textit{DeepVS} tool is capable of suggesting zero-day (unseen) code tokens by learning coding patterns from real-world software systems.

SEApr 19
Project Prometheus: Bridging the Intent Gap in Agentic Program Repair via Reverse-Engineered Executable Specifications

Yongchao Wang, Zhiqiu Huang

The transition from neural machine translation to agentic workflows has revolutionized Automated Program Repair (APR). However, existing agents, despite their advanced reasoning capabilities, frequently suffer from the ``Intent Gap'' -- the misalignment between the generated patch and the developer's original intent. Current solutions relying on natural language summaries or adversarial sampling often fail to provide the deterministic constraints required for surgical repairs. In this paper, we introduce \textsc{Prometheus}, a novel framework that bridges this gap by prioritizing \textit{Specification Inference} over code generation. We employ Behavior-Driven Development (BDD) as an executable contract, utilizing a multi-agent architecture to reverse-engineer Gherkin specifications from runtime failure reports. To resolve the ``Hallucination of Intent,'' we propose a \textbf{Requirement Quality Assurance (RQA) Loop}, a mechanism that leverages ground-truth code as a proxy oracle to validate inferred specifications. We evaluated \textsc{Prometheus} on 680 defects from the Defects4J benchmark. The results are transformative: our framework achieved a total correct patch rate of \textbf{93.97\%} (639/680). More significantly, it demonstrated a \textbf{Rescue Rate of 74.4\%}, successfully repairing 119 complex bugs that a strong blind agent failed to resolve. Qualitative analysis reveals that explicit intent guides agents away from structurally invasive over-engineering toward precise, minimal corrections. Our findings suggest that the future of APR lies not in larger models, but in the capability to align code with verified, \textbf{Executable Specifications} -- whether pre-existing or reverse-engineered.

ITApr 21
Explicit Factorization of $x^{p+1}-1$ over $\mathbb{Z}_{p^e}$: A Structural Approach via Dickson Polynomials

Yongchao Wang, Yang Ding, Jiansheng Yang et al.

Let $p$ be an odd prime. The factorization of the polynomial $x^{p+1}-1$ over the integer residue ring $\mathbb{Z}_{p^e}$ is pivotal for constructing cyclic codes with Hermitian symmetry, a critical resource for Linear Complementary Dual (LCD) codes and Entanglement-Assisted Quantum Error-Correcting Codes (EAQECC). Traditionally, lifting factorizations relies on the generic Hensel's Lemma, masking the underlying algebraic structure. In this paper, we establish a structural isomorphism between the lifting process and the roots of a special auxiliary polynomial $V(x)$, unveiling a deterministic link to Dickson polynomials. Based on this theory, we develop \texttt{Dickson-Engine}, a linear-time algorithm ($O(ep)$) that outperforms standard libraries by orders of magnitude. Applying this engine to $\mathbb{Z}_{169}$, we explicitly construct a family of classical LCD codes of length $n=182$ via the isometric Gray map. Our search reveals codes with parameters (e.g., $[182, 1, 168]_{13}$ and $[182, 2, 144]_{13}$) that are \textbf{near-optimal} with respect to the theoretical Griesmer Bound. Notably, we discover a ``robustness plateau'' starting from non-trivial dimensions ($k=4$), where the minimum distance remains stable ($d=120$) even as the dimension triples ($k=4 \rightarrow 12$). These codes provide exceptional resources for post-quantum cryptography and quantum error correction without entanglement consumption ($c=0$).

CVMay 18, 2023
Spatial-Frequency Discriminability for Revealing Adversarial Perturbations

Chao Wang, Shuren Qi, Zhiqiu Huang et al.

The vulnerability of deep neural networks to adversarial perturbations has been widely perceived in the computer vision community. From a security perspective, it poses a critical risk for modern vision systems, e.g., the popular Deep Learning as a Service (DLaaS) frameworks. For protecting deep models while not modifying them, current algorithms typically detect adversarial patterns through discriminative decomposition for natural and adversarial data. However, these decompositions are either biased towards frequency resolution or spatial resolution, thus failing to capture adversarial patterns comprehensively. Also, when the detector relies on few fixed features, it is practical for an adversary to fool the model while evading the detector (i.e., defense-aware attack). Motivated by such facts, we propose a discriminative detector relying on a spatial-frequency Krawtchouk decomposition. It expands the above works from two aspects: 1) the introduced Krawtchouk basis provides better spatial-frequency discriminability, capturing the differences between natural and adversarial data comprehensively in both spatial and frequency distributions, w.r.t. the common trigonometric or wavelet basis; 2) the extensive features formed by the Krawtchouk decomposition allows for adaptive feature selection and secrecy mechanism, significantly increasing the difficulty of the defense-aware attack, w.r.t. the detector with few fixed features. Theoretical and numerical analyses demonstrate the uniqueness and usefulness of our detector, exhibiting competitive scores on several deep models and image sets against a variety of adversarial attacks.

CRMay 27, 2021
SDN-based Runtime Security Enforcement Approach for Privacy Preservation of Dynamic Web Service Composition

Yunfei Meng, Zhiqiu Huang, Guohua Shen et al.

Aiming at the privacy preservation of dynamic Web service composition, this paper proposes a SDN-based runtime security enforcement approach for privacy preservation of dynamic Web service composition. The main idea of this approach is that the owner of service composition leverages the security policy model (SPM) to define the access control relationships that service composition must comply with in the application plane, then SPM model is transformed into the low-level security policy model (RSPM) containing the information of SDN data plane, and RSPM model is uploaded into the SDN controller. After uploading, the virtual machine access control algorithm integrated in the SDN controller monitors all of access requests towards service composition at runtime. Only the access requests that meet the definition of RSPM model can be forwarded to the target terminal. Any access requests that do not meet the definition of RSPM model will be automatically blocked by Openflow switches or deleted by SDN controller, Thus, this approach can effectively solve the problems of network-layer illegal accesses, identity theft attacks and service leakages when Web service composition is running. In order to verify the feasibility of this approach, this paper implements an experimental system by using POX controller and Mininet virtual network simulator, and evaluates the effectiveness and performance of this approach by using this system. The final experimental results show that the method is completely effective, and the method can always get the correct calculation results in an acceptable time when the scale of RSPM model is gradually increasing.

CRMay 27, 2020
A Security Policy Model Transformation and Verification Approach for Software Defined Networking

Yunfei Meng, Zhiqiu Huang, Guohua Shen et al.

Software defined networking (SDN) has been adopted to enforce the security of large-scale and complex networks because of its programmable, abstract, centralized intelligent control and global and real-time traffic view. However, the current SDN-based security enforcement mechanisms require network managers to fully understand the underlying configurations of network. Facing the increasingly complex and huge SDN networks, we urgently need a novel security policy management mechanism which can be completely transparent to any underlying information. That is it can permit network managers to define upper-level security policies without containing any underlying information of network, and by means of model transformation system, these upper-level security policies can be transformed into their corresponding lower-level policies containing underlying information automatically. Moreover, it should ensure system model updated by the generated lower-level policies can hold all of security properties defined in upper-level policies. Based on these insights, we propose a security policy model transformation and verification approach for SDN in this paper. We first present the formal definition of a security policy model (SPM) which can be used to specify the security policies used in SDN. Then, we propose a model transformation system based on SDN system model and mapping rules, which can enable network managers to convert SPM model into corresponding underlying network configuration policies automatically, i.e., flow table model (FTM). In order to verify SDN system model updated by the generated FTM models can hold the security properties defined in SPM models, we design a security policy verification system based on model checking. Finally, we utilize a comprehensive case to illustrate the feasibility of the proposed approach.

NIMar 15, 2020
SOM-based DDoS Defense Mechanism using SDN for the Internet of Things

Yunfei Meng, Zhiqiu Huang, Senzhang Wang et al.

To effectively tackle the security threats towards the Internet of things, we propose a SOM-based DDoS defense mechanism using software-defined networking (SDN) in this paper. The main idea of the mechanism is to deploy a SDN-based gateway to protect the device services in the Internet of things. The gateway provides DDoS defense mechanism based on SOM neural network. By means of SOM-based DDoS defense mechanism, the gateway can effectively identify the malicious sensing devices in the IoT, and automatically block those malicious devices after detecting them, so that it can effectively enforce the security and robustness of the system when it is under DDoS attacks. In order to validate the feasibility and effectiveness of the mechanism, we leverage POX controller and Mininet emulator to implement an experimental system, and further implement the aforementioned security enforcement mechanisms with Python. The final experimental results illustrate that the mechanism is truly effective under the different test scenarios.

SEFeb 4, 2020
Boosting API Recommendation with Implicit Feedback

Yu Zhou, Xinying Yang, Taolue Chen et al.

Developers often need to use appropriate APIs to program efficiently, but it is usually a difficult task to identify the exact one they need from a vast of candidates. To ease the burden, a multitude of API recommendation approaches have been proposed. However, most of the currently available API recommenders do not support the effective integration of users' feedback into the recommendation loop. In this paper, we propose a framework, BRAID (Boosting RecommendAtion with Implicit FeeDback), which leverages learning-to-rank and active learning techniques to boost recommendation performance. By exploiting users' feedback information, we train a learning-to-rank model to re-rank the recommendation results. In addition, we speed up the feedback learning process with active learning. Existing query-based API recommendation approaches can be plugged into BRAID. We select three state-of-the-art API recommendation approaches as baselines to demonstrate the performance enhancement of BRAID measured by Hit@k (Top-k), MAP, and MRR. Empirical experiments show that, with acceptable overheads, the recommendation performance improves steadily and substantially with the increasing percentage of feedback data, comparing with the baselines.

LGOct 12, 2019
Deep Transfer Learning for Source Code Modeling

Yasir Hussain, Zhiqiu Huang, Yu Zhou et al.

In recent years, deep learning models have shown great potential in source code modeling and analysis. Generally, deep learning-based approaches are problem-specific and data-hungry. A challenging issue of these approaches is that they require training from starch for a different related problem. In this work, we propose a transfer learning-based approach that significantly improves the performance of deep learning-based source code models. In contrast to traditional learning paradigms, transfer learning can transfer the knowledge learned in solving one problem into another related problem. First, we present two recurrent neural network-based models RNN and GRU for the purpose of transfer learning in the domain of source code modeling. Next, via transfer learning, these pre-trained (RNN and GRU) models are used as feature extractors. Then, these extracted features are combined into attention learner for different downstream tasks. The attention learner leverages from the learned knowledge of pre-trained models and fine-tunes them for a specific downstream task. We evaluate the performance of the proposed approach with extensive experiments with the source code suggestion task. The results indicate that the proposed approach outperforms the state-of-the-art models in terms of accuracy, precision, recall, and F-measure without training the models from scratch.

CRAug 23, 2019
Behavior-aware Service Access Control Mechanism using Security Policy Monitoring for SOA Systems

Yunfei Meng, Zhiqiu Huang, Senzhang Wang et al.

Service-oriented architecture (SOA) system has been widely utilized at many present business areas. However, SOA system is loosely coupled with multiple services and lacks the relevant security protection mechanisms, thus it can easily be attacked by unauthorized access and information theft. The existed access control mechanism can only prevent unauthorized users from accessing the system, but they can not prevent those authorized users (insiders) from attacking the system. To address this problem, we propose a behavior-aware service access control mechanism using security policy monitoring for SOA system. In our mechanism, a monitor program can supervise consumer's behaviors in run time. By means of trustful behavior model (TBM), if finding the consumer's behavior is of misusing, the monitor will deny its request. If finding the consumer's behavior is of malicious, the monitor will early terminate the consumer's access authorizations in this session or add the consumer into the Blacklist, whereby the consumer will not access the system from then on. In order to evaluate the feasibility of proposed mechanism, we implement a prototype system. The final results illustrate that our mechanism can effectively monitor consumer's behaviors and make effective responses when malicious behaviors really occur in run time. Moreover, as increasing the rule's number in TBM continuously, our mechanism can still work well.

SEMar 3, 2019
User Review-Based Change File Localization for Mobile Applications

Yu Zhou, Yanqi Su, Taolue Chen et al.

In the current mobile app development, novel and emerging DevOps practices (e.g., Continuous Delivery, Integration, and user feedback analysis) and tools are becoming more widespread. For instance, the integration of user feedback (provided in the form of user reviews) in the software release cycle represents a valuable asset for the maintenance and evolution of mobile apps. To fully make use of these assets, it is highly desirable for developers to establish semantic links between the user reviews and the software artefacts to be changed (e.g., source code and documentation), and thus to localize the potential files to change for addressing the user feedback. In this paper, we propose RISING (Review Integration via claSsification, clusterIng, and linkiNG), an automated approach to support the continuous integration of user feedback via classification, clustering, and linking of user reviews. RISING leverages domain-specific constraint information and semi-supervised learning to group user reviews into multiple fine-grained clusters concerning similar users' requests. Then, by combining the textual information from both commit messages and source code, it automatically localizes potential change files to accommodate the users' requests. Our empirical studies demonstrate that the proposed approach outperforms the state-of-the-art baseline work in terms of clustering and localization accuracy, and thus produces more reliable results.

NEMar 3, 2019
CodeGRU: Context-aware Deep Learning with Gated Recurrent Unit for Source Code Modeling

Yasir Hussain, Zhiqiu Huang, Yu Zhou et al.

Recently deep learning based Natural Language Processing (NLP) models have shown great potential in the modeling of source code. However, a major limitation of these approaches is that they take source code as simple tokens of text and ignore its contextual, syntactical and structural dependencies. In this work, we present CodeGRU, a gated recurrent unit based source code language model that is capable of capturing source code's contextual, syntactical and structural dependencies. We introduce a novel approach which can capture the source code context by leveraging the source code token types. Further, we adopt a novel approach which can learn variable size context by taking into account source code's syntax, and structural information. We evaluate CodeGRU with real-world data set and it shows that CodeGRU outperforms the state-of-the-art language models and help reduce the vocabulary size up to 24.93\%. Unlike previous works, we tested CodeGRU with an independent test set which suggests that our methodology does not requisite the source code comes from the same domain as training data while providing suggestions. We further evaluate CodeGRU with two software engineering applications: source code suggestion, and source code completion. Our experiment confirms that the source code's contextual information can be vital and can help improve the software language models. The extensive evaluation of CodeGRU shows that it outperforms the state-of-the-art models. The results further suggest that the proposed approach can help reduce the vocabulary size and is of practical use for software developers.

CRSep 19, 2016
Idology and Its Applications in Public Security and Network Security

Shenghui Su, Jianhua Zheng, Shuwang Lu et al.

Fraud (swindling money, property, or authority by fictionizing, counterfeiting, forging, or imitating things, or by feigning other persons privately) forms its threats against public security and network security. Anti-fraud is essentially the identification of a person or thing. In this paper, the authors first propose the concept of idology - a systematic and scientific study of identifications of persons and things, and give the definitions of a symmetric identity and an asymmetric identity. Discuss the converting symmetric identities (e.g., fingerprints) to asymmetric identities. Make a comparison between a symmetric identity and an asymmetric identity, and emphasize that symmetric identities cannot guard against inside jobs. Compare asymmetric RFIDs with BFIDs, and point out that a BFID is lightweight, economical, convenient, and environmentalistic, and more suitable for the anti-counterfeiting and source tracing of consumable merchandise such as foods, drugs, and cosmetics. The authors design the structure of a united verification platform for BFIDs and the composition of an identification system, and discuss the wide applications of BFIDs in public security and network security - antiterrorism and dynamic passwords for example.