CVSep 24, 2023Code
Causal-DFQ: Causality Guided Data-free Network QuantizationYuzhang Shang, Bingxin Xu, Gaowen Liu et al.
Model quantization, which aims to compress deep neural networks and accelerate inference speed, has greatly facilitated the development of cumbersome models on mobile and edge devices. There is a common assumption in quantization methods from prior works that training data is available. In practice, however, this assumption cannot always be fulfilled due to reasons of privacy and security, rendering these methods inapplicable in real-life situations. Thus, data-free network quantization has recently received significant attention in neural network compression. Causal reasoning provides an intuitive way to model causal relationships to eliminate data-driven correlations, making causality an essential component of analyzing data-free problems. However, causal formulations of data-free quantization are inadequate in the literature. To bridge this gap, we construct a causal graph to model the data generation and discrepancy reduction between the pre-trained and quantized models. Inspired by the causal understanding, we propose the Causality-guided Data-free Network Quantization method, Causal-DFQ, to eliminate the reliance on data via approaching an equilibrium of causality-driven intervened distributions. Specifically, we design a content-style-decoupled generator, synthesizing images conditioned on the relevant and irrelevant factors; then we propose a discrepancy reduction loss to align the intervened distributions of the pre-trained and quantized models. It is worth noting that our work is the first attempt towards introducing causality to data-free quantization problem. Extensive experiments demonstrate the efficacy of Causal-DFQ. The code is available at https://github.com/42Shawn/Causal-DFQ.
CVSep 6, 2023Code
Fast and Resource-Efficient Object Tracking on Edge Devices: A Measurement StudySanjana Vijay Ganesh, Yanzhao Wu, Gaowen Liu et al.
Object tracking is an important functionality of edge video analytic systems and services. Multi-object tracking (MOT) detects the moving objects and tracks their locations frame by frame as real scenes are being captured into a video. However, it is well known that real time object tracking on the edge poses critical technical challenges, especially with edge devices of heterogeneous computing resources. This paper examines the performance issues and edge-specific optimization opportunities for object tracking. We will show that even the well trained and optimized MOT model may still suffer from random frame dropping problems when edge devices have insufficient computation resources. We present several edge specific performance optimization strategies, collectively coined as EMO, to speed up the real time object tracking, ranging from window-based optimization to similarity based optimization. Extensive experiments on popular MOT benchmarks demonstrate that our EMO approach is competitive with respect to the representative methods for on-device object tracking techniques in terms of run-time performance and tracking accuracy. EMO is released on Github at https://github.com/git-disl/EMO.
LGJan 15, 2023
Adaptive Deep Neural Network Inference Optimization with EENetFatih Ilhan, Ka-Ho Chow, Sihao Hu et al. · gatech
Well-trained deep neural networks (DNNs) treat all test samples equally during prediction. Adaptive DNN inference with early exiting leverages the observation that some test examples can be easier to predict than others. This paper presents EENet, a novel early-exiting scheduling framework for multi-exit DNN models. Instead of having every sample go through all DNN layers during prediction, EENet learns an early exit scheduler, which can intelligently terminate the inference earlier for certain predictions, which the model has high confidence of early exit. As opposed to previous early-exiting solutions with heuristics-based methods, our EENet framework optimizes an early-exiting policy to maximize model accuracy while satisfying the given per-sample average inference budget. Extensive experiments are conducted on four computer vision datasets (CIFAR-10, CIFAR-100, ImageNet, Cityscapes) and two NLP datasets (SST-2, AgNews). The results demonstrate that the adaptive inference by EENet can outperform the representative existing early exit techniques. We also perform a detailed visualization analysis of the comparison results to interpret the benefits of EENet.
18.0CRMay 5
Quantum-Resistant Networks: A Review of Primitives, Protocols and Best PracticesElisa Bertino, Ramana Kompella, Ashish Kundu et al.
Large-scale quantum computers threaten the public-key cryptographic foundations underpinning today's network security infrastructures. While significant progress has been made in standardizing post-quantum cryptographic (PQC) primitives and adapting individual protocols such as TLS and SSH, far less attention has been paid to the broader architectural consequences of the post-quantum transition for networked systems. In particular, many real-world deployments such as mobile networks, industrial control systems, IoT environments, and regulated infrastructures cannot assume the universal availability, deployability, or desirability of PQ public-key infrastructures. This paper presents the first comprehensive systematization of PQ-resistant network architectures, focusing on key distribution and management as a system-level design problem rather than a protocol-local substitution. We introduce a unified taxonomy spanning cryptographic foundations (symmetric-only, PQ-PKI, hybrid, and information-theoretic multi-path), key-distribution architectures (centralized, hierarchical, replicated, threshold, MPC-backed, and serverless), trust and threat models, key-management lifecycle, and deployment environments. Using this framework, we analyze the security, scalability, and operational trade-offs of a wide range of architectures under realistic PQ adversary assumptions, including harvest-now, decrypt-later attacks and partial infrastructure compromise. Our study highlights fundamental gaps in existing approaches, clarifies when PQ-PKI is necessary or avoidable, and identifies promising research directions for building cryptographically agile, quantum-resilient network infrastructures.
CRJun 14, 2022
Edge Security: Challenges and IssuesXin Jin, Charalampos Katsis, Fan Sang et al.
Edge computing is a paradigm that shifts data processing services to the network edge, where data are generated. While such an architecture provides faster processing and response, among other benefits, it also raises critical security issues and challenges that must be addressed. This paper discusses the security threats and vulnerabilities emerging from the edge network architecture spanning from the hardware layer to the system layer. We further discuss privacy and regulatory compliance challenges in such networks. Finally, we argue the need for a holistic approach to analyze edge network security posture, which must consider knowledge from each layer.
90.8QUANT-PHMar 29
Asynchronous Routing for Multipartite Entanglement in Quantum NetworksChenliang Tian, Zebo Yang, Raj Jain et al.
In quantum networks, one way to communicate is to distribute entanglements through swapping at intermediate nodes. Most existing work primarily aims to create efficient two-party end-to-end entanglement over long distances. However, some scenarios also require remote multipartite entanglement for applications such as quantum secret sharing and multi-party computation. Our previous study improved end-to-end entanglement rates using an asynchronous, tree-based routing scheme that relies solely on local knowledge of entanglement links, conserving unused entanglement and avoiding synchronous operations. This article extends this approach to multipartite entanglements, particularly the three-party Greenberger-Horne-Zeilinger (GHZ) states. It shows that our asynchronous protocol outperforms traditional synchronous methods in entanglement rates, especially as coherence times increase. This approach can also be extended to four-party and larger multipartite GHZ states, highlighting the effectiveness and adaptability of asynchronous routing for multipartite scenarios across various network topologies.
47.8MAApr 3
Scaling Multi-agent Systems: A Smart Middleware for Improving Agent InteractionsCharles Fleming, Ramana Kompella, Peter Bosch et al.
As Large Language Model (LLM) based Multi-Agent Systems (MAS) evolve from experimental pilots to complex, persistent ecosystems, the limitations of direct agent-to-agent communication have become increasingly apparent. Current architectures suffer from fragmented context, stochastic hallucinations, rigid security boundaries, and inefficient topology management. This paper introduces Cognitive Fabric Nodes (CFN), a novel middleware layer that creates an omnipresent "Cognitive Fabric" between agents. Unlike traditional message queues or service meshes, CFNs are not merely pass-through mechanisms; they are active, intelligent intermediaries. Central to this architecture is the elevation of Memory from simple storage to an active functional substrate that informs four other critical capabilities: Topology Selection, Semantic Grounding, Security Policy Enforcement, and Prompt Transformation. We propose that each of these functions be governed by learning modules utilizing Reinforcement Learning (RL) and optimization algorithms to improve system performance dynamically. By intercepting, analyzing, and rewriting inter-agent communication, the Cognitive Fabric ensures that individual agents remain lightweight while the ecosystem achieves coherence, safety, and semantic alignment. We evaluate the effectiveness of the CFN on the HotPotQA and MuSiQue datasets in a multi-agent environment and demonstrate that the CFN improves performance by more than 10\% on both datasets over direct agent to agent communication.
79.2QUANT-PHMar 29
RADAR-Q: Resource-Aware Distributed Asynchronous Routing for Entanglement Distribution in Multi-Tenant Quantum NetworksChenliang Tian, Zebo Yang, Raj Jain et al.
Scalable quantum networks must support concurrent entanglement requests, yet existing routing protocols fail when users compete for shared repeater resources, wasting fragile quantum states. This paper presents RADAR-Q, a resource-aware decentralized routing protocol embedding real-time resource contention into path selection. Unlike prior designs requiring global coordination or central anchors, RADAR-Q makes intelligent local decisions balancing path length and fidelity, instantaneous quantum memory availability, and intermediate Bell-State Measurement (BSM) operations. By identifying the Nearest Common Ancestor (NCA) within a DODAG hierarchy, RADAR-Q localizes entanglement swapping close to communicating users - avoiding unnecessary central detours and reducing BSM chain length and decoherence exposure. We evaluate RADAR-Q on grid and random topologies against synchronous and root-centric asynchronous baselines. Results show RADAR-Q achieves aggregate throughputs 2.5x and 7.6x higher than synchronized and root-centric designs, respectively. While baselines suffer catastrophic fidelity collapse below the 0.5 threshold under high load, RADAR-Q consistently maintains end-to-end fidelity above 0.76, ensuring pairs remain usable. Furthermore, RADAR-Q exhibits near-perfect fairness (Jain's Fairness Index 96-98%) and retains over 50% of its ideal throughput under stringent 1.0 ms coherence times. These findings establish contention-aware decentralized routing as a scalable foundation for multi-tenant quantum networks.
25.3NIMar 30
Study of Post Quantum status of Widely Used ProtocolsTushin Mallick, Ashish Kundu, Ramana Kompella
The advent of quantum computing poses significant threats to classical public-key cryptographic primitives such as RSA and elliptic-curve cryptography. As many critical network and security protocols depend on these primitives for key exchange and authentication, there is an urgent need to understand their quantum vulnerability and assess the progress made towards integrating post-quantum cryptography (PQC). This survey provides a detailed examination of nine widely deployed protocols - TLS, IPsec, BGP, DNSSEC, SSH, QUIC, OpenID Connect, OpenVPN, and Signal Protocol - analysing their cryptographic foundations, quantum risks, and the current state of PQC migration. We find that TLS and Signal lead the transition with hybrid post-quantum key exchange already deployed at scale, while IPsec and SSH have standardised mechanisms but lack widespread production adoption. DNSSEC and BGP face the most significant structural barriers, as post-quantum signature sizes conflict with fundamental protocol constraints. Across all protocols, key exchange proves consistently easier to migrate than authentication, and protocol-level limitations such as message size and fragmentation often dominate over raw algorithm performance. We also discuss experimental deployments and emerging standards that are shaping the path towards a quantum-resistant communication infrastructure.
89.8LGMar 19
Context Bootstrapped Reinforcement LearningSaaket Agashe, Jayanth Srinivasa, Gaowen Liu et al.
Reinforcement Learning from Verifiable Rewards (RLVR) suffers from exploration inefficiency, where models struggle to generate successful rollouts, resulting in minimal learning signal. This challenge is particularly severe for tasks that require the acquisition of novel reasoning patterns or domain-specific knowledge. To address this, we propose Context Bootstrapped Reinforcement Learning (CBRL), which augments RLVR training by stochastically prepending few-shot demonstrations to training prompts. The injection probability follows a curriculum that starts high to bootstrap early exploration, then anneals to zero so the model must ultimately succeed without assistance. This forces the policy to internalize reasoning patterns from the demonstrations rather than relying on them at test time. We validate CBRL across two model families and five Reasoning Gym tasks. Our results demonstrate that CBRL consistently improves success rate, provides better exploration efficiency, and is algorithm-agnostic. We further demonstrate CBRL's practical applicability on Q, a domain-specific programming language that diverges significantly from mainstream language conventions.
23.8CRMay 20
Onion-Routed Multi-Circuit Key Establishment for Quantum-Resilient SessionsTushin Mallick, Ashish Kundu, Ramana Kompella
Public-key primitives that today anchor session-key establishment - RSA, Diffie-Hellman, and elliptic-curve cryptography - reduce to integer factorization or discrete logarithm and are therefore vulnerable to Shor's algorithm on a sufficiently capable quantum computer. The harvest-now, decrypt-later (HNDL) threat model turns this future capability into a present liability: ciphertext archived today can be decrypted retrospectively once a cryptographically relevant quantum computer becomes available. We propose a session-key establishment scheme that distributes a freshly generated key as multiple, independently encrypted fragments across distinct, ephemeral Tor circuits between an onion-service proxy and an onion-service client. Reconstruction requires every fragment; each fragment travels its own per-bundle circuit established via a NEWNYM signal. The security argument rests on the standard end-to-end correlation bound for onion routing: an adversary controlling a fraction of Tor relays must independently deanonymize every fresh circuit to correlate the fragments belonging to one session, and the per-fragment probability of success decays multiplicatively in the number of fragments. We implement the design as a Flask-based prototype on AWS EC2, with both the proxy and the client deployed as Tor onion services, and measure end-to-end key-establishment latency. The implemented prototype completes a key establishment in 13-20 s on average (7-50 s including tails), of which approximately 88% is attributable to Tor-related delay - a cost we discuss in the context of the privacy-versus-responsiveness trade-off.
CLJan 12, 2024
Large Language Models Can Learn Temporal ReasoningSiheng Xiong, Ali Payani, Ramana Kompella et al.
While large language models (LLMs) have demonstrated remarkable reasoning capabilities, they are not without their flaws and inaccuracies. Recent studies have introduced various methods to mitigate these limitations. Temporal reasoning (TR), in particular, presents a significant challenge for LLMs due to its reliance on diverse temporal concepts and intricate temporal logic. In this paper, we propose TG-LLM, a novel framework towards language-based TR. Instead of reasoning over the original context, we adopt a latent representation, temporal graph (TG) that enhances the learning of TR. A synthetic dataset (TGQA), which is fully controllable and requires minimal supervision, is constructed for fine-tuning LLMs on this text-to-TG translation task. We confirmed in experiments that the capability of TG translation learned on our dataset can be transferred to other TR tasks and benchmarks. On top of that, we teach LLM to perform deliberate reasoning over the TGs via Chain-of-Thought (CoT) bootstrapping and graph data augmentation. We observed that those strategies, which maintain a balance between usefulness and diversity, bring more reliable CoTs and final results than the vanilla CoT distillation.
57.0CRMay 7
Aquaman: A Transparent Proxy Architecture for Quantum Resilient Key EstablishmentTushin Mallick, Ashish Kundu, Ramana Kompella
The harvest-now, decrypt-later (HNDL) threat--adversaries intercepting and archiving ciphertext today for retrospective decryption once quantum computers mature--turns the future quantum threat into a present liability for the public-key primitives (RSA, Diffie-Hellman, ECC) that anchor modern session-key exchange. We present Aquaman, a transparent-proxy architecture for quantum-resilient session-key establishment. A transparent proxy intercepts session-key requests at the edge of a trusted network without requiring client-side configuration, deploying quantum-resistant capability at the network boundary on behalf of clients that may themselves lack post-quantum cryptography (PQC). Aquaman supports four operating modes: PQC offloaded to the proxy for clients without trusted PQC stacks; classical multi-path key fragmentation over heterogeneous media (with an optional anonymous proxy-pool variant); QKD with the SKIP/ETSI GS QKD 014 key-delivery interface; and classical/PQC hybrid handshakes. We implement and evaluate the first two modes; the latter two are well-trodden in the PQC literature and we discuss but do not implement them. The implemented multi-path mode splits the session key into ciphertext fragments distributed across diverse media (Wi-Fi, Bluetooth, NFC, cellular, Ethernet); reconstruction requires all fragments. We formalize the security argument and prove that recovery probability decays as (B/d)^n in the diversity dimension. A 1,000-run prototype evaluation on AWS EC2 shows that latency is dominated by network transmission, not by multi-path overhead.
CVMar 31, 2024
Training-Free Semantic Segmentation via LLM-SupervisionWenfang Sun, Yingjun Du, Gaowen Liu et al.
Recent advancements in open vocabulary models, like CLIP, have notably advanced zero-shot classification and segmentation by utilizing natural language for class-specific embeddings. However, most research has focused on improving model accuracy through prompt engineering, prompt learning, or fine-tuning with limited labeled data, thereby overlooking the importance of refining the class descriptors. This paper introduces a new approach to text-supervised semantic segmentation using supervision by a large language model (LLM) that does not require extra training. Our method starts from an LLM, like GPT-3, to generate a detailed set of subclasses for more accurate class representation. We then employ an advanced text-supervised semantic segmentation model to apply the generated subclasses as target labels, resulting in diverse segmentation results tailored to each subclass's unique characteristics. Additionally, we propose an assembly that merges the segmentation maps from the various subclass descriptors to ensure a more comprehensive representation of the different aspects in the test images. Through comprehensive experiments on three standard benchmarks, our method outperforms traditional text-supervised semantic segmentation methods by a marked margin.
DBMar 22, 2025
A Generative Caching System for Large Language ModelsArun Iyengar, Ashish Kundu, Ramana Kompella et al.
Caching has the potential to be of significant benefit for accessing large language models (LLMs) due to their high latencies which typically range from a small number of seconds to well over a minute. Furthermore, many LLMs charge money for queries; caching thus has a clear monetary benefit. This paper presents a new caching system for improving user experiences with LLMs. In addition to reducing both latencies and monetary costs for accessing LLMs, our system also provides important features that go beyond the performance benefits typically associated with caches. A key feature we provide is generative caching, wherein multiple cached responses can be synthesized to provide answers to queries which have never been seen before. Our generative caches function as repositories of valuable information which can be mined and analyzed. We also improve upon past semantic caching techniques by tailoring the caching algorithms to optimally balance cost and latency reduction with the quality of responses provided. Performance tests indicate that our caches are considerably faster than GPTcache.
LGFeb 12, 2025
A First-order Generative Bilevel Optimization Framework for Diffusion ModelsQuan Xiao, Hui Yuan, A F M Saif et al.
Diffusion models, which iteratively denoise data samples to synthesize high-quality outputs, have achieved empirical success across domains. However, optimizing these models for downstream tasks often involves nested bilevel structures, such as tuning hyperparameters for fine-tuning tasks or noise schedules in training dynamics, where traditional bilevel methods fail due to the infinite-dimensional probability space and prohibitive sampling costs. We formalize this challenge as a generative bilevel optimization problem and address two key scenarios: (1) fine-tuning pre-trained models via an inference-only lower-level solver paired with a sample-efficient gradient estimator for the upper level, and (2) training diffusion model from scratch with noise schedule optimization by reparameterizing the lower-level problem and designing a computationally tractable gradient estimator. Our first-order bilevel framework overcomes the incompatibility of conventional bilevel methods with diffusion processes, offering theoretical grounding and computational practicality. Experiments demonstrate that our method outperforms existing fine-tuning and hyperparameter search baselines.
61.8CRApr 9
Post-Quantum Cryptographic Analysis of Message Transformations Across the Network StackAshish Kundu, Vishal Chakraborty, Ramana Kompella
When a user sends a message over a wireless network, the message does not travel as-is. It is encrypted, authenticated, encapsulated, and transformed as it descends the protocol stack from the application layer to the physical medium. Each layer may apply its own cryptographic operations using its own algorithms, and these algorithms differ in their vulnerability to quantum computers. The security of the overall communication depends not on any single layer but on the \emph{composition} of transformations across all layers. We develop a preliminary formal framework for analyzing these cross-layer cryptographic transformations with respect to post-quantum cryptographic (PQC) readiness. We classify every per-layer cryptographic operation into one of four quantum vulnerability categories, define how per-layer PQC statuses compose across the full message transformation chain, and prove that this composition forms a bounded lattice with confidentiality composing via the join (max) operator and authentication via the meet (min). We apply the framework to five communication scenarios spanning Linux and iOS platforms, and identify several research challenges. Among our findings: WPA2-Personal provides strictly better PQC posture than both WPA3-Personal and WPA2-Enterprise; a single post-quantum layer suffices for payload confidentiality but \emph{every} layer must migrate for complete authentication; and metadata protection depends solely on the outermost layer.
53.6CRApr 3
The Quantum-Cryptographic Co-evolutionAshish Kundu, Ramana Kompella
As quantum computing matures toward the realization of Cryptographically Relevant Quantum Computers (CRQC), global cryptographic infrastructure faces an existential threat. This paper introduces a two-dimensional coordinate system to map the co-evolution of cryptographic resilience (x-axis) and computational capability (y-axis). By analyzing the four resulting quadrants, we categorize the transition from legacy classical systems to quantum-resilient architectures. We argue that the "Quantum Gap" - the delta between CRQC arrival and quantum-safe adoption represents the highest systemic risk, necessitating an immediate transition to crypto-agile frameworks.
NINov 24, 2025
A Layered Protocol Architecture for the Internet of AgentsCharles Fleming, Luca Muscariello, Vijoy Pandey et al.
Large Language Models (LLMs) have demonstrated remarkable performance improvements and the ability to learn domain-specific languages (DSLs), including APIs and tool interfaces. This capability has enabled the creation of AI agents that can perform preliminary computations and act through tool calling, which is now being standardized via protocols like MCP. However, LLMs face fundamental limitations: their context windows cannot grow indefinitely, restricting their memory and computational capacity. Agent collaboration emerges as essential for solving increasingly complex problems, mirroring how computational systems rely on different types of memory to scale. The "Internet of Agents" (IoA) represents the communication stack that enables agents to scale by distributing computation across collaborating entities. Current network architectural stacks (OSI and TCP/IP) were designed for data delivery between hosts and processes, not for agent collaboration with semantic understanding. To address this gap, we propose two new layers: an Agent Communication Layer (L8) and an Agent Semantic Layer (L9). L8 formalizes the structure of communication, standardizing message envelopes, speech-act performatives (e.g., REQUEST, INFORM), and interaction patterns (e.g., request-reply, publish-subscribe), building on protocols like MCP. The proposed L9 layer: (1) formalizes semantic context discovery and negotiation, (2) provides semantic grounding by binding terms to semantic context, and (3) semantically validates incoming prompts and performs disambiguation as needed. Furthermore, L9 introduces primitives for coordination and consensus, allowing agents to achieve alignment on shared states, collective goals, and distributed beliefs. Together, these layers provide the foundation for scalable, distributed agent collaboration, enabling the next generation of multi-agentic systems.
LGOct 15, 2025
FedHFT: Efficient Federated Finetuning with Heterogeneous Edge ClientsFatih Ilhan, Selim Furkan Tekin, Tiansheng Huang et al. · gatech
Fine-tuning pre-trained large language models (LLMs) has become a common practice for personalized natural language understanding (NLU) applications on downstream tasks and domain-specific datasets. However, there are two main challenges: (i) limited and/or heterogeneous data for fine-tuning due to proprietary data confidentiality or privacy requirements, and (ii) varying computation resources available across participating clients such as edge devices. This paper presents FedHFT - an efficient and personalized federated fine-tuning framework to address both challenges. First, we introduce a mixture of masked adapters to handle resource heterogeneity across participating clients, enabling high-performance collaborative fine-tuning of pre-trained language model(s) across multiple clients in a distributed setting, while keeping proprietary data local. Second, we introduce a bi-level optimization approach to handle non-iid data distribution based on masked personalization and client clustering. Extensive experiments demonstrate significant performance and efficiency improvements over various natural language understanding tasks under data and resource heterogeneity compared to representative heterogeneous federated learning methods.
LGOct 26, 2024
Prompt Diffusion Robustifies Any-Modality Prompt LearningYingjun Du, Gaowen Liu, Yuzhang Shang et al.
Foundation models enable prompt-based classifiers for zero-shot and few-shot learning. Nonetheless, the conventional method of employing fixed prompts suffers from distributional shifts that negatively impact generalizability to unseen samples. This paper introduces prompt diffusion, which uses a diffusion model to gradually refine the prompts to obtain a customized prompt for each sample. Specifically, we first optimize a collection of prompts to obtain over-fitted prompts per sample. Then, we propose a prompt diffusion model within the prompt space, enabling the training of a generative transition process from a random prompt to its overfitted prompt. As we cannot access the label of a test image during inference, our model gradually generates customized prompts solely from random prompts using our trained, prompt diffusion. Our prompt diffusion is generic, flexible, and modality-agnostic, making it a simple plug-and-play module seamlessly embedded into existing prompt learning methods for textual, visual, or multi-modal prompt learning. Our diffusion model uses a fast ODE-based sampling strategy to optimize test sample prompts in just five steps, offering a good trade-off between performance improvement and computational efficiency. For all prompt learning methods tested, adding prompt diffusion yields more robust results for base-to-new generalization, cross-dataset generalization, and domain generalization in classification tasks tested over 15 diverse datasets.
LGMay 17, 2023
Mitigating Group Bias in Federated Learning: Beyond Local FairnessGanghua Wang, Ali Payani, Myungjin Lee et al.
The issue of group fairness in machine learning models, where certain sub-populations or groups are favored over others, has been recognized for some time. While many mitigation strategies have been proposed in centralized learning, many of these methods are not directly applicable in federated learning, where data is privately stored on multiple clients. To address this, many proposals try to mitigate bias at the level of clients before aggregation, which we call locally fair training. However, the effectiveness of these approaches is not well understood. In this work, we investigate the theoretical foundation of locally fair training by studying the relationship between global model fairness and local model fairness. Additionally, we prove that for a broad class of fairness metrics, the global model's fairness can be obtained using only summary statistics from local clients. Based on that, we propose a globally fair training algorithm that directly minimizes the penalized empirical loss. Real-data experiments demonstrate the promising performance of our proposed approach for enhancing fairness while retaining high accuracy compared to locally fair training methods.
DCJul 27, 2021
Parallel Detection for Efficient Video Analytics at the EdgeYanzhao Wu, Ling Liu, Ramana Kompella
Deep Neural Network (DNN) trained object detectors are widely deployed in many mission-critical systems for real time video analytics at the edge, such as autonomous driving and video surveillance. A common performance requirement in these mission-critical edge services is the near real-time latency of online object detection on edge devices. However, even with well-trained DNN object detectors, the online detection quality at edge may deteriorate for a number of reasons, such as limited capacity to run DNN object detection models on heterogeneous edge devices, and detection quality degradation due to random frame dropping when the detection processing rate is significantly slower than the incoming video frame rate. This paper addresses these problems by exploiting multi-model multi-device detection parallelism for fast object detection in edge systems with heterogeneous edge devices. First, we analyze the performance bottleneck of running a well-trained DNN model at edge for real time online object detection. We use the offline detection as a reference model, and examine the root cause by analyzing the mismatch among the incoming video streaming rate, video processing rate for object detection, and output rate for real time detection visualization of video streaming. Second, we study performance optimizations by exploiting multi-model detection parallelism. We show that the model-parallel detection approach can effectively speed up the FPS detection processing rate, minimizing the FPS disparity with the incoming video frame rate on heterogeneous edge devices. We evaluate the proposed approach using SSD300 and YOLOv3 on benchmark videos of different video stream rates. The results show that exploiting multi-model detection parallelism can speed up the online object detection processing rate and deliver near real-time object detection performance for efficient video analytics at edge.
AIFeb 11, 2021
A Metamodel and Framework for Artificial General Intelligence From Theory to PracticeHugo Latapie, Ozkan Kilic, Gaowen Liu et al.
This paper introduces a new metamodel-based knowledge representation that significantly improves autonomous learning and adaptation. While interest in hybrid machine learning / symbolic AI systems leveraging, for example, reasoning and knowledge graphs, is gaining popularity, we find there remains a need for both a clear definition of knowledge and a metamodel to guide the creation and manipulation of knowledge. Some of the benefits of the metamodel we introduce in this paper include a solution to the symbol grounding problem, cumulative learning, and federated learning. We have applied the metamodel to problems ranging from time series analysis, computer vision, and natural language understanding and have found that the metamodel enables a wide variety of learning mechanisms ranging from machine learning, to graph network analysis and learning by reasoning engines to interoperate in a highly synergistic way. Our metamodel-based projects have consistently exhibited unprecedented accuracy, performance, and ability to generalize. This paper is inspired by the state-of-the-art approaches to AGI, recent AGI-aspiring work, the granular computing community, as well as Alfred Korzybski's general semantics. One surprising consequence of the metamodel is that it not only enables a new level of autonomous learning and optimal functioning for machine intelligences, but may also shed light on a path to better understanding how to improve human cognition.