Albert Y. Zomaya

LG
h-index92
25papers
1,417citations
Novelty41%
AI Score48

25 Papers

LGSep 23, 2024
SHFL: Secure Hierarchical Federated Learning Framework for Edge Networks

Omid Tavallaie, Kanchana Thilakarathna, Suranga Seneviratne et al.

Federated Learning (FL) is a distributed machine learning paradigm designed for privacy-sensitive applications that run on resource-constrained devices with non-Identically and Independently Distributed (IID) data. Traditional FL frameworks adopt the client-server model with a single-level aggregation (AGR) process, where the server builds the global model by aggregating all trained local models received from client devices. However, this conventional approach encounters challenges, including susceptibility to model/data poisoning attacks. In recent years, advancements in the Internet of Things (IoT) and edge computing have enabled the development of hierarchical FL systems with a two-level AGR process running at edge and cloud servers. In this paper, we propose a Secure Hierarchical FL (SHFL) framework to address poisoning attacks in hierarchical edge networks. By aggregating trained models at the edge, SHFL employs two novel methods to address model/data poisoning attacks in the presence of client adversaries: 1) a client selection algorithm running at the edge for choosing IoT devices to participate in training, and 2) a model AGR method designed based on convex optimization theory to reduce the impact of edge models from networks with adversaries in the process of computing the global model (at the cloud level). The evaluation results reveal that compared to state-of-the-art methods, SHFL significantly increases the maximum accuracy achieved by the global model in the presence of client adversaries applying model/data poisoning attacks.

LGOct 26, 2022
Hierarchical Federated Learning with Momentum Acceleration in Multi-Tier Networks

Zhengjie Yang, Sen Fu, Wei Bao et al.

In this paper, we propose Hierarchical Federated Learning with Momentum Acceleration (HierMo), a three-tier worker-edge-cloud federated learning algorithm that applies momentum for training acceleration. Momentum is calculated and aggregated in the three tiers. We provide convergence analysis for HierMo, showing a convergence rate of O(1/T). In the analysis, we develop a new approach to characterize model aggregation, momentum aggregation, and their interactions. Based on this result, {we prove that HierMo achieves a tighter convergence upper bound compared with HierFAVG without momentum}. We also propose HierOPT, which optimizes the aggregation periods (worker-edge and edge-cloud aggregation periods) to minimize the loss given a limited training time.

35.6DCMar 25
The Evolution of Decentralized Systems: From Gray's Framework to Blockchain and Beyond

Zhongli Dong, Young Choon Lee, Albert Y. Zomaya

Blockchain technology is often discussed as if it emerged from nowhere, yet its architectural DNA traces directly to the decentralized computing principles James~N. Gray articulated in 1986. This paper maps the conceptual lineage from Gray's requestor/server model to modern blockchain architectures, showing how his emphasis on modularity, autonomy, data integrity, and standardized communication anticipated the design of systems like Bitcoin and Ethereum, and, more recently, the Web3 movement and Layer-2 scaling architectures. We examine consensus mechanisms, cryptographic foundations, rollup-based Layer-2 protocols, and cross-chain interoperability through this historical lens, identify persistent challenges in scalability and modularity, and outline future directions toward Web4: an intelligent, decentralized internet integrating blockchain, artificial intelligence, and the Internet of Things.

LGAug 16, 2024
RBLA: Rank-Based-LoRA-Aggregation for Fine-tuning Heterogeneous Models in FLaaS

Shuaijun Chen, Omid Tavallaie, Niousha Nazemi et al.

Federated Learning (FL) is a promising privacy-aware distributed learning framework that can be deployed on various devices, such as mobile phones, desktops, and devices equipped with CPUs or GPUs. In the context of server-based Federated Learning as a Service (FLaaS), FL enables a central server to coordinate the training process across multiple devices without direct access to local data, thereby enhancing privacy and data security. Low-Rank Adaptation (LoRA) is a method that efficiently fine-tunes models by focusing on a low-dimensional subspace of the model's parameters. This approach significantly reduces computational and memory costs compared to fine-tuning all parameters from scratch. When integrated with FL, particularly in a FLaaS environment, LoRA allows for flexible and efficient deployment across diverse hardware with varying computational capabilities by adjusting the local model's rank. However, in LoRA-enabled FL, different clients may train models with varying ranks, which poses challenges for model aggregation on the server. Current methods for aggregating models of different ranks involve padding weights to a uniform shape, which can degrade the global model's performance. To address this issue, we propose Rank-Based LoRA Aggregation (RBLA), a novel model aggregation method designed for heterogeneous LoRA structures. RBLA preserves key features across models with different ranks. This paper analyzes the issues with current padding methods used to reshape models for aggregation in a FLaaS environment. Then, we introduce RBLA, a rank-based aggregation method that maintains both low-rank and high-rank features. Finally, we demonstrate the effectiveness of RBLA through comparative experiments with state-of-the-art methods.

82.5DCMay 11
HiRL: Hierarchical Reinforcement Learning for Coordinated Resource Management in Heterogeneous Edge Computing

Jianyong Zhu, Hao Chen, Juan Zhang et al.

Edge computing faces unprecedented resource orchestration challenges from multi-dimensional heterogeneity across device architectures, diverse task requirements in CPU-intensive, GPU-intensive, I/O-intensive, and dynamic network conditions. The edge environments demand real-time task processing within strict energy budgets, yet conventional approaches struggle with mixed continuous-discrete optimization while meeting deadline and energy constraints. This paper presents HiRL, a hierarchical reinforcement learning framework that decomposes complex resource orchestration into coordinated power control and task allocation decisions. Our approach separates continuous power management using the Twin Delayed Deep Deterministic Policy Gradient (TD3) and discrete task placement using Double Deep Q-Network (DDQN), unified through a coordination engine with five-dimensional queue state representation. We propose a heterogeneous assessment of resource compatibility with deadline-oriented prioritization and failure-penalized adaptive sampling to enhance decision quality under resource constraints. To improve practical applicability, the framework models comprehensive system dynamics including device mobility, queue congestion patterns, infrastructure heterogeneity, and priority-sensitive scheduling demands. Experimental results show that HiRL achieves effective latency-energy trade-offs with 28% latency reduction compared to Single-DDQN and maintains nearly 100% task completion rates under all load conditions. Compared to baseline algorithms, HiRL reduces energy consumption by up to 51% under low load while achieving 24% better latency performance than static optimization approaches under high load, establishing effective resource orchestration in heterogeneous edge environments.

DCJul 18, 2025
Edge Intelligence with Spiking Neural Networks

Shuiguang Deng, Di Yu, Changze Lv et al.

The convergence of artificial intelligence and edge computing has spurred growing interest in enabling intelligent services directly on resource-constrained devices. While traditional deep learning models require significant computational resources and centralized data management, the resulting latency, bandwidth consumption, and privacy concerns have exposed critical limitations in cloud-centric paradigms. Brain-inspired computing, particularly Spiking Neural Networks (SNNs), offers a promising alternative by emulating biological neuronal dynamics to achieve low-power, event-driven computation. This survey provides a comprehensive overview of Edge Intelligence based on SNNs (EdgeSNNs), examining their potential to address the challenges of on-device learning, inference, and security in edge scenarios. We present a systematic taxonomy of EdgeSNN foundations, encompassing neuron models, learning algorithms, and supporting hardware platforms. Three representative practical considerations of EdgeSNN are discussed in depth: on-device inference using lightweight SNN models, resource-aware training and updating under non-stationary data conditions, and secure and privacy-preserving issues. Furthermore, we highlight the limitations of evaluating EdgeSNNs on conventional hardware and introduce a dual-track benchmarking strategy to support fair comparisons and hardware-aware optimization. Through this study, we aim to bridge the gap between brain-inspired learning and practical edge deployment, offering insights into current advancements, open challenges, and future research directions. To the best of our knowledge, this is the first dedicated and comprehensive survey on EdgeSNNs, providing an essential reference for researchers and practitioners working at the intersection of neuromorphic computing and edge intelligence.

LGDec 20, 2024
AutoRank: MCDA Based Rank Personalization for LoRA-Enabled Distributed Learning

Shuaijun Chen, Omid Tavallaie, Niousha Nazemi et al.

As data volumes expand rapidly, distributed machine learning has become essential for addressing the growing computational demands of modern AI systems. However, training models in distributed environments is challenging with participants hold skew, Non-Independent-Identically distributed (Non-IID) data. Low-Rank Adaptation (LoRA) offers a promising solution to this problem by personalizing low-rank updates rather than optimizing the entire model, LoRA-enabled distributed learning minimizes computational and maximize personalization for each participant. Enabling more robust and efficient training in distributed learning settings, especially in large-scale, heterogeneous systems. Despite the strengths of current state-of-the-art methods, they often require manual configuration of the initial rank, which is increasingly impractical as the number of participants grows. This manual tuning is not only time-consuming but also prone to suboptimal configurations. To address this limitation, we propose AutoRank, an adaptive rank-setting algorithm inspired by the bias-variance trade-off. AutoRank leverages the MCDA method TOPSIS to dynamically assign local ranks based on the complexity of each participant's data. By evaluating data distribution and complexity through our proposed data complexity metrics, AutoRank provides fine-grained adjustments to the rank of each participant's local LoRA model. This adaptive approach effectively mitigates the challenges of double-imbalanced, non-IID data. Experimental results demonstrate that AutoRank significantly reduces computational overhead, enhances model performance, and accelerates convergence in highly heterogeneous federated learning environments. Through its strong adaptability, AutoRank offers a scalable and flexible solution for distributed machine learning.

AIDec 15, 2023
CGS-Mask: Making Time Series Predictions Intuitive for All

Feng Lu, Wei Li, Yifei Sun et al.

Artificial intelligence (AI) has immense potential in time series prediction, but most explainable tools have limited capabilities in providing a systematic understanding of important features over time. These tools typically rely on evaluating a single time point, overlook the time ordering of inputs, and neglect the time-sensitive nature of time series applications. These factors make it difficult for users, particularly those without domain knowledge, to comprehend AI model decisions and obtain meaningful explanations. We propose CGS-Mask, a post-hoc and model-agnostic cellular genetic strip mask-based saliency approach to address these challenges. CGS-Mask uses consecutive time steps as a cohesive entity to evaluate the impact of features on the final prediction, providing binary and sustained feature importance scores over time. Our algorithm optimizes the mask population iteratively to obtain the optimal mask in a reasonable time. We evaluated CGS-Mask on synthetic and real-world datasets, and it outperformed state-of-the-art methods in elucidating the importance of features over time. According to our pilot user study via a questionnaire survey, CGS-Mask is the most effective approach in presenting easily understandable time series prediction results, enabling users to comprehend the decision-making process of AI models with ease.

LGApr 11, 2025
Personalizing Federated Learning for Hierarchical Edge Networks with Non-IID Data

Seunghyun Lee, Omid Tavallaie, Shuaijun Chen et al.

Accommodating edge networks between IoT devices and the cloud server in Hierarchical Federated Learning (HFL) enhances communication efficiency without compromising data privacy. However, devices connected to the same edge often share geographic or contextual similarities, leading to varying edge-level data heterogeneity with different subsets of labels per edge, on top of device-level heterogeneity. This hierarchical non-Independent and Identically Distributed (non-IID) nature, which implies that each edge has its own optimization goal, has been overlooked in HFL research. Therefore, existing edge-accommodated HFL demonstrates inconsistent performance across edges in various hierarchical non-IID scenarios. To ensure robust performance with diverse edge-level non-IID data, we propose a Personalized Hierarchical Edge-enabled Federated Learning (PHE-FL), which personalizes each edge model to perform well on the unique class distributions specific to each edge. We evaluated PHE-FL across 4 scenarios with varying levels of edge-level non-IIDness, with extreme IoT device level non-IIDness. To accurately assess the effectiveness of our personalization approach, we deployed test sets on each edge server instead of the cloud server, and used both balanced and imbalanced test sets. Extensive experiments show that PHE-FL achieves up to 83 percent higher accuracy compared to existing federated learning approaches that incorporate edge networks, given the same number of training rounds. Moreover, PHE-FL exhibits improved stability, as evidenced by reduced accuracy fluctuations relative to the state-of-the-art FedAvg with two-level (edge and cloud) aggregation.

SPOct 14, 2021
Federated Learning for COVID-19 Detection with Generative Adversarial Networks in Edge Cloud Computing

Dinh C. Nguyen, Ming Ding, Pubudu N. Pathirana et al.

COVID-19 has spread rapidly across the globe and become a deadly pandemic. Recently, many artificial intelligence-based approaches have been used for COVID-19 detection, but they often require public data sharing with cloud datacentres and thus remain privacy concerns. This paper proposes a new federated learning scheme, called FedGAN, to generate realistic COVID-19 images for facilitating privacy-enhanced COVID-19 detection with generative adversarial networks (GANs) in edge cloud computing. Particularly, we first propose a GAN where a discriminator and a generator based on convolutional neural networks (CNNs) at each edge-based medical institution alternatively are trained to mimic the real COVID-19 data distribution. Then, we propose a new federated learning solution which allows local GANs to collaborate and exchange learned parameters with a cloud server, aiming to enrich the global GAN model for generating realistic COVID-19 images without the need for sharing actual data. To enhance the privacy in federated COVID-19 data analytics, we integrate a differential privacy solution at each hospital institution. Moreover, we propose a new blockchain-based FedGAN framework for secure COVID-19 data analytics, by decentralizing the FL process with a new mining solution for low running latency. Simulations results demonstrate the superiority of our approach for COVID-19 detection over the state-of-the-art schemes.

NIFeb 22, 2021
InaudibleKey: Generic Inaudible Acoustic Signal based Key Agreement Protocol for Mobile Devices

Weitao Xu, Zhenjiang Li, Wanli Xue et al.

Secure Device-to-Device (D2D) communication is becoming increasingly important with the ever-growing number of Internet-of-Things (IoT) devices in our daily life. To achieve secure D2D communication, the key agreement between different IoT devices without any prior knowledge is becoming desirable. Although various approaches have been proposed in the literature, they suffer from a number of limitations, such as low key generation rate and short pairing distance. In this paper, we present InaudibleKey, an inaudible acoustic signal-based key generation protocol for mobile devices. Based on acoustic channel reciprocity, InaudibleKey exploits the acoustic channel frequency response of two legitimate devices as a common secret to generating keys. InaudibleKey employs several novel technologies to significantly improve its performance. We conduct extensive experiments to evaluate the proposed system in different real environments. Compared to state-of-the-art works, InaudibleKey improves key generation rate by 3-145 times, extends pairing distance by 3.2-44 times, and reduces information reconciliation counts by 2.5-16 times. Security analysis demonstrates that InaudibleKey is resilient to a number of malicious attacks. We also implement InaudibleKey on modern smartphones and resource-limited IoT devices. Results show that it is energy-efficient and can run on both powerful and resource-limited IoT devices without incurring excessive resource consumption.

SYFeb 10, 2021
Adaptive Processor Frequency Adjustment for Mobile Edge Computing with Intermittent Energy Supply

Tiansheng Huang, Weiwei Lin, Xiaobin Hong et al.

With astonishing speed, bandwidth, and scale, Mobile Edge Computing (MEC) has played an increasingly important role in the next generation of connectivity and service delivery. Yet, along with the massive deployment of MEC servers, the ensuing energy issue is now on an increasingly urgent agenda. In the current context, the large scale deployment of renewable-energy-supplied MEC servers is perhaps the most promising solution for the incoming energy issue. Nonetheless, as a result of the intermittent nature of their power sources, these special design MEC server must be more cautious about their energy usage, in a bid to maintain their service sustainability as well as service standard. Targeting optimization on a single-server MEC scenario, we in this paper propose NAFA, an adaptive processor frequency adjustment solution, to enable an effective plan of the server's energy usage. By learning from the historical data revealing request arrival and energy harvest pattern, the deep reinforcement learning-based solution is capable of making intelligent schedules on the server's processor frequency, so as to strike a good balance between service sustainability and service quality. The superior performance of NAFA is substantiated by real-data-based experiments, wherein NAFA demonstrates up to 20% increase in average request acceptance ratio and up to 50% reduction in average request processing time.

LGDec 10, 2020
DONE: Distributed Approximate Newton-type Method for Federated Edge Learning

Canh T. Dinh, Nguyen H. Tran, Tuan Dung Nguyen et al.

There is growing interest in applying distributed machine learning to edge computing, forming federated edge learning. Federated edge learning faces non-i.i.d. and heterogeneous data, and the communication between edge workers, possibly through distant locations and with unstable wireless networks, is more costly than their local computational overhead. In this work, we propose DONE, a distributed approximate Newton-type algorithm with fast convergence rate for communication-efficient federated edge learning. First, with strongly convex and smooth loss functions, DONE approximates the Newton direction in a distributed manner using the classical Richardson iteration on each edge worker. Second, we prove that DONE has linear-quadratic convergence and analyze its communication complexities. Finally, the experimental results with non-i.i.d. and heterogeneous data show that DONE attains a comparable performance to the Newton's method. Notably, DONE requires fewer communication iterations compared to distributed gradient descent and outperforms DANE and FEDL, state-of-the-art approaches, in the case of non-quadratic loss functions.

LGNov 17, 2020
Stochastic Client Selection for Federated Learning with Volatile Clients

Tiansheng Huang, Weiwei Lin, Li Shen et al.

Federated Learning (FL), arising as a privacy-preserving machine learning paradigm, has received notable attention from the public. In each round of synchronous FL training, only a fraction of available clients are chosen to participate, and the selection decision might have a significant effect on the training efficiency, as well as the final model performance. In this paper, we investigate the client selection problem under a volatile context, in which the local training of heterogeneous clients is likely to fail due to various kinds of reasons and in different levels of frequency. {\color{black}Intuitively, too much training failure might potentially reduce the training efficiency, while too much selection on clients with greater stability might introduce bias, thereby resulting in degradation of the training effectiveness. To tackle this tradeoff, we in this paper formulate the client selection problem under joint consideration of effective participation and fairness.} Further, we propose E3CS, a stochastic client selection scheme to solve the problem, and we corroborate its effectiveness by conducting real data-based experiments. According to our experimental results, the proposed selection scheme is able to achieve up to 2x faster convergence to a fixed model accuracy while maintaining the same level of final model accuracy, compared with the state-of-the-art selection schemes.

LGNov 3, 2020
An Efficiency-boosting Client Selection Scheme for Federated Learning with Fairness Guarantee

Tiansheng Huang, Weiwei Lin, Wentai Wu et al.

The issue of potential privacy leakage during centralized AI's model training has drawn intensive concern from the public. A Parallel and Distributed Computing (or PDC) scheme, termed Federated Learning (FL), has emerged as a new paradigm to cope with the privacy issue by allowing clients to perform model training locally, without the necessity to upload their personal sensitive data. In FL, the number of clients could be sufficiently large, but the bandwidth available for model distribution and re-upload is quite limited, making it sensible to only involve part of the volunteers to participate in the training process. The client selection policy is critical to an FL process in terms of training efficiency, the final model's quality as well as fairness. In this paper, we will model the fairness guaranteed client selection as a Lyapunov optimization problem and then a C2MAB-based method is proposed for estimation of the model exchange time between each client and the server, based on which we design a fairness guaranteed algorithm termed RBCS-F for problem-solving. The regret of RBCS-F is strictly bounded by a finite constant, justifying its theoretical feasibility. Barring the theoretical results, more empirical data can be derived from our real training experiments on public datasets.

LGSep 18, 2020
Federated Learning with Nesterov Accelerated Gradient

Zhengjie Yang, Wei Bao, Dong Yuan et al.

Federated learning (FL) is a fast-developing technique that allows multiple workers to train a global model based on a distributed dataset. Conventional FL (FedAvg) employs gradient descent algorithm, which may not be efficient enough. Momentum is able to improve the situation by adding an additional momentum step to accelerate the convergence and has demonstrated its benefits in both centralized and FL environments. It is well-known that Nesterov Accelerated Gradient (NAG) is a more advantageous form of momentum, but it is not clear how to quantify the benefits of NAG in FL so far. This motives us to propose FedNAG, which employs NAG in each worker as well as NAG momentum and model aggregation in the aggregator. We provide a detailed convergence analysis of FedNAG and compare it with FedAvg. Extensive experiments based on real-world datasets and trace-driven simulation are conducted, demonstrating that FedNAG increases the learning accuracy by 3-24% and decreases the total training time by 11-70% compared with the benchmarks under a wide range of settings.

DCAug 5, 2020
Fast Adaptive Task Offloading in Edge Computing based on Meta Reinforcement Learning

Jin Wang, Jia Hu, Geyong Min et al.

Multi-access edge computing (MEC) aims to extend cloud service to the network edge to reduce network traffic and service latency. A fundamental problem in MEC is how to efficiently offload heterogeneous tasks of mobile applications from user equipment (UE) to MEC hosts. Recently, many deep reinforcement learning (DRL) based methods have been proposed to learn offloading policies through interacting with the MEC environment that consists of UE, wireless channels, and MEC hosts. However, these methods have weak adaptability to new environments because they have low sample efficiency and need full retraining to learn updated policies for new environments. To overcome this weakness, we propose a task offloading method based on meta reinforcement learning, which can adapt fast to new environments with a small number of gradient updates and samples. We model mobile applications as Directed Acyclic Graphs (DAGs) and the offloading policy by a custom sequence-to-sequence (seq2seq) neural network. To efficiently train the seq2seq network, we propose a method that synergizes the first order approximation and clipped surrogate objective. The experimental results demonstrate that this new offloading method can reduce the latency by up to 25% compared to three baselines while being able to adapt fast to new environments.

CRJun 2, 2020
MusicID: A Brainwave-based User Authentication System for Internet of Things

Jinani Sooriyaarachchi, Suranga Seneviratne, Kanchana Thilakarathna et al.

We propose MusicID, an authentication solution for smart devices that uses music-induced brainwave patterns as a behavioral biometric modality. We experimentally evaluate MusicID using data collected from real users whilst they are listening to two forms of music; a popular English song and individual's favorite song. We show that an accuracy over 98% for user identification and an accuracy over 97% for user verification can be achieved by using data collected from a 4-electrode commodity brainwave headset. We further show that a single electrode is able to provide an accuracy of approximately 85% and the use of two electrodes provides an accuracy of approximately 95%. As already shown by commodity brain-sensing headsets for meditation applications, we believe including dry EEG electrodes in smart-headsets is feasible and MusicID has the potential of providing an entry point and continuous authentication framework for upcoming surge of smart-devices mainly driven by Augmented Reality (AR)/Virtual Reality (VR) applications.

LGOct 29, 2019
Federated Learning over Wireless Networks: Convergence Analysis and Resource Allocation

Canh T. Dinh, Nguyen H. Tran, Minh N. H. Nguyen et al.

There is an increasing interest in a fast-growing machine learning technique called Federated Learning, in which the model training is distributed over mobile user equipments (UEs), exploiting UEs' local computation and training data. Despite its advantages in data privacy-preserving, Federated Learning (FL) still has challenges in heterogeneity across UEs' data and physical resources. We first propose a FL algorithm which can handle the heterogeneous UEs' data challenge without further assumptions except strongly convex and smooth loss functions. We provide the convergence rate characterizing the trade-off between local computation rounds of UE to update its local model and global communication rounds to update the FL global model. We then employ the proposed FL algorithm in wireless networks as a resource allocation optimization problem that captures the trade-off between the FL convergence wall clock time and energy consumption of UEs with heterogeneous computing and power resources. Even though the wireless resource allocation problem of FL is non-convex, we exploit this problem's structure to decompose it into three sub-problems and analyze their closed-form solutions as well as insights to problem design. Finally, we illustrate the theoretical analysis for the new algorithm with Tensorflow experiments and extensive numerical results for the wireless resource allocation sub-problems. The experiment results not only verify the theoretical convergence but also show that our proposed algorithm outperforms the vanilla FedAvg algorithm in terms of convergence rate and testing accuracy.

DCOct 11, 2019
Orchestrating the Development Lifecycle of Machine Learning-Based IoT Applications: A Taxonomy and Survey

Bin Qian, Jie Su, Zhenyu Wen et al.

Machine Learning (ML) and Internet of Things (IoT) are complementary advances: ML techniques unlock complete potentials of IoT with intelligence, and IoT applications increasingly feed data collected by sensors into ML models, thereby employing results to improve their business processes and services. Hence, orchestrating ML pipelines that encompasses model training and implication involved in holistic development lifecycle of an IoT application often leads to complex system integration. This paper provides a comprehensive and systematic survey on the development lifecycle of ML-based IoT application. We outline core roadmap and taxonomy, and subsequently assess and compare existing standard techniques used in individual stage.

HCJun 5, 2019
CreativeBioMan: Brain and Body Wearable Computing based Creative Gaming System

Min Chen, Yingying Jiang, Yong Cao et al.

Current artificial intelligence (AI) technology is mainly used in rational work such as computation and logical analysis. How to make the machine as aesthetic and creative as humans has gradually gained attention. This paper presents a creative game system (i.e., CreativeBioMan) for the first time. It combines brain wave data and multimodal emotion data, and then uses an AI algorithm for intelligent decision fusion, which can be used in artistic creation, aiming at separating the artist from repeated labor creation. To imitate the process of humans' artistic creation, the creation process of the algorithm is related to artists' previous artworks and their emotion. EEG data is used to analyze the style of artists and then match them with a style from a data set of historical works. Then, universal AI algorithms are combined with the unique creativity of each artist that evolve into a personalized creation algorithm. According to the results of cloud emotion recognition, the color of the artworks is corrected so that the artist's emotions are fully reflected in the works, and thus novel works of art are created. This allows the machine to integrate the understanding of past art and emotions with the ability to create new art forms, in the same manner as humans. This paper introduces the system architecture of CreativeBioMan from two aspects: data collection of the brain and body wearable devices, as well as the intelligent decision-making fusion of models. A Testbed platform is built for an experiment and the creativity of the works produced by the system is analyzed.

CYJun 27, 2016
Privacy Knowledge Modelling for Internet of Things: A Look Back

Charith Perera, Chang Liu, Rajiv Ranjan et al.

Internet of Things (IoT) and cloud computing together give us the ability to sense, collect, process, and analyse data so we can use them to better understand behaviours, habits, preferences and life patterns of users and lead them to consume resources more efficiently. In such knowledge discovery activities, privacy becomes a significant challenge due to the extremely personal nature of the knowledge that can be derived from the data and the potential risks involved. Therefore, understanding the privacy expectations and preferences of stakeholders is an important task in the IoT domain. In this paper, we review how privacy knowledge has been modelled and used in the past in different domains. Our goal is not only to analyse, compare and consolidate past research work but also to appreciate their findings and discuss their applicability towards the IoT. Finally, we discuss major research challenges and opportunities.

DCApr 19, 2016
Improving Raw Image Storage Efficiency by Exploiting Similarity

Binqi Zhang, Chen Wang, Bing Bing Zhou et al.

To improve the temporal and spatial storage efficiency, researchers have intensively studied various techniques, including compression and deduplication. Through our evaluation, we find that methods such as photo tags or local features help to identify the content-based similar- ity between raw images. The images can then be com- pressed more efficiently to get better storage space sav- ings. Furthermore, storing similar raw images together enables rapid data sorting, searching and retrieval if the images are stored in a distributed and large-scale envi- ronment by reducing fragmentation. In this paper, we evaluated the compressibility by designing experiments and observing the results. We found that on a statistical basis the higher similarity photos have, the better com- pression results are. This research helps provide a clue for future large-scale storage system design.

DCMar 14, 2013
Statistical Regression to Predict Total Cumulative CPU Usage of MapReduce Jobs

Nikzad Babaii Rizvandi, Javid Taheri, Reza Moraveji et al.

Recently, businesses have started using MapReduce as a popular computation framework for processing large amount of data, such as spam detection, and different data mining tasks, in both public and private clouds. Two of the challenging questions in such environments are (1) choosing suitable values for MapReduce configuration parameters e.g., number of mappers, number of reducers, and DFS block size, and (2) predicting the amount of resources that a user should lease from the service provider. Currently, the tasks of both choosing configuration parameters and estimating required resources are solely the users responsibilities. In this paper, we present an approach to provision the total CPU usage in clock cycles of jobs in MapReduce environment. For a MapReduce job, a profile of total CPU usage in clock cycles is built from the job past executions with different values of two configuration parameters e.g., number of mappers, and number of reducers. Then, a polynomial regression is used to model the relation between these configuration parameters and total CPU usage in clock cycles of the job. We also briefly study the influence of input data scaling on measured total CPU usage in clock cycles. This derived model along with the scaling result can then be used to provision the total CPU usage in clock cycles of the same jobs with different input data size. We validate the accuracy of our models using three realistic applications (WordCount, Exim MainLog parsing, and TeraSort). Results show that the predicted total CPU usage in clock cycles of generated resource provisioning options are less than 8% of the measured total CPU usage in clock cycles in our 20-node virtual Hadoop cluster.

DCJan 21, 2013
Pattern Matching for Self- Tuning of MapReduce Jobs

Nikzad Babaii Rizvandi, Javid Taheri, Albert Y. Zomaya

In this paper, we study CPU utilization time patterns of several MapReduce applications. After extracting running patterns of several applications, they are saved in a reference database to be later used to tweak system parameters to efficiently execute unknown applications in future. To achieve this goal, CPU utilization patterns of new applications are compared with the already known ones in the reference database to find/predict their most probable execution patterns. Because of different patterns lengths, the Dynamic Time Warping (DTW) is utilized for such comparison; a correlation analysis is then applied to DTWs outcomes to produce feasible similarity patterns. Three real applications (WordCount, Exim Mainlog parsing and Terasort) are used to evaluate our hypothesis in tweaking system parameters in executing similar applications. Results were very promising and showed effectiveness of our approach on pseudo-distributed MapReduce platforms.