Tshilidzi Marwala

27papers

194citations

Novelty24%

AI Score20

Ranked #193,726 of 205,806 authors (top 94%)#13,109 in AI (top 92%)

27 Papers

LGAug 31, 2023

The Use of Synthetic Data to Train AI Models: Opportunities and Risks for Sustainable Development

Tshilidzi Marwala, Eleonore Fournier-Tombs, Serge Stinckwich

In the current data driven era, synthetic data, artificially generated data that resembles the characteristics of real world data without containing actual personal information, is gaining prominence. This is due to its potential to safeguard privacy, increase the availability of data for research, and reduce bias in machine learning models. This paper investigates the policies governing the creation, utilization, and dissemination of synthetic data. Synthetic data can be a powerful instrument for protecting the privacy of individuals, but it also presents challenges, such as ensuring its quality and authenticity. A well crafted synthetic data policy must strike a balance between privacy concerns and the utility of data, ensuring that it can be utilized effectively without compromising ethical or legal standards. Organizations and institutions must develop standardized guidelines and best practices in order to capitalize on the benefits of synthetic data while addressing its inherent challenges.

CLApr 8, 2023

MphayaNER: Named Entity Recognition for Tshivenda

Rendani Mbuvha, David I. Adelani, Tendani Mutavhatsindi et al.

Named Entity Recognition (NER) plays a vital role in various Natural Language Processing tasks such as information retrieval, text classification, and question answering. However, NER can be challenging, especially in low-resource languages with limited annotated datasets and tools. This paper adds to the effort of addressing these challenges by introducing MphayaNER, the first Tshivenda NER corpus in the news domain. We establish NER baselines by \textit{fine-tuning} state-of-the-art models on MphayaNER. The study also explores zero-shot transfer between Tshivenda and other related Bantu languages, with chiShona and Kiswahili showing the best results. Augmenting MphayaNER with chiShona data was also found to improve model performance significantly. Both MphayaNER and the baseline models are made publicly available.

LGNov 17, 2022

Imputation of Missing Streamflow Data at Multiple Gauging Stations in Benin Republic

Rendani Mbuvha, Julien Yise Peniel Adounkpe, Wilson Tsakane Mongwe et al.

Streamflow observation data is vital for flood monitoring, agricultural, and settlement planning. However, such streamflow data are commonly plagued with missing observations due to various causes such as harsh environmental conditions and constrained operational resources. This problem is often more pervasive in under-resourced areas such as Sub-Saharan Africa. In this work, we reconstruct streamflow time series data through bias correction of the GEOGloWS ECMWF streamflow service (GESS) forecasts at ten river gauging stations in Benin Republic. We perform bias correction by fitting Quantile Mapping, Gaussian Process, and Elastic Net regression in a constrained training period. We show by simulating missingness in a testing period that GESS forecasts have a significant bias that results in low predictive skill over the ten Beninese stations. Our findings suggest that overall bias correction by Elastic Net and Gaussian Process regression achieves superior skill relative to traditional imputation by Random Forest, k-Nearest Neighbour, and GESS lookup. The findings of this work provide a basis for integrating global GESS streamflow data into operational early-warning decision-making systems (e.g., flood alert) in countries vulnerable to drought and flooding due to extreme weather events.

ROOct 10, 2021

Nano Version Control and Robots of Robots: Data Driven, Regenerative Production Code

Lukasz Machowski, Tshilidzi Marwala

A reflection of the Corona pandemic highlights the need for more sustainable production systems using automation. The goal is to retain automation of repetitive tasks while allowing complex parts to come together. We recognize the fragility and how hard it is to create traditional automation. We introduce a method which converts one really hard problem of producing sustainable production code into three simpler problems being data, patterns and working prototypes. We use developer seniority as a metric to measure whether the proposed method is easier. By using agent-based simulation and NanoVC repos for agent arbitration, we are able to create a simulated environment where patterns developed by people are used to transform working prototypes into templates that data can be fed through to create the robots that create the production code. Having two layers of robots allow early implementation choices to be replaced as we gather more feedback from the working system. Several benefits of this approach have been discovered, with the most notable being that the Robot of Robots encodes a legacy of the person that designed it in the form of the 3 ingredients (data, patterns and working prototypes). This method allows us to achieve our goal of reducing the fragility of the production code while removing the difficulty of getting there.

MLJul 5, 2021

Antithetic Riemannian Manifold And Quantum-Inspired Hamiltonian Monte Carlo

Wilson Tsakane Mongwe, Rendani Mbuvha, Tshilidzi Marwala

Markov Chain Monte Carlo inference of target posterior distributions in machine learning is predominately conducted via Hamiltonian Monte Carlo and its variants. This is due to Hamiltonian Monte Carlo based samplers ability to suppress random-walk behaviour. As with other Markov Chain Monte Carlo methods, Hamiltonian Monte Carlo produces auto-correlated samples which results in high variance in the estimators, and low effective sample size rates in the generated samples. Adding antithetic sampling to Hamiltonian Monte Carlo has been previously shown to produce higher effective sample rates compared to vanilla Hamiltonian Monte Carlo. In this paper, we present new algorithms which are antithetic versions of Riemannian Manifold Hamiltonian Monte Carlo and Quantum-Inspired Hamiltonian Monte Carlo. The Riemannian Manifold Hamiltonian Monte Carlo algorithm improves on Hamiltonian Monte Carlo by taking into account the local geometry of the target, which is beneficial for target densities that may exhibit strong correlations in the parameters. Quantum-Inspired Hamiltonian Monte Carlo is based on quantum particles that can have random mass. Quantum-Inspired Hamiltonian Monte Carlo uses a random mass matrix which results in better sampling than Hamiltonian Monte Carlo on spiky and multi-modal distributions such as jump diffusion processes. The analysis is performed on jump diffusion process using real world financial market data, as well as on real world benchmark classification tasks using Bayesian logistic regression.

APJun 12, 2021

Predicting Higher Education Throughput in South Africa Using a Tree-Based Ensemble Technique

Rendani Mbuvha, Patience Zondo, Aluwani Mauda et al.

We use gradient boosting machines and logistic regression to predict academic throughput at a South African university. The results highlight the significant influence of socio-economic factors and field of study as predictors of throughput. We further find that socio-economic factors become less of a predictor relative to the field of study as the time to completion increases. We provide recommendations on interventions to counteract the identified effects, which include academic, psychosocial and financial support.

MLFeb 14, 2021

Healing Products of Gaussian Processes

Samuel Cohen, Rendani Mbuvha, Tshilidzi Marwala et al.

Gaussian processes (GPs) are nonparametric Bayesian models that have been applied to regression and classification problems. One of the approaches to alleviate their cubic training cost is the use of local GP experts trained on subsets of the data. In particular, product-of-expert models combine the predictive distributions of local experts through a tractable product operation. While these expert models allow for massively distributed computation, their predictions typically suffer from erratic behaviour of the mean or uncalibrated uncertainty quantification. By calibrating predictions via a tempered softmax weighting, we provide a solution to these problems for multiple product-of-expert models, including the generalised product of experts and the robust Bayesian committee machine. Furthermore, we leverage the optimal transport literature and propose a new product-of-expert model that combines predictions of local experts by computing their Wasserstein barycenter, which can be applied to both regression and classification.

MLJan 6, 2020

An Automatic Relevance Determination Prior Bayesian Neural Network for Controlled Variable Selection

Rendani Mbuvha, Illyes Boulkaibet, Tshilidzi Marwala

We present an Automatic Relevance Determination prior Bayesian Neural Network(BNN-ARD) weight l2-norm measure as a feature importance statistic for the model-x knockoff filter. We show on both simulated data and the Norwegian wind farm dataset that the proposed feature importance statistic yields statistically significant improvements relative to similar feature importance measures in both variable selection power and predictive performance on a real world dataset.

GNOct 21, 2019

Relative Net Utility and the Saint Petersburg Paradox

Daniel Muller, Tshilidzi Marwala

The famous Saint Petersburg Paradox (St. Petersburg Paradox) shows that the theory of expected value does not capture the real-world economics of decision-making problems. Over the years, many economic theories were developed to resolve the paradox and explain gaps in the economic value theory in the evaluation of economic decisions, the subjective utility of the expected outcomes, and risk aversion as observed in the game of the St. Petersburg Paradox. In this paper, we use the concept of the relative net utility to resolve the St. Petersburg Paradox. Because the net utility concept is able to explain both behavioral economics and the St. Petersburg Paradox, it is deemed to be a universal approach to handling utility. This paper shows how the information content of the notion of net utility value allows us to capture a broader context of the impact of a decision's possible achievements. It discusses the necessary conditions that the utility function has to conform to avoid the paradox. Combining these necessary conditions allows us to define the theorem of indifference in the evaluation of economic decisions and to present the role of the relative net utility and net utility polarity in a value rational decision-making process.

MLJun 14, 2019

Automatic Relevance Determination Bayesian Neural Networks for Credit Card Default Modelling

Rendani Mbuvha, Illyes Boulkaibet, Tshilidzi Marwala

Credit risk modelling is an integral part of the global financial system. While there has been great attention paid to neural network models for credit default prediction, such models often lack the required interpretation mechanisms and measures of the uncertainty around their predictions. This work develops and compares Bayesian Neural Networks(BNNs) for credit card default modelling. This includes a BNNs trained by Gaussian approximation and the first implementation of BNNs trained by Hybrid Monte Carlo(HMC) in credit risk modelling. The results on the Taiwan Credit Dataset show that BNNs with Automatic Relevance Determination(ARD) outperform normal BNNs without ARD. The results also show that BNNs trained by Gaussian approximation display similar predictive performance to those trained by the HMC. The results further show that BNN with ARD can be used to draw inferences about the relative importance of different features thus critically aiding decision makers in explaining model output to consumers. The robustness of this result is reinforced by high levels of congruence between the features identified as important using the two different approaches for training BNNs.

AIFeb 13, 2019

Relative rationality: Is machine rationality subjective?

Tshilidzi Marwala

Rational decision making in its linguistic description means making logical decisions. In essence, a rational agent optimally processes all relevant information to achieve its goal. Rationality has two elements and these are the use of relevant information and the efficient processing of such information. In reality, relevant information is incomplete, imperfect and the processing engine, which is a brain for humans, is suboptimal. Humans are risk averse rather than utility maximizers. In the real world, problems are predominantly non-convex and this makes the idea of rational decision-making fundamentally unachievable and Herbert Simon called this bounded rationality. There is a trade-off between the amount of information used for decision-making and the complexity of the decision model used. This explores whether machine rationality is subjective and concludes that indeed it is.

AIDec 25, 2018

Can rationality be measured?

Tshilidzi Marwala

This paper studies whether rationality can be computed. Rationality is defined as the use of complete information, which is processed with a perfect biological or physical brain, in an optimized fashion. To compute rationality one needs to quantify how complete is the information, how perfect is the physical or biological brain and how optimized is the entire decision making system. The rationality of a model (i.e. physical or biological brain) is measured by the expected accuracy of the model. The rationality of the optimization procedure is measured as the ratio of the achieved objective (i.e. utility) to the global objective. The overall rationality of a decision is measured as the product of the rationality of the model and the rationality of the optimization procedure. The conclusion reached is that rationality can be computed for convex optimization problems.

AIDec 16, 2018

The limit of artificial intelligence: Can machines be rational?

Tshilidzi Marwala

This paper studies the question on whether machines can be rational. It observes the existing reasons why humans are not rational which is due to imperfect and limited information, limited and inconsistent processing power through the brain and the inability to optimize decisions and achieve maximum utility. It studies whether these limitations of humans are transferred to the limitations of machines. The conclusion reached is that even though machines are not rational advances in technological developments make these machines more rational. It also concludes that machines can be more rational than humans.

AIJul 21, 2018

Creativity and Artificial Intelligence: A Digital Art Perspective

Bo Xing, Tshilidzi Marwala

This paper describes the application of artificial intelligence to the creation of digital art. AI is a computational paradigm that codifies intelligence into machines. There are generally three types of artificial intelligence and these are machine learning, evolutionary programming and soft computing. Machine learning is the statistical approach to building intelligent systems. Evolutionary programming is the use of natural evolutionary systems to design intelligent machines. Some of the evolutionary programming systems include genetic algorithm which is inspired by the principles of evolution and swarm optimization which is inspired by the swarming of birds, fish, ants etc. Soft computing includes techniques such as agent based modelling and fuzzy logic. Opportunities on the applications of these to digital art are explored.

AIFeb 13, 2018

Blockchain and Artificial Intelligence

Tshilidzi Marwala, Bo Xing

It is undeniable that artificial intelligence (AI) and blockchain concepts are spreading at a phenomenal rate. Both technologies have distinct degree of technological complexity and multi-dimensional business implications. However, a common misunderstanding about blockchain concept, in particular, is that blockchain is decentralized and is not controlled by anyone. But the underlying development of a blockchain system is still attributed to a cluster of core developers. Take smart contract as an example, it is essentially a collection of codes (or functions) and data (or states) that are programmed and deployed on a blockchain (say, Ethereum) by different human programmers. It is thus, unfortunately, less likely to be free of loopholes and flaws. In this article, through a brief overview about how artificial intelligence could be used to deliver bug-free smart contract so as to achieve the goal of blockchain 2.0, we to emphasize that the blockchain implementation can be assisted or enhanced via various AI techniques. The alliance of AI and blockchain is expected to create numerous possibilities.

SISep 18, 2017

Early prediction of the duration of protests using probabilistic Latent Dirichlet Allocation and Decision Trees

Satyakama Paul, Madhur Hasija, Tshilidzi Marwala

Protests and agitations are an integral part of every democratic civil society. In recent years, South Africa has seen a large increase in its protests. The objective of this paper is to provide an early prediction of the duration of protests from its free flowing English text description. Free flowing descriptions of the protests help us in capturing its various nuances such as multiple causes, courses of actions etc. Next we use a combination of unsupervised learning (topic modeling) and supervised learning (decision trees) to predict the duration of the protests. Our results show a high degree (close to 90%) of accuracy in early prediction of the duration of protests.We expect the work to help police and other security services in planning and managing their resources in better handling protests in future.

AIMar 29, 2017

Rational Choice and Artificial Intelligence

Tshilidzi Marwala

The theory of rational choice assumes that when people make decisions they do so in order to maximize their utility. In order to achieve this goal they ought to use all the information available and consider all the choices available to choose an optimal choice. This paper investigates what happens when decisions are made by artificially intelligent machines in the market rather than human beings. Firstly, the expectations of the future are more consistent if they are made by an artificially intelligent machine and the decisions are more rational and thus marketplace becomes more rational.

AIMar 20, 2017

Artificial Intelligence and Economic Theories

Tshilidzi Marwala, Evan Hurwitz

The advent of artificial intelligence has changed many disciplines such as engineering, social science and economics. Artificial intelligence is a computational technique which is inspired by natural intelligence such as the swarming of birds, the working of the brain and the pathfinding of the ants. These techniques have impact on economic theories. This book studies the impact of artificial intelligence on economic theories, a subject that has not been extensively studied. The theories that are considered are: demand and supply, asymmetrical information, pricing, rational choice, rational expectation, game theory, efficient market hypotheses, mechanism design, prospect, bounded rationality, portfolio theory, rational counterfactual and causality. The benefit of this book is that it evaluates existing theories of economics and update them based on the developments in artificial intelligence field.

AIJul 1, 2016

Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

Collins Leke, Tshilidzi Marwala

In this paper, we examine the problem of missing data in high-dimensional datasets by taking into consideration the Missing Completely at Random and Missing at Random mechanisms, as well as theArbitrary missing pattern. Additionally, this paper employs a methodology based on Deep Learning and Swarm Intelligence algorithms in order to provide reliable estimates for missing data. The deep learning technique is used to extract features from the input data via an unsupervised learning approach by modeling the data distribution based on the input. This deep learning technique is then used as part of the objective function for the swarm intelligence technique in order to estimate the missing data after a supervised fine-tuning phase by minimizing an error function based on the interrelationship and correlation between features in the dataset. The investigated methodology in this paper therefore has longer running times, however, the promising potential outcomes justify the trade-off. Also, basic knowledge of statistics is presumed.

NEDec 4, 2015

Proposition of a Theoretical Model for Missing Data Imputation using Deep Learning and Evolutionary Algorithms

Collins Leke, Tshilidzi Marwala, Satyakama Paul

In the last couple of decades, there has been major advancements in the domain of missing data imputation. The techniques in the domain include amongst others: Expectation Maximization, Neural Networks with Evolutionary Algorithms or optimization techniques and K-Nearest Neighbor approaches to solve the problem. The presence of missing data entries in databases render the tasks of decision-making and data analysis nontrivial. As a result this area has attracted a lot of research interest with the aim being to yield accurate and time efficient and sensitive missing data imputation techniques especially when time sensitive applications are concerned like power plants and winding processes. In this article, considering arbitrary and monotone missing data patterns, we hypothesize that the use of deep neural networks built using autoencoders and denoising autoencoders in conjunction with genetic algorithms, swarm intelligence and maximum likelihood estimator methods as novel data imputation techniques will lead to better imputed values than existing techniques. Also considered are the missing at random, missing completely at random and missing not at random missing data mechanisms. We also intend to use fuzzy logic in tandem with deep neural networks to perform the missing data imputation tasks, as well as different building blocks for the deep neural networks like Stacked Restricted Boltzmann Machines and Deep Belief Networks to test our hypothesis. The motivation behind this article is the need for missing data imputation techniques that lead to better imputed values than existing methods with higher accuracies and lower errors.

AIOct 10, 2015

Artificial Intelligence and Asymmetric Information Theory

Tshilidzi Marwala, Evan Hurwitz

When human agents come together to make decisions, it is often the case that one human agent has more information than the other. This phenomenon is called information asymmetry and this distorts the market. Often if one human agent intends to manipulate a decision in its favor the human agent can signal wrong or right information. Alternatively, one human agent can screen for information to reduce the impact of asymmetric information on decisions. With the advent of artificial intelligence, signaling and screening have been made easier. This paper studies the impact of artificial intelligence on the theory of asymmetric information. It is surmised that artificial intelligent agents reduce the degree of information asymmetry and thus the market where these agents are deployed become more efficient. It is also postulated that the more artificial intelligent agents there are deployed in the market the less is the volume of trades in the market. This is because for many trades to happen the asymmetry of information on goods and services to be traded should exist, creating a sense of arbitrage.

AISep 16, 2015

Causal Model Analysis using Collider v-structure with Negative Percentage Mapping

Pramod Kumar Parida, Tshilidzi Marwala, Snehashish Chakraverty

A major problem of causal inference is the arrangement of dependent nodes in a directed acyclic graph (DAG) with path coefficients and observed confounders. Path coefficients do not provide the units to measure the strength of information flowing from one node to the other. Here we proposed the method of causal structure learning using collider v-structures (CVS) with Negative Percentage Mapping (NPM) to get selective thresholds of information strength, to direct the edges and subjective confounders in a DAG. The NPM is used to scale the strength of information passed through nodes in units of percentage from interval from 0 to 1. The causal structures are constructed by bottom up approach using path coefficients, causal directions and confounders, derived implementing collider v-structure and NPM. The method is self-sufficient to observe all the latent confounders present in the causal model and capable of detecting every responsible causal direction. The results are tested for simulated datasets of non-Gaussian distributions and compared with DirectLiNGAM and ICA-LiNGAM to check efficiency of the proposed method.

NAOct 15, 2015

Monte Carlo Dynamically Weighted Importance Sampling For Finite Element Model Updating

Daniel J Joubert, Tshilidzi Marwala

The Finite Element Method (FEM) is generally unable to accurately predict natural frequencies and mode shapes of structures (eigenvalues and eigenvectors). Engineers develop numerical methods and a variety of techniques to compensate for this misalignment of modal properties, between experimentally measured data and the computed result from the FEM of structures. In this paper we compare two indirect methods of updating namely, the Adaptive Metropolis Hastings and a newly applied algorithm called Monte Carlo Dynamically Weighted Importance Sampling (MCDWIS). The approximation of a posterior predictive distribution is based on Bayesian inference of continuous multivariate Gaussian probability density functions, defining the variability of physical properties affected by forced vibration. The motivation behind applying MCDWIS is in the complexity of computing normalizing constants in higher dimensional or multimodal systems. The MCDWIS accounts for this intractability by analytically computing importance sampling estimates at each time step of the algorithm. In addition, a dynamic weighting step with an Adaptive Pruned Enriched Population Control Scheme (APEPCS) allows for further control over weighted samples and population size. The performance of the MCDWIS simulation is graphically illustrated for all algorithm dependent parameters and show unbiased, stable sample estimates.

AIApr 8, 2014

Rational Counterfactuals

Tshilidzi Marwala

This paper introduces the concept of rational countefactuals which is an idea of identifying a counterfactual from the factual (whether perceived or real) that maximizes the attainment of the desired consequent. In counterfactual thinking if we have a factual statement like: Saddam Hussein invaded Kuwait and consequently George Bush declared war on Iraq then its counterfactuals is: If Saddam Hussein did not invade Kuwait then George Bush would not have declared war on Iraq. The theory of rational counterfactuals is applied to identify the antecedent that gives the desired consequent necessary for rational decision making. The rational countefactual theory is applied to identify the values of variables Allies, Contingency, Distance, Major Power, Capability, Democracy, as well as Economic Interdependency that gives the desired consequent Peace.

AIAug 10, 2013

Applying the Negative Selection Algorithm for Merger and Acquisition Target Identification

Satyakama Paul, Andreas Janecek, Fernando Buarque de Lima Neto et al.

In this paper, we propose a new methodology based on the Negative Selection Algorithm that belongs to the field of Computational Intelligence, specifically, Artificial Immune Systems to identify takeover targets. Although considerable research based on customary statistical techniques and some contemporary Computational Intelligence techniques have been devoted to identify takeover targets, most of the existing studies are based upon multiple previous mergers and acquisitions. Contrary to previous research, the novelty of this proposal lies in its ability to suggest takeover targets for novice firms that are at the beginning of their merger and acquisition spree. We first discuss the theoretical perspective and then provide a case study with details for practical implementation, both capitalizing from unique generalization capabilities of artificial immune systems algorithms.

AIJun 9, 2013

Flexibly-bounded Rationality and Marginalization of Irrationality Theories for Decision Making

Tshilidzi Marwala

In this paper the theory of flexibly-bounded rationality which is an extension to the theory of bounded rationality is revisited. Rational decision making involves using information which is almost always imperfect and incomplete together with some intelligent machine which if it is a human being is inconsistent to make decisions. In bounded rationality, this decision is made irrespective of the fact that the information to be used is incomplete and imperfect and that the human brain is inconsistent and thus this decision that is to be made is taken within the bounds of these limitations. In the theory of flexibly-bounded rationality, advanced information analysis is used, the correlation machine is applied to complete missing information and artificial intelligence is used to make more consistent decisions. Therefore flexibly-bounded rationality expands the bounds within which rationality is exercised. Because human decision making is essentially irrational, this paper proposes the theory of marginalization of irrationality in decision making to deal with the problem of satisficing in the presence of irrationality.

AIMay 26, 2013

Semi-bounded Rationality: A model for decision making

Tshilidzi Marwala

In this paper the theory of semi-bounded rationality is proposed as an extension of the theory of bounded rationality. In particular, it is proposed that a decision making process involves two components and these are the correlation machine, which estimates missing values, and the causal machine, which relates the cause to the effect. Rational decision making involves using information which is almost always imperfect and incomplete as well as some intelligent machine which if it is a human being is inconsistent to make decisions. In the theory of bounded rationality this decision is made irrespective of the fact that the information to be used is incomplete and imperfect and the human brain is inconsistent and thus this decision that is to be made is taken within the bounds of these limitations. In the theory of semi-bounded rationality, signal processing is used to filter noise and outliers in the information and the correlation machine is applied to complete the missing information and artificial intelligence is used to make more consistent decisions.