SEAug 25, 2021
Toward Formal Data Set Verification for Building Effective Machine Learning ModelsJorge López, Maxime Labonne, Claude Poletti
In order to properly train a machine learning model, data must be properly collected. To guarantee a proper data collection, verifying that the collected data set holds certain properties is a possible solution. For example, guaranteeing that the data set contains samples across the whole input space, or that the data set is balanced w.r.t. different classes. We present a formal approach for verifying a set of arbitrarily stated properties over a data set. The proposed approach relies on the transformation of the data set into a first order logic formula, which can be later verified w.r.t. the different properties also stated in the same logic. A prototype tool, which uses the z3 solver, has been developed; the prototype can take as an input a set of properties stated in a formal language and formally verify a given data set w.r.t. to the given set of properties. Preliminary experimental results show the feasibility and performance of the proposed approach, and furthermore the flexibility for expressing properties of interest.
NIJul 15, 2021
Dynamic Link Network Emulation: a Model-based DesignErick Petersen, Jorge López, Natalia Kushik et al.
This paper presents the design and architecture of a network emulator whose links' parameters (such as delay and bandwidth) vary at different time instances. The emulator can thus be used in order to test and evaluate novel solutions for such networks, before their final deployment. To achieve this goal, different existing technologies are carefully combined to emulate link dynamicity, automatic traffic generation, and overall network device emulation. The emulator takes as an input a formal model of the network to emulate and configures all required software to execute live software instances of the desired network components, in the requested topology. We devote our study to the so-called dynamic link networks, with potentially asymmetric links. Since emulating asymmetric dynamic links is far from trivial (even with the existing state-of-the-art tools), we provide a detailed design architecture that allows this. As a case study, a satellite network emulation is presented. Experimental results show the precision of our dynamic assignments and the overall flexibility of the proposed solution.
NINov 29, 2020
Short-Term Flow-Based Bandwidth Forecasting using Machine LearningMaxime Labonne, Jorge López, Claude Poletti et al.
This paper proposes a novel framework to predict traffic flows' bandwidth ahead of time. Modern network management systems share a common issue: the network situation evolves between the moment the decision is made and the moment when actions (countermeasures) are applied. This framework converts packets from real-life traffic into flows containing relevant features. Machine learning models, including Decision Tree, Random Forest, XGBoost, and Deep Neural Network, are trained on these data to predict the bandwidth at the next time instance for every flow. Predictions can be fed to the management system instead of current flows bandwidth in order to take decisions on a more accurate network state. Experiments were performed on 981,774 flows and 15 different time windows (from 0.03s to 4s). They show that the Random Forest is the best performing and most reliable model, with a predictive performance consistently better than relying on the current bandwidth (+19.73% in mean absolute error and +18.00% in root mean square error). Experimental results indicate that this framework can help network management systems to take more informed decisions using a predicted network state.
SESep 21, 2020
On using SMT-solvers for Modeling and Verifying Dynamic Network EmulatorsErick Petersen, Jorge López, Natalia Kushik et al.
A novel model-based approach to verify dynamic networks is proposed; the approach consists in formally describing the network topology and dynamic link parameters. A many sorted first order logic formula is constructed to check the model with respect to a set of properties. The network consistency is verified using an SMT-solver, and the formula is used for the run-time network verification when a given static network instance is implemented. The z3 solver is used for this purpose and corresponding preliminary experiments showcase the expressiveness and current limitations of the proposed approach.
SEMar 26, 2018
Source Code Optimization using Equivalent MutantsJorge López, Natalia Kushik, Nina Yevtushenko
A mutant is a program obtained by syntactically modifying a program's source code; an equivalent mutant is a mutant, which is functionally equivalent to the original program. Mutants are primarily used in \emph{mutation testing}, and when deriving a test suite, obtaining an equivalent mutant is considered to be highly negative, although these equivalent mutants could be used for other purposes. We present an approach that considers equivalent mutants valuable, and utilizes them for source code optimization. Source code optimization enhances a program's source code preserving its behavior. We showcase a procedure to achieve source code optimization based on equivalent mutants and discuss proper mutation operators. Experimental evaluation with Java and C programs demonstrates the applicability of the proposed approach. An algorithmic approach for source code optimization using equivalent mutants is proposed. It is showcased that whenever applicable, the approach can outperform traditional compiler optimizations.