Francisco A. Rodrigues

LG
h-index14
9papers
715citations
Novelty22%
AI Score30

9 Papers

LGApr 7, 2022
Forecasting new diseases in low-data settings using transfer learning

Kirstin Roster, Colm Connaughton, Francisco A. Rodrigues

Recent infectious disease outbreaks, such as the COVID-19 pandemic and the Zika epidemic in Brazil, have demonstrated both the importance and difficulty of accurately forecasting novel infectious diseases. When new diseases first emerge, we have little knowledge of the transmission process, the level and duration of immunity to reinfection, or other parameters required to build realistic epidemiological models. Time series forecasts and machine learning, while less reliant on assumptions about the disease, require large amounts of data that are also not available in early stages of an outbreak. In this study, we examine how knowledge of related diseases can help make predictions of new diseases in data-scarce environments using transfer learning. We implement both an empirical and a theoretical approach. Using empirical data from Brazil, we compare how well different machine learning models transfer knowledge between two different disease pairs: (i) dengue and Zika, and (ii) influenza and COVID-19. In the theoretical analysis, we generate data using different transmission and recovery rates with an SIR compartmental model, and then compare the effectiveness of different transfer learning methods. We find that transfer learning offers the potential to improve predictions, even beyond a model based on data from the target disease, though the appropriate source disease must be chosen carefully. While imperfect, these models offer an additional input for decision makers during pandemic response.

LGOct 16, 2023
Machine learning in physics: a short guide

Francisco A. Rodrigues

Machine learning is a rapidly growing field with the potential to revolutionize many areas of science, including physics. This review provides a brief overview of machine learning in physics, covering the main concepts of supervised, unsupervised, and reinforcement learning, as well as more specialized topics such as causal inference, symbolic regression, and deep learning. We present some of the principal applications of machine learning in physics and discuss the associated challenges and perspectives.

SOC-PHNov 19, 2023
Beyond the Power Law: Estimation, Goodness-of-Fit, and a Semiparametric Extension in Complex Networks

Nixon Jerez-Lillo, Francisco A. Rodrigues, Paulo H. Ferreira et al.

Scale-free networks play a fundamental role in the study of complex networks and various applied fields due to their ability to model a wide range of real-world systems. A key characteristic of these networks is their degree distribution, which often follows a power-law distribution, where the probability mass function is proportional to $x^{-α}$, with $α$ typically ranging between $2 < α< 3$. In this paper, we introduce Bayesian inference methods to obtain more accurate estimates than those obtained using traditional methods, which often yield biased estimates, and precise credible intervals. Through a simulation study, we demonstrate that our approach provides nearly unbiased estimates for the scaling parameter, enhancing the reliability of inferences. We also evaluate new goodness-of-fit tests to improve the effectiveness of the Kolmogorov-Smirnov test, commonly used for this purpose. Our findings show that the Watson test offers superior power while maintaining a controlled type I error rate, enabling us to better determine whether data adheres to a power-law distribution. Finally, we propose a piecewise extension of this model to provide greater flexibility, evaluating the estimation and its goodness-of-fit features as well. In the complex networks field, this extension allows us to model the full degree distribution, instead of just focusing on the tail, as is commonly done. We demonstrate the utility of these novel methods through applications to two real-world datasets, showcasing their practical relevance and potential to advance the analysis of power-law behavior.

LGAug 27, 2025
Discovering equations from data: symbolic regression in dynamical systems

Beatriz R. Brum, Luiza Lober, Isolde Previdelli et al.

The process of discovering equations from data lies at the heart of physics and in many other areas of research, including mathematical ecology and epidemiology. Recently, machine learning methods known as symbolic regression have automated this process. As several methods are available in the literature, it is important to compare them, particularly for dynamic systems that describe complex phenomena. In this paper, five symbolic regression methods were used for recovering equations from nine dynamical processes, including chaotic dynamics and epidemic models, with the PySR method proving to be the most suitable for inferring equations. Benchmark results demonstrate its high predictive power and accuracy, with some estimates being indistinguishable from the original analytical forms. These results highlight the potential of symbolic regression as a robust tool for inferring and modelling real-world phenomena.

SPOct 7, 2021
EEG functional connectivity and deep learning for automatic diagnosis of brain disorders: Alzheimer's disease and schizophrenia

Caroline L. Alves, Aruane M. Pineda, Kirstin Roster et al.

Mental disorders are among the leading causes of disability worldwide. The first step in treating these conditions is to obtain an accurate diagnosis, but the absence of established clinical tests makes this task challenging. Machine learning algorithms can provide a possible solution to this problem, as we describe in this work. We present a method for the automatic diagnosis of mental disorders based on the matrix of connections obtained from EEG time series and deep learning. We show that our approach can classify patients with Alzheimer's disease and schizophrenia with a high level of accuracy. The comparison with the traditional cases, that use raw EEG time series, shows that our method provides the highest precision. Therefore, the application of deep neural networks on data from brain connections is a very promising method to the diagnosis of neurological disorders.

LGJun 22, 2021
Neural networks for dengue forecasting: a systematic review

Luiza Lober, Francisco A. Rodrigues, Kirstin Roster

Background: Early forecasts of dengue are an important tool for disease mitigation. Neural networks are powerful predictive models that have made contributions to many areas of public health. In this study, we reviewed the application of neural networks in the dengue forecasting literature, with the objective of informing model design for future work. Methods: Following PRISMA guidelines, we conducted a systematic search of studies that use neural networks to forecast dengue in human populations. We summarized the relative performance of neural networks and comparator models, architectures and hyper-parameters, choices of input features, geographic spread, and model transparency. Results: Sixty two papers were included. Most studies implemented shallow feed-forward neural networks, using historical dengue incidence and climate variables. Prediction horizons varied greatly, as did the model selection and evaluation approach. Building on the strengths of neural networks, most studies used granular observations at the city level, or on its subdivisions, while also commonly employing weekly data. Performance of neural networks relative to comparators, such as tree-based supervised models, varied across study contexts, and we found that 63% of all studies do include at least one such model as a baseline, and in those cases about half of the studies report neural networks as the best performing model. Conclusions: The studies suggest that neural networks can provide competitive forecasts for dengue, and can reliably be included in the set of candidate models for future dengue prediction efforts. The use of deep networks is relatively unexplored but offers promising avenues for further research, as does the use of a broader set of input features and prediction in light of structural changes in the data generation mechanism.

APAug 8, 2018
Pattern Recognition Approach to Violin Shapes of MIMO database

Thomas Peron, Francisco A. Rodrigues, Luciano da F. Costa

Since the landmarks established by the Cremonese school in the 16th century, the history of violin design has been marked by experimentation. While great effort has been invested since the early 19th century by the scientific community on researching violin acoustics, substantially less attention has been given to the statistical characterization of how the violin shape evolved over time. In this paper we study the morphology of violins retrieved from the Musical Instrument Museums Online (MIMO) database -- the largest freely accessible platform providing information about instruments held in public museums. From the violin images, we derive a set of measurements that reflect relevant geometrical features of the instruments. The application of Principal Component Analysis (PCA) uncovered similarities between violin makers and their respective copyists, as well as among luthiers belonging to the same family lineage, in the context of historical narrative. Combined with a time-windowed approach, thin plate splines visualizations revealed that the average violin outline has remained mostly stable over time, not adhering to any particular trends of design across different periods in music history.

LGDec 26, 2016
Clustering Algorithms: A Comparative Approach

Mayra Z. Rodriguez, Cesar H. Comin, Dalcimar Casanova et al.

Many real-world systems can be studied in terms of pattern recognition tasks, so that proper use (and understanding) of machine learning methods in practical applications becomes essential. While a myriad of classification methods have been proposed, there is no consensus on which methods are more suitable for a given dataset. As a consequence, it is important to comprehensively compare methods in many possible scenarios. In this context, we performed a systematic comparison of 7 well-known clustering methods available in the R language. In order to account for the many possible variations of data, we considered artificial datasets with several tunable properties (number of classes, separation between classes, etc). In addition, we also evaluated the sensitivity of the clustering methods with regard to their parameters configuration. The results revealed that, when considering the default configurations of the adopted methods, the spectral approach usually outperformed the other clustering algorithms. We also found that the default configuration of the adopted implementations was not accurate. In these cases, a simple approach based on random selection of parameters values proved to be a good alternative to improve the performance. All in all, the reported approach provides subsidies guiding the choice of clustering algorithms.

SOC-PHJun 17, 2016
Complex systems: features, similarity and connectivity

Cesar H. Comin, Thomas K. DM. Peron, Filipi N. Silva et al.

The increasing interest in complex networks research has been a consequence of several intrinsic features of this area, such as the generality of the approach to represent and model virtually any discrete system, and the incorporation of concepts and methods deriving from many areas, from statistical physics to sociology, which are often used in an independent way. Yet, for this same reason, it would be desirable to integrate these various aspects into a more coherent and organic framework, which would imply in several benefits normally allowed by the systematization in science, including the identification of new types of problems and the cross-fertilization between fields. More specifically, the identification of the main areas to which the concepts frequently used in complex networks can be applied paves the way to adopting and applying a larger set of concepts and methods deriving from those respective areas. Among the several areas that have been used in complex networks research, pattern recognition, optimization, linear algebra, and time series analysis seem to play a more basic and recurrent role. In the present manuscript, we propose a systematic way to integrate the concepts from these diverse areas regarding complex networks research. In order to do so, we start by grouping the multidisciplinary concepts into three main groups, namely features, similarity, and network connectivity. Then we show that several of the analysis and modeling approaches to complex networks can be thought as a composition of maps between these three groups, with emphasis on nine main types of mappings, which are presented and illustrated. Such a systematization of principles and approaches also provides an opportunity to review some of the most closely related works in the literature, which is also developed in this article.