Irene Córdoba

LG
6papers
32citations
Novelty36%
AI Score19

6 Papers

MLJun 2, 2020
Sparse Cholesky covariance parametrization for recovering latent structure in ordered data

Irene Córdoba, Concha Bielza, Pedro Larrañaga et al.

The sparse Cholesky parametrization of the inverse covariance matrix can be interpreted as a Gaussian Bayesian network; however its counterpart, the covariance Cholesky factor, has received, with few notable exceptions, little attention so far, despite having a natural interpretation as a hidden variable model for ordered signal data. To fill this gap, in this paper we focus on arbitrary zero patterns in the Cholesky factor of a covariance matrix. We discuss how these models can also be extended, in analogy with Gaussian Bayesian networks, to data where no apparent order is available. For the ordered scenario, we propose a novel estimation method that is based on matrix loss penalization, as opposed to the existing regression-based approaches. The performance of this sparse model for the Cholesky factor, together with our novel estimator, is assessed in a simulation setting, as well as over spatial and temporal real data where a natural ordering arises among the variables. We give guidelines, based on the empirical results, about which of the methods analysed is more appropriate for each setting.

PLDec 2, 2018
Ann: A domain-specific language for the effective design and validation of Java annotations

Irene Córdoba, Juan de Lara

This paper describes a new modelling language for the effective design and validation of Java annotations. Since their inclusion in the 5th edition of Java, annotations have grown from a useful tool for the addition of meta-data to play a central role in many popular software projects. Usually they are not conceived in isolation, but in groups, with dependency and integrity constraints between them. However, the native support provided by Java for expressing this design is very limited. To overcome its deficiencies and make explicit the rich conceptual model which lies behind a set of annotations, we propose a domain-specific modelling language. The proposal has been implemented as an Eclipse plug-in, including an editor and an integrated code generator that synthesises annotation processors. The environment also integrates a model finder, able to detect unsatisfiable constraints between different annotations, and to provide examples of correct annotation usages for validation. The language has been tested using a real set of annotations from the Java Persistence API (JPA). Within this subset we have found enough rich semantics expressible with Ann and omitted nowadays by the Java language, which shows the benefits of Ann in a relevant field of application.

LGDec 1, 2018
Towards Gaussian Bayesian Network Fusion

Irene Córdoba, Concha Bielza, Pedro Larrañaga

Data sets are growing in complexity thanks to the increasing facilities we have nowadays to both generate and store data. This poses many challenges to machine learning that are leading to the proposal of new methods and paradigms, in order to be able to deal with what is nowadays referred to as Big Data. In this paper we propose a method for the aggregation of different Bayesian network structures that have been learned from separate data sets, as a first step towards mining data sets that need to be partitioned in an horizontal way, i.e. with respect to the instances, in order to be processed. Considerations that should be taken into account when dealing with this situation are discussed. Scalable learning of Bayesian networks is slowly emerging, and our method constitutes one of the first insights into Gaussian Bayesian network aggregation from different sources. Tested on synthetic data it obtains good results that surpass those from individual learning. Future research will be focused on expanding the method and testing more diverse data sets.

PLJul 10, 2018
A modelling language for the effective design of Java annotations

Irene Córdoba, Juan de Lara

This paper describes a new modelling language for the effective design of Java annotations. Since their inclusion in the 5th edition of Java, annotations have grown from a useful tool for the addition of meta-data to play a central role in many popular software projects. Usually they are conceived as sets with dependency and integrity constraints within them; however, the native support provided by Java for expressing this design is very limited. To overcome its deficiencies and make explicit the rich conceptual model which lies behind a set of annotations, we propose a domain-specific modelling language. The proposal has been implemented as an Eclipse plug-in, including an editor and an integrated code generator that synthesises annotation processors. The language has been tested using a real set of annotations from the Java Persistence API (JPA). It has proven to cover a greater scope with respect to other related work in different shared areas of application.

LGJun 28, 2018
Bayesian optimization of the PC algorithm for learning Gaussian Bayesian networks

Irene Córdoba, Eduardo C. Garrido-Merchán, Daniel Hernández-Lobato et al.

The PC algorithm is a popular method for learning the structure of Gaussian Bayesian networks. It carries out statistical tests to determine absent edges in the network. It is hence governed by two parameters: (i) The type of test, and (ii) its significance level. These parameters are usually set to values recommended by an expert. Nevertheless, such an approach can suffer from human bias, leading to suboptimal reconstruction results. In this paper we consider a more principled approach for choosing these parameters in an automatic way. For this we optimize a reconstruction score evaluated on a set of different Gaussian Bayesian networks. This objective is expensive to evaluate and lacks a closed-form expression, which means that Bayesian optimization (BO) is a natural choice. BO methods use a model to guide the search and are hence able to exploit smoothness properties of the objective surface. We show that the parameters found by a BO method outperform those found by a random search strategy and the expert recommendation. Importantly, we have found that an often overlooked statistical test provides the best over-all reconstruction results.

MEJun 23, 2016
A review of Gaussian Markov models for conditional independence

Irene Córdoba, Concha Bielza, Pedro Larrañaga

Markov models lie at the interface between statistical independence in a probability distribution and graph separation properties. We review model selection and estimation in directed and undirected Markov models with Gaussian parametrization, emphasizing the main similarities and differences. These two model classes are similar but not equivalent, although they share a common intersection. We present the existing results from a historical perspective, taking into account the amount of literature existing from both the artificial intelligence and statistics research communities, where these models were originated. We cover classical topics such as maximum likelihood estimation and model selection via hypothesis testing, but also more modern approaches like regularization and Bayesian methods. We also discuss how the Markov models reviewed fit in the rich hierarchy of other, higher level Markov model classes. Finally, we close the paper overviewing relaxations of the Gaussian assumption and pointing out the main areas of application where these Markov models are nowadays used.