Fernando Sancho-Caparrini

AI
4papers
3citations
Novelty40%
AI Score20

4 Papers

AIJul 16, 2023
A Recursive Bateson-Inspired Model for the Generation of Semantic Formal Concepts from Spatial Sensory Data

Jaime de Miguel-Rodriguez, Fernando Sancho-Caparrini

Neural-symbolic approaches to machine learning incorporate the advantages from both connectionist and symbolic methods. Typically, these models employ a first module based on a neural architecture to extract features from complex data. Then, these features are processed as symbols by a symbolic engine that provides reasoning, concept structures, composability, better generalization and out-of-distribution learning among other possibilities. However, neural approaches to the grounding of symbols in sensory data, albeit powerful, still require heavy training and tedious labeling for the most part. This paper presents a new symbolic-only method for the generation of hierarchical concept structures from complex spatial sensory data. The approach is based on Bateson's notion of difference as the key to the genesis of an idea or a concept. Following his suggestion, the model extracts atomic features from raw data by computing elemental sequential comparisons in a stream of multivariate numerical values. Higher-level constructs are built from these features by subjecting them to further comparisons in a recursive process. At any stage in the recursion, a concept structure may be obtained from these constructs and features by means of Formal Concept Analysis. Results show that the model is able to produce fairly rich yet human-readable conceptual representations without training. Additionally, the concept structures obtained through the model (i) present high composability, which potentially enables the generation of 'unseen' concepts, (ii) allow formal reasoning, and (iii) have inherent abilities for generalization and out-of-distribution learning. Consequently, this method may offer an interesting angle to current neural-symbolic research. Future work is required to develop a training methodology so that the model can be tested against a larger dataset.

LGJul 20, 2019
Improving Skip-Gram based Graph Embeddings via Centrality-Weighted Sampling

Pedro Almagro-Blanco, Fernando Sancho-Caparrini

Network embedding techniques inspired by word2vec represent an effective unsupervised relational learning model. Commonly, by means of a Skip-Gram procedure, these techniques learn low dimensional vector representations of the nodes in a graph by sampling node-context examples. Although many ways of sampling the context of a node have been proposed, the effects of the way a node is chosen have not been analyzed in depth. To fill this gap, we have re-implemented the main four word2vec inspired graph embedding techniques under the same framework and analyzed how different sampling distributions affects embeddings performance when tested in node classification problems. We present a set of experiments on different well known real data sets that show how the use of popular centrality distributions in sampling leads to improvements, obtaining speeds of up to 2 times in learning times and increasing accuracy in all cases.

AISep 7, 2017
Semantic Preserving Embeddings for Generalized Graphs

Pedro Almagro-Blanco, Fernando Sancho-Caparrini

A new approach to the study of Generalized Graphs as semantic data structures using machine learning techniques is presented. We show how vector representations maintaining semantic characteristics of the original data can be obtained from a given graph using neural encoding architectures and considering the topological properties of the graph. Semantic features of these new representations are tested by using some machine learning tasks and new directions on efficient link discovery, entitity retrieval and long distance query methodologies on large relational datasets are investigated using real datasets. ---- En este trabajo se presenta un nuevo enfoque en el contexto del aprendizaje automático multi-relacional para el estudio de Grafos Generalizados. Se muestra cómo se pueden obtener representaciones vectoriales que mantienen características semánticas del grafo original utilizando codificadores neuronales y considerando las propiedades topológicas del grafo. Además, se evalúan las características semánticas capturadas por estas nuevas representaciones y se investigan nuevas metodologías eficientes relacionadas con Link Discovery, Entity Retrieval y consultas a larga distancia en grandes conjuntos de datos relacionales haciendo uso de bases de datos reales.

LGAug 18, 2017
Induction of Decision Trees based on Generalized Graph Queries

Pedro Almagro-Blanco, Fernando Sancho-Caparrini

Usually, decision tree induction algorithms are limited to work with non relational data. Given a record, they do not take into account other objects attributes even though they can provide valuable information for the learning task. In this paper we present GGQ-ID3, a multi-relational decision tree learning algorithm that uses Generalized Graph Queries (GGQ) as predicates in the decision nodes. GGQs allow to express complex patterns (including cycles) and they can be refined step-by-step. Also, they can evaluate structures (not only single records) and perform Regular Pattern Matching. GGQ are built dynamically (pattern mining) during the GGQ-ID3 tree construction process. We will show how to use GGQ-ID3 to perform multi-relational machine learning keeping complexity under control. Finally, some real examples of automatically obtained classification trees and semantic patterns are shown. ----- Normalmente, los algoritmos de inducción de árboles de decisión trabajan con datos no relacionales. Dado un registro, no tienen en cuenta los atributos de otros objetos a pesar de que éstos pueden proporcionar información útil para la tarea de aprendizaje. En este artículo presentamos GGQ-ID3, un algoritmo de aprendizaje de árboles de decisiones multi-relacional que utiliza Generalized Graph Queries (GGQ) como predicados en los nodos de decisión. Los GGQs permiten expresar patrones complejos (incluyendo ciclos) y pueden ser refinados paso a paso. Además, pueden evaluar estructuras (no solo registros) y llevar a cabo Regular Pattern Matching. En GGQ-ID3, los GGQ son construidos dinámicamente (pattern mining) durante el proceso de construcción del árbol. Además, se muestran algunos ejemplos reales de árboles de clasificación multi-relacionales y patrones semánticos obtenidos automáticamente.