ITMay 8, 2020
Sparsifying Parity-Check MatricesLuís M. S. Russo, Tobias Dietz, José Rui Figueira et al.
Parity check matrices (PCMs) are used to define linear error correcting codes and ensure reliable information transmission over noisy channels. The set of codewords of such a code is the null space of this binary matrix. We consider the problem of minimizing the number of one-entries in parity-check matrices. In the maximum-likelihood (ML) decoding method, the number of ones in PCMs is directly related to the time required to decode messages. We propose a simple matrix row manipulation heuristic which alters the PCM, but not the code itself. We apply simulated annealing and greedy local searches to obtain PCMs with a small number of one entries quickly, i.e. in a couple of minutes or hours when using mainstream hardware. The resulting matrices provide faster ML decoding procedures, especially for large codes.
DCNov 11, 2016
Beyond NGS data sharing and towards open scienceBruno Dantas, Calmenelias Fleitas, Alexandre P. Francisco et al.
Biosciences have been revolutionized by next generation sequencing (NGS) technologies in last years, leading to new perspectives in medical, industrial and environmental applications. And although our motivation comes from biosciences, the following is true for many areas of science: published results are usually hard to reproduce either because data is not available or tools are not readily available, which delays the adoption of new methodologies and hinders innovation. Our focus is on tool readiness and pipelines availability. Even though most tools are freely available, pipelines for data analysis are in general barely described and their configuration is far from trivial, with many parameters to be tuned. In this paper we discuss how to effectively build and use pipelines, relying on state of the art computing technologies to execute them without users need to configure, install and manage tools, servers and complex workflow management systems. We perform an in depth comparative analysis of state of the art frameworks and systems. The NGSPipes framework is proposed showing that we can have public pipelines ready to process and analyse experimental data, produced for instance by high-throughput technologies, but without relying on centralized servers or Web services. The NGSPipes framework and underlying architecture provides a major step towards open science and true collaboration in what concerns tools and pipelines among computational biology researchers and practitioners. We show that it is possible to execute data analysis pipelines in a decentralized and platform independent way. Approaches like the one proposed are crucial for archiving and reusing data analysis pipelines at medium/long-term. NGSPipes framework is freely available at http://ngspipes.github.io/.
DSJul 19, 2012
Quick HyperVolumeLuís M. S. Russo, Alexandre P. Francisco
We present a new algorithm to calculate exact hypervolumes. Given a set of $d$-dimensional points, it computes the hypervolume of the dominated space. Determining this value is an important subroutine of Multiobjective Evolutionary Algorithms (MOEAs). We analyze the "Quick Hypervolume" (QHV) algorithm theoretically and experimentally. The theoretical results are a significant contribution to the current state of the art. Moreover the experimental performance is also very competitive, compared with existing exact hypervolume algorithms. A full description of the algorithm is currently submitted to IEEE Transactions on Evolutionary Computation.