Fast low-level pattern matching algorithm
This is an incremental improvement for bioinformatics researchers, enabling faster and more efficient DNA sequence analysis.
The paper tackles the problem of pattern matching in DNA sequences by overcoming the limitation of small pattern lengths in a previous prime number encoding method, achieving significant time and memory savings through modular arithmetic and low-level optimizations.
This paper focuses on pattern matching in the DNA sequence. It was inspired by a previously reported method that proposes encoding both pattern and sequence using prime numbers. Although fast, the method is limited to rather small pattern lengths, due to computing precision problem. Our approach successfully deals with large patterns, due to our implementation that uses modular arithmetic. In order to get the results very fast, the code was adapted for multithreading and parallel implementations. The method is reduced to assembly language level instructions, thus the final result shows significant time and memory savings compared to the reference algorithm.