Abstractions, Algorithms and Data Structures for Structural Bioinformatics in PyCogent
This work provides structural bioinformatics tools for researchers using PyCogent, but it is incremental as it adds functionality to an existing framework rather than creating a new paradigm.
The authors extended PyCogent, a sequence-based bioinformatics framework, to handle three-dimensional structural data by developing Python modules with object-oriented abstractions, efficient data structures, and fast algorithms for structural processing. This integration enables symbiotic benefits between sequence-based and structure-derived analyses within the same framework.
To facilitate flexible and efficient structural bioinformatics analyses, new functionality for three-dimensional structure processing and analysis has been introduced into PyCogent -- a popular feature-rich framework for sequence-based bioinformatics, but one which has lacked equally powerful tools for handling stuctural/coordinate-based data. Extensible Python modules have been developed, which provide object-oriented abstractions (based on a hierarchical representation of macromolecules), efficient data structures (e.g. kD-trees), fast implementations of common algorithms (e.g. surface-area calculations), read/write support for Protein Data Bank-related file formats and wrappers for external command-line applications (e.g. Stride). Integration of this code into PyCogent is symbiotic, allowing sequence-based work to benefit from structure-derived data and, reciprocally, enabling structural studies to leverage PyCogent's versatile tools for phylogenetic and evolutionary analyses.