Morteza Pourreza Shahri

h-index5

3papers

1,037citations

Novelty27%

AI Score21

Ranked #182,608 of 194,257 authors (top 94%)#2,361 in SE (top 78%)

3 Papers

31.0CLOct 5, 2020

An Ensemble Approach for Automatic Structuring of Radiology Reports

Morteza Pourreza Shahri, Amir Tahmasebi, Bingyang Ye et al.

Automatic structuring of electronic medical records is of high demand for clinical workflow solutions to facilitate extraction, storage, and querying of patient care information. However, developing a scalable solution is extremely challenging, specifically for radiology reports, as most healthcare institutes use either no template or department/institute specific templates. Moreover, radiologists' reporting style varies from one to another as sentences are telegraphic and do not follow general English grammar rules. We present an ensemble method that consolidates the predictions of three models, capturing various attributes of textual information for automatic labeling of sentences with section labels. These three models are: 1) Focus Sentence model, capturing context of the target sentence; 2) Surrounding Context model, capturing the neighboring context of the target sentence; and finally, 3) Formatting/Layout model, aimed at learning report formatting cues. We utilize Bi-directional LSTMs, followed by sentence encoders, to acquire the context. Furthermore, we define several features that incorporate the structure of reports. We compare our proposed approach against multiple baselines and state-of-the-art approaches on a proprietary dataset as well as 100 manually annotated radiology notes from the MIMIC-III dataset, which we are making publicly available. Our proposed approach significantly outperforms other approaches by achieving 97.1% accuracy.

5.0SEApr 16, 2019

Metamorphic Testing for Quality Assurance of Protein Function Prediction Tools

Morteza Pourreza Shahri, Madhusudan Srinivasan, Gillian Reynolds et al.

Proteins are the workhorses of life and gaining insight on their functions is of paramount importance for applications such as drug design. However, the experimental validation of functions of proteins is highly-resource consuming. Therefore, recently, automated protein function prediction (AFP) using machine learning has gained significant interest. Many of these AFP tools are based on supervised learning models trained using existing gold-standard functional annotations, which are known to be incomplete. The main challenge associated with conducting systematic testing on AFP software is the lack of a test oracle, which determines passing or failing of a test case; unfortunately, due to the incompleteness of gold-standard data, the exact expected outcomes are not well defined for the AFP task. Thus, AFP tools face the \emph{oracle problem}. In this work, we use metamorphic testing (MT) to test nine state-of-the-art AFP tools by defining a set of metamorphic relations (MRs) that apply input transformations to protein sequences. According to our results, we observe that several AFP tools fail all the test cases causing concerns over the quality of their predictions.

12.9SEFeb 20, 2018

Quality Assurance of Bioinformatics Software: A Case Study of Testing a Biomedical Text Processing Tool Using Metamorphic Testing

Madhusudan Srinivasan, Morteza Pourreza Shahri, Upulee Kanewala et al.

Bioinformatics software plays a very important role in making critical decisions within many areas including medicine and health care. However, most of the research is directed towards developing tools, and little time and effort is spent on testing the software to assure its quality. In testing, a test oracle is used to determine whether a test is passed or failed during testing, and unfortunately, for much of bioinformatics software, the exact expected outcomes are not well defined. Thus, the main challenge associated with conducting systematic testing on bioinformatics software is the oracle problem. Metamorphic testing (MT) is a technique used to test programs that face the oracle problem. MT uses metamorphic relations (MRs) to determine whether a test has passed or failed and specifies how the output should change according to a specific change made to the input. In this work, we use MT to test LingPipe, a tool for processing text using computational linguistics, often used in bioinformatics for bio-entity recognition from biomedical literature. First, we identify a set of MRs for testing any bio-entity recognition program. Then we develop a set of test cases that can be used to test LingPipe's bio-entity recognition functionality using these MRs. To evaluate the effectiveness of this testing process, we automatically generate a set of faulty versions of LingPipe. According to our analysis of the experimental results, we observe that our MRs can detect the majority of these faulty versions, which shows the utility of this testing technique for quality assurance of bioinformatics software.