Josef Pichler

2papers

2 Papers

LGApr 22, 2022Code
Federated Learning Enables Big Data for Rare Cancer Boundary Detection

Sarthak Pati, Ujjwal Baid, Brandon Edwards et al.

Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.

SEJun 13, 2017
Exploring Code Clones in Programmable Logic Controller Software

Hannes Thaller, Rudolf Ramler, Josef Pichler et al.

The reuse of code fragments by copying and pasting is widely practiced in software development and results in code clones. Cloning is considered an anti-pattern as it negatively affects program correctness and increases maintenance efforts. Programmable Logic Controller (PLC) software is no exception in the code clone discussion as reuse in development and maintenance is frequently achieved through copy, paste, and modification. Even though the presence of code clones may not necessary be a problem per se, it is important to detect, track and manage clones as the software system evolves. Unfortunately, tool support for clone detection and management is not commonly available for PLC software systems or limited to generic tools with a reduced set of features. In this paper, we investigate code clones in a real-world PLC software system based on IEC 61131-3 Structured Text and C/C++. We extended a widely used tool for clone detection with normalization support. Furthermore, we evaluated the different types and natures of code clones in the studied system and their relevance for refactoring. Results shed light on the applicability and usefulness of clone detection in the context of industrial automation systems and it demonstrates the benefit of adapting detection and management tools for IEC 611313-3 languages.