Zdeněk Straka

h-index5

4papers

36citations

Novelty38%

AI Score36

Ranked #102,309 of 194,257 authors (top 53%)#34,326 in CV (top 58%)

4 Papers

3.6CVJul 18, 2025

Multi-Centre Validation of a Deep Learning Model for Scoliosis Assessment

Šimon Kubov, Simon Klíčník, Jakub Dandár et al.

Scoliosis affects roughly 2 to 4 percent of adolescents, and treatment decisions depend on precise Cobb angle measurement. Manual assessment is time consuming and subject to inter observer variation. We conducted a retrospective, multi centre evaluation of a fully automated deep learning software (Carebot AI Bones, Spine Measurement functionality; Carebot s.r.o.) on 103 standing anteroposterior whole spine radiographs collected from ten hospitals. Two musculoskeletal radiologists independently measured each study and served as reference readers. Agreement between the AI and each radiologist was assessed with Bland Altman analysis, mean absolute error (MAE), root mean squared error (RMSE), Pearson correlation coefficient, and Cohen kappa for four grade severity classification. Against Radiologist 1 the AI achieved an MAE of 3.89 degrees (RMSE 4.77 degrees) with a bias of 0.70 degrees and limits of agreement from minus 8.59 to plus 9.99 degrees. Against Radiologist 2 the AI achieved an MAE of 3.90 degrees (RMSE 5.68 degrees) with a bias of 2.14 degrees and limits from minus 8.23 to plus 12.50 degrees. Pearson correlations were r equals 0.906 and r equals 0.880 (inter reader r equals 0.928), while Cohen kappa for severity grading reached 0.51 and 0.64 (inter reader kappa 0.59). These results demonstrate that the proposed software reproduces expert level Cobb angle measurements and categorical grading across multiple centres, suggesting its utility for streamlining scoliosis reporting and triage in clinical workflows.

5.0CVApr 30, 2020Code

PreCNet: Next-Frame Video Prediction Based on Predictive Coding

Zdenek Straka, Tomas Svoboda, Matej Hoffmann

Predictive coding, currently a highly influential theory in neuroscience, has not been widely adopted in machine learning yet. In this work, we transform the seminal model of Rao and Ballard (1999) into a modern deep learning framework while remaining maximally faithful to the original schema. The resulting network we propose (PreCNet) is tested on a widely used next frame video prediction benchmark, which consists of images from an urban environment recorded from a car-mounted camera, and achieves state-of-the-art performance. Performance on all measures (MSE, PSNR, SSIM) was further improved when a larger training set (2M images from BDD100k), pointing to the limitations of the KITTI training set. This work demonstrates that an architecture carefully based in a neuroscience model, without being explicitly tailored to the task at hand, can exhibit exceptional performance.

2.9ROOct 11, 2018

Toward safe separation distance monitoring from RGB-D sensors in human-robot interaction

P. Svarny, Z. Straka, M. Hoffmann

The interaction of humans and robots in less constrained environments gains a lot of attention lately and the safety of such interaction is of utmost importance. Two ways of risk assessment are prescribed by recent safety standards: (i) power and force limiting and (ii) speed and separation monitoring. Unlike typical solutions in the industry that are restricted to mere safety zone monitoring, we present a framework that realizes separation distance monitoring between a robot and a human operator in a detailed, yet versatile, transparent, and tunable fashion. The separation distance is assessed pair-wise for all keypoints on the robot and the human body and as such can be selectively modified to account for specific conditions. The operation of this framework is illustrated on a Nao humanoid robot interacting with a human partner perceived by a RealSense RGB-D sensor and employing the OpenPose human skeleton estimation algorithm.

1.5NEJun 8, 2017Code

Where is my forearm? Clustering of body parts from simultaneous tactile and linguistic input using sequential mapping

Karla Stepanova, Matej Hoffmann, Zdenek Straka et al.

Humans and animals are constantly exposed to a continuous stream of sensory information from different modalities. At the same time, they form more compressed representations like concepts or symbols. In species that use language, this process is further structured by this interaction, where a mapping between the sensorimotor concepts and linguistic elements needs to be established. There is evidence that children might be learning language by simply disambiguating potential meanings based on multiple exposures to utterances in different contexts (cross-situational learning). In existing models, the mapping between modalities is usually found in a single step by directly using frequencies of referent and meaning co-occurrences. In this paper, we present an extension of this one-step mapping and introduce a newly proposed sequential mapping algorithm together with a publicly available Matlab implementation. For demonstration, we have chosen a less typical scenario: instead of learning to associate objects with their names, we focus on body representations. A humanoid robot is receiving tactile stimulations on its body, while at the same time listening to utterances of the body part names (e.g., hand, forearm and torso). With the goal at arriving at the correct "body categories", we demonstrate how a sequential mapping algorithm outperforms one-step mapping. In addition, the effect of data set size and noise in the linguistic input are studied.