IRMar 12, 2024
The future of document indexing: GPT and Donut revolutionize table of content processingDegaga Wolde Feyisa, Haylemicheal Berihun, Amanuel Zewdu et al.
Industrial projects rely heavily on lengthy, complex specification documents, making tedious manual extraction of structured information a major bottleneck. This paper introduces an innovative approach to automate this process, leveraging the capabilities of two cutting-edge AI models: Donut, a model that extracts information directly from scanned documents without OCR, and OpenAI GPT-3.5 Turbo, a robust large language model. The proposed methodology is initiated by acquiring the table of contents (ToCs) from construction specification documents and subsequently structuring the ToCs text into JSON data. Remarkable accuracy is achieved, with Donut reaching 85% and GPT-3.5 Turbo reaching 89% in effectively organizing the ToCs. This landmark achievement represents a significant leap forward in document indexing, demonstrating the immense potential of AI to automate information extraction tasks across diverse document types, boosting efficiency and liberating critical resources in various industries.
LGSep 21, 2021
Comparison of single and multitask learning for predicting cognitive decline based on MRI dataVandad Imani, Mithilesh Prakash, Marzieh Zare et al.
The Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog) is a neuropsychological tool that has been designed to assess the severity of cognitive symptoms of dementia. Personalized prediction of the changes in ADAS-Cog scores could help in timing therapeutic interventions in dementia and at-risk populations. In the present work, we compared single and multitask learning approaches to predict the changes in ADAS-Cog scores based on T1-weighted anatomical magnetic resonance imaging (MRI). In contrast to most machine learning-based prediction methods ADAS-Cog changes, we stratified the subjects based on their baseline diagnoses and evaluated the prediction performances in each group. Our experiments indicated a positive relationship between the predicted and observed ADAS-Cog score changes in each diagnostic group, suggesting that T1-weighted MRI has a predictive value for evaluating cognitive decline in the entire AD continuum. We further studied whether correction of the differences in the magnetic field strength of MRI would improve the ADAS-Cog score prediction. The partial least square-based domain adaptation slightly improved the prediction performance, but the improvement was marginal. In summary, this study demonstrated that ADAS-Cog change could be, to some extent, predicted based on anatomical MRI. Based on this study, the recommended method for learning the predictive models is a single-task regularized linear regression due to its simplicity and good performance. It appears important to combine the training data across all subject groups for the most effective predictive models.
DATA-ANJul 10, 2017
Complexity of eye fixation duration time series in reading of Persian texts: A multifractal detrended fluctuation analysisMohammad Sharifi, Hamed Farahani, Farhad Shahbazi et al.
There is growing evidence that cognitive processes may have fractal structures as a signature of complexity. It is an an ongoing topic of research to study the class of complexity and how it may differ as a function of cognitive variables. Here, we explore the eye movement trajectories generated during reading different Persian texts. Features of eye movement trajectories were recorded during reading Persian texts using an eye tracker. We show that fixation durations, as the main components of eye movements reflecting cognitive processing, exhibits multifractal behavior. This indicates that multiple exponents are needed to capture the neural and cognitive processes involved in decoding symbols to derive meaning. We test whether multifractal behavior varies as a function of two different fonts, familiarity of the text for readers, and reading silently or aloud, and goal-oriented versus non-goal-oriented reading. We find that, while mean fixation duration is affected by some of these factors, the multifractal pattern in time series of eye fixation durations did not change significantly. Our results suggest that multifractal dynamics may be intrinsic to the reading process.