CLAug 3, 2024
Summarization of Investment Reports Using Pre-trained ModelHiroki Sakaji, Ryotaro Kobayashi, Kiyoshi Izumi et al.
In this paper, we attempt to summarize monthly reports as investment reports. Fund managers have a wide range of tasks, one of which is the preparation of investment reports. In addition to preparing monthly reports on fund management, fund managers prepare management reports that summarize these monthly reports every six months or once a year. The preparation of fund reports is a labor-intensive and time-consuming task. Therefore, in this paper, we tackle investment summarization from monthly reports using transformer-based models. There are two main types of summarization methods: extractive summarization and abstractive summarization, and this study constructs both methods and examines which is more useful in summarizing investment reports.
CLFeb 22, 2024Code
Is ChatGPT the Future of Causal Text Mining? A Comprehensive Evaluation and AnalysisTakehiro Takayanagi, Masahiro Suzuki, Ryotaro Kobayashi et al.
Causality is fundamental in human cognition and has drawn attention in diverse research fields. With growing volumes of textual data, discerning causalities within text data is crucial, and causal text mining plays a pivotal role in extracting meaningful patterns. This study conducts comprehensive evaluations of ChatGPT's causal text mining capabilities. Firstly, we introduce a benchmark that extends beyond general English datasets, including domain-specific and non-English datasets. We also provide an evaluation framework to ensure fair comparisons between ChatGPT and previous approaches. Finally, our analysis outlines the limitations and future challenges in employing ChatGPT for causal text mining. Specifically, our analysis reveals that ChatGPT serves as a good starting point for various datasets. However, when equipped with a sufficient amount of training data, previous models still surpass ChatGPT's performance. Additionally, ChatGPT suffers from the tendency to falsely recognize non-causal sequences as causal sequences. These issues become even more pronounced with advanced versions of the model, such as GPT-4. In addition, we highlight the constraints of ChatGPT in handling complex causality types, including both intra/inter-sentential and implicit causality. The model also faces challenges with effectively leveraging in-context learning and domain adaptation. We release our code to support further research and development in this field.
CRMar 17
Impact of File-Open Hook Points on Backup Ratio in ROFBS on XFSKosuke Higuchi, Ryotaro Kobayashi
Ransomware continues encrypting files during the delay between attack onset and detection. ROFBS mitigates this problem by backing up pre-modification files in real time upon file-open events. However, because the Linux file-open path traverses multiple kernel functions, it remains unclear how the choice of hook point affects defense effectiveness. In this study, we kept the ROFBS mechanism fixed and changed only the hook points on the Linux file-open path. We compared may_open, inode_permission, do_dentry_open, security_file_open, and xfs_file_open on AlmaLinux with XFS using three ransomware families: AvosLocker, Conti, and IceFire. We used Backup Ratio as the main metric and also compared the number of encrypted files with backups and the total number of encrypted files. The results showed that hook-point selection substantially affected both recoverability and damage scale. For AvosLocker, security_file_open achieved the highest Backup Ratio (82.5%). For Conti and IceFire, xfs_file_open achieved the highest values (100.0% and 63.2%, respectively). Moreover, xfs_file_open minimized the total number of encrypted files for all three ransomware families. These results indicate that, in ROFBS, the layer at which file-open events are observed is a key design factor. In particular, on XFS, hooking the filesystem-specific callback xfs_file_open may be advantageous when the goal is to reduce overall damage.