CLJul 11, 2025
MK2 at PBIG Competition: A Prompt Generation SolutionYuzheng Xu, Tosho Hirasawa, Seiya Kawano et al.
The Patent-Based Idea Generation task asks systems to turn real patents into product ideas viable within three years. We propose MK2, a prompt-centric pipeline: Gemini 2.5 drafts and iteratively edits a prompt, grafting useful fragments from weaker outputs; GPT-4.1 then uses this prompt to create one idea per patent, and an Elo loop judged by Qwen3-8B selects the best prompt-all without extra training data. Across three domains, two evaluator types, and six criteria, MK2 topped the automatic leaderboard and won 25 of 36 tests. Only the materials-chemistry track lagged, indicating the need for deeper domain grounding; yet, the results show that lightweight prompt engineering has already delivered competitive, commercially relevant ideation from patents.
CLMay 23, 2024
Data Augmentation Method Utilizing Template Sentences for Variable Definition ExtractionKotaro Nagayama, Shota Kato, Manabu Kano
The extraction of variable definitions from scientific and technical papers is essential for understanding these documents. However, the characteristics of variable definitions, such as the length and the words that make up the definition, differ among fields, which leads to differences in the performance of existing extraction methods across fields. Although preparing training data specific to each field can improve the performance of the methods, it is costly to create high-quality training data. To address this challenge, this study proposes a new method that generates new definition sentences from template sentences and variable-definition pairs in the training data. The proposed method has been tested on papers about chemical processes, and the results show that the model trained with the definition sentences generated by the proposed method achieved a higher accuracy of 89.6%, surpassing existing models.