Sakumoto had a poster presentation at the 29th Annual Meeting of the Association for Natural Language Processing.
2023-03-14
- news
- research

Sakumoto, a doctoral student at Nagaoka University of Technology, gave a poster presentation titled "Evaluation of Effectiveness Based on Detailed Dataset Classification in Similar Dataset Discovery Tasks" at the 29th Annual Meeting of the Association for Natural Language Processing (NLP2023).
This research is part of the JSPS Kakenhi (JP20H02384), "Elucidation of the Dynamics of Data Markets and Institutional Design," which proposes dataset classification and evaluation methods to facilitate cross-disciplinary search and discovery of heterogeneous datasets. Dataset discovery is a growing need in data co-creation and collaboration. We believe that our approach will help realize the coming data society.
Title: Evaluation of Effectiveness Based on Detailed Dataset Classification in Similar Dataset Discovery Tasks
Authors: Takeshi Sakumoto (Nagaoka Univ. of Technology), Teruaki Hayashi (Univ. of Tokyo), Hiroki Sakaji (Univ. of Tokyo), Hiroshi Nonaka (AIT)
Outline: With the recent development of computer technology, methods for retrieving datasets that are similar to each other have been actively studied. Therefore, in this study, we applied four detailed items based on classification criteria actually used in data platforms to the evaluation of similar dataset discovery methods. The results showed that the method using Dice coefficients between variable labels is useful for searching for datasets that perfectly match issues and main topics.