Research on Relation Extraction Method Based on Active Learning

Lianzhai Duan

doi:10.29040/ijcis.v5i2.166

Research on Relation Extraction Method Based on Active Learning

Lianzhai Duan

Abstract

The knowledge in contemporary society has exploded, and the most common knowledge is contained in unstructured natural language texts. Information Extraction technology expresses semantic knowledge in unstructured text through a set of mentioned entities, the relationships between these entities, and the events in which these entities participate. As a key part of information extraction, Relation Extraction technology provides important theoretical basis and use value for text knowledge understanding by judging the relationships between given entities. Currently, relationship extraction based on supervised learning requires a large number of labeled samples. Randomly selecting some data labels is not only a waste of data resources, but also directly affects the final accuracy of the classification model. In fact, with the development of data collection and storage technology, it has become very easy to obtain a large amount of unlabeled natural language text. Therefore, it is of great practical value to design an algorithm that can effectively utilize unlabeled sample sets for relationship extraction. In order to solve the above problems, this paper uses active learning as the starting point to implement a variety of sampling algorithms, mainly including uncertainty, diversity, representativeness and other algorithms. On the basis of verifying that active learning is suitable for relationship extraction tasks, through the fusion of multiple This sampling criterion ultimately yields an active learning sample selection strategy that is still effective under multiple data sets and multiple learning models. Experiments have proven that the multi-criteria fusion sampling strategy proposed in this article is an effective and universal strategy. Compared with multiple single-strategy sampling algorithms, it can achieve equivalent or higher classification accuracy on multiple data sets.

Full Text:

PDF

References

C. N. Dos Santos, B. Xiang, and B. Zhou, ‘Classifying relations by ranking with Convolutional neural networks’, in ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference, 2015. doi: 10.3115/v1/p15-1061.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, ‘BERT: Pre-training of deep bidirectional transformers for language understanding’, in NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 2019.

A. Sun and R. Grishman, ‘Active learning for relation type extension with local and global data views’, in ACM International Conference Proceeding Series, 2012. doi: 10.1145/2396761.2398409.

G. Angeli, J. Tibshirani, J. Y. Wu, and C. D. Manning, ‘Combining distant and partial supervision for relation extraction’, in EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 2014. doi: 10.3115/v1/d14-1164.

T. H. Nguyen and R. Grishman, ‘Relation extraction: Perspective from convolutional neural networks’, in 1st Workshop on Vector Space Modeling for Natural Language Processing, VS 2015 at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2015, 2015. doi: 10.3115/v1/w15-1506.

J. Wu et al., ‘Multi-Label Active Learning Algorithms for Image Classification: Overview and Future Promise’, ACM Computing Surveys, vol. 53, no. 2. 2020. doi: 10.1145/3379504.

Y. Wu, Y. Chen, Y. Qin, R. Tang, and Q. Zheng, ‘A recollect-tuning method for entity and relation extraction’, Expert Syst Appl, vol. 245, 2024, doi: 10.1016/j.eswa.2023.123000.

T. Wu, X. You, X. Xian, X. Pu, S. Qiao, and C. Wang, ‘Towards deep understanding of graph convolutional networks for relation extraction’, Data Knowl Eng, vol. 149, 2024, doi: 10.1016/j.datak.2023.102265.

A. Jose, J. P. A. de Mendonça, E. Devijver, N. Jakse, V. Monbet, and R. Poloni, ‘Regression tree-based active learning’, Data Min Knowl Discov, vol. 38, no. 2, 2024, doi: 10.1007/s10618-023-00951-7.

R. Cai, X. Zhang, and H. Wang, ‘Bidirectional recurrent convolutional neural network for relation classification’, in 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, 2016. doi: 10.18653/v1/p16-1072.

J. H. Caufield et al., ‘Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning’, Bioinformatics, vol. 40, no. 3, 2024, doi: 10.1093/bioinformatics/btae104.

P. Zhou et al., ‘Attention-based bidirectional long short-term memory networks for relation classification’, in 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Short Papers, 2016. doi: 10.18653/v1/p16-2034.

L. Z. Huo and P. Tang, ‘A batch-mode active learning algorithm using region-partitioning diversity for SVM classifier’, IEEE J Sel Top Appl Earth Obs Remote Sens, vol. 7, no. 4, 2014, doi: 10.1109/JSTARS.2014.2302332.

H. Tang, D. Zhu, W. Tang, S. Wang, Y. Wang, and L. Wang, ‘Research on joint model relation extraction method based on entity mapping’, PLoS One, vol. 19, no. 2 February, 2024, doi: 10.1371/journal.pone.0298974.

I. Lourentzou, D. Gruhl, and S. Welch, ‘Exploring the Efficiency of Batch Active Learning for Human-in-the-Loop Relation Extraction’, in The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018, 2018. doi: 10.1145/3184558.3191546.

S. Wu and Y. He, ‘Enriching pre-trained language model with entity information for relation classification’, in International Conference on Information and Knowledge Management, Proceedings, 2019. doi: 10.1145/3357384.3358119.

R. C. Bunescu and R. J. Mooney, ‘A shortest path dependency kernel for relation extraction’, in HLT/EMNLP 2005 - Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 2005. doi: 10.3115/1220575.1220666.

D. Zeng, K. Liu, S. Lai, G. Zhou, J. Z.-P. of COLING, and undefined 2014, ‘Relation classification via convolutional deep neural network’, aclanthology.orgD Zeng, K Liu, S Lai, G Zhou, J ZhaoProceedings of COLING 2014, the 25th international conference on, 2014•aclanthology.org, Accessed: May 10, 2024. [Online]. Available: https://aclanthology.org/C14-1220.pdf

DOI: https://doi.org/10.29040/ijcis.v5i2.166