- 无标题文档
查看论文信息

中文题名:

 深度学习在人物关系抽取中的应用研究    

姓名:

 田秀敏    

学号:

 20071212542    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 0252    

学科名称:

 经济学 - 应用统计*    

学生类型:

 硕士    

学位:

 应用统计硕士    

学校:

 西安电子科技大学    

院系:

 数学与统计学院    

专业:

 应用统计    

研究方向:

 自然语言处理    

第一导师姓名:

 冶继民    

第一导师单位:

 西安电子科技大学    

第二导师姓名:

 蔡云龙    

完成日期:

 2023-06-08    

答辩日期:

 2023-05-30    

外文题名:

 Application and Research of Deep Learning in Character Relationship Extraction    

中文关键词:

 人物关系抽取 ; 深度学习 ; 三元组重叠 ; BERT ; 实体嵌套    

外文关键词:

 Character relationship extraction ; Deep learning ; Triple overlap ; BERT ; Entity nesting    

中文摘要:

随着互联网技术的快速发展和大量数据的涌现,当前自然语言处理领域内,从非 结构化或半结构化文本信息中抽取结构化文本信息成为一大热点,关系抽取技术就是 其中之一。关系抽取技术在推荐系统、智能问答系统、知识图谱、机器翻译和语义理 解等下游领域有着重要的应用价值。近年来,基于深度学习的关系抽取模型取得了瞩 目的成就。但是在实体嵌套、三元组重叠和误差累积等方面还存在许多问题,因此会 影响关系抽取模型的精度。本文从多个角度出发,应用三种方法解决以上问题,主要 工作内容包括以下几个方面:

1. 针对误差累积和三元组重叠问题,本文将头实体的隐藏层特征融合到尾实体 识别和关系分类模块中,提出一种融合头实体特征的人物关系抽取方法。该方法分为 两个步骤: 头实体标注、尾实体概率计算与关系分类。接下来,在四个英文数据集(NYT, WebNLG, NYT*, WebNLG*)、两个中文数据集(DuIE2.0, CCRE)以及数据集 NYT*、 WebNLG*的 Normal(三元组正常的数据集)、EPO(Entity Pair Overlap, 同一对实体有多 种关系)、SEO(Single Entity Overlap, 一个实体与多个实体有关系)和含有 1-5 个不同 三元组的测试集上进行充分的实验。结果表明,本文提出的方法与之前的方法相比取 得了更高的 F1 得分。

2. 为了解决人物关系抽取中的实体嵌套问题,本文首先将训练样本切分为不同 长度的片段,从而获得多个候选实体片段;其次,使用片段分类器过滤掉长度过大的 片段以及实体类型为 None 的片段;然后,融合头实体结束位置到尾实体起始位置的 特征来获得上下文语义信息,从而实现关系分类;最后将提出的基于片段的人物关系 抽取方法在 DuIE2.0、CCRE 等数据集上进行实验,与基线模型进行 F1 得分对比,并 在单个实体上进行精度对比,结果显示本文提出的方法具有优越性。

3. 结合前两种方法的优点,针对嵌套实体、三元组重叠和误差累积问题,本文提 出一种基于填表方式的人物关系抽取方法。构建个数为实体类型数与关系种类数之和 的表格,每个表格的横、纵坐标均为文本最大长度,在实体表格中标注(实体的头,实 体的尾)的位置,在关系表格中标注(头实体的头,尾实体的尾)、(头实体的尾,尾实体 的头)的位置。基于此,修改损失函数为多标签分类损失函数,并设计了一个单步解码 出所有关系三元组的解码方式。在六个数据集以及NYT*、WebNLG*的Normal、EPO、 SEO 和含有 1-5 个不同三元组的测试集上进行实验,实验结果表明该方法的 F1 得分 优于其他现有方法。最终将模型应用到智能电视项目,模型准确率提升约 3 个百分点。

外文摘要:

With the rapid development of Internet technology and the emergence of a large number of data, extracting structured text information from unstructured or semi-structured text information has become a hot spot in the current field of natural language processing, and relational extraction technology is one of them. Relational extraction technology has important application value in downstream fields such as recommendation system, intelligent question answering system, knowledge mapping, machine translation and semantic understanding. In recent years, deep learning based relationship extraction models have achieved remarkable achievements. However, there are still many problems in entity nesting, triplet overlap, and error accumulation, which can affect the accuracy of the relationship extraction model. This article applies three methods from multiple perspectives to solve the above problems, and the main work includes the following aspects:

1. In response to the problem of error accumulation and triplet overlap, this paper integrates the hidden layer features of the head entity into the tail entity and relationship classification module, and proposes a character relationship extraction method that integrates the head entity features. This method is divided into two steps: head entity annotation, tail entity probability calculation, and relationship classification. Next, sufficient experiments were conducted on four English datasets (NYT, WebNLG, NYT*, WebNLG*), two Chinese datasets (DuIE2.0, CCRE), as well as datasets NYT*, WebNLG*'s Normal (triplet normal dataset), EPO (Entity Pair Overlap, where the same entity has multiple relationships), SEO (Single Entity Overlap, where one entity has relationships with multiple entities), and a test set containing 1-5 different triplets. The results indicate that the proposed model framework performs better than the previous methods and achieves higher scores.

2. In order to solve the problem of entity nesting in character relationship extraction, this article first divides the training sample into segments of different lengths to obtain multiple candidate entity spans; Secondly, use a span classifier to filter out spans with excessive length and spans with entity type None; Then, the context Semantic information is obtained by fusing the features from the end position of the head entity to the start position of the tail entity, so as to achieve relationship classification; Finally, the proposed span-based character relationship extraction method was tested on datasets such as DuIE2.0 and CCRE, and its scores were compared with the baseline model. The accuracy was compared on a single entity, and the results showed that the proposed method had superiority.

3. Combining the advantages of the first two methods, this paper proposes a character relationship extraction method based on filling out tables to address the issues of nested entities, triplet overlap, and error accumulation. Build a table with the sum of entity types and relationship types. The horizontal and vertical coordinates of each table are the maximum length of the text. Mark the positions of (entity head, entity tail) in the entity table, and (head entity head, tail entity tail) and (head entity tail, tail entity head) in the relationship table. Based on this, the loss function is modified to a multi label classification loss function, and a decoding method is designed to decode all relational triples in a single step. Experiments were conducted on six datasets, as well as the Normal, EPO, SEO of NYT*, WebNLG*, and datasets containing 1-5 different triplets. The experimental results showed that the score of this method was superior to other existing methods. Finally, the model was applied to the smart TV project, and the accuracy of the model was improved by about 3 percentage points.

参考文献:
[1] WANG L J, Li W C, CHANG C H. Recognizing unregistered names for mandarin word identification[C]. The 14th International Conference on Computational Linguistics, 1992.
[2] HUMPHREYS K, GAIZAUSKAS R, AZZAM S, et al. University of Sheffield: Description of the LaSIE-II system as used for MUC-7[C]. Seventh Message Understanding Conference, 1998.
[3] KIM J H, WOODLAND P C. WOODLAND. A rule-based named entity recognition system for speech input[C]. Sixth International Conference on Spoken Language Processing, 2000.
[4] KRUPKA G, HAUSMAN K. IsoQuest Inc: description of the NetOwl™ extractor system as used for MUC-7[C]. Seventh Message Understanding Conference, 1998.
[5] BALUJA S, MITTAL V O, SUKTHANKAR R. Applying machine learning for high‐performance named-entity extraction[J]. Computational Intelligence, 2010, 16(4): 586-595.
[6] MCCALLUM A, FREITAG D, PEREIRA F C N. Maximum Entropy Markov Models for Information Extraction and Segmentation[C]. Seventeenth International Conference on Machine Learning. Morgan Kaufmann, 2000.
[7] HAN A L F, WONG D F, CHAO L S. Chinese named entity recognition with conditional random fields in the light of Chinese characteristics[C]. Language Processing and Intelligent Information Systems, 20th International Conference, 2013.
[8] 顾孙炎. 基于深度神经网络的中文命名实体识别研究[D]. 南京邮电大学, 2018.
[9] HUANG Z, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv, 2015, 15(08): 01991.
[10] LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[J]. arXiv preprint arXiv, 2016, 16(03): 01360.
[11] CHIU J P C, NICHOLS E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4: 357-370.
[12] REI M, CRICHTON G K O, PYYSALO S. Attending to characters in neural sequence labeling models[J]. arXiv Preprint arXiv, 2016, 16(11): 04361.
[13] ZHANG Y, YANG J. Chinese NER using lattice LSTM[J]. arXiv Preprint arXiv, 2018, 18(05):02023.
[14] ZHOU G, ZHANG J, SU J, et al. Recognizing names in biomedical texts: a machine learning approach[J]. Bioinformatics, 2004, 20(7): 1178-1190.
[15] SHEN D, ZHANG J, ZHOU G, et al. Effective adaptation of hidden markov model-based named entity recognizer for biomedical domain[C]. Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine, 2003.
[16] ZHOU G D. Recognizing names in biomedical texts using mutual information independence model and SVM plus sigmoid[J]. International Journal of Medical Informatics, 2006, 75(6): 456-467.
[17] FINKEL J R, MANNING C D. Nested named entity recognition[C]. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009.
[18] MUIS A O, LU W. Labeling gaps between words: Recognizing overlapping mentions with mention separators[J]. arXiv preprint arXiv, 2018, 18(10): 09073.
[19] LI F, WANG Z, HUI S C, et al. Modularized interaction network for named entity recognition[C]. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021.
[20] 温秀秀, 马超, 高原原, 等. 基于标签聚类的中文重叠命名实体识别方法[J]. 计算机工程,2020, 046(005): 41-46.
[21] SOHRAB M G, MIWA M. Deep exhaustive model for nested named entity recognition[C]. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018.
[22] LUO Y, ZHAO H. Bipartite flat-graph network for nested named entity recognition[J]. arXiv preprint arXiv, 2020, 20(05): 00436.
[23] NGUYEN T S, NGUYEN L M. Nested named entity recognition using multilayer recurrent neural networks[C]. Computational Linguistics: 15th International Conference of the Pacific Association for Computational Linguistics, 2018.
[24] ALI B A B, MIHI S, EL BAZI I, et al. A Recent Survey of Arabic Named Entity Recognition on Social Media[J]. Revue d'Intelligence Artificielle, 2020, 34(2): 125-135.
[25] 鄂海红, 张文静, 肖思琪, 等. 深度学习实体关系抽取研究综述[J]. Journal of Software, 2019, 30(6).
[26] 庄传志, 靳小龙, 朱建伟, 等. 基于深度学习的关系抽取研究综述[J]. 中文信息学报, 2019,33(12): 1-18.
[27] SOHN S, WU S, CHUTE C G. Dependency parser-based negation detection in clinical narratives[J]. AMIA Summits on Translational Science Proceedings, 2012, 2012(1): 178-188.
[28] CHAPMAN W W, BRIDEWELL W, HANBURY P, et al. A simple algorithm for identifying negated findings and diseases in discharge summaries[J]. Journal of Biomedical Informatics, 2001, 34(5): 301-310.
[29] BOLLEGALA D T, MATSUO Y, ISHIZUKA M. Relational duality: Unsupervised extraction of semantic relations between entities on the web[C]. Proceedings of the 19th International Conference on World Wide Web, 2010.
[30] KAMBHATLA N. Combining lexical, syntactic, and semantic features with maximum entropy models for information extraction[C]. Proceedings of the ACL Interactive Poster and Demonstration Sessions, 2004.
[31] JIANG J, ZHAI C X. A systematic exploration of the feature space for relation extraction[C]. Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, 2007.
[32] MOONEY R J, BUNESCU R C. Subsequence Kernels for Relation Extraction[C]. International Conference on Neural Information Processing Systems MIT Press, 2005.
[33] HINTON G E, GEOFFREY E, RUSLAN R, et al. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507.
[34] ZENG D J. Relation classification via convolutional deep neural network[C]. Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers,2014.
[35] NGUYEN T H, GRISHMAN R. Relation extraction: Perspective from convolutional neural networks[C]. Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, 2015.
[36] 闫雄, 段跃兴, 张泽华. 采用自注意力机制和 CNN 融合的实体关系抽取[J]. 计算机工程与科学, 2020, 42(11): 2059-2066.
[37] SOCHER R, HUVAL B, MANNING C D, et al. Semantic compositionality through recursive matrix-vector spaces[C]. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012.
[38] XU Y, MOU L, LI G, et al. Classifying relations via long short term memory networks along shortest dependency paths[C]. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015.
[39] SCHLICHTKRULL M, KIPF T N, BLOEM P, et al. Modeling relational data with graph convolutional networks[C]. The Semantic Web: 15th International Conference. Heraklion, Crete, Greece: Springer International Publishing, 2018.
[40] ZHANG M, ZHANG Y, FU G. End-to-end neural relation extraction with global optimization[C]. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017.
[41] ZHENG S C, HAO Y X, WANF F, et al. Joint entity and relation extraction based on a hybrid neural network[J]. Neurocomputing, 2017, 257: 59-66.
[42] ZHENG S C, LU D Y, BAO H Y, et al. Joint extraction of entities and relations based on a novel tagging scheme[J]. arXiv Preprint arXiv, 2017, 17(06): 05075.
[43] FU T J, LI P H, MA W Y. Graphrel: Modeling text as relational graphs for joint entity and relation extraction[C]. Proceedings of the 57th Annual Meeting of the Association for Computational
Linguistics, 2019.
[44] DAI D, XIAO X Y, LYU Y J, et al. Joint extraction of entities and overlapping relations using position-attentive sequence labeling[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019.
[45] YANG B, CARDIE C. Joint inference for fine-grained opinion extraction[C]. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013.
[46] ZENG X R, ZENG D J, HE S Z, et al. Extracting relational facts by an end-to-end neural model with copy mechanism[C]. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018.
[47] ZENG X R, HE S Z, ZENG D J, et al. Learning the extraction order of multiple relational facts in a sentence with reinforcement learning[C]. Empirical Methods in Natural Language Processing
and the 9th International Joint Conference on Natural Language Processing, 2019.
[48] SUI D B, CHEN Y B, LIU K, et al. Joint entity and relation extraction with set prediction networks[J]. arXiv Preprint arXiv, 2020, 20(11): 01675.
[49] WEI Z P, SU J L, WANG Y, et al. A novel cascade binary tagging framework for relational triple extraction[J]. arXiv Preprint arXiv, 2019, 19(09): 03227.
[50] EBERTS M, ULGES A. Span-based joint entity and relation extraction with transformer pretraining[J]. arXiv Preprint arXiv, 2019, 19(09): 07755.
[51] SU J L, MURTADHA A, PAN S F, et al. Global Pointer: novel efficient span-based approach for named entity recognition[J]. arXiv Preprint arXiv, 2022, 22(08): 03054.
[52] SU J L, LU Y, PAN S F, et al. RoFormer: enhanced Transformer with rotary position embedding[J]. arXiv Preprint arXiv, 2021, 21(04): 09864.
[53] SU J L, ZHU M R, MURTADHA A, et al. ZLPR: a novel loss for multi-label classification[J]. arXiv Preprint arXiv, 2022, 22(08): 02955.
中图分类号:

 11    

馆藏号:

 56333    

开放日期:

 2023-12-13    

无标题文档

   建议浏览器: 谷歌 火狐 360请用极速模式,双核浏览器请用极速模式