학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

A Comparative Study of Cross-Sentence Features for Named Entity Recognition

Resource Type: Conference
Authors: Wang, Sheng-Fu; Huang, Jing; Zhang, Baohua; Li, Jia
Source: 2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR) Innovations and Development of Information Technologies and Robotics (IDITR), 2023 2nd International Conference on. :59-64 May, 2023
Subject: Computing and Processing
Power, Energy and Industry Applications
Robotics and Control Systems
Signal Processing and Analysis
Measurement
Technological innovation
Feature extraction
Data mining
Information technology
Standards
Robots
Named entity recognition
Information extraction
Contextual feature
Span classification
Language

Online Access

Full Text (IEEE)

초록

Recently, a growing number of Named Entity Recognition (NER) methods utilize cross-sentence features (also known as contexts) to improve the performance of NER models, instead of using single-sentence information alone. As far as we know, most NER models choose to exploit pre- and post-sentences to capture cross-sentence features. Generally, current NER studies focus only on the model architecture to capture better token representations. However, there is no in-depth exploration on how to better model cross-sentence features. In this paper, based on the span classification model, we investigate the effect of cross-sentence features under different settings. Specifically, we evaluate the impact of context stitching, context window size, context window padding, and classifier token of pre-trained language model (PLM) on model performance. Comparative experimental results show that appropriate incorporation of document-level contexts can considerably improve the NER metrics. Furthermore, we find that several factors can be used to improve the performance of NER models: (1) use domain-specific PLMs, but not classifier tokens; (2) use only preceding contexts for generic text, and random contexts for specialized text; (3) truncate overly long contexts when the context window is small, and preserve sentence integrity when the window is large; (4) set the context window size to about 200 for the basic size PLM.

공지

DAU Library

학술논문

요약정보

A Comparative Study of Cross-Sentence Features for Named Entity Recognition

Online Access

초록