统一框架的混合依存句法分析 / Unified Framework for Hybrid Dependency Parsin
- Resource Type
- Academic Journal
- Authors
- 吴福祥; 周付根; WU Fu-xiang; ZHOU Fu-gen
- Source
- 电子科技大学学报. (1):102-150
- Subject
- 条件随机场
依存句法
混合句法分析
最大生成树
CRF
dependency grammar
hybrid dependency parsing
MST
- Language
- Chinese
- ISSN
- 1001-0548
监督统计句法分析器的性能很大程度依赖于昂贵而有限的人工标注数据。为充分利用现有标注树库而不需额外设计句法分析器,该文提出了一种混合句法处理管线。该管线以基于最大生成树算法和线性链式条件随机场的句法分析器为基本框架,融合使用不同树库进行混合训练,综合利用不同树库对应的基线分析器解析的依存骨架,提取交叉信息,并在基本框架上构建了综合句法分析器。实验结果表明,该方法可以有效地提升单一树库的句法分析器的分析精度。
The mainstream dependency parser is a supervised statistical parser whose performance greatly relies on manually annotated dataset in recently. In order to use multi-treebank without building a new parser, a hybrid dependency processing pipeline is proposed. The pipeline is implemented through maximum spanning tree (MST) algorithm and linear chain conditional random fields (CRF) as base framework, and a hybrid dependency processing pipeline for training the parser by using multi-treebank is constructed, then a composite dependency parser is built from base framework to utilizes cross information of the multi-treebank with a set of hybrid feature templates. The result shows that the pipeline can improve the parsing precision of single-treebank parser without designing a new parser.