针对电气照明设计领域规范条文繁杂,存在设计人员查询困难及对同一规范条文的理解偏差等问题,提出运用信息抽取技术建立该领域的知识图谱,并将专家经验融入其中.在预处理阶段引入互信息和边界熵两个参数对分词进行改进,避免了对专业名词的切分;通过语义角色标注与依存句法分析相结合的方法对数据三元组进行抽取,弥补单纯用语义角色标注方法不能抽取出多宾语的缺陷;用图数据库Neo4j存储,完成该领域知识图谱的构建.
In view of the complexity of regulations in electrical lighting design,the difficulty for designers to query and the deviation of understanding of the same regulations,this paper proposes to use information extraction technology to establish the knowledge graph in this field,and integrate the expert experience into it.In the preprocessing stage,the two parameters of mutual information and boundary entropy were introduced to improve word segmentation,avoiding the segmentation of professional nouns.The method of combining semantic role labeling and dependency parsing was used to extract data triples,to make up the defects that simple method of semantic role labeling could not extract multiple objects.The graph database Neo4j was used for storage to complete the construction of the domain knowledge graph.