학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

분산분석을 이용한 체언류 어절구조 사용 특징에 대한 연구 -한국어 준구어 말뭉치를 중심으로-

Resource Type: Academic Journal
Authors: 노성화 (Lu, Xing-hua); 박기화 (Piao, Qi-hua)
Source: 우리말글. 2024-03 100:1-22
Subject: 한국어
체언
어절구조
준구어 말뭉치
일/이원 분산분석
Korean
Noun
Phrase Structure
Quasi-Spoken Corpus
One/Two-way ANOVA
Language: Korean
ISSN: 1229-9200
2713-5586

Online Access

초록

한국어 정보 처리 분야에서, 텍스트를 분석하는 기본 단위는 어절이므로 어절구조의 사용 특징에 대한 연구는 매우 중요하다. 본 연구는 대규모 한국어 준구어 말뭉치를 기반으로 하여 체언류 어절구조를 중심으로, 통계학적․언어학적 방법을 이용하여 어절구조의 사용 특징을 분석하는 것을 목표로 하였다. 통계학적으로 먼저 이원 분산분석 방법을 사용하여 체언 유형과 어절구조 유형이 어절구조의 출현빈도에 영향을 미치는지를 검증한 다음, 일원 분산분석 방법을 사용하여 체언류 어절에서 출현한 각 어절구조의 빈도가 유의한 차이가 있다는 것을 발견하였다. 그리고 언어학적으로는 그 차이의 원인을 자세히 분석하였다. 본 연구 결과는 향후 형태소 분석이나 개체명 인식 등 자연어 처리 분야에서 유익한 참고 자료로 활용할 수 있을 것으로 기대한다.
In the field of Korean natural language processing, the study of phrase structures is crucial as word segments serve as the fundamental units for computer sentence segmentation. This study aims to analyze the characteristics of phrase structures, focusing on noun phrase structures, using both statistical and linguistic methods based on a large quasi-corpus of spoken Korean. Statistical analysis using two-way ANOVA initially examined whether the type of noun and the type of phrase structure influence the frequency of phrase structure occurrence. Subsequently, one-way ANOVA revealed significant differences in the frequencies of each phrase structure appearing in noun word segments. Additionally, linguistic analysis delved into the underlying reasons for these differences. The results of this study are expected to serve as valuable reference material for future research in natural language processing fields such as morphological analysis and named entity recognition.

공지

DAU Library

학술논문

요약정보

분산분석을 이용한 체언류 어절구조 사용 특징에 대한 연구 -한국어 준구어 말뭉치를 중심으로-

Online Access

초록