학술논문

Home

자료검색

학술논문

검색결과 돌아가기

검색화면

내보내기 프린트

Extraction of Archetype from Near Duplicates in Software Documentation

Resource Type: Conference
Authors: Koznov, Dmitry; Luciv, Dmitry; Chernishev, George; Grigoryev, Dmitry
Source: 2019 Actual Problems of Systems and Software Engineering (APSSE) Actual Problems of Systems and Software Engineering (APSSE), 2019. :126-130 Nov, 2019
Subject: Computing and Processing
Documentation
Software algorithms
Task analysis
Tools
Complexity theory
Software engineering
Visualization
Software Documentation, JavaDoc, Duplicates, Near Duplicates
Language

Online Access

Full Text (IEEE)

초록

Software documentation contains a large amount of duplicate text, which is often comprised of near duplicates — repetitions of the same text with slight differences. They emerge due to numerous copy-pastes that have been slightly modified. Uncontrolled near duplicates complicate documentation support to a significant degree. There are some research papers on detection and management of duplicates in software documentation, but only the Duplicate Finder approach addresses the problem of near duplicates. Nevertheless, Duplicate Finder's search algorithms do not provide extraction of archetype (common text) for detected groups of near duplicates (a set of near duplicates belong to one group if they have a lot of commonalities). Archetype of group can be used in visualization of the common text and differences of duplicates for manual analysis, as well as for reuse of documentation. In this paper, we present an algorithm for archetype extraction and results of experiments on documentation of several well-known open source Java projects JUnit, Mockito, SLF4J.

공지

DAU Library

학술논문

요약정보

Extraction of Archetype from Near Duplicates in Software Documentation

Online Access

초록