Where should I comment my code? A dataset and model for predicting locations that need comments
- Resource Type
- Conference
- Authors
- Louis, Annie; Dash, Santanu Kumar; Barr, Earl T.; Ernst, Michael D.; Sutton, Charles
- Source
- 2020 IEEE/ACM 42nd International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER) ICSE-NIER Software Engineering: New Ideas and Emerging Results (ICSE-NIER), 2020 IEEE/ACM 42nd International Conference on. :21-24 Oct, 2020
- Subject
- Computing and Processing
Computational modeling
Neural networks
Machine learning
Predictive models
Software
Natural language processing
Software engineering
NLP
natural language processing
comments
- Language
Programmers should write code comments, but not on every line of code. We have created a machine learning model that suggests locations where a programmer should write a code comment. We trained it on existing commented code to learn locations that are chosen by developers. Once trained, the model can predict locations in new code. Our models achieved precision of 74% and recall of 13% in identifying comment-worthy locations. This first success opens the door to future work, both in the new where-to-comment problem and in guiding comment generation. Our code and data is available at http://groups.inf.ed.ac.uk/cup/comment-locator/. CCS CONCEPTS • Software and its engineering $\rightarrow$ Maintaining software; • Computing methodologies $\rightarrow$ Neural networks; Natural language processing.