Title:
|
Identifying Word Relations in Software: A Comparative Study of Semantic Similarity Tools
|
Authors:
|
Giriprasad Sridhara, Emily Hill, Lori Pollock, and K Vijay-Shanker
|
Abstract:
|
Modern software systems are typically large and complex, making comprehension of these systems extremely difficult. Experienced programmers comprehend code by seamlessly processing synonyms and other word relations.
Thus, we believe that automated comprehension and software tools can be significantly improved by leveraging word
relations in software. In this paper, we perform a comparative study of six state of the art, English-based semantic similarity techniques and evaluate their effectiveness on
words from the comments and identifiers in software. Our
results suggest that applying English-based semantic similarity techniques to software without any customization
could be detrimental to the performance of the client software tools. We propose strategies to customize the existing semantic similarity techniques to software, and describe
how various program comprehension tools can benefit from
word relation information.
|
Publisher:
|
IEEE
|
Book Title:
|
16th IEEE International Conference on Program Comprehension
|
Date:
|
June 2008
|
Project:
|
Natural Language Program Analysis
|
Document Type:
|
Conference Proceedings
|
Files:
|
[presentation slides: Adobe PDF] (973 KB)
[preprint: Adobe PDF] (186 KB)
|
Bibtex Entry:
|
| @inproceedings{123456789/188, |
| author = {Giriprasad Sridhara and Emily Hill and Lori Pollock and K Vijay-Shanker}, |
| title = {Identifying Word Relations in Software: A Comparative Study of Semantic Similarity Tools}, |
| booktitle = {16th IEEE International Conference on Program Comprehension}, |
| publisher = {IEEE}, |
| month = {June}, |
| year = {2008} |
| } |
|