Dec 11 2017 cs.SE
In this paper, we first collect and track large-scale fixed and unfixed violations across revisions of software. It turns out that a small number of violation types are responsible for the majority of recurrently occurring violations and they are fixed with similar code changes. To automatically identify patterns in violations and their fixes, we propose an approach that utilizes convolutional neural networks and clustering. We then evaluate the usefulness of the identified fix patterns by applying them to unfixed violations. The results show that actual developers accepted and merged 69 of 116 fixes generated from the fix patterns. From the study, we observe the recurrences of fixed violations that may help prioritize violations, identify fix patterns from existing fixed violations, and resolve similar violations existing in the wild.