Software systems typically consist of many lines of source code organized in several files hierarchically structured into directories and packages. Since the code is the key data in software development, in many scenarios an overview of it is required, in particular for similar code passages. In this paper, we investigate the visual analysis of source code similarities for local as well as global code passages. To this end, we first compute all subsequence occurrence frequencies (support metric) and relative occurrence frequencies (confidence metric) in local as well as global code regions. The resulting textual data attached by its occurrence values is displayed in a triangular matrix. Several interaction techniques are integrated in our visualization tool which are illustrated in the corresponding case study illustrating similarities in source code written in Assembler consisting of 10,641 characters.
Source code is becoming larger and is hierarchically structured. The visualization of the additional hierarchical organization might be important. Also the algorithmic computation of a hierarchical clustering based on code similarities might be of interest.