Introduction
The undocumented evolution of a software project and its underlying architecture underscores the need to recover the architecture from the software’s implementation-level artifacts. Despite the existence of various software remodularization techniques, they often suffer from inaccuracies and evaluating their effectiveness is challenging due to the absence of accurate “ground-truth” architectures or reference models. In this paper, we propose Automated Construction of Reference Model (ACRM), an approach for automatically constructing reference models for various software projects using the metadata of all software versions and historical maintenance records. We evaluate ACRM through both quantitative and qualitative analysis. The experiment results provide quantitative validation and show that the generated reference models are reasonable, as confirmed by the relationship between proposed reference models and architecture-level bugs or code smells. We also conduct a qualitative study, involving industrial developers and students, which further validates the generated reference models. The survey shows that, on average, 89% of the participants agree with the reference models generated by ACRM. Moreover, we propose an improved metric, wc2c, which analyze the strengths and weaknesses of different types of software clustering techniques using the proposed reference models of the analyzed software. Finally, we discuss the potential benefits of using ACRM in analyzed projects, particularly in terms of improving software quality, reducing maintenance costs, and enhancing developer productivity.
Studied Subject
ID | Project | # Versions | # Major Versions | # Stars | KLOC (Avg) | # Classes (Avg) | Commits |
---|---|---|---|---|---|---|---|
1 | Activemq | 64 | 2 | 1,764 | 324.9 | 3,057 | 11,309 |
2 | Activemq-artemis | 32 | 2 | 602 | 518.3 | 3,324 | 9,680 |
3 | Aeron | 86 | 2 | 5,065 | 51.1 | 330 | 15,842 |
4 | Alluxio | 62 | 3 | 4,613 | 248.0 | 916 | 30,937 |
5 | Apktool | 34 | 2 | 10,220 | 16.6 | 179 | 1,648 |
6 | Assertj-core | 50 | 3 | 1,756 | 109.9 | 2,600 | 2,870 |
7 | Atmosphere | 204 | 3 | 3,430 | 40.6 | 259 | 5,931 |
8 | Atomix | 95 | 3 | 1,901 | 55.6 | 619 | 4,265 |
9 | AxonFramework | 99 | 4 | 2,020 | 93.0 | 724 | 5,951 |
10 | Beam | 83 | 2 | 3,998 | 389.6 | 1,063 | 27,132 |
11 | Bisq | 86 | 2 | 3,102 | 111.1 | 892 | 11,168 |
12 | Byte-buddy | 202 | 2 | 3,485 | 117.0 | 581 | 5,200 |
13 | Calcite | 52 | 2 | 1,894 | 211.5 | 869 | 4,175 |
14 | Camel | 154 | 3 | 3,242 | 680.0 | 7,981 | 45,096 |
15 | Cas | 218 | 4 | 7,620 | 91.1 | 1,219 | 16,869 |
16 | Cassandra | 241 | 4 | 5,950 | 189.2 | 775 | 25,297 |
17 | Conversations | 215 | 3 | 3,541 | 54.6 | 150 | 6,274 |
18 | Cxf | 153 | 2 | 642 | 527.7 | 4,618 | 15,722 |
19 | Dbeaver | 108 | 4 | 13,652 | 286.0 | 2,233 | 16,052 |
20 | Debezium | 73 | 2 | 3,265 | 75.5 | 363 | 3,125 |
21 | Discovery | 76 | 3 | 2,954 | 17.4 | 289 | 2,403 |
22 | Dropwizard | 147 | 3 | 7,657 | 44.0 | 509 | 5,430 |
23 | Eclim | 76 | 2 | 1,026 | 33.2 | 326 | 4,849 |
24 | Flink | 101 | 2 | 13,149 | 698.3 | 4,037 | 22,170 |
25 | Fresco | 40 | 2 | 16,207 | 89.2 | 547 | 2,531 |
26 | Grakn | 45 | 2 | 2,107 | 76.6 | 570 | 4,291 |
27 | Guacamole-client | 33 | 2 | 1,004 | 19.5 | 281 | 5,378 |
28 | Hadoop | 293 | 4 | 10,489 | 972.6 | 1,784 | 23,874 |
29 | Hawtio | 137 | 2 | 1,138 | 63.3 | 199 | 8,803 |
30 | Hive | 40 | 2 | 3,174 | 850.3 | 2,345 | 14,501 |
31 | Java-tron | 51 | 3 | 2,380 | 80.2 | 849 | 14,129 |
32 | karaf | 82 | 3 | 480 | 80.0 | 655 | 8,197 |
33 | Maxwell | 170 | 2 | 2,141 | 68.8 | 123 | 3,110 |
34 | Nifi | 88 | 2 | 2,066 | 60.1 | 693 | 5,286 |
35 | Okhttp | 95 | 4 | 37252 | 50.3 | 167 | 4645 |
36 | Openapi-generator | 53 | 3 | 5,446 | 374.2 | 542 | 14,218 |
37 | Orientdb | 157 | 3 | 4,154 | 368.1 | 2,329 | 19,352 |
38 | Pdfbox | 52 | 2 | 1,162 | 134.7 | 939 | 8,962 |
39 | Pmd | 70 | 2 | 2,887 | 184.3 | 1,415 | 16,532 |
40 | Powermock | 42 | 2 | 3,121 | 36.8 | 590 | 1,607 |
41 | Redisson | 163 | 3 | 13,242 | 74.7 | 486 | 5,675 |
42 | Rest-assured | 56 | 3 | 4,748 | 20.0 | 180 | 1,959 |
43 | Speedment | 67 | 2 | 1,832 | 95.3 | 1,537 | 4,674 |
44 | Spotbugs | 41 | 2 | 1,894 | 227.6 | 1,891 | 16,206 |
45 | Spring-framework | 175 | 3 | 37,411 | 502.5 | 3,773 | 20,896 |
46 | Spring-security | 143 | 4 | 4,843 | 145.0 | 1,231 | 8,732 |
47 | Storm | 33 | 2 | 6,078 | 160.0 | 920 | 10,316 |
48 | Testcontainers-java | 73 | 2 | 3,805 | 8.3 | 175 | 2,008 |
49 | Tika | 56 | 2 | 1,002 | 82.0 | 526 | 4,747 |
50 | Traccar | 31 | 2 | 2,392 | 25.9 | 415 | 6,227 |