The thesis aims to investigate the effectiveness of a metric-based approach compared to machine-learning methods in detecting design patterns in software code. Design patterns provide reusable solutions to common programming problems, which is crucial in improving software quality and maintainability. While machine learning methods have gained popularity in design pattern detection, they often present challenges due to their reliance on large labelled datasets and extensive training. To address these limitations, I propose a metric-based approach that overcomes the need for extensive training by extracting metrics from programs using scripts and evaluating patterns based on predetermined thresholds. By conducting a comparative analysis of the metric-based and machine-learning approaches on eighteen open-source projects in both C++ and Java, this study aims to assess the accuracy and practicality of each method. The findings demonstrate that the metric-based approach achieves comparable or better results in detecting design patterns without relying on AI. This research contributes to simplifying design pattern analysis and opens up possibilities for practical implementation in software development.
The primary objective of this study is to conduct a comprehensive comparative analysis between the metric-based approach and machine-learning methods using eighteen open-source projects in both C++ and Java. By evaluating the accuracy and practicality of each method, this research aims to determine the effectiveness of the metric-based approach in detecting design patterns without relying on AI. The findings from this analysis reveal that the metric-based approach achieves comparable or even superior results in pattern detection.
The research significantly contributes to the field by offering a practical alternative to the prevalent machine-learning methods, simplifying design pattern analysis, and enhancing software quality. The study emphasizes the potential of this approach to streamline design pattern analysis and improve software maintainability. By overcoming the challenges associated with extensive training and large datasets, the metric-based approach presents opportunities for practical implementation in software development.
In summary, this thesis explores the effectiveness of the metric-based approach in detecting design patterns, providing valuable insights into its accuracy and applicability. The comparative analysis conducted on diverse open-source projects establishes the viability of the metric-based approach as a practical solution, contributing to the advancement of software analysis and quality assurance techniques.
- Programming Languages: Java and Python
- Java AST: Javaparser
- C++ AST: LLVM Clang
For this thesis research, a total of 18 repositories were analyzed. These repositories were selected from two different sources: PMART and open-source C++ repositories on GitHub. The repositories were selected due to their clearly labelled design patterns and broad usage in the field. Below is a list of the repositories analyzed:
By analyzing repositories from both PMART and GitHub, we were able to achieve in-depth design pattern detection results in Java and C++.
To perform the Java extraction, ensure that you have Java and Maven installed on your machine. If you haven't installed them yet, follow the instructions provided by their respective documentation.
For the C++ extraction, please follow these steps:
- Ensure that Python is installed on your machine. If you don't have Python installed, you can download and install it from the official Python website.
- Once Python is installed, open your preferred command-line interface.
- Install the external library C++ clang by executing the following command:
pip install clang
With the necessary installations completed, you are now ready to proceed with the using the project.
-
Clone the repository containing the extraction scripts to your local machine and open it in your preferred Integrated Development Environment (IDE).
-
Navigate to the appropriate source folder depending on the extraction method you wish to use:
For the Javaparser approach,cd Scripts/Java/javaparser/src/main/java/nils/dunlop/thesis
For the Javasymbol solver approach,
cd Scripts/Java/javasymbolsolver/src/main/java/nils/dunlop/thesis
-
Open the MainRunner.java file within the chosen source folder.
-
Update the source folder path in the MainRunner.java file to specify the location of the code you want to analyze and save the changes.
-
Compile the MainRunner.java file by running the following command:
javac MainRunner.java
- Clone the repository and open it in your preferred Integrated Development Environment (IDE).
- Navigate to the Scripts/C++ directory within the cloned repository.
cd Scripts/C++
- Open the main.py file, update the file paths to specify the location of the C++ code you want to analyze and save.
- Run the following command to start the C++ metric extraction process:
python main.py
I extend my sincere gratitude to my supervisor, Jennifer Horkoff, for their guidance and support throughout this research project. I would also like to thank Gothenburg University and Chalmers AI Research Centre for providing the necessary resources and facilities for conducting this research.