GROUP ASSIGNMENT
How to submit: Two submissions to BBLearn:
- Design of the checklist and choice of the projects
- Domain analysis results
Deadline:
- March 10: results of design/choice of projects
- March 28: final results
This is a two step assingment
You are requested to
- create a checklist to analyze projects to assess them according to how well-structured it is for new users and/or new contributors.
- choose the subject projects
Our focus in analyzing projects on GitHub
To do so, you must analyze at least the following resources:
- https://opensource.guide/
- https://www.igor.pro.br/publica/papers/IEEESoft_2018.pdf
- https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007296
Your checklist needs to be targeted at assessing the documentation and project structure. You don't need to investigate the code structure, architecture, or other technical aspects at this point.
Your checklist should not have more than 10 items at the end. Each item of the checklist needs to be justified based on the references provided and on the goal of your analysis. There is no right or wrong items, you can make them reasonable or not based on your justification.
This checklist will be used in the next step of this assignment
You are asked to choose 30 projects that will be analyzed using the checklist your group is proposing. The choice of the projects needs to be objective, and defined BEFORE selecting the projects.
Some potential "variables" that you can use to choose the projects are:
- Programming language: you may want to understand projects written in a specific language or projects that comprehend a diverse set of languages
- Size of the project: you may analyze the size of the codebase, the number of commits, the number of different contributors, etc.
- Maturity of the project: age of the project
- Popularity: measured in terms of "stars" on GitHub, or watchers, for example
- Activity: number of issues or pull requests in the last X months
You DON'T need to use all of them. What you have to do is to create your selection strategy and justify why you think that the criteria you are using is appropriate. One example of criteria:
We chose 20 projects written in Java, ranked by number of stars with less than 5 years of contributions
We chose Java, because the language is BLA BLA BLA and XYZ (compeling reasons, backed on evidence). We decided to understand the most popular projects, because we wanted to check whether popularity is related to a good structure for attracting new people. Still, we focused on new projects to check if the newer projects are following the guidelines proposed in the recent literature.
Make sure that you correctly curate your sample. Some hints:
- Manually analyze the outcomes to check if they are all software projects
- Check for archived or unused projects
- Verify if the projects uses pull requests and issues on GitHub, since this is important to what you are doing.
- Anything that would help creating a valid sample
Based on the projects selected and your checklist you need to analyze the projects and write a report.
The report needs to have:
- All the details about the method followed, including everything from Step 1
- Results in quantitative sense and in qualitative sense (checking the details that are missing, how projects do things differently, etc.)
- Provide interesting insights about (i) interesting patterns you found on your sample, and (ii) projects that present a completely different way of doing things (outliers).
The report may read as a scientific paper or a magazine.
You will be evaluated in terms of: [ ] how clear the design decisions are [ ] how convincing (based on evidence) your choices have been for creating the checklist [ ] how good was the strategy created to select the sample [ ] how robust was the evaluation (curation) of the projects before the analysis [ ] how careful was the analysis of the projects [ ] the readability of your report (no long sentences, use linked paragraphs, use paragraphs, structure in sections, etc.) [ ] the creativity of the data analysis [ ] the conclusions of your paper, given your "informed opinion"