Skip to content

Latest commit

 

History

History
79 lines (55 loc) · 4.25 KB

A5.DomainAnalysis.md

File metadata and controls

79 lines (55 loc) · 4.25 KB

Domain Analysis

GROUP ASSIGNMENT

How to submit: Two submissions to BBLearn:

  1. Design of the checklist and choice of the projects
  2. Domain analysis results

Deadline:

  1. March 10: results of design/choice of projects
  2. March 28: final results

This is a two step assingment

Step 1:

You are requested to

  1. create a checklist to analyze projects to assess them according to how well-structured it is for new users and/or new contributors.
  2. choose the subject projects

Our focus in analyzing projects on GitHub

Create the checklist:

To do so, you must analyze at least the following resources:

Your checklist needs to be targeted at assessing the documentation and project structure. You don't need to investigate the code structure, architecture, or other technical aspects at this point.

Your checklist should not have more than 10 items at the end. Each item of the checklist needs to be justified based on the references provided and on the goal of your analysis. There is no right or wrong items, you can make them reasonable or not based on your justification.

This checklist will be used in the next step of this assignment

Choose the projects:

You are asked to choose 30 projects that will be analyzed using the checklist your group is proposing. The choice of the projects needs to be objective, and defined BEFORE selecting the projects.

Some potential "variables" that you can use to choose the projects are:

  • Programming language: you may want to understand projects written in a specific language or projects that comprehend a diverse set of languages
  • Size of the project: you may analyze the size of the codebase, the number of commits, the number of different contributors, etc.
  • Maturity of the project: age of the project
  • Popularity: measured in terms of "stars" on GitHub, or watchers, for example
  • Activity: number of issues or pull requests in the last X months

You DON'T need to use all of them. What you have to do is to create your selection strategy and justify why you think that the criteria you are using is appropriate. One example of criteria:

We chose 20 projects written in Java, ranked by number of stars with less than 5 years of contributions

We chose Java, because the language is BLA BLA BLA and XYZ (compeling reasons, backed on evidence). We decided to understand the most popular projects, because we wanted to check whether popularity is related to a good structure for attracting new people. Still, we focused on new projects to check if the newer projects are following the guidelines proposed in the recent literature.

Make sure that you correctly curate your sample. Some hints:

  • Manually analyze the outcomes to check if they are all software projects
  • Check for archived or unused projects
  • Verify if the projects uses pull requests and issues on GitHub, since this is important to what you are doing.
  • Anything that would help creating a valid sample

Step 2:

Based on the projects selected and your checklist you need to analyze the projects and write a report.

The report needs to have:

  1. All the details about the method followed, including everything from Step 1
  2. Results in quantitative sense and in qualitative sense (checking the details that are missing, how projects do things differently, etc.)
  3. Provide interesting insights about (i) interesting patterns you found on your sample, and (ii) projects that present a completely different way of doing things (outliers).

The report may read as a scientific paper or a magazine.

Evaluation (subjective rubric)

You will be evaluated in terms of: [ ] how clear the design decisions are [ ] how convincing (based on evidence) your choices have been for creating the checklist [ ] how good was the strategy created to select the sample [ ] how robust was the evaluation (curation) of the projects before the analysis [ ] how careful was the analysis of the projects [ ] the readability of your report (no long sentences, use linked paragraphs, use paragraphs, structure in sections, etc.) [ ] the creativity of the data analysis [ ] the conclusions of your paper, given your "informed opinion"