-
Notifications
You must be signed in to change notification settings - Fork 3
/
5-project-clones.Rmd
37 lines (26 loc) · 1.05 KB
/
5-project-clones.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
---
title: "6 - Project Clones"
output: html_notebook
---
```{r setup, echo=F, results='hide'}
Sys.setenv(R_NOTEBOOK_HOME = getwd())
source("config.R")
source("helpers.R")
```
First, the input data for `clone_finder` must be exported:
```{r}
exportCloneFinderData(DATASET_NAME, DATASET_PATH)
```
Once done, you may run the `clone_finder NUM_THREADS FOLDER THRESHOLD`:
```{bash}
cd $R_NOTEBOOK_HOME
tools/clone_finder/build/clone_finder 1 datasets/js 2
```
Where `NUM_THREADS` is the number of threads the clone finder may use, `FOLDER` is the dataset folder and `THRESHOLD` is the minimal inclusive number of tokens the files must have to be considered. To proceed without modifications, set the number of threads to `1`.
Then, load the data into database:
```{r}
importCloneFinderData(DATASET_NAME, DATASET_PATH, 1)
```
Where the last argument is number of threads the `clone_finder` used and must therefore be adjusted accordingly before running the script.
## Next Steps
[Metadata Acquisition](6-metadata.nb.html) in file [`6-metadata.Rmd`](6-metadata.Rmd).