Skip to content

Latest commit

 

History

History
169 lines (126 loc) · 7.71 KB

INSTALL.md

File metadata and controls

169 lines (126 loc) · 7.71 KB

Environment Preparation

Regardless of the level of reproducibility you will need to prepare your environment with the following common steps:

  1. clone this repo:
    git clone https://github.com/sola-st/thinking-like-a-developer.git
  2. make sure python 3 (we used Python 3.8) is installed, by running:
    python --version
    if not install it with:
    sudo apt install python3.8
  3. from the main folder, install the dependencies of this analysis by running:
    pip install -r reqs.txt
    Optional: we encourage you to use a virtual environment if possible: conda or venv;
  4. download the data and place the unzipped content in the data folder of this repo: https://doi.org/10.6084/m9.figshare.14462052
  5. from the main folder of the repo, launch the Jupyter notebook instance
    jupyter notebook
  6. From the browser page navigate to the notebooks folder which contains important notebooks you will need in the following step.

1. Comparative Study: Neural Models versus Human Attention

If you did the common steps in the previous paragraph your environment is ready, and the installation to reproduce the comparative study is completed. Move back to the README.md to reproduce the analysis.

2. Empirical Study: Human Reasoning Recorder Deployment

Before starting double check that you completed the previous common steps in the Environment Preparation paragraph above.

The setup to collect human attention relies on three component:

  • the web interface of the Human Reasoning Recorder (HRR) (written in NodeJS) that show the 20 method naming tasks to the participants. This interface is hosted on the free application instance from Heroku).
  • the database containing the actual collected data and the info on which experiment set has been delivered to the user (hosted on a free instance of MongoDB Cloud)
  • (OPTIONAL) the Amazon Mechanical Turk for requester account. Here we recruit participants and validate those that send us a valid submission. Note that this is the only part that needs investment, so we invest money only on real data not on infrastructure maintenance since we use free resources..

To deploy this tool for a software engineering experiment you need:

  • an Heroku account (FREE)
  • a mongodb account (FREE)

Heroku will host our application, whereas MongoDB will host our NoSql Database to store the logs of the participants.

Install Heroku CLI

Follow these steps:

  1. Create heroku account on heroku.com
  2. Download and install Heroku CLI (google it and download)

Go to the folder where you cloned the project and open the CLI there:

  1. create an ssh key (https://devcenter.heroku.com/articles/keys)
    ssh-keygen -t rsa
    
  2. add key to heroku account. In this project CLI:
    heroku keys:add
    
  3. Create the link between this git and the heroku remote. In this project CLI:
    heroku create <app_name_you_want_it_to_be>
    
  4. Add a new heroku remote. Go to the folder where you cloned this project andn run:
    git remote add heroku https://git.heroku.com/<app_name_you_want_it_to_be>.git
    

Create a MongoDB Database

  1. Create an account on cloud.mongodb.com
  2. Create a new project via website. Let's name it: MyProject.
  3. Open the project and create a cluster. Choose the FREE options with 512 MB of storage space.
  4. Go to the security section > database access and Add new database user
  5. Select a user name and password, then add user
  6. Go to security section > network access and Add IP address and allow from everywhere (since we do not know where heroku will deploy our app precisely), then confirm.
  7. Go to data storage section > clusters and click on Connect in your MyProject cluster.
  8. Select Connect your application
  9. Copy the url endpoint

Connect Heroku and MongoDB

Our NodeJS web interface, hosted on Heroku, needs the credentials we just created to store new documents (participants logs) in the database:

  1. Create a copy of the configuration file at config/template.settings.yaml, rename it to config/settings.yaml
  2. Insert the username and password from mongodb, replacing <your_username> and <your_password>.
  3. Insert the url endpoint, replacing <your_endpoint>.
  4. Save and commit your changes locally
  5. Push it to heroku. In this project CLI:
git push heroku master

Troubleshooting: Sometimes you might need to force the push:

git push heroku master --force

Install MongoExport CLI Tool

We need mongoexport to efficiently download data from mondodb. Follow the official documentation for that: https://docs.mongodb.com/database-tools/mongoexport/

Publish experiments

If you used the default configuration, the JSON files in the folder data/datasets/methods_showed_to_original_participants have been already deployed to the heroku nodejs app, nevertheless the information on whether each expeirment dataset has been already consumed by a participant or not, that is stored in the mongodb database. In this step, we add records to the mongodb database to let it know that heroku has new experiment sets to show to participants.

Open and run the following notebook top to bottom: notebooks/Uplaod_Available_Experiment_Sets.ipynb.

Open the interface

If everything was successful, you can access your Human Reasoning Recorder interface directly from any computer! To get the url get the name of the application with:

heroku apps
=== <YOUR_EMAIL> Apps
<YOUR_APP_NAME>

And then go to <YOUR_APP_NAME>.herokuapp.com to see your Human Reasoning Recorder! Congratulations!

Interface Walkthrough

A demo of the environment is available here (the tool starts on demand, so it might take up to 30 seconds to start if you are the first user of the day):

Try the Human Reasoning Recorder NOW!

1. Intro Page

Startup intro page:

screenshot startup page

2. Guideline before

Before asking we inform the participant about the task:

screenshot guidelines

3. Main Task: Method Naming

3.1 Interface to explore the method body and choose one of the alternatives:

screenshot alternatives

3.2 Deblurring based code exploration:

screenshot deblur

3.3 Click on tokens to make them always visible:

screenshot click

4. Question on Perceived Difficulty

We ask "How difficult was the previous question?" right after every method inspection.

screenshot question on perceived difficulty

What's in for ME as a Researcher?

Remember that the platform is fully customizable: the task can be changed according to your needs, it takes only a simple JSON file as input where you:

  • ask your code related question
  • propose your custom list of options
  • provide a code snippet (and the tokenization that you want)
  • provide a custom "after task" question with a grading scale

The platform will record the answers of the user together with all the events with a timestamp (e.g. mouse over token, clicked tokens, mouse over alternative) to let you perform your custom analysis and answer your research questions.

Now that you had a look at the interface, why not to try the experience by yourself? Try the Human Reasoning Recorder NOW!