Dockerized RStudio and MongoDB integration to play around.
This project aims towards automatically setting up a dockerized multi-application Data Science platform with RStudio and MongoDB.
- installed Docker Engine and Docker Compose
- installed python & pip (only needed to load the sample data into MongoDB)
- pandas & pymongo package (only needed to load the sample data MongoDB)
$ git clone https://github.com/wipatrick/docker-rstudio-mongodb.git
$ cd docker-rstudio-mongodb/
$ docker-compose up -d
As default RStudio initializes the following credentials according to the specification in the docker-compose.yml
:
- username: 'testuser'
- password: 'testpassword'
Check if the instances of RStudio and MongoDB are running correctly.
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2399140168c6 dockerrstudiomongodb_rstudio "/usr/bin/supervisord" 13 minutes ago Up 13 minutes 0.0.0.0:80->8787/tcp rstudio
c14ac842ab18 dockerrstudiomongodb_mongodb "/entrypoint.sh mongo" 13 minutes ago Up 13 minutes 0.0.0.0:27017->27017/tcp mongodb
Before loading the sample data into dockerized MongoDB edit data2mongo.py
according to your setup (changing: IP address to your Docker Host) and execute.
$ python data2mongo.py
{'obs': 1.0, 'satv': 600.0, 'hse': 10.0, 'gpa': 3.3199999999999998, 'hsm': 10.0, 'hss': 10.0, 'sex': 1.0, 'satm': 670.0}
{'obs': 2.0, 'satv': 640.0, 'hse': 5.0, 'gpa': 2.2599999999999998, 'hsm': 6.0, 'hss': 8.0, 'sex': 1.0, 'satm': 700.0}
{'obs': 3.0, 'satv': 530.0, 'hse': 8.0, 'gpa': 2.3500000000000001, 'hsm': 8.0, 'hss': 6.0, 'sex': 1.0, 'satm': 640.0}
Switch over to you designated browser on http://<DockerHost-IP> and login with the specified credentials. Once logged in, you can connect to MongoDB with the pre-installed RMongo-package for R as follows.
library(RMongo)
Loading required package: rJava
> mongo <- mongoDbConnect("db", "mongodb")
> print(dbShowCollections(mongo))
[1] "system.indexes" "test"
> query <- dbGetQuery(mongo, "test", "{'satv': {'$lt': 500}}")
> summary(query)
hse obs sex X_id hss gpa
Min. : 3.000 Min. : 7.0 Min. :1.000 Length:113 Min. : 4.000 Min. :0.120
1st Qu.: 7.000 1st Qu.: 74.0 1st Qu.:1.000 Class :character 1st Qu.: 7.000 1st Qu.:2.140
Median : 8.000 Median :125.0 Median :1.000 Mode :character Median : 8.000 Median :2.620
Mean : 7.735 Mean :124.8 Mean :1.407 Mean : 7.628 Mean :2.564
3rd Qu.: 9.000 3rd Qu.:185.0 3rd Qu.:2.000 3rd Qu.: 9.000 3rd Qu.:3.070
Max. :10.000 Max. :224.0 Max. :2.000 Max. :10.000 Max. :4.000
satv satm hsm
Min. :285.0 Min. :300.0 Min. : 2.000
1st Qu.:400.0 1st Qu.:505.0 1st Qu.: 7.000
Median :440.0 Median :570.0 Median : 8.000
Mean :432.9 Mean :566.7 Mean : 8.027
3rd Qu.:470.0 3rd Qu.:630.0 3rd Qu.: 9.000
Max. :490.0 Max. :740.0 Max. :10.000
As you can see, establishing a connection to your MongoDB Docker instance can be done by simply calling it with its defined name in docker-compose.yml due to Dockers Container-Links feature.
Credits belong to rocker-org and the docker-library team working on MongoDB for their pre-work.