-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
move the instruction of readme and container provisioner to separate …
…readme files
- Loading branch information
Showing
3 changed files
with
45 additions
and
52 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# Container Provisioner | ||
Container Provisioner is a tool written in [Go](https://go.dev/) that provides a UI for the users to interact with the scraper. It uses [Docker API](https://docs.docker.com/engine/api/) to provision the containers and run the scraper. The UI is written in raw HTML and JavaScript while the backend web framwork is [Fiber](https://docs.gofiber.io/). | ||
|
||
The scraped reviews will be uploaded to [Cloudflare R2 Buckets](https://www.cloudflare.com/lp/pg-r2/) for storing. R2 is S3-Compatible; therefore, technically, one can also use AWS S3 for storing the scraped reviews. | ||
|
||
## Pull the latest scraper Docker image | ||
```bash | ||
docker pull ghcr.io/algo7/tripadvisor-review-scraper/scraper:latest | ||
``` | ||
## Credentials Configuration | ||
### R2 Bucket Credentials | ||
You will need to create a folder called `credentials` in the `container_provisioner` directory of the project. The `credentials` folder will contain the credentials for the R2 bucket. The credentials file should be named `creds.json` and should be in the following format: | ||
```json | ||
{ | ||
"bucketName": "<R2_Bucket_Name>", | ||
"accountId": "<Cloudflare_Account_Id>", | ||
"accessKeyId": "<R2_Bucket_AccessKey_ID>", | ||
"accessKeySecret": "<R2_Bucket_AccessKey_Secret>" | ||
} | ||
``` | ||
### R2 Bucket URL | ||
You will also have to set the `R2_URL` environment variable in the `docker-compose.yml` file to the URL of the R2 bucket. The URL should end with a `/`. | ||
|
||
## Run the container provisioner | ||
The `docker-compose.yml` for the provisioner is located in the `container_provisioner` folder. | ||
|
||
## Visit the UI | ||
The UI is accessible at `http://localhost:3000`. | ||
|
||
## Live Demo | ||
A live demo of the container provisioner is available at [https://algo7.tools](https://algo7.tools). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Proxy Pool | ||
Proxy Pool is a docker image that runs both HTTP and SOCKS5 Proxies over OpenVPN (config to be provided by the user via docker bind mounts). `sockd`, `squid`, and `openvpn` client are managed by `supervisord` in the container. The service integrates with the Container Provisioner to provide a pool of proxies for the scraper to use. The container provisioner uses `docker-compose labels` to distinguish between different proxies. At this moment, the container provisioner only supports connecting to the Proxy Pool using HTTP proxies. Each service in the `docker-compose.yml` file represents a single proxy in the pool. The `docker-compose.yml` file for the proxy pool is located in the `proxy_pool` folder. | ||
|
||
The Proxy Pool service can also be used directly with the scraper. Just make sure that the `PROXY_ADDRESS` environment variable is in the `docker-compose.yml` file for the scraper. | ||
|
||
## Running the Proxy Pool | ||
1. Pull the latest scraper Docker image | ||
```bash | ||
docker pull ghcr.io/algo7/tripadvisor-review-scraper/vpn_worker:latest | ||
``` | ||
2. Create a docker-compose.yml file containing the configurations for each proxy (see the docker-compose.yml provided in the proxy_pool folder). | ||
3. Place the OpenVPN config file of each proxy in the corresponding bind mount folder speicified in the docker-compose.yml file. | ||
4. Run `docker-compose up` to start the container. |