Skip to content

Commit

Permalink
Merge pull request #2 from experius/release/0.2.0
Browse files Browse the repository at this point in the history
Release/0.2.0
  • Loading branch information
egordm authored Feb 25, 2020
2 parents 214e70d + 8ea9cad commit 2bacd8a
Show file tree
Hide file tree
Showing 6 changed files with 47 additions and 19 deletions.
50 changes: 39 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
![logo](https://github.com/experius/SeoSnap/raw/master/assets/logo.png)

#SeoSnap Beta

Setup for the whole seosnap stack including dashboard, cache server and cache warmer used for prerendering and full
page caching PWA's.

# Installation
* Pull the repo
* Pull the repo (*note: the pull is recursive*)
```
git clone --recursive git@github.com:experius/SeoSnap.git
```
Expand All @@ -15,12 +13,6 @@ git clone --recursive git@github.com:experius/SeoSnap.git
```
docker-compose up --build -d && docker-compose down
```
Wait 5 seconds and then build again
```
docker-compose up --build
```

Everything should run now

# Usage
* Dashboard: http://127.0.0.1:8080/ (default login: snaptron/Sn@ptron1337)
Expand All @@ -35,7 +27,7 @@ Cache directory ./cache
## Run cache warmer
Make sure you have created a website via dashboard http://127.0.0.1:8080/seosnap/website/add/
```
docker-compose run cachewarmer <website id>
docker-compose run cachewarmer cache <website id>
```

## Nginx
Expand All @@ -57,4 +49,40 @@ When the crawler is started it connects with the dashboard api. It uses scrapy t
The cache server is a simple file caching server. If a file exist with the content of the page it serves the html from the file. If not, it renders the requested url with rendertron and saves the html output in a file. To refresh the cache the cache-warmer uses PUT requests instead of GET. This will force update from the cache file.

# Build with
![diagram](https://github.com/experius/SeoSnap/raw/master/assets/software.png)
![diagram](https://github.com/experius/SeoSnap/raw/master/assets/software.png)

## Usage cache warmer [See](https://github.com/experius/SeoSnap-Cache-Warmer/blob/master/README.md)
### Commands
#### Cache
Handles caching of pages associated to given website
```
Usage: crawl.py cache [OPTIONS] WEBSITE_IDS
Options:
--follow_next BOOLEAN Follows rel-next links if enabled
--recache BOOLEAN Recached all pages instead of not yet cached ones
--use_queue BOOLEAN Cache urls from the queue instead of the sitemap
--load BOOLEAN Whether already loaded urls should be scraped instead
--help Show this message and exit.
```

#### Clean
Handles cleaning of the dashboard queue
```
Usage: crawl.py clean [OPTIONS] WEBSITE_IDS
Options:
--help Show this message and exit.
```

### Examples
```
# Cache the sitemap of website 1
docker-compose run cachewarmer cache 1
# Cache requests in queue for websites 1 and 2
dc run cachewarmer cache 1,2 use_queue=true
# Clean the queue for websites 1 and 2
docker-compose run cachewarmer clean 1,2
```
7 changes: 4 additions & 3 deletions examples/firewall.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
sudo sh -c "echo '{ \"iptables\": false }' >> /etc/docker/daemon.json && systemctl restart docker"

sudo ufw default deny incoming
sudo ufw allow from myip to any port 22
sudo ufw allow from myip to any port 8080
sudo ufw allow from myip to any port 5000
sudo ufw allow from <myip> to any port 22
sudo ufw allow from <myip> to any port 8080
sudo ufw allow from <myip> to any port 5000
sudo ufw allow out on docker0 from 172.17.0.0/16
sudo ufw enable
3 changes: 1 addition & 2 deletions update.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
#!/bin/bash



git pull origin master --recurse-submodules

0 comments on commit 2bacd8a

Please sign in to comment.