Scrap ton VIE aims at providing a convenient way to find VIE job offers. A development version lives at : https://vie.johanet.fr/
Note : on 11/21/19, SSL certificates got messed up on the civiweb website. I had to skip verifying certificates, although this is probably going to be fixed.
To ignore urllib3 warning from requests
, use :
export PYTHONWARNINGS="ignore:Unverified HTTPS request"
Install dependencies :
npm install
Launch the development server with :
npm run watch
npm install
npm run build
Start scraping by running python scrap.py
with a proper config.ini in the same directory, e.g :
[credentials]
api_key = ABCDEFGHIJKLMNOPQSRTUVW
[civiweb]
offer_list = https://www.civiweb.com/FR/offre-liste/page/
offer_page = https://www.civiweb.com/FR/offre/
[db]
database = my_database
hostname = localhost
port = 5432
username = my_username
password = my_password
In production, use npm run build
then serve the index.html with the dist folder with any web server.
Run the scrapper as a CRON job. Inclure full paths to avoid $PATH errors, and include stderr in the log file.
crontab -e
0 1 */5 * * /usr/bin/env python3.6 /home/user/scrap/scrap.py >> /home/user/scrap/scrap.log 2>&1
Build the NodeJS app with npm run build
then start the server with forever
or pm2
.
WITH total AS (SELECT COUNT(*)::numeric FROM offer)
SELECT COUNT(country) AS nombre, ROUND(COUNT(country)/(SELECT * FROM total), 4) * 100 AS percentage , country FROM offer
GROUP BY country
ORDER BY nombre DESC
SELECT country, UPPER(city), salary FROM offer
GROUP BY salary, UPPER(city), country
ORDER BY salary DESC
SELECT salary, ntile(n) OVER (ORDER BY salary) AS quantile FROM offer GROUP BY salary;