Skip to content

nwihardjo/Startup-Crawler

Repository files navigation

Crawler

Automatically scrape startup-related data from startup-database website utilising scrapy-splash website

Features

  • Extract startups funding information
  • Scrape startups' information (logo, description, founder) information
  • Adjust the data formatting to be uploaded to MongoDB database

Windows-user dependencies:

  • Anaconda / miniconda - run through the whole framework in the isolated environment
  • Docker
  • Python 3
  • scrapy-splash
  • scrap

Getting the crawled data:

All data is stored in it's responding database website's name in the csv file

Installing guide : here

About

For scraping startup-database website

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages