Skip to content

VincentLeV/stackoverflow-api-py

Repository files navigation

Open in Visual Studio Code

License


Stack Overflow API

Table of Contents

Introduction
Features
Tech Stack
Run The Project Locally
Extract Data Only
Demo

Introduction

This is an API that scrapes the newest unanswered question on Stack Overflow by tags. The API should work with any available tags on Stack Overflow.

With this API, user could check out the unanswered questions on Stack Overflow quickly without any hassle. There is a link for each question that leads straight to Stack Overflow.

The motivation behind this project is to practice Python, scraping the web and creating data pipeline skills.

Features

  • Can check the unanswered questions by tag
  • Can extract data to .csv, .py and .js formats

Tech Stack

  1. Python
  2. Jupyter Notebook
  3. Requests_html
  4. FastAPI
  5. Pandas

Run The Project Locally

In the terminal execute:

Windows

./start.ps1

MacOS

./start.sh

Check the API out at

Check the app out at http://localhost:8000

The data will be extracted into different formats after the command and can be found in the "data" folder

Extract Data Only

Execute the below line in the terminal:

Windows

./extract.ps1

MacOS

./extract.sh

Demo

https://stackoverflow-api-py.herokuapp.com/