This repository contains code for a Flask-based RESTful API that ranks the columns of Excel files and returns the top 5 columns.
Clone the repository using the following command.
$ git clone https://github.com/sbeen1840/Excel-Files-Ranker-API.git
Note : It is recommended to use a conda environment
with Python 3.6
with this code. Before running the commands in this guide, make sure you activate the environment using $ source activate <name of the env>
The use the requirements.txt
file given in the repository to install the dependencies via pip
.
$ pip install -r requirements.txt
To verify whether pandas
, flask
were installed properly, run the following.
$ python
>>> import pandas
>>> import flask
If there are no error messages upon importing the above dependencies, it would indicate that the they are correctly installed.
Step 1: Enter the keyword on the Naver Trends site and download the Excel file.
Step 4: Modify the masterpath, extension, and folders parameters in the Ranker class to specify the location of your Excel files.
self.masterpath = masterpath
self.extension = extension
self.folders = folders
$ python
>>> import pandas
>>> import flask
After running the script, you can access keywords representing the search trend by visiting http://localhost:5000/home in your web browser. You can also see their search volume, normalized. The data will be presented in the form of json and sorted in descending order.
The API returns a JSON object containing the top 5 columns of each Excel file in the specified folders, ranked by the sum of the values in each column.
info/excel_files_ranker_api.ipynb
describes the execution steps of this program pipeline in detail.
-
This code assumes that the first 5 rows of each Excel file are header information and skips them.
-
This code only works with .xlsx file extensions.
-
The API can be accessed at localhost:5000/home.
-
The code has only been tested on Windows machines.
- sbeen1840
- This project is licensed under the
MIT License
- see the LICENSE file for details