This project provides a set of functions to analyze employee salary data from a CSV file. It includes capabilities to calculate salary statistics, categorize salaries, filter employees, and check for missing phone numbers.
To run this project, ensure you have Python 3.x installed along with the following libraries:
- pandas
- numpy
You can install the required libraries using pip (ideally in a virtual environment):
pip install pandas numpy
To use the code, simply run the main
block at the bottom of the script. This will execute all the functions defined in the script and print out relevant results.
python -i main.py
This will run the script in interactive mode. You can then check the dataframes interactively.
Fetches the employee data from a specified CSV URL.
Fetches department data from a specified CSV URL and renames the department ID column.
Calculates the average, median, lower, and upper quartiles of employee salaries.
Calculates the average salary per department and includes the department name in the results.
Creates a new column SALARY_CATEGORY
indicating whether the salary is "low" or "high" based on the average salary.
create_salary_catergory_among_department(employees_df: pd.DataFrame, avg_salary_per_department_df: pd.DataFrame) -> pd.DataFrame
Creates another column SALARY_CATEGORY_AMONG_DEPARTMENT
to indicate salary status relative to the department average.
Filters the employee DataFrame to include only those in department ID 20.
Increases the salary by 10% for all employees working in department 20.
Checks if any phone numbers in the PHONE_NUMBER
column are empty.
The employee and department data are fetched from the following URLs: