Estimating Croatia's GINI Coefficient using Lagrange Interpolation Method for Lorenz Curve Approximation
This project consists of a program that calculates the Gini coefficient in the case of Croatia 2018. For the gini calculation, the program first reads from a csv file containing the income distribution by tenths of population found in Eurostat. Then, creates the coordinates needed to plot Lorenz curve. The actual Lorenz curve is calculated as a polynomial approximation using Lagrange Interpolation Method with calculated coordinates. The program then calculates the gini coefficient by integrating x minus the lorenz function. Two other methods are added to calculate gini coefficient, but only for comparison.
The pdf document also in this repo explains further theory and procedures followed. Its LaTeX source file is NM_SeminarPaper.zip
.
Python version: Python 3.6.2
Matplotlib version: 2.1.2
Sympy version: 1.5.1
Numpy version: 1.14.1
- Download this repo and store it in your computer.
- Go to the folder's directory where the repo is stored.
- Run
lagrange.py
by typing in Powershell:python lagrange.py
, once located in the project's directory.
The Lorenz curve polynomial approximation using Lagrange Interpolation Method is the following:
If we plot a line from point to point (gray line) and compare it to the approximated lorenz curve (red line) we get:
Additionally and outside of the projects strict boundaries, the method for integrating the Lorenz curve in order to get the Gini coefficent were three: Sympy's integrate
function, Monte Carlo simulation and Riemann Sums. The following plot is the resulting time performance of these three. Interestingly, Monte Carlo simulation showed the most accurate result, with 97.91% of accuracy.
Monte Carlo Simulation is presented here in order to integrate the Lorenz Curve and get the Gini Coefficient, where the amount of tests increases to show how it works.
Making this project made me so happy.