- Add functionality to perform some common data cleaning tasks.
- Add
geo.py
module and functionality to set 'close' lat-long coordinates to same value.
- SeriesProfile now reports gaps in pd.Series with type
datetime64
or for Series withDatetimeIndex
. gh-20 times.py
module has been added with public functionstime_diffs
,time_diffs_index
,id_gaps
,id_gaps_index
,category_gaps
. gh-20freq_most_least
default parameter for SeriesProfile has been changed to(10, 5)
.
- Add memory usage to
DataFrameProfile
gh-30 - Improve formatting of
distribution_stats
function output gh-29 - Improved project documentation with project website gh-2
- Split reports module into
profiles
andstats
- Renamed
save_report
method tosave
- Refactored tests to use pytest fixtures
- Add support for improved display in Jupyter Notebooks gh-22
- Add user to select different string formats for profiles gh-24
- Allow user to specify number of most frequent and least frequent values to display in SeriesProfile gh-25
- Update for Python 3.12
- Switch project build to pyproject.toml gh-18
- Simplify import:
import pandahelper
now importsDataFrameProfile
,SeriesProfile
,frequency_table
, anddistribution_stats
gh-17 - Improved
SeriesProfile
to better handle different data types. gh-19 - Removed excess trailing whitespace on reports gh-21
- Added improved type-checking for functions and profile classes
- First version of Panda-Helper