- Enhanced/Fixed: show_notebook() now outputs a better HTML file, and has added for it (thanks @tvdboom)
- Added: Parameter for verbosity (full, progress_only, off)
- Fixed: Histogram inaccuracies due to incorrect binning (thanks @troy46), fixes #116, #127
- Fixed: Incorrect detection of categorical series in some cases
- Fixed: Deprecation warnings for is_categorical_dtype (thanks @GegznaV), fixes #162
- Updated: jQuery to the latest version 3.7.1 (thanks @hedsnz)
- Updated: Updated the project to use the latest build & packaging pipelines (pyproject.toml)
- Updated: Using the "warnings" library directly instead of np.warningS
- Fixed: "KeyError: None of ['index'] are in the columns" (for pandas > 2.0)
- Fixed: "AttributeError: numpy has no attribute 'warning'" (for numpy > 1.23)
- Fixed: "AttributeError: 'DataFrame' object has no attribute 'iteritems'"
- Fixed: np.bool deprecation warning
- Fixed: Pandas 'mad()' function deprecation warning
- Fixed: removed deprecation warnings
- Enhanced: added info tag for comet.ml-generated reports
- Fixed: issue with comet.ml in a notebook environment
- Fixed: division by zero crash in some cases
- Fixed: association graph description text
- Fixed: ignoring warning due to matplotlib/numpy/pandas version mismatches
- Added: support for Comet.ml
- Added: numerical features now show "ZEROES" count in the summary tab
- Fixed: issue causing "FloatingPointError: divide by zero encountered in true_divide" in some edge cases
- Enhanced: feature counts near 0% and 100% will now show a more accurate "<1%" and ">99%"
- Enhanced: added more changes to feature counting to make it more consistent
- Enhanced: now allowing tuples (not just lists) as parameters for naming datasets
- Fixed: appearance of feature numbers above 100 (better alignment) and 1000 (now shown vertically)
- Fixed: sorting issue causing feature summaries above 2500 to disappear
- Fixed: progress bar issues causing it to be repeated or cause line ending issues
- Fixed: report issue introduced in 2.0.5
- Fixed: crashes due to LaTeX escape codes in feature names that were causing "Font family ['STIXGeneral'] not found" errors
- Fixed: better handling of features named "index"
- Enhanced: made feature count more consistent and clear: taking "target" into account as well as explicitly calling out when features are not present in the comparison data frame
- Enhanced: made "Association" buttons more obvious and color-coded (they were a bit too hard to see given their importance)
- Fixed: re-fixed unclickable buttons in some circumstances
- Tweaked: changed default notebook scale factor to 1.0
- Fixed: problem with overlap in some categorical values in vertical layout when scaling is applied
- Fixed: default layout value ignored in show_notebooks()
- Fixed: unclickable buttons in some circumstances
- Fixed: issue with width "%" in default value INI
- Added:
show_notebook(...)
for embedded notebook support (Jupyter, Colab, etc.) - Added: report size scaling
- Added: vertical report layout
- Added: INI defaults for all show_xxx function parameters
- Updated: disallowed NaN values for target features (resolves many interpretation & reporting issues)
- Fixed: boolean issues with NaN/missing data
- Fixed: association graph label issues
- Fixed: association detail display issues
- Fixed: numerous miscellaneous issues
- Fixed: fixed major display issues with progress bar in notebooks
- Updated: improved progress bar configuration/display
- Updated: restored compact font as default
- Added: CJK font support
- Added: color-coding for % of missing values
- Added: "open_browser" option for show_html()
- Enhanced: multiple report generation fixes and cosmetic updates
- Enhanced: better correlation edge-case handling
- Enhanced: moved logo HTML to be easier to control through INI
- Fixed: issues with columns named 'index' and missing columns in comparison data. Closes #60.
- Fixed: for issues with missing data
- Fixed: support for numpy.float32 data. Closes #58.
- Fixed: issues for data columns with a single value
- Fixed: issues with page height
- Fixed: sorting issue when >500 features
- Fixed: multiple minor report generation issues
- Fixed: numerical summary showing 0.00 for small values or ranges
- Fixed: indexing issues that were causing warnings and report inconsistencies
- Fixed: selection issues in the reports
- Added: __version__and other metadata
- Fixed: "KeyError" crash
- Fixed: error for coercion of boolean series to categorical
- Enhanced: error reporting output for type coercions
- Added: post-report-generation descriptive text for Jupyter/Colab
- Re-added: horizontal scrollbar
- Added: link to check for updates in header
- Fixed: all-NaN columns will not crash and get added to the report (as empty text feature)
- Fixed: detail tab title overflowing to 2 lines when multiple words
- Fixed: progress bar resetting to 0% when reaching 100%
- Fixed: images on Pypi site
- Updated: README
- Added: Support for categorical Pandas data type
- Fixed: MANY crash and general stability/compatibility issues! The library is now much more robust with regard to supporting different data and conditions.
- Fixed: "ValueError: index must be monotonic..." crash with some datasets (#10)
- Fixed: forcing feature names to be strings, to avoid crashing if numerical (#9)
- Improved: error message in case of mixed type feature (#3)
- Added: CHANGELOG.md!