Skip to content

PyTorch normalization layer that learns Yeo-Johnson power scaling on tabular data

License

Notifications You must be signed in to change notification settings

sedthh/InputNorm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InputNorm

InputNorm is a normalization layer capable of learning estimations of common scikit-learn scalers (such as the Yeo-Johnson / Box-Cox [1][2][3] based PowerTransformer) in a fully differentiable and numerically stable way.

To mimic the preprocessing steps of most data preprocessing pipelines, where scaling is applied before missing value imputation, InputNorm also accepts NaNs in its input (it will pass the missing values in its output accordingly).

The normalization is applied feature-wise. However, unlike BatchNorm, no running statistics are learned during the process, so the same results will be returned during both training and inference. Another difference between InputNorm and other normalization layers is that this layer is intended to be applied once, immediately after the input.

NOTE:

  • The layer is sensitive to learning rate settings at the moment.
  • Skip InputNorm's parameters when applying weight decay!
  • You can directly apply a dropout layer after this layer to mimic random tree-like network structures.
  • Although the layer handles most of the necessary steps for data preprocessing, extreme outliers can still hinder training! Consider clipping your inputs as a preprocessing step!

About

PyTorch normalization layer that learns Yeo-Johnson power scaling on tabular data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published