1.1.0
Extended the source code for the experiments in our new paper:
Deep Learning on Small Datasets without Pre-Training using Cosine Loss.
Björn Barz and Joachim Denzler.
Backwards-incompatible changes
- The meaning of the
resnet-110-*
architecture names has changed:resnet-110
now always refers to the standard ResNet-110 architecture with 16, 32, and 64 channels per block. Previously, the number of channels in the last block equaled the embedding dimension when learning image embeddings.resnet-110-fc
previously had twice the number of channels as the standard ResNet-110 and always a final fully-connected layer (as opposed toresnet-110
, which lacks that final layer when learning embeddings). This architecture is now referred to asresnet-110-wfc
, whileresnet-110-fc
just always has a final FC layer but the standard number of channels.
- Dataset interfaces have been completely refactored. The API of
get_data_generator
stayed the same and this function can still be imported fromdatasets
. This, however, is not a file anymore but a sub-package now, containing one module per dataset interface. Some more layers of abstraction have been introduced as well to reduce redundancy in the code.
New features
- Interface to the CUB dataset and meta-data in terms of 3 different class-hierarchy variants for this dataset. We also provide pre-trained models for CUB.
- Variant of the NAB dataset with input size 448x448 instead of 224x224. The new variant is referred to as
NAB-large
. We also provide pre-trained models for this variant in addition, which perform better than the 224x224 variant. - Dataset interfaces for the Stanford Cars and Oxford Flowers-102 datasets.
- Cifar-ResNet architectures now support different input sizes than 32x32 and even dynamic ones.
- One-hot class embeddings can now be generated on the file (i.e., without the need for a pickle file) by specifying
--embedding onehot
. learn_classifier.py
now supports label smoothing using the--label_smoothing
CLI argument.- CLI argument
--nesterov
for training with Nesterov momentum. - CLI argument
--snapshot_best
for snapshotting the best model only. DataSequence
now supports oversampling and multiple repetitions of the data per epoch.
Bug fixes
- Fixed hand-crafted learning rate schedule specification using
--sgd_schedule
, which previously broke right before the last epoch. - Fixed DenseNet.