ML back-end, workflows, database additions/encoding, and more
-
Addition of validated, PaDEL/alvaDesc-generated YSI databases
-
Update to repository links, author information
-
ecnet.utils.data_utils now forces UTF-8 encoding for all database creation/saving
-
ML back-end updated to TensorFlow 2.0.0
- No API changes to ecnet.models.mlp.MultilayerPerceptron
- Existing .h5 model files will not work with the updated class
Note: initially, PyTorch was looked at as an alternative; however, after tests to evaluate performance were conducted and the viability of installing PyTorch on high-performance machines available to the ECRL were both deemed inadequate, updating to TensorFlow 2.0.0 was deemed the most appropriate action.
-
Only the following hyper-parameters are tuned with the built-in functions:
- Learning rate of Adam optimization function
- Learning rate decay of Adam optimization function
- Batch size during training
- Patience (if validating, # epochs to wait for better validation loss, else terminate training)
- Size of each hidden layer
Note: with the relatively small number of samples our models are trained with, it does not make sense to adjust hyper-parameters such as beta_1, beta_2, and epsilon. The hyper-parameters listed above are theorized to play a much more important role with how the models train/perform.
-
Added the UML ECRL's general publication workflow as ecnet.workflows.ecrl_workflow.create_model
-
If using ecnet.Server and not creating a project, a single model's filename can now be specified as an additional argument (default: model.h5)
-
TensorFlow's verbose argument is now propagated from ecnet.Server.train to the model during training; added as an additional argument
-
ecnet.models.mlp.MultilayerPerceptron.fit now returns a tuple: (learn losses, validation losses); learn losses and validation losses are both lists containing loss values (mean squared error) at every epoch; if training a single model using ecnet.Server.train, this tuple is returned; if not performing validation, the validation losses list is populated with None elements equal in size to the learn losses list
-
If installing using setup.py, installing TensorFlow is optional; to skip the installation of the pre-compiled PyPI distribution of TensorFlow, run setup.py with python setup.py --omit_tf install
Note: other methods of installing TensorFlow offer clear benefits (GPU support, different CPU instruction sets, etc.), therefore we want to provide an option for the user to use an existing installation of TensorFlow instead of forcing the PyPI-sourced version.