Skip to content

Releases: JMGaljaard/fltk-testbed

v0.3.1

26 Sep 10:25
Compare
Choose a tag to compare

In short

This release updates the code base such that:

  1. Documentation is updated to show how docker images can be built and pushed.
  2. Memory issues are resolved/mitigated for FederatedLearning experiments
  3. BatchOrchestrator has been tested and updated according to the findings from debugging.
    • Supports waiting for HistoricalArrivalTasks
    • Supports stop-and-go deployment through parallelism configuration parameters.

What's Changed

Full Changelog: v0.3.0...v0.3.1

Terraform and repetitions

06 Sep 14:27
Compare
Choose a tag to compare

In short

This release revamps the deployment on GKE with Terraform, making deployment a breeze. Furthermore, the dependency list is slimmed down from Kubeflow to only Kubeflow-training-operators. This alliviates the overhead on your cluster, as for example istio is now no-longer required for deployment.

For experiments, the orchestrator allows for running repetitions of experiments directly. This allows to describe an experiment file once (e.g. a distributed learning configuration), and run it multiple times in a single deployment.

What's Changed

Full Changelog: v0.2.2...v0.3.0

Federated num_epochs_per_round

30 May 10:45
Compare
Choose a tag to compare

Important information

In the configuration files (e.g. configs/federated/*.yaml) for Federated Learning, make sure to check that you use the following parameters as follows:

  • totalEpochs, you MUSN'T use, this will be removed later, after this years' course has finished. This parameter was also not used before this release, but keep this in mind. An warning will be logged during execution if you do access this.
  • rounds, the number of communication rounds, so how many times the Federator node will sample and contact Client nodes.
  • epochsPerRound, the number of epochs the `Clients perform within a communication round.

What's Changed

Full Changelog: v0.2.1...v0.2.2

Experiment parsing and testing

24 May 11:55
Compare
Choose a tag to compare

This release introduces the following change:

  1. All federated and data_parallel experiment parsers now load all their respective parameters.
  2. The loss function can now be any of the functions that inherit from torch.nn.modules.loss._Loss base class, using their CamelCase name.
  3. A series of test cases was added to test the parsing of default values with configured values for testing.

Issues closed

This version resolves the following issues. For more information read the changelog above.

Kubernetes FLTK

12 May 09:42
Compare
Choose a tag to compare
Kubernetes FLTK Pre-release
Pre-release

This release introduces the following changes:

  1. FederatedLearning datasets other than FashionMNIST failed to load without raising an Exception. Thereby resulting in a failed execution of experiments. This required a revision of the experiment configuration objects that were used by the learners, see also point 4.
    • To help detect issues like these from getting introduced undetected, a series of smoke tests were introduced that can be run locally.
  2. Distributed (DistributedDataParallel) experiments are now compatible again with KFLTK 🎉. See also #26.
  3. Configuration files with small floating-point numbers without a decimal (such as 10e-5) will no longer be parsed as a string by FedLearningConfig and DistLearningConfig objects. This was mainly an issue for the min_lr configuration parameter, which would result in an exception being raised after 50 epochs.
  4. The (flat) configuration objects for DistributedDataParallel and Federated learning experiments have been unified partially. These objects are now renamed to FedLearningConfig and DistLearningConfig respectively. ⚠️ Make sure to update your imports.
    • Moreover both classes now allow being directly parsed from a json file. Note that the FromYamlFile function has been renamed to the more pythonic name from_yaml
    • These objects now both make use of the Definitions Enum classes, allowing for more readable parsing errors when providing an incorrect type.
  5. A typo has been resolved in the jinja templates such that theaggregation is now properly set.
  6. Datasets for Federated experiments now make use of the test_batch_size, rather than using a default of 16. This parameter has also been introduced in the jinja template.
  7. the fltk.util.env module has been renamed to fltk.util.environment to add it to the repository. This also resolves #30
  8. A series of linter issues have been resolved (mostly warnings and some typos).

Issues closed

This version resolves the following issues. For more information read the changelog above.