Skip to content

Latest commit

 

History

History
216 lines (162 loc) · 18.3 KB

README.md

File metadata and controls

216 lines (162 loc) · 18.3 KB
pidgan logo

GAN-based models to flash-simulate the LHCb PID detectors

TensorFlow versions Keras 3 scikit-learn versions Python versions

PyPI - Version PyPI - Status GitHub - License DOI

GitHub - Tests Codecov GitHub - Style Ruff

What is PIDGAN?

PIDGAN is a Python package built upon TensorFlow 2 to provide ready-to-use implementations for several GAN algorithms (listed in this table). The package was originally designed to simplify the training and optimization of GAN-based models for the Particle Identification (PID) system of the LHCb experiment. Today, PIDGAN is a versatile package that can be employed in a wide range of High Energy Physics (HEP) applications and, in general, whenever one has anything to do with tabular data and aims to learn the conditional probability distributions of a set of target features. This package is one of the building blocks to define a Flash Simulation framework of the LHCb experiment [1].

PIDGAN is (almost) all you need (for flash-simulation)

Standard techniques for simulations consume tons of CPU hours in reproducing all the radiation-matter interactions occurring within a HEP detector when traversed by primary and secondary particles. Directly transforming generated particles into analysis-level objects allows Flash Simulation strategies to speed up significantly the simulation production, up to x1000 [1]. Such transformations can be defined by using Generative Adversarial Networks (GAN) [2] trained to take into account the kinematics of the traversing particles and the detection conditions (e.g., magnet polarity, occupancy).

GANs rely on the simultaneous (adversarial) training of two neural networks called generator and discriminator, whose competition ends once reached the Nash equilibrium. At this point, the generator can be used as simulator to generate new data according to the conditional probability distributions learned during the training [3]. By relying on the TensorFlow and Keras APIs, PIDGAN allows to define and train a GAN model with no more than 20 code lines.

from pidgan.players.generators import Generator
from pidgan.players.discriminators import Discriminator
from pidgan.algorithms import GAN

x = ... # conditions
y = ... # targets

G = Generator(
  output_dim=y.shape[1],
  latent_dim=64,
  output_activation="linear",
)

D = Discriminator(
  output_dim=1,
  output_activation="sigmoid",
)

model = GAN(generator=G, discriminator=D)
model.compile(
  metrics=["accuracy"],
  generator_optimizer="rmsprop",
  discriminator_optimizer="rmsprop",
)

model.fit(x, y, batch_size=256, epochs=100)

Installation guide

First steps

Before installing PIDGAN, we suggest preparing a fully operational TensorFlow installation by following the instructions described in the dedicated guide. If your device is equipped with one of the NVIDIA GPU cards supported by TensorFlow (see Hardware requirements), do not forget to verify the correct installation of the libraries for hardware acceleration by running:

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

If the equipped GPU card is not included in the list printed by the previous command, your device and/or Python environment may be misconfigured. Please refer to this table for the correct configuration of CUDA Toolkit and cuDNN requested by the different TensorFlow versions.

How to install

PIDGAN has a minimal list of requirements:

The easiest way to install PIDGAN is via pip:

pip install pidgan

In addition, since hopaas_client is not available on PyPI, you need to install it manually to unlock the complete set of PIDGAN functionalities:

pip install git+https://github.com/landerlini/hopaas_client

Optional dependencies

Standard HEP applications may need additional packages for data management, results visualization/validation, and model export. PIDGAN and any additional requirements potentially useful in HEP can be installed via pip in one shot:

pip install pidgan[hep]

Models available

The main components of PIDGAN are the algorithms and players modules that provide, respectively, implementations for several GAN algorithms and the so-called adversarial neural networks (e.g., generator, discriminator). The objects exposed by the algorithms and players modules are implemented by subclassing the Keras Model class and customizing the training procedure that is executed when one calls the fit() method. With PIDGAN v0.2.0 the package has been massively rewritten to be also compatible with the new multi-backend Keras 3. At the moment, the custom training procedures defined for the various GAN algorithms are only implemented for the TensorFlow backend, while relying also on the Pytorch and JAX backends is planned for a future release. The following tables report the complete set of algorithms and players classes currently available, together with a snapshot of their implementation details.

Generative Adversarial Networks

Algorithms* Source Avail Test Lipschitz** Refs Tutorial
GAN k2/k3 2, 10, 11 Open In Colab
BceGAN k2/k3 4, 10, 11 Open In Colab
LSGAN k2/k3 5, 10, 11 Open In Colab
WGAN k2/k3 6, 11 Open In Colab
WGAN-GP k2/k3 7, 11 Open In Colab
CramerGAN k2/k3 8, 11 Open In Colab
WGAN-ALP k2/k3 9, 11 Open In Colab
BceGAN-GP k2/k3 4, 7, 11 Open In Colab
BceGAN-ALP k2/k3 4, 9, 11 Open In Colab

*each GAN algorithm is designed to operate taking conditions as input [3]

**the GAN training is regularized to ensure that the discriminator encodes a 1-Lipschitz function

Generators

Players Source Avail Test Skip conn Refs
Generator k2/k3 2, 3
ResGenerator k2/k3 2, 3, 12

Discriminators

Players Source Avail Test Skip conn Aux proc Refs
Discriminator k2/k3 2, 3, 11
ResDiscriminator k2/k3 2, 3, 11, 12
AuxDiscriminator k2/k3 2, 3, 11, 12, 13

Other players

Players Source Avail Test Skip conn Aux proc Multiclass
Classifier src
ResClassifier src
AuxClassifier src
MultiClassifier src
MultiResClassifier src
AuxMultiClassifier src

References

  1. M. Barbetti, "The flash-simulation paradigm and its implementation based on Deep Generative Models for the LHCb experiment at CERN", PhD thesis, University of Firenze, 2024
  2. I.J. Goodfellow et al., "Generative Adversarial Networks", arXiv:1406.2661
  3. M. Mirza, S. Osindero, "Conditional Generative Adversarial Nets", arXiv:1411.1784
  4. A. Radford, L. Metz, S. Chintala, "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks", arXiv:1511.06434
  5. X. Mao et al., "Least Squares Generative Adversarial Networks", arXiv:1611.04076
  6. M. Arjovsky, S. Chintala, L. Bottou, "Wasserstein GAN", arXiv:1701.07875
  7. I. Gulrajani et al., "Improved Training of Wasserstein GANs", arXiv:1704.00028
  8. M.G. Bellemare et al., "The Cramer Distance as a Solution to Biased Wasserstein Gradients", arXiv:1705.10743
  9. D. Terjék, "Adversarial Lipschitz Regularization", arXiv:1907.05681
  10. M. Arjovsky, L. Bottou, "Towards Principled Methods for Training Generative Adversarial Networks", arXiv:1701.04862
  11. T. Salimans et al., "Improved Techniques for Training GANs", arXiv:1606.03498
  12. K. He et al., "Deep Residual Learning for Image Recognition", arXiv:1512.03385
  13. A. Rogachev, F. Ratnikov, "GAN with an Auxiliary Regressor for the Fast Simulation of the Electromagnetic Calorimeter Response", arXiv:2207.06329

Credits

Most of the GAN algorithms are an evolution of what provided by the mbarbetti/tf-gen-models repository. The BceGAN model is freely inspired by the TensorFlow tutorial Deep Convolutional Generative Adversarial Network and the Keras tutorial Conditional GAN. The WGAN-ALP model is an adaptation of what provided by the dterjek/adversarial_lipschitz_regularization repository.

Citing PIDGAN

To cite this repository:

@software{pidgan:2023abc,
  author    = "Matteo Barbetti and Lucio Anderlini",
  title     = "{PIDGAN: GAN-based models to flash-simulate the LHCb PID detectors}",
  version   = "v0.2.0",
  url       = "https://github.com/mbarbetti/pidgan",
  doi       = "10.5281/zenodo.10463728",
  publisher = "Zenodo",
  year      = "2023",
}

In the above bibtex entry, the version number is intended to be that from pidgan/version.py, while the year corresponds to the project's open-source release.

License

PIDGAN has a GNU General Public License v3 (GPLv3), as found in the LICENSE file.