Template TensorFlow code for feed-forward neural networks - learning Gaussian distributions.
Model architecture illustration:
-
Input values: vector of numbers
$x$ .
-
Output values:
$\mu$ ,$\sigma$ (parameters of Gaussian distrbution conditioned on input$x$ ).
-
Loss function: standard negative log-likelihood of target value
$y$ under model output distribution:$-\log p(y; \mu, \sigma)$ , where$\mu$ ,$\sigma = f(x)$ and$f$ is neural network model.
For dataset $X = \{x_1, x_2, ..., x_n\}$ and target values $Y = \{y_1, y_2, ..., y_3\}$ loss function negative log-likelihood is defined as:
Model architecture:
- feed-forward neural network (model can be extended for recurrent architectures),
- base layers learn joint representations of inputs,
- parameter layers (
$\mu$ - layer and$\sigma$ - layer) learn specific representations important for each output parameter, - alternative for regression models with single numerical output,
-
$\mu$ - layer output activation function: can be any function, linear or any other that restricts output range, -
$\sigma$ - layer output activation function: should be function with only positive value outputs, like softplus.
Model output