Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
lcontento committed Jun 24, 2024
2 parents 462b863 + 9dfb944 commit 87b0eac
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 30 deletions.
2 changes: 1 addition & 1 deletion 01_machine_learning.jl
Original file line number Diff line number Diff line change
Expand Up @@ -298,7 +298,7 @@ scatterlines!(iteration, Float32.(reg_loss_valid_l); label = "validation (L2)");
axislegend();
fig

# 4.3. Programatic hyperparameter tuning
# 4.3. Programmatic hyperparameter tuning

nn_ho = hyperopt(reg_nn, target_train)
nn_ho.best_hyperparameters
Expand Down
28 changes: 14 additions & 14 deletions 02_SciML.jl
Original file line number Diff line number Diff line change
Expand Up @@ -439,11 +439,11 @@ plotgrid(pred)
# Here, we go through some of the problems one is likely to face when using UDEs in real
# projects and how to think when trying to solve them.

# 3.1. Akward scales
# 3.1. Awkward scales

# Most neural networks work best if the input and target outputs have value that are not too
# far from the relevant bits of our activation functions. A farily standard practice in ML
# regression is to standardize input and output to have a mean 0 and std=1 or to ensure that
# Most neural networks work best if the input and target outputs have values that are not too
# far from the relevant bits of our activation functions. A fairly standard practice in ML
# regression is to standardize input and output to have a mean=0 and std=1 or to ensure that
# all values are between 0 and 1. With bad input/output scales, it can be hard to fit a
# model.

Expand Down Expand Up @@ -580,7 +580,7 @@ lines(-10000:100:10000, softplus.(-10000:100:10000))
# With UDEs/NeuralODEs, we don't always know exactly what input values the NN will recieve,
# but we can often figure out which order of magnitude they'll have. If we can rescale the
# NN inputs and outputs to be close to 1 then we would be in a much better place. In this
# case, we know that we're dosing with 1e4 and that there's concervation from Depot to
# case, we know that we're dosing with 1e4 and that there's conservation from Depot to
# Central.


Expand Down Expand Up @@ -627,12 +627,12 @@ plotgrid(predict(fpm_rescale; obstimes=0:0.1:10))


# So, be mindful of what scales you expect your nerual network to get as inputs and to need
#to get as outputs. Also, be mindful of how the regularization may be penalizing automatic
#rescaling of the input/output layer. Here, we looked at large inputs which could have been
#solved by the weights of the first neural network being small but where the later need to
#up-scale in the output layer would be penalized by the regularization. For inputs much
#smaller than 1, we get that the necessary large weights of the input layer may be
#over-regularized. It often makes sense not to regularize the input or output layer of the
#neural network. That avoids this particular problem but it does not always make it easy to
#find the solution since initial gradients may be close to zero and the optimizer won't know
#what to do.
# to get as outputs. Also, be mindful of how the regularization may be penalizing automatic
# rescaling of the input/output layer. Here, we looked at large inputs which could have been
# solved by the weights of the first neural network being small but where the later need to
# up-scale in the output layer would be penalized by the regularization. For inputs much
# smaller than 1, we get that the necessary large weights of the input layer may be
# over-regularized. It often makes sense not to regularize the input or output layer of the
# neural network. That avoids this particular problem but it does not always make it easy to
# find the solution since initial gradients may be close to zero and the optimizer won't know
# what to do.
6 changes: 3 additions & 3 deletions 03_MeNets.jl
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ fit does.


#=
The quality of the fits here depend a on a few different things. Among these are:
The quality of the fits here depends on a few different things. Among these are:
- The number of training subjects
- The number of observations per subject
Expand Down Expand Up @@ -139,7 +139,7 @@ plotgrid(pred_test; ylabel="Y (Test data)")

#=
Another important factor is the 'dimensionality' of outcome heteroneniety in
Another important factor is the 'dimensionality' of outcome heterogeneity in
the data versus in the model.
Here, the synthetic data has inter-patient variability in c1 and c2. These two
Expand All @@ -163,7 +163,7 @@ than our data has? The easiest way for us to play with this here is to reduce
the number of random effects we feed to the neural network in our model_me.
The model is then too 'simple' to be able to prefectly fit the data, but in
what way will it fail, and how much? Train such a model on nice and clean Data
what way will it fail, and how much? Train such a model on nice and clean data
to be able to see in what way the fit fails
=#
Expand Down
4 changes: 2 additions & 2 deletions 04_DeepNLME.jl
Original file line number Diff line number Diff line change
Expand Up @@ -203,8 +203,8 @@ Explore freely, but if you want some ideas for what you can look at then here's
the NN capture only what's called EFF in the data generating model. You can stop using R as
an input but you'll need to change the MLPDomain definition for that.
- Change the number of random effects that's passed to the nerual network. What happens if
- Change the number of random effects that's passed to the neural network. What happens if
the DeepNLME model has fewer random effects than the data generating model? What happens if
it has more?
=#
=#
16 changes: 6 additions & 10 deletions 05_prognostic_factors.jl
Original file line number Diff line number Diff line change
Expand Up @@ -101,16 +101,16 @@ plotgrid(pred_datamodel)
############################################################################################
## Neural-embedded NLME modeling
############################################################################################
# Here, we define a model where the PD is entirely deterimined by a neural network.
# Here, we define a model where the PD is entirely determined by a neural network.
# At this point, we're not trying to explain how patient data may inform individual
# parameters


model = @model begin
@param begin
# Define a multi-layer perceptron (a neural network) which maps from 5 inputs (2
# state variables + 3 individual parameters) to a single output. Apply L2
# regularization (equivalent to a Normal prior).
# Define a multi-layer perceptron (a neural network) which maps from 5 inputs
# (2 state variables + 3 individual parameters) to a single output.
# Apply L2 regularization (equivalent to a Normal prior).
NN MLPDomain(5, 6, 5, (1, identity); reg=L2(1.0))
tvKa RealDomain(; lower=0)
tvCL RealDomain(; lower=0)
Expand Down Expand Up @@ -239,8 +239,8 @@ mean(abs, pred_residuals(pred_datamodel, pred_augment_ho))
# training covariate models well requires more data than fitting the neural networks
# embedded in dynamical systems. With UDEs, every observation is a data point. With
# prognostic factor models, every subject is a data point. We've (hopefully) managed to
# improve our model using only 50 subjects, but lets try using data from 1000 patients
# instead.
# improve our model using only 50 subjects, but let's try using data from 1000 patients
# instead.

target_large = preprocess(model, trainpop_large, coef(fpm), FOCE())
fnn_large = hyperopt(nn, target_large)
Expand Down Expand Up @@ -321,7 +321,3 @@ plotgrid!(pred_deep; ipred=false, pred=(; color=Cycled(2), label = "Deep fit pre
# Compare the deviation from the best possible pred.
mean(abs, pred_residuals(pred_datamodel, pred_augment))
mean(abs, pred_residuals(pred_datamodel, pred_deep))




0 comments on commit 87b0eac

Please sign in to comment.