Merge branch 'main' of https://github.com/PumasAI-Labs/DeepPumas_work…

…shop_2024_PAGE
PumasAI-Labs · Jun 24, 2024 · 87b0eac · 87b0eac
2 parents 462b863 + 9dfb944
commit 87b0eac
Show file tree

Hide file tree

Showing 5 changed files with 26 additions and 30 deletions.
diff --git a/01_machine_learning.jl b/01_machine_learning.jl
@@ -298,7 +298,7 @@ scatterlines!(iteration, Float32.(reg_loss_valid_l); label = "validation (L2)");
 axislegend();
 fig
 
-# 4.3. Programatic hyperparameter tuning
+# 4.3. Programmatic hyperparameter tuning
 
 nn_ho = hyperopt(reg_nn, target_train)
 nn_ho.best_hyperparameters

diff --git a/02_SciML.jl b/02_SciML.jl
@@ -439,11 +439,11 @@ plotgrid(pred)
 # Here, we go through some of the problems one is likely to face when using UDEs in real
 # projects and how to think when trying to solve them.
 
-# 3.1. Akward scales 
+# 3.1. Awkward scales 
 
-# Most neural networks work best if the input and target outputs have value that are not too
-# far from the relevant bits of our activation functions. A farily standard practice in ML
-# regression is to standardize input and output to have a mean 0 and std=1 or to ensure that
+# Most neural networks work best if the input and target outputs have values that are not too
+# far from the relevant bits of our activation functions. A fairly standard practice in ML
+# regression is to standardize input and output to have a mean=0 and std=1 or to ensure that
 # all values are between 0 and 1. With bad input/output scales, it can be hard to fit a
 # model. 
 
@@ -580,7 +580,7 @@ lines(-10000:100:10000, softplus.(-10000:100:10000))
 # With UDEs/NeuralODEs, we don't always know exactly what input values the NN will recieve,
 # but we can often figure out which order of magnitude they'll have. If we can rescale the
 # NN inputs and outputs to be close to 1 then we would be in a much better place. In this
-# case, we know that we're dosing with 1e4 and that there's concervation from Depot to
+# case, we know that we're dosing with 1e4 and that there's conservation from Depot to
 # Central. 
 
 
@@ -627,12 +627,12 @@ plotgrid(predict(fpm_rescale; obstimes=0:0.1:10))
 
 
 # So, be mindful of what scales you expect your nerual network to get as inputs and to need
-#to get as outputs. Also, be mindful of how the regularization may be penalizing automatic
-#rescaling of the input/output layer. Here, we looked at large inputs which could have been
-#solved by the weights of the first neural network being small but where the later need to
-#up-scale in the output layer would be penalized by the regularization. For inputs much
-#smaller than 1, we get that the necessary large weights of the input layer may be
-#over-regularized. It often makes sense not to regularize the input or output layer of the
-#neural network. That avoids this particular problem but it does not always make it easy to
-#find the solution since initial gradients may be close to zero and the optimizer won't know
-#what to do.
+# to get as outputs. Also, be mindful of how the regularization may be penalizing automatic
+# rescaling of the input/output layer. Here, we looked at large inputs which could have been
+# solved by the weights of the first neural network being small but where the later need to
+# up-scale in the output layer would be penalized by the regularization. For inputs much
+# smaller than 1, we get that the necessary large weights of the input layer may be
+# over-regularized. It often makes sense not to regularize the input or output layer of the
+# neural network. That avoids this particular problem but it does not always make it easy to
+# find the solution since initial gradients may be close to zero and the optimizer won't know
+# what to do.
diff --git a/03_MeNets.jl b/03_MeNets.jl
@@ -95,7 +95,7 @@ fit does.
 
 
 #=
-The quality of the fits here depend a on a few different things. Among these are:
+The quality of the fits here depends on a few different things. Among these are:
 
 - The number of training subjects
 - The number of observations per subject
@@ -139,7 +139,7 @@ plotgrid(pred_test; ylabel="Y (Test data)")
 
 #=
 
-Another important factor is the 'dimensionality' of outcome heteroneniety in
+Another important factor is the 'dimensionality' of outcome heterogeneity in
 the data versus in the model. 
 
 Here, the synthetic data has inter-patient variability in c1 and c2. These two
@@ -163,7 +163,7 @@ than our data has? The easiest way for us to play with this here is to reduce
 the number of random effects we feed to the neural network in our model_me.
 
 The model is then too 'simple' to be able to prefectly fit the data, but in
-what way will it fail, and how much? Train such a model on nice and clean Data
+what way will it fail, and how much? Train such a model on nice and clean data
 to be able to see in what way the fit fails
 
 =#

diff --git a/04_DeepNLME.jl b/04_DeepNLME.jl
@@ -203,8 +203,8 @@ Explore freely, but if you want some ideas for what you can look at then here's
   the NN capture only what's called EFF in the data generating model. You can stop using R as
   an input but you'll need to change the MLPDomain definition for that.
   
-- Change the number of random effects that's passed to the nerual network. What happens if
+- Change the number of random effects that's passed to the neural network. What happens if
   the DeepNLME model has fewer random effects than the data generating model? What happens if
   it has more?
 
-=#
+=#
diff --git a/05_prognostic_factors.jl b/05_prognostic_factors.jl
@@ -101,16 +101,16 @@ plotgrid(pred_datamodel)
 ############################################################################################
 ## Neural-embedded NLME modeling
 ############################################################################################
-# Here, we define a model where the PD is entirely deterimined by a neural network.
+# Here, we define a model where the PD is entirely determined by a neural network.
 # At this point, we're not trying to explain how patient data may inform individual
 # parameters
 
 
 model = @model begin
   @param begin
-    # Define a multi-layer perceptron (a neural network) which maps from 5 inputs (2
-    # state variables + 3 individual parameters) to a single output. Apply L2
-    # regularization (equivalent to a Normal prior).
+    # Define a multi-layer perceptron (a neural network) which maps from 5 inputs
+    # (2 state variables + 3 individual parameters) to a single output.
+    # Apply L2 regularization (equivalent to a Normal prior).
     NN ∈ MLPDomain(5, 6, 5, (1, identity); reg=L2(1.0))
     tvKa ∈ RealDomain(; lower=0)
     tvCL ∈ RealDomain(; lower=0)
@@ -239,8 +239,8 @@ mean(abs, pred_residuals(pred_datamodel, pred_augment_ho))
 # training covariate models well requires more data than fitting the neural networks
 # embedded in dynamical systems. With UDEs, every observation is a data point. With
 # prognostic factor models, every subject is a data point. We've (hopefully) managed to
-# improve our model using only 50 subjects, but lets try using data from 1000 patients
-# instead. 
+# improve our model using only 50 subjects, but let's try using data from 1000 patients
+# instead.
 
 target_large = preprocess(model, trainpop_large, coef(fpm), FOCE())
 fnn_large = hyperopt(nn, target_large)
@@ -321,7 +321,3 @@ plotgrid!(pred_deep; ipred=false, pred=(; color=Cycled(2), label = "Deep fit pre
 # Compare the deviation from the best possible pred. 
 mean(abs, pred_residuals(pred_datamodel, pred_augment))
 mean(abs, pred_residuals(pred_datamodel, pred_deep))
-
-
-
-