Training model with large parameters #1707
ruchitkini
started this conversation in
General
Replies: 2 comments
-
My question is similar to you, but in the context of epidemiology model and there are quite a lot parameters and equations in the model |
Beta Was this translation helpful? Give feedback.
0 replies
-
Hi @ruchitkini |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, @lululxvi and DeepXDE community,
I am fairly new to deep learning and PINNs so I would like to have an honest conversation. I had general questions about training models with large parameters, say for example Young's modulus, E =$10^{8}$ Pa. For small parameters, the model trains accurately, but for models with large parameters naturally it would take more iterations and be difficult to train the NN. I would like to know what different strategies are out there that one can apply.
For example: I did some tests with simple 2D geometries for linear elasticity equations, first with small parameters and then with large ones. For large parameter tests, my outputs are of scale O($10^{-3}$ ) for displacement and O($10^4$ ) for stresses. I used ADAM and LBFGS optimizer, suitable output transform, and necessary applied loss weights.
Test 1: I did not provide any loss weights. The initial loss was of order$10^8$ . I trained the model with ADAM for 50000 iterations and then with LBFGS for another 50000 iterations. The loss came down to the order of $10^1$ and the results were fairly good. If I would train the model for more iterations, I would get accurate results.
Test2: As initial losses were high I provided loss weights, of order$10^-6$ for PDEs and $10^-5$ for BCs. The initial loss was of order $10^3$ . After training the model for 50000 iterations with ADAM the loss came down to order $10^-1$ , and the results were kind of OK, I would gotten better results for more iterations. But as in Test1, here also I used LBFGS for the next training, but in this case, the losses increased initially for LBFGS and then slowly started decreasing although they did not come down to the previous losses obtained from ADAM. Q: Do the weights assigned to losses in ADAM affect the LBFGS Optimizer? It seems like it rechecks the entire training space.
Test 3: In this test, I did the training initially just for boundary conditions, as known from one of the papers of @lululxvi. For 25000 iterations I trained the model just for BCs with loss weights = 1 for all of them. Then I trained for another 50000 iterations with PDE loss weights = 0.1 and reduced the BC loss weight to 0.01. The losses increased for PDEs which is expected, but also for BC (Q: why does it also increase for BC?). Anyway as I saved the model for every 1000 iterations, I checked the results even though the total loss was of order$10^3$ . Surprisingly the results were fairly good. Q: As I observed from this and the above test the results were acceptable even with the loss of order $10^2$ for training the model with large parameters. Is this true for all cases, such as 3D geometries or it might be just for particular cases? What should be the criteria for checking loss to interpret the accuracy of the training process? I mean it makes sense for losses to be high initially as we are taking the MSE loss function.
I can do such tests for simple 2D geometries as it does not take much time. But this is not feasible for training 3D geometries which take days. Are there any better training strategies out there? It would be great to know about them.
Beta Was this translation helpful? Give feedback.
All reactions