Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

training loss hovering around few values #1334

Open
janmejaya-nanda opened this issue Mar 10, 2022 · 0 comments
Open

training loss hovering around few values #1334

janmejaya-nanda opened this issue Mar 10, 2022 · 0 comments

Comments

@janmejaya-nanda
Copy link

janmejaya-nanda commented Mar 10, 2022

❓ Questions and Help

Hi all
i am training this model with a subsample(1.6k) of fashionpedia data. hare are the steps I followed to incorporate this new data set.

  1. create a new Dataset and Dataloader class.
  2. Downloaded Imagenet pre-trained model (for R-50-C4) from model zoo. and trim below last layers and used this utils function to load model.
    But while training my loss hovered around a few values as shown below.
  3. i am not sure whether roughly 1.5K data and training up to 16 epoch is enough to see some progress in loss.
  4. is this abysmal low value of the loss is expected. (which looks abnormal to me)

any help regarding training will be highly appreciated.

Loss:
2022-03-09 06:21:51,094 - trainer - INFO - epoch : 1
2022-03-09 06:21:51,095 - trainer - INFO - loss_classifier: 0.11012238264083862
2022-03-09 06:21:51,095 - trainer - INFO - loss_box_reg : 0.10907483100891113
2022-03-09 06:21:51,095 - trainer - INFO - attribute_loss : 0.24423398077487946
2022-03-09 06:21:51,095 - trainer - INFO - loss_objectness: 0.032822877168655396
2022-03-09 06:21:51,095 - trainer - INFO - loss_rpn_box_reg: 0.015091314911842346
2022-03-09 06:21:51,095 - trainer - WARNING - Warning: Metric 'val_loss' is not found. Model performance monitoring is disabled.
2022-03-09 06:33:35,350 - trainer - INFO - epoch : 2
2022-03-09 06:33:35,350 - trainer - INFO - loss_classifier: 0.025048378854990005
2022-03-09 06:33:35,350 - trainer - INFO - loss_box_reg : 0.033926017582416534
2022-03-09 06:33:35,350 - trainer - INFO - attribute_loss : 0.1824456751346588
2022-03-09 06:33:35,351 - trainer - INFO - loss_objectness: 0.008666656911373138
2022-03-09 06:33:35,351 - trainer - INFO - loss_rpn_box_reg: 0.0010567393619567156
2022-03-09 06:45:20,282 - trainer - INFO - epoch : 3
2022-03-09 06:45:20,282 - trainer - INFO - loss_classifier: 0.23171865940093994
2022-03-09 06:45:20,283 - trainer - INFO - loss_box_reg : 0.13188767433166504
2022-03-09 06:45:20,283 - trainer - INFO - attribute_loss : 0.2769358158111572
2022-03-09 06:45:20,283 - trainer - INFO - loss_objectness: 0.02191787026822567
2022-03-09 06:45:20,283 - trainer - INFO - loss_rpn_box_reg: 0.00860893540084362
2022-03-09 06:57:04,570 - trainer - INFO - epoch : 4
2022-03-09 06:57:04,570 - trainer - INFO - loss_classifier: 0.10086892545223236
2022-03-09 06:57:04,570 - trainer - INFO - loss_box_reg : 0.13848796486854553
2022-03-09 06:57:04,570 - trainer - INFO - attribute_loss : 0.2500332295894623
2022-03-09 06:57:04,570 - trainer - INFO - loss_objectness: 0.027166806161403656
2022-03-09 06:57:04,571 - trainer - INFO - loss_rpn_box_reg: 0.015889877453446388
2022-03-09 06:57:05,121 - trainer - INFO - Saving checkpoint: saved/models/AttributeHeadGeneralizedRCNN_500x500_BS2/0309_061005/checkpoint-epoch4.pth ...
2022-03-09 07:08:48,914 - trainer - INFO - epoch : 5
2022-03-09 07:08:48,914 - trainer - INFO - loss_classifier: 0.022068515419960022
2022-03-09 07:08:48,914 - trainer - INFO - loss_box_reg : 0.016676167026162148
2022-03-09 07:08:48,914 - trainer - INFO - attribute_loss : 0.2411506325006485
2022-03-09 07:08:48,914 - trainer - INFO - loss_objectness: 0.00628832820802927
2022-03-09 07:08:48,914 - trainer - INFO - loss_rpn_box_reg: 0.0053995708003640175
2022-03-09 07:20:33,071 - trainer - INFO - epoch : 6
2022-03-09 07:20:33,071 - trainer - INFO - loss_classifier: 0.057442087680101395
2022-03-09 07:20:33,071 - trainer - INFO - loss_box_reg : 0.0581355094909668
2022-03-09 07:20:33,071 - trainer - INFO - attribute_loss : 0.19747763872146606
2022-03-09 07:20:33,072 - trainer - INFO - loss_objectness: 0.0021394509822130203
2022-03-09 07:20:33,072 - trainer - INFO - loss_rpn_box_reg: 0.002343851840123534
2022-03-09 07:32:17,115 - trainer - INFO - epoch : 7
2022-03-09 07:32:17,116 - trainer - INFO - loss_classifier: 0.021263597533106804
2022-03-09 07:32:17,116 - trainer - INFO - loss_box_reg : 0.022195173427462578
2022-03-09 07:32:17,116 - trainer - INFO - attribute_loss : 0.11648037284612656
2022-03-09 07:32:17,116 - trainer - INFO - loss_objectness: 0.0010667262831702828
2022-03-09 07:32:17,116 - trainer - INFO - loss_rpn_box_reg: 0.007561735343188047
2022-03-09 07:44:00,811 - trainer - INFO - epoch : 8
2022-03-09 07:44:00,812 - trainer - INFO - loss_classifier: 0.01990237645804882
2022-03-09 07:44:00,812 - trainer - INFO - loss_box_reg : 0.014553537592291832
2022-03-09 07:44:00,812 - trainer - INFO - attribute_loss : 0.1027921661734581
2022-03-09 07:44:00,812 - trainer - INFO - loss_objectness: 0.0009330251486971974
2022-03-09 07:44:00,812 - trainer - INFO - loss_rpn_box_reg: 0.00016308532212860882
2022-03-09 07:44:01,352 - trainer - INFO - Saving checkpoint: saved/models/AttributeHeadGeneralizedRCNN_500x500_BS2/0309_061005/checkpoint-epoch8.pth ...
2022-03-09 07:55:45,827 - trainer - INFO - epoch : 9
2022-03-09 07:55:45,828 - trainer - INFO - loss_classifier: 0.04558339715003967
2022-03-09 07:55:45,828 - trainer - INFO - loss_box_reg : 0.03209593892097473
2022-03-09 07:55:45,828 - trainer - INFO - attribute_loss : 0.2617703974246979
2022-03-09 07:55:45,828 - trainer - INFO - loss_objectness: 0.006617174483835697
2022-03-09 07:55:45,828 - trainer - INFO - loss_rpn_box_reg: 0.0082792853936553
2022-03-09 08:07:30,247 - trainer - INFO - epoch : 10
2022-03-09 08:07:30,247 - trainer - INFO - loss_classifier: 0.07696175575256348
2022-03-09 08:07:30,248 - trainer - INFO - loss_box_reg : 0.051953330636024475
2022-03-09 08:07:30,248 - trainer - INFO - attribute_loss : 0.13248559832572937
2022-03-09 08:07:30,248 - trainer - INFO - loss_objectness: 0.005302210338413715
2022-03-09 08:07:30,248 - trainer - INFO - loss_rpn_box_reg: 0.0033989460207521915
2022-03-09 08:19:15,366 - trainer - INFO - epoch : 11
2022-03-09 08:19:15,366 - trainer - INFO - loss_classifier: 0.04610725864768028
2022-03-09 08:19:15,366 - trainer - INFO - loss_box_reg : 0.0247064009308815
2022-03-09 08:19:15,367 - trainer - INFO - attribute_loss : 0.08765356987714767
2022-03-09 08:19:15,367 - trainer - INFO - loss_objectness: 0.0051475027576088905
2022-03-09 08:19:15,367 - trainer - INFO - loss_rpn_box_reg: 0.004154238849878311
2022-03-09 08:31:05,490 - trainer - INFO - epoch : 12
2022-03-09 08:31:05,490 - trainer - INFO - loss_classifier: 0.03030693158507347
2022-03-09 08:31:05,490 - trainer - INFO - loss_box_reg : 0.007616356015205383
2022-03-09 08:31:05,490 - trainer - INFO - attribute_loss : 0.2480003535747528
2022-03-09 08:31:05,490 - trainer - INFO - loss_objectness: 0.014252716675400734
2022-03-09 08:31:05,491 - trainer - INFO - loss_rpn_box_reg: 0.00047123077092692256
2022-03-09 08:31:06,034 - trainer - INFO - Saving checkpoint: saved/models/AttributeHeadGeneralizedRCNN_500x500_BS2/0309_061005/checkpoint-epoch12.pth ...
2022-03-09 08:42:53,728 - trainer - INFO - epoch : 13
2022-03-09 08:42:53,729 - trainer - INFO - loss_classifier: 0.029441138729453087
2022-03-09 08:42:53,729 - trainer - INFO - loss_box_reg : 0.003846927313134074
2022-03-09 08:42:53,729 - trainer - INFO - attribute_loss : 0.17121224105358124
2022-03-09 08:42:53,729 - trainer - INFO - loss_objectness: 0.1492273509502411
2022-03-09 08:42:53,729 - trainer - INFO - loss_rpn_box_reg: 0.02294902689754963
2022-03-09 08:54:38,583 - trainer - INFO - epoch : 14
2022-03-09 08:54:38,583 - trainer - INFO - loss_classifier: 0.03436742722988129
2022-03-09 08:54:38,584 - trainer - INFO - loss_box_reg : 0.02238697186112404
2022-03-09 08:54:38,584 - trainer - INFO - attribute_loss : 0.15846939384937286
2022-03-09 08:54:38,584 - trainer - INFO - loss_objectness: 0.0023808141704648733
2022-03-09 08:54:38,584 - trainer - INFO - loss_rpn_box_reg: 0.0007849537651054561
2022-03-09 09:06:28,238 - trainer - INFO - epoch : 15
2022-03-09 09:06:28,238 - trainer - INFO - loss_classifier: 0.05352114886045456
2022-03-09 09:06:28,238 - trainer - INFO - loss_box_reg : 0.06448137760162354
2022-03-09 09:06:28,238 - trainer - INFO - attribute_loss : 0.08668316900730133
2022-03-09 09:06:28,238 - trainer - INFO - loss_objectness: 0.018392745405435562
2022-03-09 09:06:28,238 - trainer - INFO - loss_rpn_box_reg: 0.018192943185567856
2022-03-09 09:18:16,502 - trainer - INFO - epoch : 16
2022-03-09 09:18:16,502 - trainer - INFO - loss_classifier: 0.008634211495518684
2022-03-09 09:18:16,502 - trainer - INFO - loss_box_reg : 0.007569539360702038
2022-03-09 09:18:16,502 - trainer - INFO - attribute_loss : 0.22656865417957306
2022-03-09 09:18:16,503 - trainer - INFO - loss_objectness: 0.0011281637707725167
2022-03-09 09:18:16,503 - trainer - INFO - loss_rpn_box_reg: 0.0012207112740725279

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant