-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OOM Triggered during apt-offline install #62
Comments
In my investigation so far, this will be an intermittent issue, seen only on some systems and seen based on how many bug report were downloaded and how many changelogs generated. And then, your memory would matter.
|
Happy that my root cause is correct.
|
And for the record, the leak is in apt-offline and NOT in python. |
rickysarraf
added a commit
that referenced
this issue
Mar 10, 2017
rickysarraf
added a commit
that referenced
this issue
Apr 9, 2017
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I suspected it to be dupe of #38
But that is NOT the case.
rickysarraf commented an hour ago
So, I've locally hit an OOM under UML.
852731 util-linux : mention upstream bug reporting URL
855203 util-linux : hwclock-set: Synchronize from hwclock despite systemd presence
856833 util-linux : /sbin/wipefs: wipefs does not delete ZFS label/signature
856968 util-linux : /usr/bin/whereis: does not find multiarch libraries
97349 util-linux : hwclock doesn't adjust
(Y) Yes. Proceed with installation
(N) No, Abort.
(R) Redisplay the list of bugs.
(Bug Number) Display the bug report from the Offline Bug Reports.
(?) Display this help message.
What would you like to do next: (y, N, ?)y
base-files_9.8_amd64.deb file synced.
bash_4.4-4+b1_amd64.deb file synced.
gcc-6-base_6.3.0-8_amd64.deb file synced.
coreutils_8.26-3_amd64.deb file synced.
bsdutils_1%3a2.29.1-1_amd64.deb file synced.
libgcc1_1%3a6.3.0-8_amd64.deb file synced.
libstdc++6_6.3.0-8_amd64.deb file synced.
perl-base_5.24.1-1_amd64.deb file synced.
libc6_2.24-9_amd64.deb file synced.
init-system-helpers_1.47_all.deb file synced.
cron_3.0pl1-128+b1_amd64.deb file synced.
dpkg_1.18.23_amd64.deb file synced.
dash_0.5.8-2.4_amd64.deb file synced.
Killed
root@uml:~#
and
[ 15.500000] SCSI subsystem initialized
[ 15.590000] Loading iSCSI transport class v2.0-870.
[ 6005.440000] do_syscall_stub : ret = -12, offset = 1052688, data = 000000009e962010
[ 6005.440000] do_syscall_stub: syscall 9 failed, return value = 0xfffffffffffffff4, expected return value = 0x155bd3000
[ 6005.440000] syscall parameters: 0x155bd3000 0x1000 0x5 0x11 0x3 0x395a3000
[ 6005.440000] UML ran out of memory on the host side! This can happen due to a memory limitation or vm.max_map_count has been reached.
[ 6005.440000] Failed to flush page for address 0x155bd3000
root@uml:
# apt-offline install /var/tmp/lxc/uml.zip ^C# free -mroot@uml:
total used free shared buff/cache available
Mem: 993 25 825 7 142 941
Swap: 0 0 0
root@uml:~#
1 GiB is enough and I don't like the fact that apt-offline is asking for more. Can some optimization be done ?
@rickysarraf
Owner
rickysarraf commented 18 minutes ago
Oh! Boy. This one is going to be a hard one to optimize.
Given that the actual zip archive is hardly 80 MiB, and given below logs, there seems to be a leak in Python. Luckily on a machine with large amounts of RAM, it eventually recovered.
top - 16:43:38 up 21:42, 1 user, load average: 2.73, 1.72, 1.22
Tasks: 355 total, 3 running, 336 sleeping, 0 stopped, 3 zombie
%Cpu0 : 29.3 us, 1.0 sy, 0.0 ni, 67.7 id, 1.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu1 : 72.5 us, 2.9 sy, 0.0 ni, 23.5 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 20.0 us, 2.0 sy, 0.0 ni, 77.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 16.0 us, 2.0 sy, 0.0 ni, 81.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 7907.461 total, 159.203 free, 3889.949 used, 3858.309 buff/cache
MiB Swap: 20796.85+total, 19639.40+free, 1157.453 used. 2933.637 avail Mem
PID to signal/kill [default pid = 17392]
PID PPID nTH USER PR NI S %CPU %MEM TIME+ COMMAND UID
17392 17391 1 root 20 R 98.0 81.0 2:19.81 python 0
27124 8417 7 rrs 20 S 18.0 0.8 2:59.19 nautilus 1000
7068 8439 8 rrs 20 S 12.0 3.8 13:35.96 gnome-shell 1000
8433 8431 2 rrs 20 R 5.0 0.9 15:04.01 Xorg 1000
1019 1 7 root 10 -10 S 2.0 0.0 0:49.22 mergerfs 0
11547 11173 15 rrs 20 S 2.0 1.4 3:03.77 chromium 1000
11624 11173 13 rrs 20 S 2.0 1.0 2:55.04 chromium 1000
8729 8417 4 rrs 20 S 1.0 0.8 1:12.93 terminix 1000
13962 2 1 root 20 S 1.0 0:03.18 kworker/u16:4 0
14221 8417 4 rrs 20 S 1.0 1.0 0:33.23 onboard 1000
15886 15882 41 rrs 20 S 1.0 7.3 3:24.14 java 1000
top - 16:44:11 up 21:43, 1 user, load average: 2.89, 1.86, 1.28
Tasks: 355 total, 4 running, 335 sleeping, 0 stopped, 3 zombie
%Cpu0 : 27.0 us, 1.0 sy, 0.0 ni, 71.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 62.0 us, 2.0 sy, 0.0 ni, 36.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 13.0 us, 4.0 sy, 0.0 ni, 82.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 38.0 us, 2.0 sy, 0.0 ni, 59.0 id, 1.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 7907.461 total, 145.008 free, 3901.461 used, 3860.992 buff/cache
MiB Swap: 20796.85+total, 19639.40+free, 1157.453 used. 2919.355 avail Mem
PID to signal/kill [default pid = 17392]
PID PPID nTH USER PR NI S %CPU %MEM TIME+ COMMAND UID
17392 17391 1 root 20 R 97.1 98.5 2:51.42 python 0
27124 8417 8 rrs 20 R 16.7 0.8 3:04.91 nautilus 1000
7068 8439 8 rrs 20 S 13.7 3.9 13:42.03 gnome-shell 1000
8433 8431 2 rrs 20 S 5.9 0.9 15:06.72 Xorg 1000
1019 1 7 root 10 -10 S 2.0 0.0 0:50.01 mergerfs 0
14221 8417 4 rrs 20 R 2.0 1.0 0:33.76 onboard 1000
8729 8417 4 rrs 20 S 1.0 0.8 1:13.52 terminix 1000
11267 11173 17 rrs 20 S 1.0 1.2 5:00.77 chromium 1000
11624 11173 13 rrs 20 S 1.0 1.0 2:55.25 chromium 1000
15886 15882 41 rrs 20 S 1.0 7.3 3:24.38 java 1000
17480 8807 1 rrs 20 R 1.0 0.0 0:01.68 top 1000
17549 17548 1 root 20 S 1.0 0.0 0:01.27 cachestat-perf 0
This is on my host machine with 8 GiB or RAM and 20 GiB of SWAP.
Here's cachestat
Counting cache functions... Output every 1 seconds.
HITS MISSES DIRTIES RATIO BUFFERS_MB CACHE_MB
3311 1283 0 72.1% 240 3342
2783 582 0 82.7% 240 3341
2299 251 0 90.2% 240 3346
7858 2767 153 74.0% 240 3345
2441 249 0 90.7% 240 3339
4122 326 67 92.7% 240 3333
2431 407 0 85.7% 240 3333
3998 266 0 93.8% 240 3338
2719 392 152 87.4% 240 3338
2312 264 0 89.8% 240 3342
4077 400 0 91.1% 240 3333
2415 268 0 90.0% 240 3333
2448 410 9 85.7% 240 3333
3762 1777 95 67.9% 240 3333
7117 977 29 87.9% 240 3339
6010 2419 0 71.3% 240 3346
3950 340 84 92.1% 240 3338
2422 339 68 87.7% 240 3338
4003 264 0 93.8% 240 3342
4341 825 0 84.0% 240 3337
3546 1557 0 69.5% 240 3343
4299 518 0 89.2% 240 3338
6087 2580 40 70.2% 240 3347
2434 327 0 88.2% 240 3341
3424 1350 4 71.7% 240 3341
16868 2892 242 85.4% 240 3335
5612 190 5 96.7% 240 3335
2855 614 22 82.3% 240 3317
3477 1748 0 66.5% 240 3319
5501 2812 53 66.2% 240 3317
5697 4691 0 54.8% 240 3317
5622 3811 241 59.6% 240 3317
2444 358 0 87.2% 240 3313
3993 245 0 94.2% 240 3318
3877 2029 0 65.6% 240 3318
5104 3135 18 61.9% 240 3313
2595 553 0 82.4% 240 3313
2282 266 0 89.6% 240 3313
5829 4012 0 59.2% 240 3313
2517 413 3 85.9% 240 3313
2510 493 0 83.6% 240 3312
2741 550 0 83.3% 240 3315
2418 325 0 88.2% 240 3313
2722 612 17 81.6% 240 3314
2746 572 0 82.8% 240 3309
2453 460 0 84.2% 240 3309
2415 368 0 86.8% 240 3310
2457 401 0 86.0% 240 3309
2879 862 2904 77.0% 240 3309
5207 2176 0 70.5% 240 3313
3014 993 50 75.2% 240 3314
2604 1133 39 69.7% 240 3328
2576 330 0 88.6% 240 3334
4526 879 74 83.7% 240 3326
4168 522 0 88.9% 240 3314
2686 324 0 89.2% 240 3313
8955 233 5471 97.5% 240 3332
2567 307 0 89.3% 240 3314
2660 690 0 79.4% 240 3314
3805 1217 61 75.8% 240 3314
3581 1561 0 69.6% 240 3318
2544 539 0 82.5% 240 3318
2508 518 9 82.9% 240 3317
3351 1266 0 72.6% 240 3317
2376 346 0 87.3% 240 3318
2756 693 18 79.9% 240 3318
3957 2032 0 66.1% 240 3318
2681 695 0 79.4% 240 3318
2501 497 0 83.4% 240 3318
2823 682 0 80.5% 240 3318
2600 465 0 84.8% 240 3318
2557 537 0 82.6% 240 3318
4577 2537 48 64.3% 240 3318
2378 332 4 87.7% 240 3318
2654 585 0 81.9% 240 3318
2907 807 0 78.3% 240 3314
2392 349 0 87.3% 240 3314
2688 602 0 81.7% 240 3315
2459 312 0 88.7% 240 3315
2238 230 0 90.7% 240 3314
4022 329 2875 92.4% 240 3326
5057 1554 0 76.5% 240 3333
8751 5528 383 61.3% 240 3336
19801 18746 1348 51.4% 240 3337
13433 10772 1034 55.5% 240 3308
8755 6492 586 57.4% 240 3308
19866 19994 1651 49.8% 240 3307
7441 5324 299 58.3% 240 3310
22220 21758 1685 50.5% 240 3310
17127 14967 990 53.4% 240 3310
44986 9213 669 83.0% 240 3316
21874 19886 1939 52.4% 240 3314
15436 15099 1581 50.6% 240 3313
2196265 721 54 100.0% 240 3313
1883 0 0 100.0% 240 3313
1856 0 0 100.0% 240 3312
2357 0 0 100.0% 240 3305
1844 0 0 100.0% 240 3308
1776 0 0 100.0% 240 3308
1968 0 0 100.0% 240 3304
1825 0 0 100.0% 240 3304
1835 0 0 100.0% 240 3304
1847 0 0 100.0% 240 3304
1963 0 0 100.0% 240 3308
1803 0 0 100.0% 240 3308
1848 0 0 100.0% 240 3301
1825 0 0 100.0% 240 3307
5109 21 0 99.6% 240 3309
3460 11 0 99.7% 240 3304
5071 21 0 99.6% 240 3305
^C 1795 0 0 100.0% 240 3305
Ending tracing...
2017-03-07 / 16:44:48 ♒♒♒ ☺
So there's not much to be done in apt-offline, if python may be leaking memory.
@rickysarraf
Owner
rickysarraf commented 6 minutes ago
My suspicion that tempfiles were eating precious memory is wrong. Because tempfiles are being unavailable for deletion because they are dynamically created/removed as and when accesses from the ZipFile handle.
Exception OSError: (2, 'No such file or directory', '/tmp/tmpyyUjYz') in <bound method _TemporaryFileWrapper.del of <closed file '', mode 'w+b' at 0x42e43ae0>> ignored
/tmp/tmptezTbl
Exception OSError: (2, 'No such file or directory', '/tmp/tmptezTbl') in <bound method _TemporaryFileWrapper.del of <closed file '', mode 'w+b' at 0x42e43c00>> ignored
/tmp/tmpDPboX0
Exception OSError: (2, 'No such file or directory', '/tmp/tmpDPboX0') in <bound method _TemporaryFileWrapper.del of <closed file '', mode 'w+b' at 0x42e43ae0>> ignored
/tmp/tmppXlpwx
Exception OSError: (2, 'No such file or directory', '/tmp/tmppXlpwx') in <bound method _TemporaryFileWrapper.del of <closed file '', mode 'w+b' at 0x42e43c00>> ignored
/tmp/tmp8YGeLB
Exception OSError: (2, 'No such file or directory', '/tmp/tmp8YGeLB') in <bound method _TemporaryFileWrapper.del of <closed file '', mode 'w+b' at 0x42e43ae0>> ignored
/tmp/tmpoCjU5f
Exception OSError: (2, 'No such file or directory', '/tmp/tmpoCjU5f') in <bound method _TemporaryFileWrapper.del of <closed file '', mode 'w+b' at 0x42e43c00>> ignored
/tmp/tmpXo82ZN
Exception OSError: (2, 'No such file or directory', '/tmp/tmpXo82ZN') in <bound method _TemporaryFileWrapper.del of <closed file '', mode 'w+b' at 0x42e43ae0>> ignored
/tmp/tmpvK9TvU
Exception OSError: (2, 'No such file or directory', '/tmp/tmpvK9TvU') in <bound method _TemporaryFileWrapper.del of <closed file '', mode 'w+b' at 0x42e43c00>> ignored
/tmp/tmp8W2HvE
The text was updated successfully, but these errors were encountered: