-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High battery usage, probably network issue #6732
Comments
Doesn't look like anything unusual is happening here, although it does seem as if your network connection is reset pretty regularly. Keeping an application awake in the background non-stop while sending keep-alives to make a network connection stay awake is fundamentally going to cause battery drain. By contrast, GCM can be mostly idle, since they've negotiated idle connection windows with carriers around the world. K9 has a much easier task, since they can use some kind of alarm manager to make periodic checks given the more relaxed latency requirements of email. |
That's not the whole truth. K9 also uses push connections that stay open for instant notification of new mails, so the task is (in my opinion) comparable to signal. But: I just learned that K9 did not resolve the issues around doze mode since android 6. So, although the options imply that it can do instant notifications, it fails to do so. And since Signal does the job right, it is at least reasonable for me now that it uses more energy than K9. Nevertheless I think the EOFExceptions should not be in the log. If they are handled correctly, then there is no need to print the whole stacktrace to the log. |
The log is informational. By that logic, none of the contents of your log would need to be in the log. |
This still stucks in my mind. Until some days ago I thought this energy problem is around doze mode and signal trying to keep the connection open during doze. Now I learned that my phone (and probably all phones without gcm) does not doze. (The android documentation states that gcm is a prerequisite for doze mode.) And therefore I don't see a reason why signal should use more energy than K9, as both have an open push connection and I did several tests to verify that both work reliable. At the moment it appears to me that my energy problem consists of two parts: On the one hand I have an increased energy usage when my phone is ideling during night/standby, and on the other hand I have peak energy usage during SOT. The later one may be a problem on my phone, but I am not sure about it at the moment - still investigating this. For the first one, the only hint I have are some not very special looking lines in the log like: 4721 4774 I chatty : uid=10128(org.thoughtcrime.securesms) Thread-2 expire 15 lines I have alot of them during night, and nothing else. Can someone tell me the meaning of them, or perhaps where in the code they are produced? They don't look dangerous to me, but I have no other trace at the moment. |
I also have a huge battery drain with Signal when I'm on cellular data. It doesn't occur with Wi-Fi connection. Adding some more info to this topic it looks like both Riot.im and Telegram apps still manage to stay connected but don't drain the battery much (still no GAPPS). Maybe their code can be investigated and implemented similary? |
I can confirm that the drain only happens on mobile data and not with WI-FI. I tried to get two logs to compare the situations: https://gist.github.com/b27f7627301914fb058d4aaf7deae40c The first one was during a night on WIFI and the second one with mobile data. You can see in the second one that there is constant activity. Unfortunatly the log doesn't show what happened, because Android suppressed most messages (too chatty). I will try to get a better one. I verified that my connection was stable during the second log by keeping open an ssh connection all the time printing the server time every 10 minutes, so I am certain that there is no connection issue. |
@theBoatman regarding the chattyness: maybe the 2nd and 3rd answers to this stackoverflow question can help. I guess you'll have to do the whitelisting before the log messages you're interested in are generated. |
I have now whitelisted signal and took another log: https://gist.github.com/anonymous/2e403b617f00964b7f246a738f0133dd You can see that the connection is constantly renewed because of an EOFException. I verified that my internet connection was stable all the time. |
@theBoatman Looks to me like your mobile carrier is killing connections at an idle interval smaller than Signal's keepalive interval. The fact that this only happens for you on mobile data and not wifi is additional confirmation. This is the biggest problem with not using GCM; we are not organizationally large enough or rich enough to negotiate idle connection windows with mobile carriers all over the world as Google and others have done. To say that you've "verified that your internet connection is stable" is not entirely relevant for this case. It's not clear what your SSH session's keepalive interval is, and your carrier could likely have different idle timeout windows for different hosts, ports, etc. If you want to rule out Signal, you can write a script to simulate idle connections to Signal's production host/port to see how your mobile carrier treats it. |
@moxie0 I'm from a developing country and I can understand there can be possible issues with carriers. However, I'm also pretty sure that nor Riot.im or Telegram negotiated anything with them and yet they work like a charm on mobile data. I'm always connected, receive messages instantly and the battery drain is minimal. I think it's worth investigating and I'd appreciate it. |
@dtltf I'm sure there is a way to keep a connection alive, probably with some kind of adaptive keepalive strategy. I don't have any devices without GCM on them, though, so I'm not really the right person to write that. I think it would be better if the people who've been advocating for non-GCM support were the ones to write the code or to investigate issues like this. |
I run Signal on Nexus6P with CopperheadOS without GCM and it is quite a bit more battery drain than with GCM. No matter if on wifi or mobile data. I fully understand, that moxie0 don't have the time to search and develop a better strategie for non GCM users. As for now i am happy to use the official Signal Client without gcm. Moxie is right. We need to make the research how this could be optimized. And this doesn't mean we point to other apps that handle this sceneraio better than we do. We have to understand how. |
So we agree that there is a problem and we have to determine whether it is a coding problem (and therefore solvable) or a carrier problem (and therefore out of scope for signal). I am quite sure that it is no carrier problem. One of my ssh test servers is behind a home internet connection with dynamic ip and a randomly selected port. I don't think my carrier has a special rule for this. But I have no proof. So, if anyone can tell me which host/port signal connects to, then I will try to do some experiments with this. If I can get a stable connection through mobile data then we need to take a deeper look in the code. |
I now did some little tests. I can keep up a connection to the server for 90 seconds over both wifi and mobile data. Then the server closed the connection - probably because I did not log in or sent something. In the source code, the keep alive timeout is specified as 55 seconds, so first I thought that should be fine, but then I took a deeper look in the logfiles and saw, that this doesn't work. If the logs are right, then the keep alive packets are sent in far longer intervalls. On Wifi it is up to an hour, on mobile data long enought that a connection timeout occurs. I think the problem lies in the KeepAliveSender. There, Thread.sleep() is used for scheduling. I googled a little bit and found out that sleep doesn't work reliable on Android when the screen is off, so I think that might be our problem. Can anyone confirm that the keep alive isn't sent every 55 seconds? |
I am also experiencing this problem with a GCM enabled device. The app is using a very large amount of energy in the background; neither WiFi nor mobile data connections seem to matter. |
I've had a somewhat cursory look over the discussion, so apologies in advance if I'm repeating something that's been already said. This issue is indeed related to #6644. Inspection of the logs shows that the connection is reset every 1.5', or thereabouts, while the keep alive period is 55", as @theBoatman pointed out. The keep alive is missed, because the keep alive sending implementation is broken for GCM-free devices, as explained in #6644 and, in more detail, in signalapp/libsignal-service-java#45. This has probably nothing to do with the network. I'm not sure why this problem would cause a huge battery drain. It doesn't in my case, where Signal is practically unusable, as it routinely delivers messages with hours of delay. For some reason, in this case, the device periodically wakes up. This causes the drain and also allows Signal to function, as it has a chance to re-establish the connection as soon as it is lost. The fact that the device is awake much of the time can also be seen by the fact that reconnections happen in time:
Between these two log lines, there's a 200ms blocking wait, which can only expire in time, as is the case here, when the device is awake. In my case, this and other waits are delayed indefinitely (and there's no battery drain). Lines such as these on the other hand
imply that the device is not awake all of the time. There is a one minute blocking wait between these two and the fact that it expires after, typically, 1.5 minutes, implies that the device has been sleeping for at least the last 30 sec of that time. Finally, I should note that I can confirm that Signal can, in principle, work without GCM and without a big charge drain on the battery, at least with Android versions up to 7.x.x (see signalapp/libsignal-service-java#45 and #7100 for a proof of concept). |
I tried @dpapavas patch and I can confirm that signal runs better with it. There are nearly no exceptions in the log, and the battery usage droped by 25% and is now very constant. Without the patch the energy consumption was very volatile. The energy consumption is still about 3 times as K9 uses (it was more than 4 times without the patch), so there is still room for improvement. Also the patch isn't accepted by @moxie0 because it breaks the library interface, so it is no final solution, but a step in the right direction. I am currently trying to find out how this situation is solved in K9, but as I am no android developer it will take me some time. |
@moxie0 Too much battery usage on 2hrs audio call approx 628mAh on Huawei Mate 10 lite and my phone become overheat :( |
I have a huge battery drain on Android 8.0 and Pixel XL. Usually 700-1000mAh per day. Disabling the background activity makes no difference. It is the same with the Signal beta version. Signal is always the number one battery consuming app even though I use hours of YouTube and Firefox. |
Here is my log: |
@moxie0 |
Seems this battery drain is ignored by the developer. Lots of complains in the play store also. So big problem. |
@MikkyDoubleB the problem in this case is that moxie0 is the opinion that it is quite unnatural to use an android phone without google apps and therefore he doesn't support this situation. He accepts code for GCM-free devices, but has no interest in developing it. I tried to find a solution. @dpapavas patch is a good step in the right direction, but it also fails every now and then, and it needs some not very pretty changes in the code. If someone knows a reliable way to trigger the sending of keep alive packages on android, please let me know :) |
It was reported in an another thread that updating to 8.1 have fixed the issue for some #6898 |
@MikkyDoubleB the problem in #6898 is a different one, because it happens on phones with gcm. The problem in this issue here only happens on gcm free devices. I may have a possible fix for our issue here now, but I need to test it for a few days. If it runs without exception and acceptable battery usage I will publish it here. |
How does it fail? I've been using it all this time, without any issues, at least not anything that seems to be related to this particular problem. There seems to be another issue, that has to do with connection handovers, from mobile broadband to WiFi, but it happens seldomly enough, that i haven't much bothered to investigate, especially since any potential patch will probably not make it into the repository. I'm happy to hear you have a patch. Keeping a private forked version isn't much fun. |
@dpapavas It works for me usually for about an hour or so, and then my phone is sleeping so deep that the alarm isn't triggered anymore. The reason for that is that you used setRepeating for setting the alarm, which only works as long as the device is not in a low power mode. The only method (since API level 23) which should work in every power mode is setAndAllowWhileIdle. This is the only important change to your solution. I also did some refacturing, but that was only cosmetic and for easier merging. This works perfectly for me now, except for handover between WiFi and mobile data, which is (as you already stated somewhere) not handled by the current code and leads always to an exception. The problem is that we still need a change in libsignal-service-java for triggering the keep alive things from outside. I just reread your merge-request there and I think that's the place where we need to continue. Without a change in the lib we will not be able to solve this. I think I will post some thoughts there. |
I find this troubling, in regard to its implications on the universality of the solution. The documentation is not very specific, but it does imply that using the
Furtermore, the section on
This is all not very specific, for instance in respect to what sort of 'going off' is assumed, but it seems to imply, that the device should wake up. This seems not to be the case, even though the device is whitelisted from "battery optimizations". The fact that the vague nature of the documentation, probably implies leeway for developers to interpret "sleep" or "wake" ad libitum, doesn't help either, as it makes it less likely, that a particular approach will work across devices. As far as I can recall from the time I researched this, the
(Note, in passing, that it's specifically mentioned that Elsewhere the documentation also mentions, again implying that
But then whitelisting is mentioned, as a "partial exemption" from doze which, among other things, means:
Again, "regular AlarmManager alarms" is not very specific, but I interpreted it as non-wake type alarms, which seems to be the case for my device and that of a friend which explicitly mentions entering doze in the logs, but it's not the case for your device. As I've said, this is not very encouraging, in terms of the hope of achieving a universal solution. The fact that
In the end, perhaps a one minute keep alive requirement is overly restrictive and not achievable on mobile devices. A larger period, like 15 minutes would be desirable, but I'm not sure on the implications of this on the server end, which is why I tried to avoid it. It looks like there might be no way around it in the long term though. |
The question is what exaclty "low-power idle modes" means. I haven't found a documentation to the different idle-modes jet. If it stands for "Doze" then everything is fine, as our phones shouldn't doze at all, and that would explain why it works as expected on my phone. I expect that we will need to call a suitable alarm method for each api level, which is not a big deal, except that you need enough devices/people to test. btw.: I wrongly stated that I used setAndAllowWhileIdle, but really used setExactAndAllowWhileIdle, but I think it doesn't make a difference for the discussion. |
Basically, as far as I recall, and speaking with whatever certainty is allowed by the vague documentation, I'm pretty sure that there's only one sleep mode. When certain conditions are met, basically screen off and no wakelocks kept by any application the device will sleep. This basically means that the CPU stops (which, incidentally, is probably why the uptime-based clock stops ticking and the keep alive thread never unblocks) and can only be woken up via some external event. On top of that, there are other mechanisms, which are not related to sleep in the sense of the basic CPU state, but are rather protocols, that allow the system to avoid waking the CPU too often, while still allowing most tasks to be carried out. Doze (and App Standby) seem to fall in this category, and more or less work by regulating when power-consuming tasks are allowed to take place and for how long. Note here, that while exemptions from these protocols can be requested, they seen to be "partial exemptions", at least according to the documentation, which I've quoted in my previous post. It's not entirely clear therefore if Doze and App Standby is entirely irrelevant for a whitelisted application. The documentation does mention GCM-free devices as more or less the sole purpose of exemptions though, so it should be reasonable to expect that Signal should be able to work without it. |
I created a fork of libsignal and Signal-Android containing my changes, if anyone wants to take a look at it. https://github.com/theBoatman/libsignal-service-java-openKeepalive I also changed the build.gradle to prefer the locally built library, so it should be enough to do a "gradlew build install" in libsignal-service-java and a "gradlew build" in Signal-Android to build the application. This patched version runs fine on my phone and reduced the energy usage of signal by 50%, which gained me a whole day of standby time. I am currently trying to find a more "moxiefied" solution :) |
I forgot to mention: Regarding handover between WiFi and mobile connection, this also seems to be a device-related problem. On my phone, (as is the case on theBoatman's I believe,) when switching networks during a call, the connection is lost. Even if not in a call, the device won't be able to receive anything from the server, for some time after the handover. On a friends device, with the same version of Android as mine and without GCM, this is not the case. I've investigated this a bit, and it seems to be a case of the device's configuration, or perhaps network drivers. On the working device, after handover, all sockets belonging to the disconnected network are immediately closed and destroyed. On mine, they seem to be allowed to linger, until they time out. This seems to have the result, that Signal promptly recycles the message pipe in one case, but hangs around listening to a dangling socket in the other. That is probably not all of it though, as it wouldn't explain why the working phone can keep a call going during a handover. In any case, I remembered having a fix for that, too, but couldn't find the patch, so I decided, I probably didn't keep it, because it didn't work. Turns out I did. I've cleaned it up a bit and it's available here, for anyone interested: 4359b4f. It basically works, by detecting the network change, and manually shutting down the pipe in response. It won't help a call survive a handover, but at least you can still use Signal immediately after it. (@theBoatman: note that, to incorporate this fix in your fork, you'll need some way to wake up the socket from all the places it performs blocking sleep, most notably when it waits for a message on the pipe. As it is now, it seems to only send the keep-alive, so that the socket, would probably still be blocked and unresponsive for a very long time, while the device is sleeping. It would probably also take a long time to reconnect, as it would be trapped in the blocking wait-before-reconnect sleep, which is also not addressed by just sending a keep-alive. I imagine that these blocking sleep traps, will also affect Signal's behavior during normal use, especially when the connection is bad, but I haven't concretely seen this in the logs, so I'm just speculating.) |
I just made a Pull-request to fix #6447 ( signalapp/libsignal-service-java#49) - maybe this does also fix your problems? I never had this problem, so I can't test this. |
If I am right we have currently two problem situations that are not (really) handled by the current code:
My/dpapavas patch is adressing situation 1, rkohrts patch situation 2. Perhaps we should split this up into two seperate issues. @dpapavas: I don't think that I have to do anything more than my patch does. As long as the keep alive packages are sent, the socket seems to listen to incomming data and react immediatly on messages - at least on my phone. But I will retest that to be sure. |
I did link to a patch for case 2 above as well, but note that this is tested on my device only. Also note, that this should probably be viewed as a workaround for certain devices, instead of necessary functionality, as some devices seem to work fine (see my previous post for details). As such, a general solution should probably determine why and to what extent a workaround is necessary (as opposed to filing an upstream bug report for instance) and also whether it has any adverse effects on properly working devices. I have done neither.
Well it should work fine most of the time. On some occasions though, there should be observable problems. Say your connection is lost while the phone is asleep. The blocking sleep before reconnection, will never expire for the reasons already explained and you'll stay without a connection, until some alarm, perhaps your own, wakes up the phone for long enough for the timeout to expire. Since the timeout in question starts out relatively short, this will probably happen soon, so the impact will not be too bad in most cases. During a period of poor connectivity though, things might be different. Anyway, it'll probably work well enough in practice. Nevertheless, any sort of blocking wait function while the device is asleep (which is generally the case here), basically means "block for as long as it might happen to take, to accumulate x waking ms", which makes no sense. I therefore consider it a bug, unless the code takes this behavior into account, instead of relying on external conditions, that might or might not happen, to work as it should. |
@theBoatman yes exactly - with both of these issues fixed I have now a reliable connection to the Signal server, without using GCM. Also note my other Pullrequest (#7388) where I have rewritten @dpapavas Pullrequest for problem 1 so that it does not brake the interface. Hopefully this brings us closer to get it merged into master. |
@rkohrt your Pullrequest (#7388) looks very promising, but it will not work on my phone reliably, as you are using alarmManager.setRepeating() for setting the alarm. The only way to set an alarm that is triggered in every situation since marshmallow is (according to the documentation) setExactAndAllowWhileIdle. Can you add that in your version? @dpapavas When my phone looses/switches connection, the invalid connection is detected as soon as the next keepalive is sent and then the connection is rebuilt instant. So the scenario you described seems not to happen in my case. |
GitHub Issue Cleanup: |
I have:
Bug description
I noticed a very high battery usage during signal is running in the background. Signal uses 3 to 4 times more energy than K9, which has a similar task (open push connection, waiting for messages). The battery drain seems to be constantly in the background and happens even if no message is received or send. The log shows a lot of EOFExceptions in the WebSocketReader, so it seems to be a network related issue.
I tried to enable/disable the android internal battery optimization - in both cases the Exceptions happened. I also uninstalled Signal for a day to confirm that it is Signal which uses the energy and not an error in the statistics. And I checked with BetterBatteryStats for uncommon wakelocks/alarms without result.
I already reported this issue in a comment in #6717, but it seems to be a different problem.
I am using Android 7.1.1 on a phone without GAPPS.
Device info
Device : motorola Moto G (lineage_falcon)
Android : 7.1.1 (cdc512c087, lineage_falcon-userdebug 7.1.1 NOF27B cdc512c087 release-keys)
App : Signal 4.6.1
Link to debug log
https://gist.github.com/anonymous/efd6d9b9ae725b831134654ff3250bd2
https://gist.github.com/anonymous/216fae3f5f7857c88eee3af101b54c83
The text was updated successfully, but these errors were encountered: