Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positives #30

Open
iDiogenes opened this issue Apr 11, 2017 · 11 comments
Open

False positives #30

iDiogenes opened this issue Apr 11, 2017 · 11 comments

Comments

@iDiogenes
Copy link

Hello,

I pointed this at my rancher 1.4.3 server and it said about half my stacks were in an UNHEALTHY state and fired off a bunch of emails. However, the rancher UI says everything is green.

Is 1.4 supported?

@VAdamec
Copy link

VAdamec commented Apr 11, 2017

I tried clean v1.4.3 and some sample apps from catalog and it seems to get state changes without any problem. Can you provide logs from container ? we can add some debug to see more later.

My sample log:

[INFO]   2017-4-11 5:48:22:643     start polling rancher-eventer/rancher-eventer
[INFO]   2017-4-11 5:48:22:644     start polling pxc/pxc
[INFO]   2017-4-11 5:48:22:644     start polling concrete5/cmsmysql
[INFO]   2017-4-11 5:48:22:645     start polling concrete5/concrete5app
[INFO]   2017-4-11 5:48:22:645     start polling dokuwiki2/dokuwiki-server
[INFO]   2017-4-11 6:4:9:301       service concrete5/cmsmysql active -> degraded
[INFO]   2017-4-11 6:4:24:325      service concrete5/cmsmysql degraded -> active
[INFO]   2017-4-11 6:5:24:460      service concrete5/concrete5app active -> upgraded
[INFO]   2017-4-11 6:5:39:492      service concrete5/concrete5app upgraded -> active
[INFO]   2017-4-11 6:5:54:526      service concrete5/concrete5app active -> degraded
[INFO]   2017-4-11 6:6:9:549       service concrete5/concrete5app degraded -> active
[INFO]   2017-4-11 6:6:23:287      stopping pxc due to rolling-back state
[INFO]   2017-4-11 6:6:23:287      stop polling pxc/pxc
[INFO]   2017-4-11 6:6:24:625      service pxc/pxc         upgrading -> degraded
[INFO]   2017-4-11 6:7:23:321      discovered new running service, creating monitor for: pxc/pxc
[INFO]   2017-4-11 6:7:23:322      new monitor up pxc/pxc:
  targets: "(HipchatTarget {\"notify\":\"true\"})"
  healthcheck: {
    "pollInterval": 15000,
    "healthyThreshold": 3,
    "unhealthyThreshold": 4
}

[INFO]   2017-4-11 6:7:23:322      start polling pxc/pxc
[INFO]   2017-4-11 6:7:38:364      service pxc/pxc         active -> degraded
[INFO]   2017-4-11 6:8:23:475      service pxc/pxc         became UNHEALTHY with threshold 4
[INFO]   2017-4-11 6:8:24:64       sent event to Hipchat service pxc in stack pxc became degraded (active) link: http://xxx.xxx.xxx.xxx:8080/env/1a5/apps/stacks/1e7/services/1s8/containers

@iDiogenes
Copy link
Author

I am not sure if you initiated an issue, but it looks like your service pxc became UNHEALTHY after rancher-alarms started.

Here are my logs. As you can see 6 of the 10 services were marked as degraded and then UNHEALTH pretty much right on startup. Thre resulted in 6 emails being triggered. However, Rancher is showing every service as active/green.

4/11/2017 10:42:51 AM> rancher-alarms@0.1.7 start /usr/src/app
4/11/2017 10:42:51 AM> node bin/rancher-alarms.js
4/11/2017 10:42:51 AM
4/11/2017 10:42:55 AM[INFO] 2017-4-11 17:42:55:112 composing config from env variables
4/11/2017 10:42:55 AM[INFO] 2017-4-11 17:42:55:125 started with config:
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:709 monitors inited:
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:710 mystack3/db:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:710 mystack3/passenger:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 mystack3/lb:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 letsencrypt/letsencrypt:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 pa/db:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 pa/shoryuken:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 edge/lb:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 edge/redirect:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 mystack5/db:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 mystack5/passenger:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 mystack5/lb:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 mystack1/db:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:711 mystack1/passenger:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 mystack4/db:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 mystack4/passenger:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 api/db:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 api/passenger:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 api/lb:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 mystack6/db:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 mystack6/passenger:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 mystack2/db:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 mystack2/passenger:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:712 mystack2/lb:
4/11/2017 10:42:56 AM targets: "email:\n recipients: myuser@mydomain.com"
4/11/2017 10:42:56 AM healthcheck: {
4/11/2017 10:42:56 AM "pollInterval": 15000,
4/11/2017 10:42:56 AM "healthyThreshold": "2",
4/11/2017 10:42:56 AM "unhealthyThreshold": "3"
4/11/2017 10:42:56 AM}
4/11/2017 10:42:56 AM
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:714 start polling mystack3/db
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:718 start polling mystack3/passenger
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:719 start polling mystack3/lb
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:719 start polling letsencrypt/letsencrypt
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:719 start polling pa/db
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:719 start polling pa/shoryuken
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:720 start polling edge/lb
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:720 start polling edge/redirect
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:722 start polling mystack5/db
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:722 start polling mystack5/passenger
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:722 start polling mystack5/lb
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:722 start polling mystack1/db
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:722 start polling mystack1/passenger
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:722 start polling mystack4/db
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:722 start polling mystack4/passenger
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:722 start polling api/db
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:723 start polling api/passenger
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:723 start polling api/lb
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:723 start polling mystack6/db
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:723 start polling mystack6/passenger
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:723 start polling mystack2/db
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:723 start polling mystack2/passenger
4/11/2017 10:42:56 AM[INFO] 2017-4-11 17:42:56:723 start polling mystack2/lb
4/11/2017 10:43:12 AM[INFO] 2017-4-11 17:43:12:420 service mystack4/passenger active -> degraded
4/11/2017 10:43:12 AM[INFO] 2017-4-11 17:43:12:436 service mystack1/passenger active -> degraded
4/11/2017 10:43:12 AM[INFO] 2017-4-11 17:43:12:442 service mystack6/passenger active -> degraded
4/11/2017 10:43:12 AM[INFO] 2017-4-11 17:43:12:502 service mystack3/passenger active -> degraded
4/11/2017 10:43:12 AM[INFO] 2017-4-11 17:43:12:511 service mystack5/passenger active -> degraded
4/11/2017 10:43:12 AM[INFO] 2017-4-11 17:43:12:532 service mystack2/passenger active -> degraded
4/11/2017 10:43:43 AM[INFO] 2017-4-11 17:43:43:694 service mystack4/passenger became UNHEALTHY with threshold 3
4/11/2017 10:43:43 AM[INFO] 2017-4-11 17:43:43:856 service mystack1/passenger became UNHEALTHY with threshold 3
4/11/2017 10:43:43 AM[INFO] 2017-4-11 17:43:43:899 service mystack3/passenger became UNHEALTHY with threshold 3
4/11/2017 10:43:43 AM[INFO] 2017-4-11 17:43:43:906 service mystack5/passenger became UNHEALTHY with threshold 3
4/11/2017 10:43:43 AM[INFO] 2017-4-11 17:43:43:909 service mystack6/passenger became UNHEALTHY with threshold 3
4/11/2017 10:43:43 AM[INFO] 2017-4-11 17:43:43:963 service mystack2/passenger became UNHEALTHY with threshold 3
4/11/2017 10:43:44 AM[INFO] 2017-4-11 17:43:44:214 sending email notification to myuser@mydomain.com
4/11/2017 10:43:44 AM[INFO] 2017-4-11 17:43:44:384 sending email notification to myuser@mydomain.com
4/11/2017 10:43:44 AM[INFO] 2017-4-11 17:43:44:406 sending email notification to myuser@mydomain.com
4/11/2017 10:43:44 AM[INFO] 2017-4-11 17:43:44:419 sending email notification to myuser@mydomain.com
4/11/2017 10:43:44 AM[INFO] 2017-4-11 17:43:44:463 sending email notification to myuser@mydomain.com
4/11/2017 10:43:44 AM[INFO] 2017-4-11 17:43:44:485 sending email notification to myuser@mydomain.com
4/11/2017 10:43:45 AM[INFO] 2017-4-11 17:43:45:484 sent email notification to myuser@mydomain.com {
4/11/2017 10:43:45 AM "accepted": [
4/11/2017 10:43:45 AM "myuser@mydomain.com"
4/11/2017 10:43:45 AM ],
4/11/2017 10:43:45 AM "rejected": [],
4/11/2017 10:43:45 AM "response": "250 2.0.0 OK 1491932625 n7sm31840855pfn.0 - gsmtp",
4/11/2017 10:43:45 AM "envelope": {
4/11/2017 10:43:45 AM "from": "myuser@mydomain.com",
4/11/2017 10:43:45 AM "to": [
4/11/2017 10:43:45 AM "myuser@mydomain.com"
4/11/2017 10:43:45 AM ]
4/11/2017 10:43:45 AM },
4/11/2017 10:43:45 AM "messageId": "1491932624641-8e3b5400-57777895-4643747f@mydomain.com"
4/11/2017 10:43:45 AM}
4/11/2017 10:43:45 AM[INFO] 2017-4-11 17:43:45:746 sent email notification to myuser@mydomain.com {
4/11/2017 10:43:45 AM "accepted": [
4/11/2017 10:43:45 AM "myuser@mydomain.com"
4/11/2017 10:43:45 AM ],
4/11/2017 10:43:45 AM "rejected": [],
4/11/2017 10:43:45 AM "response": "250 2.0.0 OK 1491932625 t5sm31763246pgb.58 - gsmtp",
4/11/2017 10:43:45 AM "envelope": {
4/11/2017 10:43:45 AM "from": "myuser@mydomain.com",
4/11/2017 10:43:45 AM "to": [
4/11/2017 10:43:45 AM "myuser@mydomain.com"
4/11/2017 10:43:45 AM ]
4/11/2017 10:43:45 AM },
4/11/2017 10:43:45 AM "messageId": "1491932624502-9f9b38ef-fc021404-00c495c2@mydomain.com"
4/11/2017 10:43:45 AM}
4/11/2017 10:43:46 AM[INFO] 2017-4-11 17:43:46:77 sent email notification to myuser@mydomain.com {
4/11/2017 10:43:46 AM "accepted": [
4/11/2017 10:43:46 AM "myuser@mydomain.com"
4/11/2017 10:43:46 AM ],
4/11/2017 10:43:46 AM "rejected": [],
4/11/2017 10:43:46 AM "response": "250 2.0.0 OK 1491932626 r17sm31801969pfa.13 - gsmtp",
4/11/2017 10:43:46 AM "envelope": {
4/11/2017 10:43:46 AM "from": "myuser@mydomain.com",
4/11/2017 10:43:46 AM "to": [
4/11/2017 10:43:46 AM "myuser@mydomain.com"
4/11/2017 10:43:46 AM ]
4/11/2017 10:43:46 AM },
4/11/2017 10:43:46 AM "messageId": "1491932624634-318d97dc-d8cc5db6-fd492550@mydomain.com"
4/11/2017 10:43:46 AM}
4/11/2017 10:43:46 AM[INFO] 2017-4-11 17:43:46:426 sent email notification to myuser@mydomain.com {
4/11/2017 10:43:46 AM "accepted": [
4/11/2017 10:43:46 AM "myuser@mydomain.com"
4/11/2017 10:43:46 AM ],
4/11/2017 10:43:46 AM "rejected": [],
4/11/2017 10:43:46 AM "response": "250 2.0.0 OK 1491932626 o194sm31854886pfg.66 - gsmtp",
4/11/2017 10:43:46 AM "envelope": {
4/11/2017 10:43:46 AM "from": "myuser@mydomain.com",
4/11/2017 10:43:46 AM "to": [
4/11/2017 10:43:46 AM "myuser@mydomain.com"
4/11/2017 10:43:46 AM ]
4/11/2017 10:43:46 AM },
4/11/2017 10:43:46 AM "messageId": "1491932624637-070282ad-d48bfe01-57a7081f@mydomain.com"
4/11/2017 10:43:46 AM}
4/11/2017 10:43:46 AM[INFO] 2017-4-11 17:43:46:738 sent email notification to myuser@mydomain.com {
4/11/2017 10:43:46 AM "accepted": [
4/11/2017 10:43:46 AM "myuser@mydomain.com"
4/11/2017 10:43:46 AM ],
4/11/2017 10:43:46 AM "rejected": [],
4/11/2017 10:43:46 AM "response": "250 2.0.0 OK 1491932626 m19sm5561930pfg.115 - gsmtp",
4/11/2017 10:43:46 AM "envelope": {
4/11/2017 10:43:46 AM "from": "myuser@mydomain.com",
4/11/2017 10:43:46 AM "to": [
4/11/2017 10:43:46 AM "myuser@mydomain.com"
4/11/2017 10:43:46 AM ]
4/11/2017 10:43:46 AM },
4/11/2017 10:43:46 AM "messageId": "1491932624694-9ada7c1c-3900e933-a89a9bed@mydomain.com"
4/11/2017 10:43:46 AM}
4/11/2017 10:43:47 AM[INFO] 2017-4-11 17:43:47:66 sent email notification to myuser@mydomain.com {
4/11/2017 10:43:47 AM "accepted": [
4/11/2017 10:43:47 AM "myuser@mydomain.com"
4/11/2017 10:43:47 AM ],
4/11/2017 10:43:47 AM "rejected": [],
4/11/2017 10:43:47 AM "response": "250 2.0.0 OK 1491932627 2sm8215793pfs.85 - gsmtp",
4/11/2017 10:43:47 AM "envelope": {
4/11/2017 10:43:47 AM "from": "myuser@mydomain.com",
4/11/2017 10:43:47 AM "to": [
4/11/2017 10:43:47 AM "myuser@mydomain.com"
4/11/2017 10:43:47 AM ]
4/11/2017 10:43:47 AM },
4/11/2017 10:43:47 AM "messageId": "1491932624699-9ddac7c1-9e0ed61e-e97122fc@mydomain.com"
4/11/2017 10:43:47 AM}

@VAdamec
Copy link

VAdamec commented Apr 11, 2017

Well GREEN service in Rancher UI doesn't mean it's healthy, depends on how you setup healtchecks in affected services. If you look to API services are they really healthy ?

@iDiogenes
Copy link
Author

Using the rancher cli running a ps against the environment shows every service and sidekick as healthy. What is rancher-alarms querying to check for a healthy state?

@iDiogenes
Copy link
Author

Also, using the "view in API" from the UI is showing the same results - active and healthy.

@VAdamec
Copy link

VAdamec commented Apr 11, 2017

That's strange, it get result from API (services, see server.es6 and rancher.es6), do you have more environments ?

@iDiogenes
Copy link
Author

There is only a single cattle environment on the server that is being queried.

@VAdamec
Copy link

VAdamec commented Apr 12, 2017

Ok, so please run it with debug, I need to see more than standard log

@VAdamec
Copy link

VAdamec commented Apr 12, 2017

I'm not familiar with trace() which is used here, but you can easily change it to info() in src/server.es6, line 32

  trace(`loaded services from API\n${JSON.stringify(services, null, 4)}`)
# just change trace to info
  info(`loaded services from API\n${JSON.stringify(services, null, 4)}`)

it will show you complete API response which is received from Rancher. And run it from shell:

export RANCHER_ACCESS_KEY=..
...
npm start

@iDiogenes
Copy link
Author

@VAdamec - I found the issue and it does have to do with the version of Rancher. The _withoutSidekicks function does a split on the container name using an underscore. In Rancher 1.2 (I believe, could be 1.3) they changed the sidekicks to be separated by a hyphen. I updated my code locally to use the hyphen and it solved my problem. Not sure how you want to address the issue, but a fix that supports both formats would be recommended.

https://github.com/ndelitski/rancher-alarms/blob/master/src/monitor.es6#L195

@VAdamec
Copy link

VAdamec commented May 23, 2017

Ok, it's seems to be easy fix, do you create PR ? we see If and when @ndelitski accept it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants