Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Labcontroller compatible with Python 2 and Python 3 #227

Draft
wants to merge 81 commits into
base: python-3
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
557e290
feat: enable lab-controller build for Python 3
StykMartin Jan 19, 2024
6783017
fix(async): rename module to avoid use of reserved keyword
StykMartin Jan 19, 2024
046ec78
fix(test): use range instead of xrange
StykMartin Jan 19, 2024
b3b291c
fix(test): update exception syntax for Python 3 compatibility
StykMartin Jan 19, 2024
8bd61eb
fix(watchdog): use six to provide xmlrpc client
StykMartin Jan 19, 2024
59ca7ed
fix(watchdog): update exception syntax for Python 3 compatibility
StykMartin Jan 19, 2024
2d257ed
chore(watchdog): format code and sort imports
StykMartin Jan 19, 2024
d015ee2
chore(utils): format code and sort imports
StykMartin Jan 19, 2024
97f1633
fix(transfer): update exception syntax for Python 3 compatibility
StykMartin Jan 19, 2024
160f2d8
chore(transfer): format code and sort imports
StykMartin Jan 19, 2024
e182e16
fix(transfer): handle SSL error on Python 6
StykMartin Jan 19, 2024
fe53e5a
chore(test): format code and sort imports
StykMartin Jan 19, 2024
392ae39
fix(test): use proper octal format
StykMartin Jan 19, 2024
1babbbe
fix(pxemenu): use xmlrpc client and urllib from six
StykMartin Jan 19, 2024
b365a10
chore(pxemenu): format code and sort imports
StykMartin Jan 19, 2024
6a00b9a
fix(proxy): update exception syntax for Python 3 compatibility
StykMartin Jan 19, 2024
f8c3eea
fix(proxy): use xmlrpc move from six
StykMartin Jan 19, 2024
4698017
chore(proxy): format code and sort imports
StykMartin Jan 19, 2024
df27135
fix(provision): use xmlrpc move from six
StykMartin Jan 19, 2024
257de1c
fix(provision): use unicode from six
StykMartin Jan 19, 2024
97b3d7b
fix(provision): update exception syntax for Python 3 compatibility
StykMartin Jan 19, 2024
ceb0a39
fix(provision): handle iterators and dict with six
StykMartin Jan 19, 2024
cd9aecd
chore(provision): format code and sort imports
StykMartin Jan 19, 2024
f315c74
fix(netboot): import StringIO from six
StykMartin Jan 19, 2024
087709f
fix(netboot): use urllib from six
StykMartin Jan 19, 2024
08a8953
fix(netboot): use file mode compatible with six
StykMartin Jan 19, 2024
a078345
chore(netboot): format code and sort imports
StykMartin Jan 19, 2024
13505c8
fix(proxy-main): update exception syntax for Python 3 compatibility
StykMartin Jan 19, 2024
278181b
fix(proxy-main): make xmlrpc imports compatible
StykMartin Jan 19, 2024
8b842b6
chore(proxy-main): format code and sort imports
StykMartin Jan 19, 2024
96efb20
fix(log-storage): update exception syntax for Python 3 compatibility
StykMartin Jan 19, 2024
f9bdd1e
fix(log-storage): use proper octal format
StykMartin Jan 19, 2024
8842505
chore(log-storage): format code and sort imports
StykMartin Jan 19, 2024
40e6bb3
fix(pxemenu): use xmlrpc client and urllib from six
StykMartin Jan 19, 2024
2a9fbe2
chore(expire-distros): format code and sort imports
StykMartin Jan 19, 2024
283e3fd
fix(distro-import): update exception syntax for Python 3 compatibility
StykMartin Jan 19, 2024
ca713cd
fix(distro-import): call fn print
StykMartin Jan 19, 2024
313b0ee
fix(pxemenu): use xmlrpc client,urllib, and configparser from six
StykMartin Jan 19, 2024
9483957
fix(distro-import): use anonymous fn instead string::strip
StykMartin Jan 19, 2024
40a936e
chore(distro-import): format code and sort imports
StykMartin Jan 19, 2024
3919ba2
fix(distro-import): use is to compare None
StykMartin Jan 19, 2024
8857893
fix(distro-import): simplify comparison
StykMartin Jan 19, 2024
8df208e
chore(config): format code and sort imports
StykMartin Jan 19, 2024
8d649ec
fix(concurrency): update exception syntax for Python 3 compatibility
StykMartin Jan 19, 2024
1d2f31c
chore(concurrency): format code and sort imports
StykMartin Jan 19, 2024
daef1d6
chore(clear-netboot): format code and sort imports
StykMartin Jan 19, 2024
d1fce19
tests: use pytest w/ python3
StykMartin Jan 19, 2024
fa58bd9
fix(proxy): use absolute import to import utils
StykMartin Jan 19, 2024
5069073
ci: run unit tests during pull requests
StykMartin Jan 19, 2024
c14e540
fix(provision): use datetime fn instead on using custom
StykMartin Jan 19, 2024
75bef29
fix(common-helpers): always encode unicode to str
StykMartin Jan 19, 2024
497f87b
fix(spec): add werkzeug to builddeps
StykMartin Jan 19, 2024
964772c
fix(test): use file mode compatible with python six
StykMartin Jan 19, 2024
c0736c4
fix(test): always write bytes to test images
StykMartin Jan 19, 2024
9cc592f
fix(test): drop deprecated fn names
StykMartin Jan 19, 2024
12e8abc
fix(concurrency): wait till fileno is ready
StykMartin Jan 21, 2024
a06a82f
fix(concurrency): decode bytes
StykMartin Jan 21, 2024
b89bd27
test: extend test suite to validate concurrency
StykMartin Jan 21, 2024
3ac3e93
fix(proxy-main): use pywsgi instead of deprecated wsgi
StykMartin Jan 21, 2024
3fe496a
test: refactor _assert_process_group_is_removed to use psutil
StykMartin Jan 21, 2024
d83fbc9
ci: run init inside the container that forwards signals
StykMartin Jan 21, 2024
359066a
ci: define timeout for unit and integration tests
StykMartin Jan 21, 2024
4d08ee5
ci: run unit test on CentOS 9 Stream
StykMartin Jan 21, 2024
491b538
ci: show python environment
StykMartin Jan 21, 2024
af764fe
chore: remove init.d configuration
StykMartin Jan 21, 2024
0693013
Make source code cloneable on Windows
JohnVillalovos Dec 4, 2021
0d8c870
chore: remove unused check_output
StykMartin Jan 21, 2024
f8262f1
fix(proxy): explicitly convert ascii_control_chars to list for Py3 co…
StykMartin Jan 21, 2024
138a41d
fix(proxy): use raw string notation for ANSI escape code regex
StykMartin Jan 21, 2024
1932974
fix: use warning instead of deprecated warn in logger
StykMartin Jan 21, 2024
451e548
chore: noqa for broad exceptions in main loops
StykMartin Jan 21, 2024
11d6660
refactor(watchdog): remove shadowing for variable greenlet
StykMartin Jan 21, 2024
f6e8b00
fix(watchdog): handle missing 'Running' task in recipe abort process
StykMartin Jan 21, 2024
932c4aa
refactor(utils): remove unused code
StykMartin Jan 21, 2024
ab2bd02
refactor(test): rename 'id' to 'entity_id' to avoid shadowing built-i…
StykMartin Jan 21, 2024
3c38986
refactor(proxy): always close hub
StykMartin Jan 21, 2024
0ae7e70
docs: generate man page for beaker-import
StykMartin Jan 21, 2024
7062f60
ci: execute unit tests on fedora
StykMartin Jan 21, 2024
584f728
build: manage logrotate and log dir for py3 targets
StykMartin Jan 21, 2024
d84bcdd
refactor: use absolute import for utility bkr utility library
StykMartin Jan 21, 2024
20896e9
fix(provision): encode power env fields on python 2
StykMartin Jan 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
180 changes: 106 additions & 74 deletions LabController/src/bkr/labcontroller/watchdog.py
Original file line number Diff line number Diff line change
@@ -1,27 +1,26 @@

# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.

import os
import sys
import signal
import logging
import time
import socket
import signal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to sort import files in alpha order? I think you did this last time. Up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, this should be fixed.

import subprocess
import lxml.etree
import daemon
from daemon import pidfile
import sys
from optparse import OptionParser
import gevent, gevent.hub, gevent.event, gevent.monkey
from bkr.labcontroller.proxy import ProxyHelper, Monitor
from bkr.labcontroller.config import load_conf, get_conf
from bkr.log import log_to_stream, log_to_syslog

import daemon
import gevent
import gevent.event
import gevent.hub
import gevent.monkey
import lxml.etree
from daemon import pidfile
from six.moves import xmlrpc_client

from bkr.labcontroller.config import get_conf, load_conf
from bkr.labcontroller.proxy import Monitor, ProxyHelper
from bkr.log import log_to_stream, log_to_syslog

# Like beaker-provision and beaker-transfer, this daemon is structured as
# a polling loop. Each iteration of the loop, it asks Beaker for the list of
Expand All @@ -40,10 +39,12 @@
# Note that we must construct it *after* we daemonize, inside the main loop below.
shutting_down = None


def shutdown_handler(signum, frame):
logger.info('Received signal %s, shutting down', signum)
logger.info("Received signal %s, shutting down", signum)
shutting_down.set()


def run_monitor(monitor):
while True:
updated = monitor.run()
Expand All @@ -53,97 +54,118 @@ def run_monitor(monitor):
if shutting_down.is_set():
break
else:
if shutting_down.wait(timeout=monitor.conf.get('SLEEP_TIME', 20)):
if shutting_down.wait(timeout=monitor.conf.get("SLEEP_TIME", 20)):
break

class Watchdog(ProxyHelper):

class Watchdog(ProxyHelper):
def __init__(self, *args, **kwargs):
super(Watchdog, self).__init__(*args, **kwargs)
self.monitor_greenlets = {} #: dict of (recipe id -> greenlet which is monitoring its console log)
self.monitor_greenlets = (
{}
) #: dict of (recipe id -> greenlet which is monitoring its console log)

def get_active_watchdogs(self):
logger.debug('Polling for active watchdogs')
logger.debug("Polling for active watchdogs")
try:
return self.hub.recipes.tasks.watchdogs('active')
return self.hub.recipes.tasks.watchdogs("active")
except xmlrpc_client.Fault as fault:
if 'not currently logged in' in fault.faultString:
logger.debug('Session expired, re-authenticating')
if "not currently logged in" in fault.faultString:
logger.debug("Session expired, re-authenticating")
self.hub._login()
return self.hub.recipes.tasks.watchdogs('active')
return self.hub.recipes.tasks.watchdogs("active")
else:
raise

def get_expired_watchdogs(self):
logger.debug('Polling for expired watchdogs')
logger.debug("Polling for expired watchdogs")
try:
return self.hub.recipes.tasks.watchdogs('expired')
return self.hub.recipes.tasks.watchdogs("expired")
except xmlrpc_client.Fault as fault:
if 'not currently logged in' in fault.faultString:
logger.debug('Session expired, re-authenticating')
if "not currently logged in" in fault.faultString:
logger.debug("Session expired, re-authenticating")
self.hub._login()
return self.hub.recipes.tasks.watchdogs('expired')
return self.hub.recipes.tasks.watchdogs("expired")
else:
raise

def abort(self, recipe_id, system):
# Don't import this at global scope. It triggers gevent to create its default hub,
# but we need to ensure the gevent hub is not created until *after* we have daemonized.
from bkr.labcontroller.concurrency import MonitoredSubprocess
logger.info('External Watchdog Expired for recipe %s on system %s', recipe_id, system)

logger.info(
"External Watchdog Expired for recipe %s on system %s", recipe_id, system
)
if self.conf.get("WATCHDOG_SCRIPT"):
job = lxml.etree.fromstring(self.get_my_recipe(dict(recipe_id=recipe_id)))
recipe = job.find('recipeSet/guestrecipe')
recipe = job.find("recipeSet/guestrecipe")
if recipe is None:
recipe = job.find('recipeSet/recipe')
for task in recipe.iterfind('task'):
if task.get('status') == 'Running':
recipe = job.find("recipeSet/recipe")
for task in recipe.iterfind("task"):
if task.get("status") == "Running":
break
task_id = task.get('id')
args = [self.conf.get('WATCHDOG_SCRIPT'), str(system), str(recipe_id), str(task_id)]
logger.debug('Invoking external watchdog script %r', args)
p = MonitoredSubprocess(args,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
timeout=300)
logger.debug('Waiting on external watchdog script pid %s', p.pid)
task_id = task.get("id")
args = [
self.conf.get("WATCHDOG_SCRIPT"),
str(system),
str(recipe_id),
str(task_id),
]
logger.debug("Invoking external watchdog script %r", args)
p = MonitoredSubprocess(
args, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, timeout=300
)
logger.debug("Waiting on external watchdog script pid %s", p.pid)
p.dead.wait()
output = p.stdout_reader.get()
if p.returncode != 0:
logger.error('External watchdog script exited with status %s:\n%s',
p.returncode, output)
logger.error(
"External watchdog script exited with status %s:\n%s",
p.returncode,
output,
)
else:
try:
extend_seconds = int(output)
except ValueError:
logger.error('Expected external watchdog script to print number of seconds '
'to extend watchdog by, got:\n%s', output)
logger.error(
"Expected external watchdog script to print number of seconds "
"to extend watchdog by, got:\n%s",
output,
)
else:
logger.debug('Extending T:%s watchdog %d', task_id, extend_seconds)
logger.debug("Extending T:%s watchdog %d", task_id, extend_seconds)
self.extend_watchdog(task_id, extend_seconds)
# Don't abort it here, we assume the script took care of things.
return
self.recipe_stop(recipe_id, 'abort', 'External Watchdog Expired')
self.recipe_stop(recipe_id, "abort", "External Watchdog Expired")

def spawn_monitor(self, watchdog):
monitor = Monitor(watchdog, self)
greenlet = gevent.spawn(run_monitor, monitor)
self.monitor_greenlets[watchdog['recipe_id']] = greenlet
self.monitor_greenlets[watchdog["recipe_id"]] = greenlet

def completion_callback(greenlet):
if greenlet.exception:
logger.error('Monitor greenlet %r had unhandled exception: %r',
greenlet, greenlet.exception)
del self.monitor_greenlets[watchdog['recipe_id']]
logger.error(
"Monitor greenlet %r had unhandled exception: %r",
greenlet,
greenlet.exception,
)
del self.monitor_greenlets[watchdog["recipe_id"]]

greenlet.link(completion_callback)

def poll(self):
for expired_watchdog in self.get_expired_watchdogs():
try:
recipe_id = expired_watchdog['recipe_id']
system = expired_watchdog['system']
recipe_id = expired_watchdog["recipe_id"]
system = expired_watchdog["system"]
self.abort(recipe_id, system)
except Exception:
# catch and ignore here, so that we keep going through the loop
logger.exception('Failed to abort expired watchdog')
logger.exception("Failed to abort expired watchdog")
if shutting_down.is_set():
return
# Get active watchdogs *after* we finish running
Expand All @@ -153,15 +175,16 @@ def poll(self):
active_watchdogs = self.get_active_watchdogs()
# Start a new monitor for any active watchdog we are not already monitoring.
for watchdog in active_watchdogs:
if watchdog['recipe_id'] not in self.monitor_greenlets:
if watchdog["recipe_id"] not in self.monitor_greenlets:
self.spawn_monitor(watchdog)
# Kill any running monitors that are gone from the list.
active_recipes = set(w['recipe_id'] for w in active_watchdogs)
active_recipes = set(w["recipe_id"] for w in active_watchdogs)
for recipe_id, greenlet in list(self.monitor_greenlets.items()):
if recipe_id not in active_recipes:
logger.info('Stopping monitor for recipe %s', recipe_id)
logger.info("Stopping monitor for recipe %s", recipe_id)
greenlet.kill()


def main_loop(watchdog, conf):
global shutting_down
shutting_down = gevent.event.Event()
Expand All @@ -170,25 +193,29 @@ def main_loop(watchdog, conf):
signal.signal(signal.SIGINT, shutdown_handler)
signal.signal(signal.SIGTERM, shutdown_handler)

logger.debug('Entering main watchdog loop')
logger.debug("Entering main watchdog loop")
while True:
try:
watchdog.poll()
except:
logger.exception('Failed to poll for watchdogs')
if shutting_down.wait(timeout=conf.get('SLEEP_TIME', 20)):
gevent.hub.get_hub().join() # let running greenlets terminate
logger.exception("Failed to poll for watchdogs")
if shutting_down.wait(timeout=conf.get("SLEEP_TIME", 20)):
gevent.hub.get_hub().join() # let running greenlets terminate
break
logger.debug('Exited main watchdog loop')
logger.debug("Exited main watchdog loop")


def main():
parser = OptionParser()
parser.add_option("-c", "--config",
help="Full path to config file to use")
parser.add_option("-f", "--foreground", default=False, action="store_true",
help="run in foreground (do not spawn a daemon)")
parser.add_option("-p", "--pid-file",
help="specify a pid file")
parser.add_option("-c", "--config", help="Full path to config file to use")
parser.add_option(
"-f",
"--foreground",
default=False,
action="store_true",
help="run in foreground (do not spawn a daemon)",
)
parser.add_option("-p", "--pid-file", help="specify a pid file")
(opts, args) = parser.parse_args()
if opts.config:
load_conf(opts.config)
Expand All @@ -197,10 +224,12 @@ def main():
conf = get_conf()
pid_file = opts.pid_file
if pid_file is None:
pid_file = conf.get("WATCHDOG_PID_FILE", "/var/run/beaker-lab-controller/beaker-watchdog.pid")
pid_file = conf.get(
"WATCHDOG_PID_FILE", "/var/run/beaker-lab-controller/beaker-watchdog.pid"
)

# HubProxy will try to log some stuff, even though we
# haven't configured our logging handlers yet. So we send logs to stderr
# HubProxy will try to log some stuff, even though we
# haven't configured our logging handlers yet. So we send logs to stderr
# temporarily here, and configure it again below.
log_to_stream(sys.stderr, level=logging.WARNING)
try:
Expand All @@ -215,14 +244,17 @@ def main():
else:
# See BZ#977269
watchdog.close()
with daemon.DaemonContext(pidfile=pidfile.TimeoutPIDLockFile(
pid_file, acquire_timeout=0), detach_process=True):
log_to_syslog('beaker-watchdog')
with daemon.DaemonContext(
pidfile=pidfile.TimeoutPIDLockFile(pid_file, acquire_timeout=0),
detach_process=True,
):
log_to_syslog("beaker-watchdog")
try:
main_loop(watchdog, conf)
except Exception:
logger.exception('Unhandled exception in main_loop')
logger.exception("Unhandled exception in main_loop")
raise

if __name__ == '__main__':

if __name__ == "__main__":
main()