Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metricd: remove janitor process #704

Closed
wants to merge 1 commit into from
Closed

Conversation

jd
Copy link
Member

@jd jd commented Jan 30, 2018

This merge the janitor process in the worker 0 of the metric processor. This
saves some megabytes of memory.

Closes #652

Copy link
Member

@chungg chungg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just remember why i didn't move forward with this. i imagine this would affect your plans to do indexer-less metricd

gnocchi/opts.py Outdated
@@ -149,12 +149,6 @@ def list_opts():
help="How many seconds to wait between "
"metric ingestion reporting. Set value to -1 to "
"disable reporting"),
cfg.IntOpt('metric_cleanup_delay',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably keep this? part of the issue is that it does nothing on cleanup also as there's little to cleanup. if we remove this, it's checking to do nothing even more often.

Copy link
Member Author

@jd jd Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm smart enough to do that. I'll take a look :)

@jd
Copy link
Member Author

jd commented Jan 30, 2018

I actually implemented that on the wrong class.

This merge the janitor process in the worker 0 of the metric processor. This
saves some megabytes of memory.

Closes gnocchixyz#652
@jd
Copy link
Member Author

jd commented Jan 30, 2018

Pull-request updated, HEAD is now 00ff687

@jd
Copy link
Member Author

jd commented Jan 30, 2018

@chungg I'll do the same for the reporting process is that works for you.

@chungg
Copy link
Member

chungg commented Jan 30, 2018

you'll need to set a constraint that reporting interval > processing interval

i'm still slightly concerned that this locks us to indexer. if this is only to lower memory, we will probably save more not loading oslo_db in n*workers vs merging janitor+processing.

@jd
Copy link
Member Author

jd commented Jan 31, 2018

you'll need to set a constraint that reporting interval > processing interval

Same goes for cleanup already. I'll add some code to make that ok.

i'm still slightly concerned that this locks us to indexer. if this is only to lower memory, we will probably save more not loading oslo_db in n*workers vs merging janitor+processing.

I don't understand what you mean here by "locks us to indexer".

@chungg
Copy link
Member

chungg commented Jan 31, 2018

I don't understand what you mean here by "locks us to indexer".

it's my understanding that janitor will always require indexer. if we merge processing and janitor, then the processing worker will always have indexer so i don't know how you'd achieve #548.

i thought about #548 a little bit last night and i'm not entirely sure how that'd be achieved regardless.

@sileht
Copy link
Member

sileht commented Jan 31, 2018

As you want for this, but my preference is for isolated process than saving memory.

@jd
Copy link
Member Author

jd commented Jan 31, 2018

Yeah. The real question is how much memory a process like janitor takes @chungg?

@chungg
Copy link
Member

chungg commented Jan 31, 2018

Yeah. The real question is how much memory a process like janitor takes @chungg?

the janitor process is 40% of total memory and it decreases as a percentage significantly the more workers/children an agent has. you'll save significantly more memory if you can somehow remove the indexer from processing agents.

as i mentioned, this isn't a big issue. i just wanted to point out that the janitor is mostly idle and even in very large environments, i don't think it has much work... was looking to see if there was an easy hack but it's not the end of world if we have an idle janitor per agent.

@chungg
Copy link
Member

chungg commented Jan 31, 2018

so my equivalent comparison would be comparing this to ceilometer-expirer which really only needed to be run every few days... except the janitor has signficantly less work to do than ceilometer-expirer.

@jd
Copy link
Member Author

jd commented Jan 31, 2018

40% of total memory of all metricd workers? that depends on the number of workers?
do you have a size in MB?

@chungg
Copy link
Member

chungg commented Jan 31, 2018

if there is only one processing worker... the janitor and reporting workers are always there (you can disable reporting process technically). reporting worker is roughly 1/2 memory of processing workers... janitor worker is roughly same as processing worker.

for example, if you have one processing worker in a metricd, memory percentage used by janitor is n/(2.5n) (where n is memory of a single processing agent. ~80MB?). if you have 10 processing workers in metricd, it's n/(9.5n) which is significantly less.

but yeah, like i said, it's not GB of memory and if you have a large backwindow, the memory used to actually load data would dwarf anything the janitor uses at idle.

@jd
Copy link
Member Author

jd commented Feb 1, 2018

Ok… Well I'll just close this for the time being. I think it's good we discussed it, but process isolation strikes me as more important for now.

@jd jd closed this Feb 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants