-
-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A fork or redis-oplog for infinite scalability #362
Comments
Adding @afrokick, @edemaine, @adamgins, @hanselke, @znewsham, @vlasky, @karen2k, @Floriferous, @nathan-muir, @rj-david Sorry if you feel spammed, pls unsubscribe if not interested. Trying to involve most recently-active users. |
@ramezrafla Did you think about merge your work to this repo? |
This sounds very cool! I do worry about cache consistency / race conditions though. In my app, several users/servers could be modifying the same object at almost the same time, so I wouldn't fully trust the cached copy -- perhaps while doing a local update there's also an update coming (but not yet delivered) via Redis from another server when you do the update? Is there a way to detect this and fix things? |
But you nailed it, to avoid race conditions the current redis-oplog is cumbersome and heavy in db hits @afrokick Jack of all trades master of none :) |
This does sound cool, I'd be worried about the memory cost of this though: |
@znewsham
Instead of full doc we could look into fetching only needed fields -- but that complicates things and somewhat negates the approach of not making db calls. It all depends ... |
regarding the timeout, presumably it is a timeout that runs after the last observer has stopped observing? E.g., it will never expire when needed? I think the storage of full documents would need to be an option for this to be useful, I'm thinking of a few different scenarios here:
I looked into this issue a while ago, and took a slightly different approach using a global map of documents for a collection. The individual observers all point to this global map rather than maintaining their own document store, and add/remove fields to it as necessary, so regardless of how many observers you have, a document exists exactly once while observed. This solves part of the memory problem but not all of it - the other part is in the individual subscriptions (per session) I resolved this by in some cases removing the mergebox - in a minority of cases this results in too many changes being sent to the client (typically where reactivity is being managed by the server, e.g., counts), but in most cases this results in no extra network use, and vastly lower CPU/memory usage. I'm also interested in the 80% -> 7% CPU drop on your primary, is this mostly caused by the cache (not needing to hit DB at all) or by spreading the workload across many secondaries? |
So the way we did it is based on access. Even if you are subscribed to the document but it never changes, why would we keep it in cache? So I like the timeout approach more than server-side observers. We fetch the doc when it is needed (say In terms of full-doc vs selected fields. I see the issue. We don't have large docs by design. We could look into it, maybe have an extra setting. I am trying to avoid swiss-army approach. I want to fit a specific need for 95% of the people than have it bloated and hard to maintain / buggy. Something to think about. The drop in CPU for the primary was due to caching i.e. less hits. The 2 secondaries saw an increase due to secondary reads of only ~8% each. So effectively we saw a global drop due to caching of 80 - 7 - 8 * 2 = ~60%. I know we can do better with our app and reduce db hits further (e.g. more ID-based calls) -- if a |
Got it, regarding the cache - we use it for super fast lookups (fields that are heavily referenced by server side methods) there's no way of knowing which documents are required in advance, so we load everything. We could probably improve this with locality - many older documents aren't heavily required by the server, so we could probably pay the price to do the DB round trip in those cases. |
@ramezrafla and everyone. I repeat and stress this out: RedisOplog is not going in disinvestment We are taking care of big issues and we still want to ensure a good experience. The option you mention can be easily implemented as an iteration over "protection against race conditions" kind of subscription. We need to merge it in. Your solution overrides completely the race-condition protection. Most of my clients even private can handle with the current version of RedisOplog and use of channels subscriptions for over 10k users, CPU is stable and with Race Condition Proofing. |
Thanks @theodorDiaconu Sorry if you were offended by my comment. Not the intent. The pace of updates has admittedly slowed down, and there are many issues + PRs open. If this is sufficient for your customer needs, then great. And also a thank you for our private conversations. As discussed, I am gauging community interest in this approach to reach the next level of scalability. We are not talking 10k users as you mention but millions of users. The current approach will not scale well (at least not with us), but may likely be sufficient for your customers. We lost business because of it as the servers failed in production and we had to scramble to find a solution. This new redis-oplog is the result of long hours of work and testing and works great in production (for us). If no one is interested, we will proceed on our own. It's a business imperative for us. |
@ramezrafla if you are interested in releasing this and having it maintained by the community, the Meteor Community Packages org would make a great home. |
@copleykj Releasing a package for the community takes time and effort to prepare, document and support. That's the point of this thread, are others facing the same issues and are they interested? Seems like no. |
It's good to explore new ideas! I think the only question here is whether a fork or merging or something else is better. Publishing your code would be a good start, so we can explore the differences/similarities... A natural question is whether some of your ideas can be incorporated into this package, e.g., when |
I think it's worth releasing it - always good to get multiple sets of eyes onto it, perhaps the way you've solved a specific set of problems is better than the way others have, perhaps there is a way to make your changes less drastic (e.g., optional) to make it a better candidate for merging. Hard to tell without seeing any code :) |
@edemaine @copleykj It's a philosophical difference, I can't see how we can merge. But I'll leave it up to you to give me your feedback. |
New repo is online: Please don't use in production until you are sure it is working for you in a staging environment with multiple servers. |
@ramezrafla since you mentioned it, have you tried to use the GO application in your use-case to see what kind of improvements it would give you? I didn't write the application (credits to @benweissmann, @mammothbane has also contributed quite a lot lately) but I'm a heavy user of For this reason the combination fitted quite well for us, because it allowed us to skip evaluating changes on a collection nobody has a publication open on. Your use-case might well require a different solution. But I'm thrilled to hear your use-case - or where you see the biggest bottle-necks. Is your application write- or read-heavy, or is it rather balanced? You've most likely already tried the fine-tuning tips ... Would be nice to have a higher-level-perspective of things you were facing. |
@SimonSimCity thanks for your message. First, a side question: based on your description of your application, why don't you just use regular mongo oplog? Meteor was specifically designed for your use-case. You don't need to bother with at least 3 other external packages and services (redis, oplogtoredis and redis-oplog) Based on what you described, our application is the exact opposite of your use-case: we are heavy-reads with a large real-time user base. Which means all the duplicated data that redis-oplog stores kills our memory (I mentioned it to @theodorDiaconu, the 2x duplication of data in Personally, when things went sour and I had to go into the code to figure things out and saw this trivial bug, I lost confidence in the original package. We lost business because of this! People can say all they want, Our challenge gets even more complicated when we scale up. The heavy-reads killed the Memory and DB, we hit 100% on the primary node -- this is especially true with the continuous reads to avoid race conditions. Caching and mutating locally solved the issue for us (since you have the data, you mutate it and send it to your users without waiting for db results). Now ... we do intend on using I hope I answered well. Please let me know if I can clarify further. |
@SimonSimCity |
Well, because each Meteor instance, using the traditional mongo oplog, scans through the full oplog on the database. our write-heavy workload changes documents which are not always monitored by the user. The load of those operations kept our instances busy at 50% CPU - which was the reason we opted in for Would be nice to have a chat with you on the Meteor Community Slack group (https://github.com/Meteor-Community-Packages/organization#slack). Just ping me there privately. |
Just now seeing this. Been busy with our production app seeing a lot of use. Something I’m seeing when our Galaxy containers get a lot of users (e.g. four Galaxy Quad containers sharing 3K to 5K users - which sadly doesn’t seem like a lot for four quads) is dropped DDP or Redis commands. A small amount of users’ apps will just not get the update to stay in sync with the presentation. Could this be related? |
@evolross been having a similar issue with my chatrooms using redis-oplog. I've got the subscription using a unique |
@jasongrishkoff Have you tried experimenting with @ramezrafla's fork yet? I'm going to be getting to this here in the next week or so, so I should know if it helps. |
Yes @evolross it looks very promising, but as soon as I rolled it to production my 3x quad galaxy containers all hit 100% CPU and crashed. I've opened a few issues that @ramezrafla is looking into :) |
Thanks @evolross and @jasongrishkoff mini-mongo doesn't support positional operators |
Just a thought (since there's so many here that have experience with Meteor reactivity at scale)... has anyone experimented with going back to plain old oplog-tailing? Perhaps with larger-sized containers of a lesser quantity? I noticed Galaxy now offers 8x and 16x containers. We run on Galaxy, so this thought had crossed my mind when I saw that those were available. I wondered about performance with oplog-tailing and a lesser number of containers. |
Honestly, the main reason I've been experimenting with I get the idea that @ramezrafla's solution would help considerably take pressure off my database because of the caching involved? But I worry about that cache properly invalidating / propagating changes to users when it needs to. |
I truly feel your frustration. We faced the same exact issue you faced. To be honest, I would never go back to oplog-tailing as you cannot scale. You know the drill, as more meteor instances come on board, your load just watching the oplog grows until you hit a breakpoint where you are spending more time looking at the oplog than doing actual work. My solution DOES work and has been in production for a few months now. I personally invested hours on it and it's crafted and tested with care. I almost lost business because of the original To get to what you want is really trivial. It's 95% there. Your need is an escape hatch (i.e. a bypass) at the beginning of |
There are many mechanisms in place to automatically update the data when it changes, including a race conditions detector which pulls from the DB when there is risk of the data being stale. |
@ramezrafla 100% down to test. I'll find some time today to send you a brief overview of how my app works just in case you see areas that might cause concern. |
Just an update, this morning it dawned on me that I could offload my cron jobs to my mongodb secondary replicasets using |
@evolross @SimonSimCity @maxnowack
(pls feel free to tag more people)
Also Redis-oplog is slowly going into disinvestment
We create a fork (not public yet, considering my options) which does the following (more technical notes below)
find
,update
, thenfind
again which has 2 more hits than neededinsert
- we build the doc and send it to the othe instancesRESULTS:
Here is the technical:
fields: {}
option).collection.findOne
fetches from that cache -- results in cache hits of 85-98% for our AppMutator
we mutate what is in the cache ourselves (if it's not there, we pull it from DB and mutate) - in other words, we don't do anupdate
followed by afind
so usually a single db hit (update
). We also do a diff to only dispatch fields that have changed. Same thing with insert, we build the doc and dispatch it fully to redis.QUESTION:
Is this of interest? Should we have a community version of redis-oplog that we all maintain together?
The text was updated successfully, but these errors were encountered: