Replies: 4 comments 3 replies
-
@rdettai This is really, really cool. We still need Redis for synchronization, but adding something like that as a shortcut could be really powerful. Would you be in a position to make an introduction to that team by any chance? I would love to work with them. I have many good friends who came from ETH, and they tend to be really good at what they do... |
Beta Was this translation helpful? Give feedback.
-
@rdettai I would not worry too much about Redis not being purely serverless. We must be pragmatic there. As long as it's managed by the Cloud Provider, it's good enough. Shuffling is critical if you want to support interesting SQL queries, and SNS+SQS have unacceptable latencies (100ms or more, versus submillisecond for Redis). Redis will give us a very solid foundation to build upon, at a very low cost. Once we have the platform running, we can spend time optimizing things on that front. But if we try to build everything at once, we won't deliver anything good anytime soon. Let's not re-invent the wheel... Thanks a ton for you help. I will reach out directly. |
Beta Was this translation helpful? Give feedback.
-
@ghalimi @rdettai I'm one of the authors of the FMI project, and I am the current maintainer of the project since the student working with us moved into a different domain. While our work focuses primarily on the high-performance computing domain, other use cases, such as data processing and analytics, are also close to ours, and we want to explore them one day. In the end, many parallel computing principles stay the same across domains :-) Shuffling and data exchange should benefit from the latency and bandwidth of direct connections. It also fits use cases that might not require persistent data storage, e.g., Spark RDDs and other intermediate containers. FMI is focused on collective operations for parallel computing, but it should be relatively easy to adjust it towards more generic operations; we support P2P communication natively. Please let me know if you have any questions, and find the project interesting and useful for your product, even at later stages when optimizing Redis-based communication. I would be happy to discuss it further and hear about your experiences in building distributed computations on top of serverless functions. |
Beta Was this translation helpful? Give feedback.
-
I am the main author of the https://github.com/cloudfuse-io/lambdatization repository. I would like to get in touch with you regarding your decision of using Redis as a communication medium. I think this defeats the point of having a serverless compute engine and might not play ways with queries that need a very large shuffling between some stages (sorts, joins...). I was wondering if you knew about the research at ETH Zurich about the possibility to create direct connection between lambdas using TCP hole punching: https://arxiv.org/pdf/2202.06646.pdf
Beta Was this translation helpful? Give feedback.
All reactions