Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layers in a monorepo project #8

Open
Pavel910 opened this issue Aug 15, 2019 · 12 comments
Open

Layers in a monorepo project #8

Pavel910 opened this issue Aug 15, 2019 · 12 comments

Comments

@Pavel910
Copy link

Hi @eahefnawy, as I mentioned in our conversation, the monorepo project organization is a pretty practical way to organize an entire project so I wanted to discuss this a little more and provide you with my personal insight into the matter and the issues I encountered. I'd love to use layers to increase the deploy speeds, so let's go step by step:

Current solution

The way the component determines if it should deploy a layer is a simple check if node_modules exists in the inputs.code folder:
https://github.com/serverless-components/aws-lambda/blob/master/serverless.js#L86

Issues

There are 2 issues I ran into with this approach:

  1. in a monorepo project, dependencies are hoisted to the root folder of the entire project (default behavior of yarn workspaces). Hoisting is good, as it reduces the amount of duplicate code in your project. Once hoisted, the dependencies are no longer located in the {function}/node_modules so the layer component would basically deploy an incomplete set of dependencies.
  2. more often than not, projects use babel/typescript (standalone or with webpack) so before deployment there is a build step, which usually generates some kind of ./build folder which is meant to be deployed. If I set the inputs.code to point to this build folder, the component will no longer be able to find node_modules, because those are left in the root of the function folder (where package.json is located).

Layer archive creation

This is related, and I think a solution (improvement) to this, would automatically solve the issues above:

  1. Hash generation
    Currently, you create the hash by reading the entire package.json as a string (https://github.com/serverless-components/aws-lambda-layer/blob/master/utils.js#L73). By reading only the dependencies key from package.json would save you the unnecessary layer deploy in case you change non-production-dependency data in the package.json (like adding a script, a devDependency, etc.).

  2. By only bundling the production dependencies from package.json, you would reduce the size of the layer archive, as very often there will be some devDependencies you don't even need in production.

Now the solution proposal

  1. check whether the inputs.code is the folder containing the package.json, and if not, use the find-up package to find the package.json.
  2. read the dependencies key, and use require.resolve to find the actual location of the dependency (this way you no longer care where the node_modules folder is located and hoisting is no longer an issue).
// This gives you the absolute path to the dependency location.
// Resolving `package.json` gives you the root package folder, and not some `/lib/index.js` or whatever the "main" field is.
path.dirname(require.resolve("lodash/package.json"))

Let me know what you think about all of this.
I'm looking forward to discussing this further and I'm sure we'll find a proper solution for this!
Great work on all of this! 🍻

@eahefnawy
Copy link
Member

Thanks for opening this up @Pavel910, and sorry for the late reply. Was a bit sick the past few days 🤒

Everything you mentioned makes total sense. Specially about hashing the dependencies part of package.json. I think the solution you proposed would work great. I'll try to get to it soon, but a PR is more than welcome 😊

I don't think the solution you proposed handles dev dependencies though, right? I mean it wouldn't detect changes in dev dependencies, which is great, but once you update your production dependencies, your devDeps would be included in the package as well. Could you think of a way we could ignore those devDeps?

@Pavel910
Copy link
Author

@eahefnawy that's exactly what point 2. of my proposed solution handles. Since you are building archive using archiver on the fly, you would only add the dependencies from the dependencies key, completely skipping the devDependencies.

I'll be glad to implement this as I already have a good reproduction case to test this on.
Feel free to send more of your input if you feel something is off 🍻

@eahefnawy
Copy link
Member

aaah archiver on the fly! that would be awesome! Please do send a PR! 😊

@josephluck
Copy link

This sounds like an ideal approach to support monorepo lambda projects! I can lend a hand testing if needed.

@eahefnawy
Copy link
Member

@josephluck that would be great! 😊

@Pavel910
Copy link
Author

@eahefnawy @josephluck I will begin working on this pretty soon, sorry I couldn't get to this sooner but there are more urgent things to take care of to get my platform up and running with serverless :) Once everything is working I'll begin working on optimization, that's where this issue will be resolved. Stay tuned 🚀

@eahefnawy
Copy link
Member

Thanks @Pavel910 ... looking forward to the PR 🎉

@clouditinerary
Copy link

@Pavel910 @eahefnawy - thanks for getting a head start on this... Is there any progress on monorepo support for aws-lambda (and Serverless components in general)? Thanks - G

@Pavel910
Copy link
Author

@clouditinerary I didn't get a chance to work on this, and in our project we haven't used layers so far. We simply bundle every lambda with webpack and that's it.

@clouditinerary
Copy link

Thanks @Pavel910 - I also thought about going the webpack route but there are a lot of pros and cons to consider. Anyway - thanks for the info!

@Pavel910
Copy link
Author

Yeah it totally depends on your project. For us, layers were introducing a lot more problems than they solved. So far we're ok with webpack, it removes a lot of headache and since we have a lot of totally independent lambdas with all sorts of dependencies, we couldn't isolate a set of dependencies that would actually justify the added complexity of handling layers.

I'd be interested to see your webpack pros/cons list, just to see if there is something we could improve in our project we may have missed, or maybe convince you that some cons are not actually such a big deal (since we've been doing it for a long time now). So if you'd be so kind, please share :)

@ration
Copy link

ration commented Feb 27, 2020

While I'm aware that the proposal in this issue is a bit more involved, but for anyone curious here is how I did this:

commonLayer:
  component: "@serverless/aws-lambda-layer"
  inputs:
    name: commonLayer
    code: ../layers/common/dist/
    bucket: ${deploymentBucket}
    
usersLambda:
  component: "../../node_modules/@serverless/aws-lambda"
  inputs:
    name: UsersHandler
    code: ../lambda/users/dist/
    handler: index.handler
   layers:
      - ${commonLayer.arn}

This requires that the lambda component has layers support (e.g. from #15). I left building and rollups etc. to the build process and not a task for serverless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants