A Summary of Serverless
7/6/2021
Often misunderstood, the term "serverless", in its most basic sense, refers to a way of building and deploying software that takes advantage of cloud services and avoids the use of "servers" directly.
While many have described serverless computing in-depth, the goal of this article is to provide an ever-growing and summarized list of serverless concepts, tips, and other info:
What does it mean to be "Serverless"?
- No server or server infrastructure management
- Built-in elasticity & flexible scaling
- Pay for runtime, never pay for idle
- Built-in high availability
- Highly event-driven
VS Containers? (ie Docker)
- Containers still require servers
- Containers also require infrastructure management for orchestration
- Containers have minimal vendor lock-in
But, what about vendor lock-in??
-
Serverless presents more lock-in than Containers
- but clear benefits such as elasticity, reliability, and developer productivity may outweigh the potential costs of lock-in
-
To minimize lock-in and increase portability*
- Keep logic in your lambda functions --lambdas are just functions written in code that can (usually) run with any cloud vendor
- Create abstractions around the “connection points” of your lambdas. For example, create “selector” functions that access data from events sent lambdas. That way, if you switch from AWS to Azure, for instance, you can just modify or polymorphically adjust the selectors to match the vendor.
- Frameworks like Serverless and Terraform can help with some of the Infrastructure as Code and deployment if you are striving to be agnostic
Development Challenges
-
Lots of cloud service configuration to manage
- Infrastructure as Code can be painful, but frameworks and tools like Serverless and Cloud Development Kit can help
-
Unexpected bottlenecks if involving non-serverless services (like relational
DB’s running on servers)
- Async queues (SQS) and/or throttling can help with this
- Developers must know about a bunch of cloud services, how they work and how they fit together. Examples (for AWS): Lambda, Cognito, S3, DynamoDB, SQS, SNS, API Gateway, etc
- Resiliency: Gracefully handling faults or failures in a highly distributed ecosystem
Development/Architecture Characteristics
- Most/all “business logic” will end up in cloud functions (Lambda in the AWS case) --this is also a best practice
- Cloud functions will run based on events coming in from one or more cloud services. For example, API gateway will send an event modeling and HTTP request, SNS will send an event will a particular topic and optional payload
Development/Architecture Best Practices
For Cloud Functions (Lambda in AWS)
- Error handling: Avoid try/catch if possible - let the lambda fail and return the error to the calling service. This provides better visibility to the error
- Avoid monolithic lambdas, keep to SRP with small, focused lambdas, which are easier to test and maintain
- Idempotency: cloud services can retry events, so the same event can get called more than once
- Dead Letter Queues -capture all failed or skipped lambdas, can set up alerts on this queue and fix bugs, and reprocess
- Understand the different Lambda invocation types and their respective retry behavior
- Don’t store sensitive info in your Lambdas, use a cloud-based secrets store
- For IaC and deployment, use a framework like Serverless and/or AWS CDK or Terraform. No serverless framework can truly be cloud vendor-agnostic 100%, so choose your vendor wisely
-
Testing
- Unit testing forms the bulk. You cannot completely run your code in a cloud-based environment locally, unit tests are a way to ensure your logic and test locally
- Devs should run as much integration testing locally as possible as well --frameworks like Serverless can help with this
- CI/CD --run both unit tests and full integration tests (if feasible) in your pipeline. (Full integration means the entire stack (ie the whole API call, instead of just the lambda )
- Avoid lambdas & APIs with little to no logic, that are simple pass-throughs to backend services such as DBs or queues or contain mostly data access code, instead, use API Service Integrations and/or Lambda Destinations
- Additional Best Practices
For APIs
- Use an API Gateway to handle API routing, security, and other cross-cutting concerns for your microservices
-
For APIs that trigger async requests
- They will return immediately, so if the caller needs a response after async is processed, use polling or websockets to return that data when the aysnc process has finished
- Async is typically a way to avoid pressure and timeouts for non-serverless systems in the call chain (like RDBs) Websockets APIs through API Gateway are now available
- Respect and adhere to good microservices architecture, serverless gives you all the tools to do this, avoid microservices hell
For Data access
-
DynamoDB (or any key-value database, really)
- Understand the pros and cons of DynamoDB before just using it
- Carefully analyze how you will be querying data and let that guide your table and key design. Unlike with RDB, DyanamoDB must be designed for optimal querying rather than optimal storage space
For Load/Performance Testing
- Although serverless is meant to handle near-infinite load, load testing can still uncover issues with service quotas, memory and concurrency configs, and bottlenecks with non-serverless systems in the call chain
- For testing API use Artillery.io
For DevOps
- Create custom metrics and dashboards to monitor health
- Use an instrumentation tool for in-depth analysis of full call chain in serverless (errors and performance)
- Make sure to estimate traffic and implement cost controls to mitigate cost if/when you exceed the free tier
- Make sure to have cost controls and budget alarms in place