A Summary of Serverless

7/6/2021

Often misunderstood, the term "serverless", in its most basic sense, refers to a way of building and deploying software that takes advantage of cloud services and avoids the use of "servers" directly.

While many have described serverless computing in-depth, the goal of this article is to provide an ever-growing and summarized list of serverless concepts, tips, and other info:

What does it mean to be "Serverless"?

No server or server infrastructure management
Built-in elasticity & flexible scaling
Pay for runtime, never pay for idle
Built-in high availability
Highly event-driven

VS Containers? (ie Docker)

Containers still require servers
Containers also require infrastructure management for orchestration
Containers have minimal vendor lock-in

But, what about vendor lock-in??

Serverless presents more lock-in than Containers
- but clear benefits such as elasticity, reliability, and developer productivity may outweigh the potential costs of lock-in
To minimize lock-in and increase portability^*
- Keep logic in your lambda functions --lambdas are just functions written in code that can (usually) run with any cloud vendor
- Create abstractions around the “connection points” of your lambdas. For example, create “selector” functions that access data from events sent lambdas. That way, if you switch from AWS to Azure, for instance, you can just modify or polymorphically adjust the selectors to match the vendor.
- Frameworks like Serverless and Terraform can help with some of the Infrastructure as Code and deployment if you are striving to be agnostic
^*If lock-in even matters to your organization. Don’t over-engineer a vendor-agnostic solution if the likelihood of switching vendors soon is low. The cost of doing so now could be greater than the cost later, or the cost could be unnecessary.

Development Challenges

Lots of cloud service configuration to manage
- Infrastructure as Code can be painful, but frameworks and tools like Serverless and Cloud Development Kit can help
Unexpected bottlenecks if involving non-serverless services (like relational DB’s running on servers)
- Async queues (SQS) and/or throttling can help with this
Developers must know about a bunch of cloud services, how they work and how they fit together. Examples (for AWS): Lambda, Cognito, S3, DynamoDB, SQS, SNS, API Gateway, etc
Resiliency: Gracefully handling faults or failures in a highly distributed ecosystem

Development/Architecture Characteristics

Most/all “business logic” will end up in cloud functions (Lambda in the AWS case) --this is also a best practice
Cloud functions will run based on events coming in from one or more cloud services. For example, API gateway will send an event modeling and HTTP request, SNS will send an event will a particular topic and optional payload

Development/Architecture Best Practices

For Cloud Functions (Lambda in AWS)

Error handling: Avoid try/catch if possible - let the lambda fail and return the error to the calling service. This provides better visibility to the error
Avoid monolithic lambdas, keep to SRP with small, focused lambdas, which are easier to test and maintain
Idempotency: cloud services can retry events, so the same event can get called more than once
Dead Letter Queues -capture all failed or skipped lambdas, can set up alerts on this queue and fix bugs, and reprocess
Understand the different Lambda‌ ‌invocation‌ ‌types‌‌ and their respective retry behavior
Don’t store sensitive info in your Lambdas, use a cloud-based secrets store
For IaC and deployment, use a framework like Serverless and/or AWS CDK or Terraform. No serverless framework can truly be cloud vendor-agnostic 100%, so choose your vendor wisely
Testing
- Unit testing forms the bulk. You cannot completely run your code in a cloud-based environment locally, unit tests are a way to ensure your logic and test locally
- Devs should run as much integration testing locally as possible as well --frameworks like Serverless can help with this
- CI/CD --run both unit tests and full integration tests (if feasible) in your pipeline. (Full integration means the entire stack (ie the whole API call, instead of just the lambda )
Avoid lambdas & APIs with little to no logic, that are simple pass-throughs to backend services such as DBs or queues or contain mostly data access code, instead, use API Service Integrations and/or Lambda Destinations
Additional Best Practices

For APIs

Use an API Gateway to handle API routing, security, and other cross-cutting concerns for your microservices
For APIs that trigger async requests
- They will return immediately, so if the caller needs a response after async is processed, use polling or websockets to return that data when the aysnc process has finished
- Async is typically a way to avoid pressure and timeouts for non-serverless systems in the call chain (like RDBs) Websockets APIs through API Gateway are now available
- Respect and adhere to good microservices architecture, serverless gives you all the tools to do this, avoid microservices hell

For Data access

DynamoDB (or any key-value database, really)
- Understand the pros and cons of DynamoDB before just using it
- Carefully analyze how you will be querying data and let that guide your table and key design. Unlike with RDB, DyanamoDB must be designed for optimal querying rather than optimal storage space

For Load/Performance Testing

Although serverless is meant to handle near-infinite load, load testing can still uncover issues with service quotas, memory and concurrency configs, and bottlenecks with non-serverless systems in the call chain
For testing API use Artillery.io

For DevOps

Create custom metrics and dashboards to monitor health
Use an instrumentation tool for in-depth analysis of full call chain in serverless (errors and performance)
Make sure to estimate traffic and implement cost controls to mitigate cost if/when you exceed the free tier
Make sure to have cost controls and budget alarms in place