The purpose of this article is to introduce the difference between monolithic and microservices-based applications and the pros and cons of each; this will be presented in the first chapter. The second chapter will look at how to overcome the drawbacks of a microservices architecture using Kubernetes and how to optimize on operational costs. The third chapter of this series will look at serverless stateless and stateful microservices, how these impact high availability and scalability and recommended DevOps techniques in general.
You are developing a server-side enterprise application. It must support a variety of different clients including desktop browsers, mobile browsers, and native mobile applications. The application might also expose an API for 3rd parties to consume. It might also integrate with other applications via either web services or a message broker. The application handles requests (HTTP requests and messages) by executing business logic; accessing a database; exchanging messages with other systems; and returning an HTML/JSON/XML response. There are logical components corresponding to different functional areas of the application.
Despite having a logically modular architecture, the application is packaged and deployed as a monolith.
Monolithic Application Architecture: In the early stages of the project, it works well and basically most of the big and successful applications which exist today were started as a monolith
Benefits of Monolithic Architecture
- Simple to develop
- Simple to deploy
- Simple to test
- Simple to scale
- Larger code base
- Difficult to onboard new team members because of the large monolithic code base
- IDE lag in performance and started affecting the developer’s productivity
- Takes a significant amount of time to start the application
- Deployments start taking time. We need to redeploy the whole application for even a single line of change
- Application scalability is difficult because it can be scaled in one dimension only
- An obstacle to scaling development team
- One misbehaving component can bring down the entire system
- Requires a long-term commitment to a technology stack
Microservices Application Architecture: To tackle the issues faced in case of monolithic architecture, we split our applications into a set of smaller, interconnected services. Each microservice is a small application that consists of its own business logic along with various adapters. This structures the application as a set of loosely coupled, collaborating services.
Benefits of Microservices Architecture
- Continuous delivery and deployment of large, complex applications
- Improved maintainability – Easier to understand and change
- Faster testability
- independently deployability
- Organize the development effort around multiple, autonomous teams.
- The IDE is faster making developers more productive
- The application starts faster, which makes developers more productive, and speeds up deployments
- Improved fault isolation. For example, if there is a memory leak in one service then only that service will be affected. The other services will continue to handle requests
- Eliminates any long-term commitment to a technology stack
- Increased memory consumption. The Microservice architecture replaces N monolithic application instances with NxM services instances. If each service runs in its own JVM (or equivalent), which is usually necessary to isolate the instances, then there is the overhead of M times as many JVM runtimes
- Deployment complexity. In production, there is also the operational complexity of deploying and managing a system comprised of many different services
Here we will describe in detail how to overcome some of the drawbacks of a microservices architecture and how to overcome these within a Kubernetes environment. We will look at optimizing on operational costs for our application as well.
Kubernetes preemptive node pool and Non-preemptive pool
We found that our services now have consumed a lot of memory and vCPU because each microservice running independently or interdependently requires memory and CPU. This first drawback was the main reason because of which our cost has increased by 400% as compared to old(Monolithic) architecture. Also, to have an exact replica of a live application, every application has the same stack deployed for development and staging environments. So it is, even more, worst situation than before. So for staging and development environments we changed the node pool resource type to preemptive whose lifetime is 24 hours and cost less than 1/3 of the cost of non-preemptive compute resources. The Kubernetes master is responsible for maintaining the desired state for your cluster, so if a node pool or Pod or Container is crashed or evicted then Kubernetes will auto bring up another node pool and restore the service on it. For example, if we have 3 containers of each microservice running on multiple pods under multiple nodes then in case of failure/preemption of any node our application will not face a downtime because the request will be sent to another node pool when it arrives and not the one which is preempted. Also we need to use predefined machine types for our compute engines because custom machine type cost is a bit higher than predefined machine type. But how to reduce the cost of a live application, it still costs a lot?
So we now changed our services to run as stateless, stateful and serverless microservices. We used non-preemptive compute resources for stateful microservices, preemptive resources for stateless microservices and used GCP service cloud run for serverless microservices which runs occasionally or at specific intervals via a cron job.
Building an image for each microservice is a crucial part because it downloads the predefined images and add some dependencies or libraries based on our application requirements and then push the image to a registry. The size of each microservice image depends on how the Dockerfile is written for it, what predefined image being used and how much dependencies are added as per application requirement. We need to use the pre-defined alpine(about 30x smaller than Debian) images because small image size means less time to download the image, push the image and small amount of space in cloud storage to use.
Alpine images are lightweight images and are the vanilla version. There you need to download all the dependencies/libraries/packages one by one for your application to run. This involves bandwidth cost in return but we have a way to in next steps to handle it.
We found that a network cost involved while building images for our microservices. Building an image has multiple parts, which are: downloading a predefined image from container or docker registry, downloading the libraries or dependencies required to run the application and pushing the image to a registry. So, if the container registry is in another region or docker registry, then pushing an image to a registry will not only cost of saving the image to storage but also the network cost to push the image. Large sized images will require more bandwidth to push and download when assigned to containers in a pod. In the case of autoscaling, images are downloaded from the container registry to assign the newly scaled containers and if they are in the same region it will not cost much. But in case of global registry or docker registry, it will download the image respectively and will in turn cost bandwidth charges as well.
For deploying a microservice it needs to build an image and push that image to GCP container registry or Docker repository etc. The best practice is that each image should be saved with a unique name like GIT commit hash or cloud build ID or timestamp. Cloud build has a caching method using which we can build microservices even faster using the previously defined image as a caching layer. So using it we don’t need to download the dependencies from external resources every time the image builds which will save the time of cloud build because cloud builds charges for the time it is being used.
In Monolithic architecture, we have to deploy the whole application if there is a change. But in the case of microservices, we can deploy specific microservice for any specific change within it. This will reduce the deployment time and will not affect other microservices if the latest build failed in that specific microservice. We used cloud build for building and deploying microservices where we have defined the GIT submodules hooks on branches for different environments. For example, three branches defined for three different environments like dev, stage and Live. Change in any of these branches in microservice will trigger the cloud build for that environment. This will not only reduce the deployment time but also the cloud build time cost as well. Also, we can grant access to specific microservice to multiple teams working remotely.
Cloud Run is a managed compute platform that enables you to run stateless containers that are invocable via HTTP requests. Cloud Run is serverless. Microservices which runs on a specific interval like scheduling jobs, reconciliation, reporting, etc can be run on a cloud run. Cloud Run charges you only for the resources you use for the amount of time it is active and it can autoscale based on traffic as well. We will not be billed for the services which are idle most of the time. Cloud Run offers a free tier of 50 hours/month per project on GCP. Yay!!
Container Registry only charges for the Cloud Storage storage and network egress used by your Docker images. The first time you push an image to Container Registry, the system creates a Cloud Storage bucket to store all of your images. You are charged for this storage. The default Cloud Storage class used for most Container Registry storage buckets is called Multi-Regional. So if we have a large size image and are stored in container registry for a long time for each environment then it can cost a lot over time and is kind of unseen charges. So, how we cut our cost in this case? We made our image sizes small using alpine images in the docker file as explained in previous sections and also removed the unique image name for our development and staging environments. This is because of a reason that development images are created whenever any dev made some changes and pushed to development specific branch. Cloud build triggers and a new image was build for that microservice. This was happening a lot due to multiple developers working on a daily basis. So we had only one image per each microservice which saves a lot of disk space. For Live environment, we created a script that deletes all images before a specific time and always leaving the latest image during deletion. What happened with that if we have 100’s of images for the live environment it will cost over time and such charges can be minimized timely.
We created a script which enables/disable our compute engines and DB instances daily on a specific time. So, we disable/shutdown our compute engines and DB instances for our development and staging environments during off timings and also on weekends. Our team works 9 hours a day but the Compute Engine and Db instances running 24/7. It cut the cost to half for the two environments for each project and saves a lot of money because now they are auto shutdown and restart after every 12 hours. Also, using Kubernetes, cloud build, etc we reduced the cost overhead of system admin because with the number of projects we need to increase the Human Resources with time to manage them.
We found a real unseen charge when we discover that static IP’s charges you for every hour if not in use. Yes, it charges you when you have reserved a static IP but Is not assigned to any resources or the resource using it is terminated/deleted. We were getting charged $170/month for reserved static IP’s not in use. We released all nonstatic IP’s and saved this cost as well.
Project-based GCP discount
GCP provides a project-based discount for each service you use. You can get a max of 30% discount on the amount per each service if you use it 24/7.
Application architecture and service types when switched from monolithic to microservice architecture.
What are Stateful applications?
Stateful applications store state from client requests or from the last operation on the server itself and use that state to process further requests. Data is stored in the database but the sessions for clients are stored on server. e.g. a user sends a login request, it authenticates user and enable login to be true and the requests following it will not require any token or Data from user for authentication. User sessions are stored on server and so it doesn’t need to make any call to database for authentication purpose, hence it is faster. But it does have drawbacks which is we can’t horizontally scale a stateful application. Let’s say there is a load balancer and there are two servers running Stateful application behind it. First request to login go to serverA and second request might go to serverB, now only serverA contains session information, the user won’t be able to logic when LB sends him to serverB. So it’s not possible to horizontally scale Stateful applications.
How do Stateful Apps Maintain State Information Between Client Requests?
The load distribution facilities can maintain state information between client requests, including:
- Transaction affinity – in which a transaction’s existence is acknowledged
- Session affinity – in which a client session’s existence is acknowledged
- Server affinity – in which the load distribution facility acknowledges that while multiple servers might be acceptable for a specific client request, a specific server is best suited for processing that particular request
- Revoke the session anytime
- Easy to implement and manage for one-session-server scenario
- Increasing server overhead: As the number of logged-in users increases, the more server resources are occupied.
- Fail to scale: If the sessions are distributed in different servers, we need to implement a tracking algorithm to link a specific user session and the specific session server.
What are Stateless applications?
A stateless application is one that neither reads nor stores information about its state from one time that it is run to the next. it does not store any state/session on the server. They use database to store all the Info. Typically, a user request for login with credentials, any of the server behind LB process the request and generates an auth token and stores it in the database and returns the token to the client on the frontend. Next request is sent along with token, Now, no matter whichever server process request, it will match the token with info in database, and grant login to the user. Every request is independent, doesn’t have any link with previous or next request.
Benefits of Stateless applications
- Removes the use of sessions
- Scales horizontally
- New instances of an application added/removed on demand
- Consistency across various applications
- Reduces memory usage at the server side.
How to Adopt Stateless Applications?
Best Practices for Stateless Applications
- Try to Avoid sessions at any cost.
- If the load on application increases exponentially, distribute the load to different servers.
- Sessions are only useful for specific use-cases such as FTP (File Transfer Protocol).
- Sessions functionality replicated using cookies, caching on the client side.
What is Serverless and Serverless applications?
Serverless is a cloud computing execution model where the cloud provider dynamically manages the allocation and provisioning of servers.
Serverless applications are event-driven cloud-based systems where application development rely solely on a combination of third-party services (knows as Backend as a Service or “BaaS”), client-side logic and cloud-hosted remote procedure calls (Functions as a Service).
Benefits of Serverless applications
- NO SERVER MANAGEMENT: There is no need to provision or maintain any servers. There is no software or runtime to install, maintain, or administer.
- FLEXIBLE SCALING: Your application can be scaled automatically or by adjusting its capacity through toggling the units of consumption (e.g. throughput, memory) rather than units of individual servers.
- PAY FOR VALUE: You do not have to pay for idle capacity. There is no need to pre- or over-provision capacity for things like compute and storage. For example, there is no charge when your code is not running.
- AUTOMATED HIGH AVAILABILITY: Serverless provides built-in availability and fault tolerance. You don’t need to architect for these capabilities since the services running the application provide them by default.
How to Adopt Serverless Applications?
Any application that works seamlessly as a stateless application can be adopted to serverless. A stateful application cannot be a serverless as it maintains the state of the application wherein serverless architecture the FaaS or BaaS storage is ephemeral.