Breaking the Monolith
Micro services are everywhere today and are a critical paradigm to consider when building distributed systems. But like someone I worked with once dryly said “Everyone is building micro services”, it is critical to be clear about the motivations which are often grounded in real business needs. Our end state is depicted below and while part of the journey is the end, in this article I will describe our beginnings about how we actually applied Martin Fowlers ideas around micro services in conjunction with AWS White papers to actually break our monolith into a functioning micro services stack
In a previous job, I had an opportunity to transform a legacy architecture which involved a portal built on traditional JEE, JBoss and Oracle stored procedures into a micro services architecture as a part of launching a new analytics portal. Before I describe our beginnings, I am going to play the Rewind button to depict below where we landed since the end is far prettier than the messy details that had to be worked out before we got here
The first step in breaking the monolith is having a rough idea of what parts of your platform will become micro services.
This is a critical decision since a micro service architecture does not come for free — like anything in life, it has costs in terms of inter service communication, data duplication, etc. Hence a key guiding principle is to Organize them around Business Capabilities. Some of the capabilities such as Authentication, Authorization, Communication, etc are foundational & likely to exist in every business domain. Others would be specific to each business and in our case they became services to deliver capabilities such as analytics, Case Management, Subscription Management, Store Hierarchy Management, etc.
Each of these capabilities would follow its own product lifecycle from concept to production releases and that is a critical reason for it to be a micro service.
Many a time I see teams building micro services to break larger components into smaller one only to find that many of those services simply don’t have an independent product lifecycle. Think Products Not Projects A key question to always ask is this: If one of my micro services is unavailable, what parts of my overall experience would be unavailable? What capabilities would my customers be able to still use? Granted that some services are critical enough to bring your entire experience down (Eg Authorization) but you get my point! In addition when we think of each independent business capability as a product, it becomes an always improving product rather than a one off project. I would also like to caveat that when thinking of product capabilities, not everything has to be external facing — you can have internal micro services that offer a capability to the rest of your ecosystem, in our case Communications Service was a good example since it took care of our email, text, push notifications.
Finally the “grain” is always a topic of debate when decisions are being made on what should and should not be a micro service. There is no silver bullet here and the key is to embrace iterative development — I always prefer to avoid getting too fine grained and then depending on how your product evolves make adjustments to make services finer
Once we had a blueprint of what our micro services end state would like at least in terms of the various services, we began our development journey by adopting the Twelve Factor App pattern. I consider this to be critical to the success of a micro services architecture and I have seen instances where not following this eventually hinders the independent lifecycle & business agility which are amongst others the key drivers of building micro services
How do we rapidly provide business value to our customers should be something that we as engineers are always thinking about.
So for all our services and in many cases components we had a separate Git repositories, it’s CI/CD pipeline, it’s infrastructure, giving every team the independence & isolation. What comes along with this is the adoption of the DevOps mindset — every team had to adopt the mindset of running the service which includes managing its infrastructure, production operations, tests & overall quality lifecycle.
Stateless services are great but how did we achieve Decentralized Data Management? As is the case with most monolithic architectures, in our case all of our data was in a single Oracle database which was accessed and modified by various applications even outside the core engineering team. To balance near term customer needs with long term strategic goals, we had made significant investment in our Oracle infrastructure. We applied a 2 pronged strategy
- For our relational needs we leveraged our Oracle infrastructure but ensured we had separate schemas & databases for every service providing the logical isolation that we needed. Thus from a governance point of view, each service managed its database with the Shared DB Ops team managing the overall database infrastructure
- For non relational databases that we eventually had to leverage databases such as Cassandra for columnar & Neo4j for Graph
With any micro services architecture, you need to ensure that you are building Smart endpoints & dumb pipes. Thus communication between services needs to follow a specified contract. We adopted both the Request/Response & Publish/Subscribe approaches depending on the use case. But how exactly did services communicate with each other? For starters, how did they get the address of other services? We had adopted Spring Cloud since it was the fastest way for us to address some of the core infrastructure concerns for a micro service architecture. It wraps the Netflix OSS which was a battle tested framework for building micro services. This gave us Zuul as our API Gateway which our user interface (React Single Page App, Xamarin Based Mobile Apps) and developer APIs would call, Eureka as the Service Discovery for all our micro services to communicate with each other, Hystrix to build circuit breakers and fallbacks. Service to Service communication made use of the Feign Client which internal made use of Eureka and Route 53
We also needed to Design for Failures — while we used Hystrix as the core framework, things did not stop there. We designed a range of recovery mechanisms. For example for interactive services we implemented fall backs using AWS ElastiCache (Redis) and for asynchronous processes we made use of queues and exponential back offs to recover. Eureka itself ensured that processes not sending heartbeats were de-registered thereby services always communicated with live instances. We used Apache Kafka as our overarching event log and leveraged it wherever we employ an eventually consistent architecture (E.g. for materializing the graph for hierarchy changes). Finally a key need for micro services architectures is centralized logging & traceability to allow engineering teams to troubleshoot effectively and quickly improve their QoS. For this we leveraged Cloud watch agents, Managed elastic search service and Zipkin.
Finally all of the above is not possible without Infrastructure automation. Immutable infrastructure & treating infrastructure as code is paramount to the success of micro service architectures.
Like any new venture, a micro service journey involves a lot of upfront build out and especially for smaller organizations like us it can make us fall into a trap of building the perfect foundation which can take years and make us not provide business value to our customers. So we took a more pragmatic approach. We adopted some AWS services initially such as Elastic beanstalk, SQS, RDS, DynamoDB, etc to reduce time to market. As we delivered value, we slowly improved our automation and stopped using certain services E.g. automation on EC2 vs using Elastic Beanstalk. While all these AWS services were great to us we were also extremely cost conscious. So for key infrastructure components such as our Kafka stack we managed the infrastructure on EC2 using Ansible. Finally, infrastructure automation is incomplete without configuration management. Our adoption of 12 factor app meant that our configurations were maintained separate from code and integrated into our CI/CD and we also ended up using Hashicorp Vault for managing secrets
My last words are that no architecture is perfect and in retrospect I would do a few things differently myself. Micro services are not a panacea for every problem but when done right they can enable you to deliver rapid business value to your customers.
My goal in this blog was to outline with our real world journey, how we actually applied the various ideas in converting a monolithic architecture into micro services making pragmatic choices & adopting an iterative mindset.