AWS re:Invent 2023 - Using serverless for event-driven architecture & domain-driven design (NTA203)

all right I think everybody heard the keynote today from Swami it's quite amazing right all the generative AI stuff this stock is no different it's still the same thing Innovation Innovation Innovation right so welcome to NTA 203 using AWS surus for even driven architecture and domain driven design I'm balachander kileri AWS Solutions architect and a devops systems engineer by trade and I'm here with Jonathan luk Global director of software engineering at city and also Lee Gilmore Global head of

architecture and Technology at City I've had the amazing opportunity to work with the city team please give a big shout out to the city team in the front here uh you know so I've been working with them for the past year and uh they have been continuously learning and trying to reinvent themselves using AWS cist and today is where we learn a few things of how their Journey went in the past year so with that here's a quick look at the agenda the so we're going to look at the different benefits an

d advantages of going towards the Eda and serverless approach and um Jonathan is going to go over the city's Journey with AWS in the past year and then uh we'll have legal more going over City's architecture so with that um how many of us here in the room are already users of AWS serverless or even driven architecture okay that's a handful that's very encouraging because 14 years ago I was super proud of getting my Linux server installed along with the databases configured in the ha mode and you

know that was amazing back then writing scripts and now that's different the ages you know his transformational of what we can do today with AWS serverless all I have to do is focus on my business logic so with that generative Ai and AI driven era we have we as Enterprises are continuing to transform our businesses and in this day and age it's even more of a greater need to reinvent ourselves so how is AWS going to help you in this journey right if we think about Reinventing ourselves the first

thing is we have to think about what are what is the value we providing to our customers and how can we reinvent ourselves it's more of a design question even before technology comes into the picture and to get to the design problem question you first have to think about events what are events events are pretty much everywhere and to go ahead and innovate yourself you need to completely understand not just your own product but your business domains and what are all the events within those busin

ess domaines uh later on Lee will touch a little bit more about the domain driven design but specifically about events events can be anywhere right within the airport where you went through the process of uh you know getting a flight ticket and checking in that's an event right and maybe in in your hotel and it's check-in process customer experiences get defined by events so this is so fundamental that this day and age of digital software Innovation we as Enterprises have the responsibility to d

o the best for our customers and events are one way to approach this problem from the bottom up so with that into the context let's talk about uh what Melvin convey said convey slot it's very popular and I I come very close to it primarily because as a systems engineer I can understand what he's saying right and we all do so any organization that designs a system defined broadly will produce a design whose structure is a copy of the organization's communication structure why is this important th

e more tightly coupled our processes and our teams are software is just ends up being a communication uh implementation of our teams are set up and within an organization it is our leaders responsibilities to keep it very Loosely coupled because that'll end up building scalable systems over time and this is all taken into consideration from the long term right so even driven architecture is one way to help you promote this Loosely coupled architecture mechanism right so let's take a look at what

does AWS servus do for it because even driven architecture is not new we've been talking about it for a while but what where does AWS cist come into the picture enabling this journey of orchestrating event driven architectures the most important thing as a systems engineer I realized is yes it's it was so satisfying to install install a server upgrade a server doing all of that but I I didn't really care about what was the bigger picture of my business but rather focused on my own individual Fo

cus or the functional role of what I was doing and I was happy about it but when I got to know the bigger picture of how how I as an individual can contribute to my bigger business value then it all made sense so that's where I I you know I fell in love with seres and wner talks over time and uh this is what I see like every single event driven architecture needs a number of services across event producers event consumers and the broker you can have your team spin up those services or use AWS se

rverless but the best part about is with AWS serverless you get all those Services across the board and you're able to do all of the um underlying boilerplate code with very very minimal effort right so that's the beauty of it like it's off the shelf no infrastructure to manage and you only pay as use and also one more thing that I realize is cost is important factor for most of the Enterprises when something is fancy and complex and too costly Enterprises try to avoid it they don't want to you

know spend too much even though it's uh it looks fancy right but when your developers are Unleashed with this freedom of innovation of getting yourself to experiment really quickly with event driven serverless model it becomes very interesting of how fast we can innovate as a business and that's where the combination of even driven architecture and servus together provides us amazing benefits of Greater developer agility scalability and fall tolerance and lower TCO so when you think think about

greater developer agility and extensibility when tightly coupled systems hard to innovate because feuture velocity slows down you want to understand your entire ecosystem of services and components and one change on one side of the component affects the other change so hence risk is there and that's why we slow down and that's just part of the nature of this problem right but with the event broker in the middle as Central event broker you're now decoupled so what happens any team member within e

ach of those competents is free to innovate and all they have to understand is even broker itself and the event that they are actually working on so this is why in the bounded context later on when Lee explains why we need to have a bounded context it makes sense to understand what are the events within those bounded contexts and Taco Bell is an amazing example of this developer agility story so they realized during the pandemic they had a sudden surge of orders from delivery online delivery par

tners and those orders have to be integrated with the point of sale system in those stores so that that the customer orders are delivered right seamlessly and they are able to achieve it they were able to do this integration by integrating with those existing systems not much changes done to those but having an even driven serverless approach and creating this API layer to integrate with their point of sale in Store applications all of the poc's were done within a span of two weeks and within tw

o months this application right that you see here was in production and they were able to scale this by low testing to up to millions of orders per hour and later on they also added further Integrations with other delivery partners and the next benefit you see here is increased scalability and fall tolerance I've been um as a systems engineer up all night sometimes burning the midnight oil just to troubleshoot my system that you know one one small change that brought down the entire system and t

hat happens a lot we have all been there and we want to avoid that even driven approach is one mechanism because what happens is it avoids this cascading failure effect by having this asynchronous way of communication right from the order service let's say your shipping service has some errors and doesn't respond it's okay you still have your responses back from your inventory service and then you make sure you you have the shipping service in fix and then you're fine right so but the most impor

tant thing that we realize is serverless also provides this fall tolerance built into it Lego is an excellent example of using this approach to rebuild their e-commerce platform um you know from scratch and then they were able to successfully scale and meet their Black Friday needs so with that I know the final thing is one of the benefits is lower TCO we always don't think of cost as the first thing when you're solving a problem but in this Ser list and even driven approach is fundamentally imp

ortant for us to think like that primarily because you need to know what are the events that have an impact to the business bus value and then you need to know like you know how do you architect those in a scalable Manner and with event driven and seress approach you don't have no infrastructure to manage it's all payer use and you don't pay for the idle time and Liberty Mutual is an amazing example of it early adopter of adws servess and even driven architecture and they processed over 1 millio

n transactions for just $60 that's amazing right like I couldn't do it 10 14 years ago it's it's not not even thinkable right so now taking a step back yes we know it's a design problem we have solved it with the help of serverless and um you know the ability to have things spin up just when we need it and that's important but everything starts from the grounds up you developers have now more freedom to unleash their creativity and that's where it starts because us business enablers have provide

d them this platform which can enable you to succeed more for your customers right so that's where you developers get to brainstorm we even call it even strong right and that's super important and Lee and Jonathan are going to talk about it as well a little bit more and at the end of the day when you have these Loosely coupled teams and products coming together it's going to enable greater business Innovation and even uh for example generative AI tools like code Whisperer is going to play a sign

ificantly important role in accelerating this developer agility even further enabling you to modernize your future better and at a much much much better Pace right and this is all with the focus on customer experience at the center of it with that let me bring in Jonathan Luke onto the stage please thank you thank you Bala thank you everybody for uh joining us today I appreciate you all coming out and listening to our story uh so what I am planning to do is to kind of give you uh the why what Wh

o U backstory around what we did in in what we achieved and and why it was uh why it was important uh so first off just a little bit about me uh I'm a software developer by trade started out as a developer on an Erp system and sort of uh raise Rose through the ranks within City and ended up being the deputy Global director of uh software engineering and right now my current role is I'm I'm responsible for the North American uh software engineering teams so I've been uh at city for about 20 years

uh so I've got a lot of good experience uh within this within this market so who are we at City Electric what are we right we are a F line electrical wholesaler we sell wire pipe Breakers all the wonderful things that keep the light and the energy flowing in your home and in your office as far as uh who we are as a business we are privately owned we've been in business for 40 years um we have about 705 uh branches in North America and we did just shy a$3 billion of turnover last year and uh we

have about 2.3 million lines of lines of inventory so our story right we want to talk a little bit today about um you know moving into event- driven architecture and cus and a little bit of where we where we came from So speaking of where we came from so this like many you know um existing businesses you have a big giant bowl of spaghetti that is your software ecosystem uh right now we have about seven bespoke products that cover the entirety of our operations within City and they're in all diff

erent languages all different databases uh all different Integrations release Cycles things things like that um I want to take do take a second to complement that it has um these software platforms have driven and grown our business from you know where we were a few years ago which is around 300 million to now 3 billion so um while it worked um eventually you do have to move on and you have to mature your your your architecture so here's the a little bit of uh the teams that help support this um

I don't know how many of you have these similar challenges uh we have teams that are spread across um a good chunk of the US and some in Canada and so we're in two countries 13 states and three different time zones and we have about 60 plus people supporting all the various uh platforms and infrastructure uh that allows these these these software systems to to run okay so with that being said what are the challenges right what are the problems why are we here to try to you know um find Solution

s well one of the biggest problems we do have uh within City's architecture at the moment is it's not uniform um we have seven different platforms and seven different idees with seven different databases and so it's just all over the place which um breeds or brings in a lot of uh duplication of work because there's no sharable components uh some of the platforms kind of age out a little bit because like with a lot of Erp systems when you get in it's very tough to get them back out because it's g

ot got momentum so you deal with that um aging platforms no uniform architecture duplication of work and so that drives up costs and Technical debt uh which is a which is a challenge and then that then leads into people right so you've got people that are supporting these systems and with these systems that are very siloed it's very hard to move people around between those systems so you have challenges with recruitment you have challenges with onboarding um and then the other thing that we end

up having a problem with is just keeping up with technical Trends so we obviously wanted to make a change right but we had to start somewhere and what we ended up doing is we had City we do a yearly inventory and we call it call it stock take uh this is where we stop the business for about 12 hours and we count all those 2.3 million uh inventory um those lines of inventory and we do this so we can validate the counts in our system and this allows things to get picked up so if there's human error

s physical errors system errors it allows us to reset uh that inventory so why do we do this why is it important so within City we have the luxury of being in a essentially a profit sharing businesses so our uh employees share in the success of city and then this inventory number uh feeds into the profitability of the branch and that then feeds into people's uh compensation so it's very very important we do it right and it's accurate uh and and and efficient so this is what we do today this wond

erful piece of Hardware that is a million years old so uh it is an old motor roller scanner if you have poor eyesight or big fingers you are in a world of hurt trying to use this stuff out in the warehouse uh it's a very dumb device it does batch downloading over serial ports uh there's no feedback there's no interaction um it's a very cumbersome tool to to use and so we said you know what we've got to move on from this uh for many reasons uh the ones that I've given before but also we can't Sou

rce these things anymore so uh we have to move on to new to new te new technology so where do we go right what do we want how do we structure a solution um you know for this for this problem you know trying to embrace the new and what we're here and what Lee is going to talk about uh while sort of feeding into into the old so here was just a handful of things that we sort of put together to have an outline for uh a solution right we we at City uh have a history of building things our own so we w

anted it to be bespoke we wanted it to be cloud-based we wanted to make sure that it was modern architecture uh something that was reusable that we can do as a um foundational piece of other projects that will come on the heels of this we also didn't want to go wholly on our own right we wanted to leverage uh existing knowledge within City uh and Lee will probably talk a little bit more about that as well uh and then we also wanted to utilize proserve and um you know the assistance that the aw u

s teams uh can can bring so so in regards to that I can't I don't know if any of you have had the luxury or the privilege of working with the AWS account teams or the pr serve teams um they've been absolutely fantastic uh and amazing with us as many of you know AWS is web yeah AWS is a massive set of tools and Suites and services that you can get totally lost in it's like an absolute Forest um and what those teams do is they help us to navigate um that forest and making sure that we are pulling

the right solutions for the right problem um they also bring in various different type of Architects and experts in those fields to really kind of guide us it's been amazing and and individually I can't say enough good things about uh the solutions Architects and the technical Architects that have come from AWS they've been incredibly supportive patient worked with us a lot of good stuff um the other thing that they've provided which has been brilliant is upskilling uh and and training right rig

ht so we're coming from a very on-prem Legacy type of development environment that you probably all are familiar with uh moving into cloud-base so there was a big effort around upskilling and bringing knowledge and expertise into the teams and which they they've been amazing at um we've mentioned a couple of times here uh CF just to kind of give uh you quick background of this for about 10 seconds 30 seconds um is that this is the founding member of City Electric um City Electric as a as a busin

ess was founded in Coventry England and 1951 uh and it expanded in Us in in 1983 so um we had different businesses at that point different technology teams uh and then they were running into problems several years back and they started to embrace cloud and serverless uh and that's where um you know Lee and others have joined us and so we were able to kind of lean on their expertise to help guide you know our our journey so this is kind of what we did and this is where I'm really most most proud

of right we took all the these different challenges new technologies new team uh philosophies the whole nine yards and we're able to build something incredibly quick and Incredibly effective um and Incredibly cheap so we went from teams that had very little experience to no solution to within uh within nine weeks we had a full uh production ready application uh that worked exceedingly well um us being a little bit reserved and careful we only did this within 16 locations and we had 78 end users

uh but they it worked and it worked fantastically um this is kind of what we did as a quick little screenshot um it was all um on zebra scanners um it was a view front end it was very interactive guided Journeys very simple UI ux for um you know for our users and then the things that uh also was a benefit now because it was all Wii uh it was able to interact to our existing Erp Erp system dynamically so we were able to to get feedback back into the scanners so it was an incredible experience for

the um you know for the users so in all of this what did it cost um you know there a little joke here it cost us peanuts um the the night to run all these systems cost $463 uh which was a amazing right um granted it wasn't the whole business but it still shows how quick you can develop an AWS and how cost effective it it it really it really is so here's just some general feedback that we got from our Branch staff the short short uh version of this is they loved it um saved them hours of efficie

ncy it worked um it was clean it was just a really great experience for them and the syncing with the our current Earp systems it was just it was it was a fantastic result the the team should should be really really proud so what's next where did we go from here right um well first off uh nothing is perfect and neither are we and neither was this and so there was some feedback that we had to go back and address and correct and so that was a big thing that we were going to be working on the other

thing that we wanted to tackle is scalability right so we have 705 stores and we only did 16 so we've got to scale this out by 30X or more um there were some bits around um the code that we kind of took shortcuts on as you do in a 9we timeline right you got to go quick and you've all probably have been there at some point where you kind of beg borrow and steel to get through uh and hit the deliverables um but there are some things we wanted to go back and revisit um and then like every other pr

oduct we want to make sure that we are continually improving this product and making it better for our users so we're going to add obviously more feature sets to it as we um you know as we uh as we move on so um like to introduce Lee up on to uh the stage now hi everyone so my name is Lee Gilmore and I'm the global head of architecture and Technology city and what I'm going to talk about is the Global Tech strategy that allowed us to build this out in 9 weeks um the teams previously hadn't used

serverless AWS or typescript and did really well to get this out uh production ready so what we're going to talk about is these five key areas so we've got servus first supported by AWS domain driven design and team topologies Ada so the event driven architecture strategy our evolutionary architecture and then we're going to touch on the Global Tech radar so servus first supported by AWS so historically we come from data centers in VMS uh with City CF and CES now this move to servus allowed us t

o be uh really quick to Market we could innovate and we could um get this out very very quickly and then you know iterate on these particular designs and we also had reduced operational complexity so we're not having to patch servers anymore that's part of the shared responsibility model with AWS and what this gave us is massive scalability and obviously High availability too and now we've got a future proof technology stack as we move away from uh cybase and Deli to typescript cdk and serverles

s Technologies and as Jonathan talked about earlier obviously we've got lower running costs for anything that we actually run in the cloud and just to note here we actually built out our reusable reference architectures as well so we've got something called the city CD care where we can actually start a package these parts of IAC and take the cognitive load off teams actually allow them to um utilize this through an mpm package and although we are servess first as a mindset it is still buy befor

e build so if there's no competitive Advantage um in building something out we're just going to buy that off the shelf and as Jonathan said earlier obviously we've been supported by AWS proos serve so 18 months ago we had some initial wins in the UK um teams were very siloed they' done some small MVPs got that out and and it was a learning curve but there was no reuse um no standards really in place so when myself and proserve uh first started 18 months ago we did a gap analysis and this allowed

us to move towards a cloud Target operating model so we generated a backlog of work for that and they also supported our reference architecture um that actually included being on site and doing whiteboarding sessions with us and obviously training and upskilling so a lot of uh immersion days and actually working with the teams to help up skill them and as I said before this has allowed us to move to what we call the city CD care so taking the cognitive load off teams and having these reusable b

uilding blocks of code which they can use and coming from a data center it's very different to uh running in the cloud so they also supported the security teams with best practices and standards so once we did all this we then started looking at actually what we're going to build and who is going to build it and this is where we looked at domain rriv design and team topologies so this is a great quote from Eric Evans uh he wrote the original Blue Book for anyone that's read that and this quote r

eally resonated with us which is the heart of software is its ability to solve domain related problems for its users and that's what we need to do we need to go back to what is the actual problem that we're solving through these service architectures so domain driven design aligns real world business domains um with real world problems with architecture and uh software engineering so we're actually building something that the customer wants and it places the key focus on the core domains what ma

kes us unique where's the competitive advantage that's where we want to focus our efforts and it also promotes continuous collaboration between technical so engineering teams and domain experts I the customer and they agree to these models and they refine them um so we're all in the same wavelength with what we're actually building and it also allows us to understand the key domain events so what are the significant changes that that are happening within the domains so I said the word domain the

re A lot of times but what actually is a domain so if we went to the Cambridge Dictionary definition we'd say things like an area of Interest or an area over which a person has control you'd better ask Paul electronics is not my domain I'm afraid and boardroom decisions are the exclusive domain of company directors so if we look at tangibly what that is we've got fuzzy interpretations of interest and meaning we've got boundaries for language and Concepts and distinct boundaries are ideal for sys

tem design so good fences make good neighbors that's a a common um quote that most of us would have heard of so then we've got this notion of our overall domain this is City and this is very complex this is across manufacturing warehousing a sale customer and if we went to build this outright we'd build a complex monolith a big ball of mud so what we typically do is look at things like B storming and uh context mapping to start to break this down into smaller subdomains and we've got a few of th

ose listed there and as we do this what we find is there's three types that we need to look at so we've got core now this encapsulates um what makes you competitive what differentiates you with your um competitors and this is where you want to spend about 80% of your engineering effort and then we've got supporting so these support the core domains but don't really have uh value in the on their own and these don't really affect their competitiveness and then we've got generic um so this is typic

ally where it's very complex to build and more often than not we're just going to buy this off the shelf so to bring this to life a little bit at City we would have core domains such as order fulfillment product customer and price uh this is very unique to what we do at City it makes us competitive we've got supporting things like data bi Integrations and messaging which underpin everything we do in the core domains and then we've got generic things like finance and HR everybody uses the same sy

stems pretty much so we're just going to buy that off the shelf and integrate into the other systems so as we start to break these down into these uh different subdomains what we typically do is look at the Domain model so this is your structured uh knowledge and problems within the domain and this is made up of rules and languages and Concepts and that's key here it's all about language and it should identify the relationships among all of the entities that happen within the scope of that domai

n and it's typically modeled with things like Post-it notes if you're doing event storming or could be sod code and diagrams and this is to make sure we're all on the same wavelength with what we're building between the customer and the engineering teams so it's just a construct to allow that and as we start to build out these subdomains we've then got the notion of bounded context so a bounded context is the boundary of the subdomain model and it's always is desirable to have a one toone map in

between the domain model and the boundary context so the subdomain therefore equates to the problem space so the language and the concepts and the rules within that subdomain and then the bounded context is the solution space and this is where we're going to build uh microservices and one team should build and maintain within that bounded context because we want to limit dependencies on teams we want them to do a full full slice of the work and one team how however could actually own multip bou

ndary contexts so again to bring this to life a little bit we can see a simplified diagram here so we've got HR that's generic we're just going to buy that off the shelf we've got data inbi that's a supporting subdomain so we're going to probably have a one to one ratio between build and buy and then we've got price and customer which are both core this makes us competitive and we're going to um spend a lot of engineering effort in building these out and we've zoomed into customer a little bit t

here to see the domain model which is in y yellow there and you can see for the customer subdomain this part of the overall system this is where we've got language processes and rules and you can see that red dotted line around the edge and that's the bounded context and that's where we're going to start building out these microservices so we can see here that within that bounded context it's well encapsulated um we can see that we've got one or more microservices that are communicating within t

here and they can communicate privately between themselves they might have private domain events there um but all of that is fully encapsulated so the only way that we actually talk to other domains is either expose certain functionality through rest apis or through raisin public domain events and one thing that's missing from that diagram is typically we would have an event Bridge custom bus in there as well uh it City and that allows us to also consume domain events so now we're looking at tea

m topologies and this is a great quote from uh Matthew Emanuel who wrote the book on team topologies which is choose software architectures that encourage team scoped flows so how do we allow teams to work very quickly so team topologies is an approach to organizing business and tech technology teams for fast flow and limiting the the dependencies between them and there's four fundamental team types that we need to align to and this helps reduce the cognitive load on teams and streamlines the in

teractions between them so these are the four fundamental team types that we've aligned to at City so we've got stream aligned uh for us that would be our domain teams or in some companies this would be product teams and these are long liveed teams uh that typically work on a backlog of work and then we've got the notion of enabling teams and these are teams that help support the the stream Aline teams so this could be uh for example architecture or ux practices so we might want to actually um h

ave Architects work with the team help upskill them overcome certain problems and then step back and let the team work away and we've got complicated subsystems which we don't actually have at City this is where we've got significant mathematics calculations or technical expertise needed and then finally we've got platform teams so this is teams that build internal developer platforms again this supports the Streamline teams and allows them to move quicker so how has DD and team topologies affec

ted city as an organization so we can see here that we've got our enabling teams in yellow so we've got Enterprise architecture um typically they're looking at the North Star architecture and supporting the stream align teams and move into that we've got the architecture enablement team and the architecture practice engineering ux and QA practices so again supporting and upscaling those stream align teams and we can see the stream align teams there running left to right the purple ones so these

are all Long Live domain teams if we go back to DDD and underpin in all of this we've got our Cloud engine engineering teams who build out all of the platforms for us and if we overlay the DDD aspect onto this we can see our cems um we've got things like product and Price customer and sale this is where we're going to spend a lot of engineering effort we've got supporting things like the common component team and bi analytics and then we've got uh the generic subdomains these are things that we'

re just going to buy off the shelf typically so if we then zoomed out and looked at the high level architecture at City now we call this serverless architecture layers or cell architecture you can see at the top we have channels this could be ADI this could be chatbot it could be um web or mobile and they only communicate through this experience layer so this is backend for front ends typically um these don't have a lot of business logic in them and this is just the interface into the domain ser

vices so we can see there we've got integration apis B2B and web and mobile and they then talk to the demands Services these are private to AWS network uh so we use private link to actually talk between them with sigv4 and I am authentication and what you can see here going back to team topologies we want the teams to do the full slice so with the branch WMS project the team did the front end back end for front end uh the back end for front end sorry and the backend uh Warehouse API changes so w

e don't want to have dependencies between teams we want them to do the full vertical slice and then we've got our data less we can say we've got our Enterprise service bus and we've got things like our data Lake and data warehouse you'd also have machine learning and AI in this uh area and then we've got the platform layer so this is where we've got things like develop experience Landing zones and pipelines everything think that um supporting the teams above with undifferentiated heavy lifting a

gain we want the teams to move very quickly so spinning up an AWS account or um equivalent should be very quick and then on on the left hand side we've got our cross cutting so this is typically something that affects everything within an organization things like logging tracing we've got our own SDK and obviously we've got our city cdk that I talked about earlier and this is managed through an enablement team so this is our common component team who build these out so we saw an ASB on there and

a lot of people will be thinking about SOA style architectures which we don't actually have at city but you can see we're now going to go on to the Eda side of it our Eda strategy now Bala talked about this earlier um and this is a great quote from Vera Vogal as well that everything fails all the time and this is why we want to decouple our seress microservices so the Ada strategy that we've got has allowed us to have decoupled domain and experience services so typically we only uh communicate

through events there are always times we need to do synchronous calls through rest but it is event driven first where we can and this enables these services to scale independently and the events are based on our domain models going back to DDD that we talked about earlier and again Bal talked about this earlier but we can uh utilize error handling so things like dqs so when a service comes back online we can replay the events and again the customers not affected so I said the word event there qu

ite a few times but what is an event so there's two types that we use at city so we've got a typ typical event this is a domain event it's something that's happened it's immutable it's in the past this could be uh order created or invoice generated and then we have the notion of a command which we don't use a lot at city but this is where there's an intent aimed at another domain something like send email or generate PDF now I won't read this for bettim this is a quote from uh AWS about an ASB b

ut what this typically is is allowing different uh services to communicate through events in a in a standard way so going back to servess architecture layers we can see in our data lay we've got our ASB which is uh Amazon eventbridge and we can see that we have events flowing between the experience layer and the domain lay there and to bring it to life a little bit we have things like payment consell order created user logged in or product updated now the reason we call it an ASB is we use somet

hing called the single bus multi account ptn so you can see on this diagram you've got a central event bus that everything flows through from an an event perspective and then we can see we've got the blue uh orange and purple teams here and they're publishing domain events to the central eventus but also consume into their own local eventus as where they need to so we've got that asynchronous two-way communication and again zooming out a little bit and having a look at an example at City we can

say our stream aligned team has an experience lay and you can see this is our customer web app and it's publishing domain events which are going into our data lay so our ESB and this is our shared eventbridge account and then we've got Target rules that root to our domain layer so these private domain services and these again are looked after by our stream align team and we can see the ORD domain might do some processing and the customer service might do its own processing off the back of that n

ow you can see the customer service the domain layer is then publishing back to the main vmus and then that's being consumed again in the front end so you might use something like Apps sync subscription you might want to do that in real time and give some kind of update to the user or maybe cash some kind of read store um in the experience lay so the key learnings for us using amson event bridge is you haven't got guaranteed ordering so when you need to to do this you need to look at SNS and sqs

F4 patters um there is a way of um using the two um in conjunction um if anybody's interested you can grab me after the talk and we can talk talk that through so Publishers should always validate the events you should be a good event citizen because a lot of other domain services are going to consume that event but at the same time you should always validate anything you consume and Amazon event Bridge supports at least one semantics and that means you could get the same event multiple times so

your Downstream Services have to be i' important they need to be able to ignore the subsequent events what you don't want to do is charge a customer twice or maybe send an email three times so we've went through very quickly there the Strategic side of DDD and now we're going to look at the Tactical side so we're going to look at evolutionary architecture so again this is a quote that resonated with us um this was by a guy called Robert C Martin or Uncle Bob as some people know him as and this

is when any of the external parts of the system become obsolete such as the database or the web framework you can replace those obsolete elements with minimum of first now at City we knew this was transient so to start with we would call through to our monolithic API we knew that over time We Gather insights and we might want to swap out certain AWS services for other ones and we knew as we broke down our monolithic API we'd break it down into domain services and that meant we'd have to start ch

anging different HTTP calls hidden different rest interfaces and we' have different dto objects coming back to and some of these services that we build might be more applicable to Containers some might be more applicable to functions and vice versa and we only find out some when they're in production we've actually got some kind of scale going through them and again for the same reason we can't know all data access pattern up front so we want to be able to swap out a certain database during deve

lopment if we need to without a full rewrite so at City we've got two types we've got the lightweight version of hexagonal architecture and then we've got the full version now we can't talk through the full version today CU it covers things like repositories um use cases Aggregates aggregate Roots that's a talken its own right and but we'll talk on talk about the lightweight version so we use this typically when it's um more of a crud style service it could be an our experience layer or an integ

ration and the full version we typically use when there's a lot of business logic in there a very heavy domain model so if we look at this example here we can see we've got API Gateway which is invoking a Lambda function that's what the the purple circle um denotes and we can see that we've got primary adapters on the driving side so this is some kind of input that's coming in so this would be taking the event from API Gateway it would be transposing it maybe doing some kind of instrumentation a

nd logging and then call in the use case for the business logic and eventually that's going to return the status code and the body backed API Gateway so this is completely divide of any business logic and it's completely just framework but as it calls the use case again this is vice versa it has no notion of Frameworks it is purely domain logic and this will have to persist data or retrieve data so we've got Dynamo DB here and that does it on the driven side through something called secondary ad

apters and we can also see that we publishing a domain event there to Amazon event bridge now the beauty of this comes in where we might want to go with a storage first pattern so we might want to go from API Gateway to sqs now and then Lambda so all of a sudden we've got this um primary adapter that needs to understand sqs so what we typically do is write a new primary adapter for SQ and we can slot that in very easily and nothing else changes from left to right um sorry and then if you look at

the driven side you can see that we might want to swap out Amon of vbridge for SNS again we can create a new secondary adapter for SNS and with the Bor adapters we can just unplug um the event Bridge one and plug in SNS very easily now if we' done this as one big Lambda Handler this would have been pretty much a rewrite and and then if you extrapolate that out across maybe 100 or a thousand lambdas that's a lot of work so I just want to touch upon the Global Tech radar that we've got at City be

cause this has really underpinned what we've done so a Global Tech radar is something that was created by thoughtworks and this allows us to look at these four quadrants so tools and techniques platforms languages and Frameworks and then we've got these rings so we've got adopt now for us at City this would be things like typescript in the cdk and this is what we want teams to go to um anytime they build out a new service but then we've got the notion of onh hold so we don't want to do any new d

evelopment with cbase or Deli and this makes sure that we're all in the same wavelength with what we're building globally and things like gen might be in assess so we might look at Amazon Bedrock for example and as we start to use this a little bit more we might put this into trial and eventually as we use this in more and more services and we think we get real value from it we can move that to adopt in that way as a business we're all aligned to what we're actually building and this has really

supported our common component team so they're building things out like our city cdk so the composable architecture now if we had teams using terraform and servus framework they wouldn't be able to use this um particular framework that we're creating and that's exactly the same with our front end components as well in our design language so we did this over 18 months in the UK and this has really underpinned the first bit of work that we did in North America which which is our Branch WMS project

so we needed to use the agility and speed of servess and really thin slices as as Jonathan said we had quite um tight time skills to get this out and we knew it would take time to strangle the monolith that we have in North America so the first iteration was going to be to call through from the experienc lay through to the the domain lay and future iterations over time would then call out to The Domain Services as we start to break those down so that means we had to have evolutionary architectu

re we wanted to make this really easy for us uh for ourselves and we used the pattern best practices and reference architectures that we'd build out over the past 18 months to really underpin the work that we did in North America and the North American team also aligned to the tech radar which meant that they could start to utilize the common components that we had so this is a very simple uh diagram there's a lot of services that have been removed uh there's only so much I can get on the slide

but you can see here we've got a scanner device um so this is the somebody uh somebody doing the stock tap we can see we've got AWS WAFF because we want to make sure that we restrict the use of this down to the actual branches uh the vuejs up is in S3 and we've got a cloudfront distribution there and obviously there's other services at player like rout 53 and there's also authentication so it's authenticated with a zad and then we've got API Gateway as you can see they're calling through these L

ambda functions within a VPC now the reason we've got the Lambda functions in a VPC is because typically as I said earlier the communication synchronously is using sigv for an I am so we keep everything private to the AWS Network and again going back to DDD we can see these Lambda functions are raising public domain events and they're going to our shared Ada account so our shared Amazon event Bridge bus and we've got Gateway endpoints and um VPC endpoints there for talking out to parameter store

and Dynamo DB again keeping it um private to the AWS Network and we've got a nut Gateway there because we want to make sure any egress is all on a B on a static IP address and this allows on the right hand side the on Premier pii to allow list um this particular experience lay and again as that's fully authorized you've got a machine machine floors there so client credential Grant floor between the two so I'm now going to invite um Bala and Jonathan to come on stage and talk about what we actua

lly learned as part of this project thanks Lee so obviously there was a lot of work went into this across the board and you know excuse me like everything else there's wins and there's losses and things we did right and things we did wrong um the thing that I want to most focus on is the fact we did it you know we took a lot of these Concepts we took a lot of this new technology and we found a way to deliver a solution in a very um in a very quick and efficient way that had very minimal cost uh

impact which was sort of impressive uh the team did it amazing job working together rallying around each other uh to to deliver to deliver this solution with all the different supports coming from AWS and from our colleagues in in in Europe um but like everything else um there's things here that don't work as well so uh we did incre some technical debt like we mentioned before that we had to go back and and correct and clean uh that was a little bit of a siloed approach kind of just we needed to

do that in order to get it delivered um and then a lot of the processes that we had were ad hoc right so so we kind of had to go back and and look at restructure them to have them uh being a little bit more um a little bit more formalized um and there ways of working so I mean at the end of the day uh we delivered value to the business we did it in a a modern architecture utilizing cloud services uh and the teams were able to sort of hit their hit their Mark uh which was an impressive uh an imp

ressive result so um I'm very proud of what everybody is is has done uh and uh thank you all for for coming we appreciate the time and the effort

AWS re:Invent 2023 - Using serverless for event-driven architecture & domain-driven design (NTA203)

Related articles

Comments