Main

AWS re:Invent 2023 - [LAUNCH] Introducing Amazon ElastiCache Serverless (DAT342)

Come learn about the recently announced Amazon ElastiCache Serverless. With ElastiCache Serverless, you no longer have to provision, plan for, or manage your cache cluster capacity. ElastiCache Serverless helps you get started in under a minute with a distributed in-memory cache and instantly scales in and out to support even the most unpredictable application traffic patterns. In this session, learn how ElastiCache Serverless works and how it can help you accelerate application performance. Learn more about AWS re:Invent at https://go.aws/46iuzGv. Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4 ABOUT AWS Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster. #AWSreInvent #AWSreInvent2023

AWS Events

4 months ago

thanks for coming we're going to talk about Elastica serverless uh hopefully that's why you're here uh so as you as you probably heard we launched elasticas seres yesterday Peter announced it in his keynote uh so I'm excited here to talk about elasticas servess with all of you uh we'll go through a little bit about what caching is if you're not familiar with what a cache is at all we'll touch upon that a little bit and then we'll talk about uh you know how elastic cash serverless works we'll tal
k about uh what happens behind the scenes what what are its features and then we'll also go a little bit deeper into how we built it uh we'll go deeper into how it works behind the scenes so you get a better understanding of how you know how it all works behind the scenes its capabilities and so on so forth so yeah my name is AB I'm a product manager on the elastic servess team and I have euron with me who's the software engineering manager uh on the team as well and uh you're excited to walk yo
u through all of this so uh before we dive into elasticache seress uh just uh who who here is familiar with what a cash is I'm assuming most of you are given you in an elastic hash uh talk uh that's good so uh for those of you who may not know what a cash is just quickly uh you know the basic definition is any hardware or software component that uh stores data so that it can be accessed quickly and at high speed with low latency right so usually customers use it uh to accelerate application perf
ormance uh you put a cache for example in front of a database or in front of some other service which may be expensive whether expensive from a uh money standpoint from a cost standpoint or from a latency standpoint expensive to read the data and so you want to put your most frequently accessed data into elastic cash or a cache for that matter right uh so usually the idea is you put your most frequently accessed data for high performance uh low latency access now elastic cache is uh you know our
aws's Premier caching service um who who here uses elastic cash today all right almost half of the group here that's that's good so as you're aware probably we are compatible with two open source engines mcash and redis um and so customers uh you know hundreds of thousands of customers use elastic cash today uh to optimize uh performance uh for their applications and use uh use redis as a data store right uh redis is a pretty capable uh engine we will talk about that in a second uh but the basi
c benefit the main reason customers use it is to increase performance right anytime you've got an application which is slow or it's taking too long or you know your uh reads uh from your database or from from the underlying service is taking too long that's when you would use a cache and customers are using elastic cach to to store that data frequently in front of other databases like odds or Aurora or Dynamo DV and so on so forth right the other reason you would use elastic hash is is higher sc
ale right so you can today's databases are pretty pretty capable you can scale vertically horizontally you can add read replicas to your databases as your as your application scales as your throughput increases um but it can get expensive right so if you're trying to scale for example uh a mySQL database or any relational database uh it can get expensive to achieve the high scale that you want to achieve uh so caching is a good way to achieve or a or a cost effective way I would rather say uh to
achieve High scale right because all data is stored in memory it is pretty pretty uh low cost to access that data in terms of compute uh uh compute access so you can achieve higher scale uh with uh with elastic hash and it can be more cost effective versus trying to scale your your database right so uh so those are those are some of the main reasons customers use elastic Ash today um redis uh I'm pretty sure most of you are familiar with redis who who he uses reddis yeah almost all of you okay
that's good so yeah I I'll just touch briefly upon it you know pretty it's a pretty popular database it's got a very uh simple yet flexible and capable API right a lot of in memory data structures like uh sorted sets hashmaps even stre uh uh uh streams pubsub Etc right a pretty capable uh inmemory engine uh that you can use to for for a variety of use cases so Rus can go be Beyond caching right you don't have to just use strings and set and get commands to to uh cache your data but you can also
use it as a as a data store people are using it for leader boards people are using it for you know streaming and Pub sub type use cases and and and a whole bunch of others right it has it has built-in support for replication and scaling if you're using elastic hash today you're aware it it allows you to operate in cluster mode enabled uh which is lets you operate in a multi- sharded cache environment so you can scale out add more shards as needed add Moree replicas as needed and scale back in wh
en when when the uh when the traffic is low right uh and we touched upon the in memory data structure so I'm not going to go deep U mcast is the second engine we support uh pretty pretty popular uh uh probably the second most popular after R is if you if you ask me in terms of caching um and a lot of our customers use M cach for for caching as well simply because it's a simple to use API and it's it's it's really good at what it does right it's simple and it's fast um so if you want a cache whic
h is which gives you Ultra fast performance um you would use mcash so let's talk about some of the some of the pain points that customers have been telling us about uh that led us to to invent and build uh Cas serverless main the main problem customers are telling us is that capacity planning is hard right if you've done capacity planning or decided you know how much or how large of a cluster you need or how many replicas you need Etc or even which uh instance type to use you know how how uh cum
bersome it can be right uh you usually what customers would do is look at your your traffic patterns that you've seen in the past and you would set up capacity enough or maybe slightly higher than the p Peaks that you've seen in in the past right what that does though is if you see a peak Which is higher than what you've estimated then you will see performance impact right because uh you you don't you just don't have enough capacity and in most of the other scenarios you know uh at the uh during
night time during weekends when your traffic is low you you're paying for capacity you're not using right it's just sitting there so you you're you're essentially incurring excess costs or costs that you don't necessarily need to incur right so you we you can manually adjust capacity as well as I was just talking about it earlier you can decide that hey when my traffic increases I'm going to scale up I'm going to add more shards right or add more read replicas um and when traffic goes down I'm
going to scale it back down right uh yes you can do that uh but doing so is is is also difficult right it's not it's not the easiest thing to do uh because you know the you need to know exactly when to scale you need to know hey what's the memory threshold that I hit on my cache when I want to initiate a scaleout or hey what's the compute uh utilization or the CPU utilization when I want to initiate a scale out and that's not easy to do right if you if you've got if you've got a redus workload l
et's say and you're running uh you know let's say a leaderboard right and you're using a sorted um sorted set and so on to run that run that workload knowing exactly when your or when to scale in terms of the CPU threshold is hard because as your uh as your requests per second or the throughput increases your CPU may not necessarily increase linearly right so it's hard to know exactly when to set those thresholds so customers just generally have been telling us that you know capacity planning is
is hard right the stock about mcash for a second uh if you if you use mcash uh you know that achieving High availability with mcash is is pretty cumbersome it's not it's not something that the mcash engine supports uh out of the box right so many of our customers what they're doing is they they essentially create two clusters and write to both at once either using some homegrown software Sometimes using open source proxies right and they write to both uh clusters at once and if a cluster become
s unavailable uh they just fail over to the other one right so but it's not it's not something that the that it's uh that supports out of the box in the engine right if if a node is lost then the data is lost in in mcash right because there is no built-in support for replication either in mcash so again uh achieving High availability and even some semblance of data durability is is extremely hard in in mcash right and SC it does support scale right you can always add more nodes you can scale out
uh and rehard but all of that resharding logic Etc needs to be handled by the client it it's done on the client side right so the client needs to be aware which Keys live on which Shard and then redistribute the data when when you add a new new uh new node some customers just just let it you know uh rehydrate from from zero right so just add a new Shard and or some new keys that are coming in will go to the new shot so that's that's another approach but in general achieving High availability wi
th MEC is is difficult right so all of these so we went back to the board and we decided hey what can we do about this right we we do support Autos scaling today on elastic hash you can set up rules you can go and say hey uh on Friday evening at 5:00 p.m. when I see a lot of my traffic scale up or scale out and uh on Sunday afternoon scale back in and or you can say that hey at a certain CPU threshold or memory threshold just scale out right and scale back in when it falls below a certain thresh
old but as we disc just it's going to be hard it's not it's not easy to decide those thresholds either so we went back and decided hey what what can we do about this so I'm excited to introduce elastic as serus uh to you today um well yesterday uh and we're talking about it today um so elastic serverless essentially get rids or get gets gets rid of capacity planning for you completely right so you don't have to worry about capacity planning at all so there is no infrastructure to manage you don'
t choose any instance types you don't choose the number of shards you don't choose the number of replicas there's very there's very little configuration required to create a cache right so here's here's some of the features uh for for the cache or for elastic cache serverless right we've made it super simple uh to create a cache right you can create a cache in under a minute with with serverless uh no capacity management as we as we talked about earlier you don't have to worry about uh deciding
how much capacity you need up front um same performance semantics that you see on elastic a today uh and you get pper use pricing we we'll go into each of these a little bit more in detail but uh really what we're trying to do is uh we we're trying to simplify or take on more of the operational overhead and the operational burden of running a cache uh so that you are free to work on your application and you just have a cache that just just works right so at launch elastic Cas serverless is going
to be compatible with red 7.1 and with mcash 1.6 um it supports the same microc performance that you used to with elas cache uh there is a maximum limit of 5 terabyte per cache so every Cache can grow up to a maximum of five terabytes at GA um and so yeah as customers have been telling us that they just want a simpler way to get started so we made it extremely simple right we said that hey all you need to give us is a name you tell us the name of your cash and we'll create it for you you don't
need to give us anything else there's no other uh you know settings in terms of uh uh you know configuration or capacity and so so on and so forth what we did was we really just took a step back and took a thoughtful approach of uh what levers and what configuration controls we want to offer here right because there are there are you know I'm sure as developers we like to Tinker we like to have a lot of control over over the underlying uh infrastructure and you have that capability today with el
astic hash right so you can go in and configure engine parameters you can configure how many nodes you want how many replicas how many shards which azs they live in you have a lot of control with the last cas today but what if you wanted to you know get started much more easily and I think that's the philosophy or the design philosophy between behind elastic as serverless right so now that doesn't mean there is no configuration options right so we had to decide for each option uh is this somethi
ng that as as developers is this something that that customers would want to do or need control over so basic things like hey which VPC does the cache operate in right in terms of security that's obviously something we you will want to control so you can choose which VPC is the cash access from you can choose which availability zones you want to access it from so if you've got an application that lives in let's say AA and a b you want to access the cash from those a right you don't want your cas
h to live in AC because you will be incurring latency while with the cross a hops so by default we will pick uh you know uh uh two availability zones in your default VPC if you don't specify them but you always have the capability to go in and specify exactly which availability zones you want to access the cach from things like uh encryption is always on by default right so encryption and Transit encryption at rest for backups is always on by default and but you can choose if which key you want
to use for encryption so you can choose for example to use the service managed Key by default or you can use uh a customer managed key right so the biggest benefit I think with uh serverless is just no capacity planning we've we uh we debated you know how to simplify management of capacity what levels do we give but I think the at the end where we landed was we just wanted to get rid of capacity planning completely right so what this means is elastic Cas will scale vertically and horizontally uh
for you and it automatically meets the traffic demands that you throw at it right you're not at any point configuring how much capacity uh the cache has right you don't have to worry about that at all uh elasticache serus is constantly monitoring the memory uh Network and compute utilization on your cash and it uses uh it uses what we're calling you know predictive algorithms to predict where your traffic is going in the next few minutes uh to uh to to scale appropriately and the way it works i
s as it says we scale both vertically and horizontally so what that means is say you're operating at you know U say 50,000 requests per second and suddenly you see a spike we will let let the application scale up in place right so it will you can you can take on spikes and scale up in place while we in parallel we decide to scale out ahead of time right so and as you know scaleout operations can take time use sometimes in of the order of single digit minutes right to 10 minutes it can take uh to
scale out so we decide to scale out early while letting the application scale up in place so that you have just the right capacity when you need it we decided to give you payper use pricing right you don't pay for capacity at any given time you're not paying for the capacity that is behind the cache you're paying for your actual use you're paying for the amount of data you store that is built in GB hours and you're paying for your requests that you execute on the cache in a new unit called elas
tic processing units right so e you can think of ecpu as a as a measure of the amount of compute and network throughput that your application is consuming or your requests are consuming the simplest way to think of this of the ecpu is uh let's say you're running a simple caching workload right you're using reddis strings uh set and get commands uh so every set and get command that transfers up to 1 KB of data we will consume one ecpu right U now if you if you have a command that transfers 2.3 KB
of data it will consume 2 3 ecps right and so uh it it will it will calculate the amount of ecpu based on the data or the network data transferred some commands as we discussed on redis can cons take up more uh vcpu time as well so let's say uh as taking the same example let's say you're running leaderboards right and you're running uh you know give me the top three uh items in my sorted set right so now depending on the size of your sorted set depending on the amount of data that is being uh a
ccessed in in your cache the amount of vcpu time taken can vary right and so what we do is we actually measure the total vcpu time taken by your command and we compare it to a simple set or get and we essentially charge you in multiples of that right so if it's taking five times the vcpu time of a simple set or get you will it it will charge you five ecus right so that's that's really the philosophy behind it uh the idea is you pay for the amount of requests that you execute on on the cache some
more things to touch upon here by default you get four nines availability of SLA on every cache you know behind the scenes we are replicating your data to multiple azs at once it's not something you configure it's not something you uh you have to worry about it's just there uh you get just get four NES availability SL SLA by default on on every cache as we discussed encryption is always on uh in translate as well as ADD rest and we also support uh you know a lot of the compliances that elastic
Cas already has today uh including PCI s so and Hippa so if you if if you're running a workload which needs these compliances uh it's ready it's ready to be uh used on serverless as well let's talk about another benefit I think this one this one is an interesting one um we we built uh you know as we were talking about scaling earlier right uh it is uh scale when when the cache scales uh what happens behind the scenes as you as you probably know is when you're scaling out we add new shards right
so we're going to add a new node we might add new replicas if you have replicas in your shards and then we rebalance data right so your data is moved from one Shard to the other and when the new node becomes available then the redis client or the mcash client for that matter has to discover the new topology right so most open source redis clients are able to discover changes in topology and they reconnect to the individual nodes and when the node becomes available they just when a new node becom
es available they just uh connect to the new node and start executing commands there sometimes but this this configuration and depending on which client you're using some sometimes this can this can also result in availability issues because we've seen issues we've seen cases where you know clients continue connecting to the old node and you see an availability impact because uh it did not correctly ReDiscover cluster topology or so on and so forth right so what we've done is we've created a sim
ple endpoint experience you get you get a single endpoint to connect to from the client's perspective you're talking to a single node you're not talking to multiple nodes at once so now the client doesn't need to worry about cluster topology Discovery at all and the way we achieve this is to a new proxy layer and we'll talk about that a little bit more in detail there's a new proxy layer which is a network load balanced uh Fleet of proxy nodes that handles all incoming connections from your clie
nts right so your client is connecting to the proxy node and then the proxy executes the commands on the underlying cash nodes and what that means is now if a node becomes unavailable for whatever reason whether it's scaling out whether it's just you know the E2 instance is lost for whatever reason and a new node is being brought in uh the application or the client doesn't need to know about it they don't it doesn't need to care about it at all because you're talking to the cash uh proxy and the
proxy automatically rediscovers the cluster topology establishes a new connection and executes the commands there so there's no availability impact that you might see so you get a much smoother curve on the availability semantics that you see on elastic cach servus versus a instance-based uh cache some performance numbers to share uh you know we will see the same uh submillisecond uh per perance in terms of uh reads when using read from replica we we do support reading from replica semantics fo
r both uh reddis and mcash um and uh you rights are slightly higher um at 1100 and 1900 uh microc at p50 you can scale up to millions of requests per second uh and up to 5 terabytes of data as we discussed earlier on on each cache so I'm Cur I'm sure you're curious about how all of this works behind the hood so let's go a little bit deeper we'll we'll talk about you know the architecture we'll talk about how we achieved uh and the trade-offs we made uh and go deeper I'm sure and you'll have a ho
pefully a better understanding of how it works what the capabilities are and youron is going to walk us through all of that thank you AB uh hello everyone I'm really happy to be here um so thank you for joining us this is I'm yon and I'm leaded in data plan team who actually build the elastic serverless so this is very um special moment for me because we finally announced this elastic surus uh it was announced yesterday and today we are speaking about that um so it's going to be very excited um
so I will walk through the technical uh problems that we solve in this project I'm going to dive deep into multiple technical aspects so I hope it's going to be interesting for you um so let's start when we uh build elastic servus we had to to uh think how we can build a system that can dynamically scale by multiple different dimensions so we want to provide the best customer experience so we had to build a system that can scale when the memory start to grow or when the CPU utilization uh reach
to some threshold uh we also wanted to make sure that the system is not uh saturated the the network is not saturated and we can scale uh before it started so when we designed the system we had to answer multiple different questions and one of them was how can we actually serve a customer who has uh a spiky workloads uh multiple times an hour and still providing him um a seamless scaling experience so let's understand how the scaling is actually working on the scaling we have three main uh stage
s the first one is the detection and the detection is happening within a seconds uh so we can actually meet the customer demands uh we also have the ability to project the future demanding by monitoring the usage pattern so we can assume what is going to be what going to be the next uh few minutes the uh usage of the customers this give us the power to um be more proactive and to scale our capacity ahead of time so once the workloads uh kicks in we already scale out and we have enough resources
to ingest the new data now the second phase is provisioning and the provisioning is done within a minute and we can do that thanks to a warm pulling mechanism that we build and the warm pulling is a list of cache engine nodes waiting in a standby mode and ready to be attached to existing cluster so once once we decide that we need to um scale out we can use this war pooling attach the cluster and do it very quickly and the last phase is data rebalancing now the data rebalancing uh is happening i
n parallel between the slots of the data that we have across the shards so we can actually move Slots of data for in in parallel at the same time from different shards and this is help us to expedite the process um now how do that how we choose which of the slots needs to be migrated we are monitoring the um um the usage at per slot level and then we determine which of the slot are candidate to be uh um migrated to different shards we can mark them is hot or cold or anything that we think that t
his is the right time to start moving them to a different Shard So eventually what is happening is that uh we yield with um a balanced workload across the um cluster shards now we also took the decision to scale uh when we reach to 50% of the resource uses usage and the reason for that is to allow us to handle increased capacity of up to two times and still to complete the scaling process successfully now horizontal scaling involved with moving data from one Shard to another Shard and usually th
is is a long and heavy process and we had to invest and to improve it so we can actually um uh improve the scaling performance so we did multiple things one of them it was to uh develop build a batching mechanism so we can move bulk of data uh uh that will help us to expedite the uh U data migration and to support that we increase the network buffer so so we can actually process more data on the source and on the target upon every system call that we're executing and on addition to that on the n
etworking layer we um now able to reuse the open connections and we don't need to reopen every connection every time that we are moving um a new slots so this is also improveed uh uh significantly uh the the replicated data and last we took the badging mechanism that I mentioned before and now now it's running in parallel to the purge process so compared to the previous states where we had to replicate all the data and run The Purge process now we can do that in parallel so overall we this all t
hese improvements um uh improve significantly uh uh the replication the slot migration data the repartitioned process and we benefit from a better scal in performance now I want to talk about Caspian and elastic serus so first elastic surus built on top of Caspian platform and for those of you who missed Peter Keynotes where he talk about Caspian I will give a a brief about that so Caspian is a a new type of Amazon ec2 uh instance it's a purpose buil for dynamic resizing and it can do that very
quickly so caspan platform do not have a memory or CPU footprint and can be rescaled up and down very easily and so caspan deliver a very good performance without compromising on uh on the cost and he do that thanks to and over subscription techniques now the Cooperative over subscription techniques is where tenants are assigned more resources than actually exist on the physical lost and this is largely work uh when the level of the usage of the resources is below than the level of the physical
resources uh but sometimes this is not the situation and we can find ourselves that the tenants want to use much more resources than we have on the physical host and we need to start think how we can move them from one caspan platform to a new to a new Casp platform and I will speak about that how we do that uh in the next few slides now how we actually benefit from all the good things that caspan provide us so first Caspian allow us to instantly grow our go and Shrink our compute in place so we
can benefit from more uh um additional resources of uh CPU memory and network bandwidth uh while then the more workload that kicks in and when the data need to be uh uh um repartitioned and and we need to rescaled it's actually provide us the capability to start with vertical scaling and to follow with horizontal scaling and this is a big achievement now with all these Caspian we still need to maintain them and to monitor in them so we can manage them correctly so we can make sure that in our F
leet uh we have enough resources to handle the current usage of the um workloads and the future so let's understand more how we're actually managing all these uh Caspian so we are utilizing the underly capacity of our uh Caspian platform and we do that according to the customer uh demands and the main challenge here is to ensure that all all the cash nodes that running on the Caspian has enough resources to manage the existing workload and the one that uh uh future to come to the system so we ne
ed to make sure that we have enough resources enough we are not running out of resources CPU or network bandwidth otherwise the cach nodes will experience with reduced throughput and increased latency and this is something that we definitely want to avoid so for that we build a live aggregated view of all the caspan platform and we do that using async polling mechanism that run in every few seconds and then we build a live aggregated view of all our Fleet with this aggregated view we are able to
determine which of the CM platform is considered as hot and which of them is not and then we can run a heat balancing on that particular Caspian and to decide which of the cash nodes we want to migrate to a different Caspian platform now this help us to keep all the caspin platform uh Fleet healthy with enough resources to handle the workloads of our customers now I mentioned very briefly that we running the heat balancing but but let's understand more deeply how it's actually working how what
is the algorithm that we use to decide which of the cach nodes need to be migrated to a different Caspian so we call it we call it heat management and the way it works is like that when we are searching for a a cash not to be migrated to a a new um Caspian we are searching for the second hottest nod in term of resource consumption and the reason for that is that the first hot nod is probably going under heavy load and um is busy with ingesting the data and there is also good probability that it'
s going under scaling process so it's better for us to leave it let it complete the the process and and and and not interrupt him in the middle and for that it's but still we need to find uh a a cash node that use significant resources that once we finish the eviction it will free up on that Caspian and uh otherwise we just moving a cach note between the Caspian platforms so this is the reason why we choose the second one now when we are searching for a new placement for a node we are um using t
he algorithm that called uh uh the power of two random choices where we are picking two uh uh random Caspian nodes platforms and um we are choosing the second hottest Caspian uh between the two this is help us to uh spread the balance evenly across our Fleet and it also works really well uh using um in in delay delayed data on on failover situations now on top of these two things we also build a Noisy Neighbor isolation and smart management resources so we can make sure that the tenant inside th
e plat caspan platform does not interrupt each other we do that thanks to um we use the Linux C group to isolate the usage of the uh CPU and for the memory we use the Linux hot plug and hot unplug to decrease and increase the allocated the physical allocated memory that assigned to each of the um cach nodes now I want to talk about different angle uh of um the serverless elastic seress and this is connectivity and as you can see I'm speaking about single endpoint so we decided to build um a sing
le logical entry point to elastic serverless and for that we build a new elasticus serverless proxy and that enable us actually to encapsulate uh and to track all the cluster topology changes behind the scene so anything related to scaling failover connection disconnection uh a patching that's happened behind the scenes it's hidden from the client and this is a huge Improvement uh uh compared to um uh the note based provision so we build the proxy to be highly performance highly available fast a
nd secure and we decided to build it using rust language and the reason for that because we want to Leverage The inherit features such as memory memory safety concurrency in safety and high speed now the proxy uh main job is to rout client request to the correct chart while uh um while supporting a a huge scale of possibly hundreds of shards behind the scene while requiring for the client to maintain only a single connection that's it the client can con maintain only single connection while the
proxy can support thousands of connections behind the scenes and this is a a great benefits because it take away from the client all the responsibility to deal with all the connectivity aspect to the uh cluster shards now we might think that for every established connection to the proxy the proxy will establish a connection to the cach nodes and so this is can work up to some level of of connection number of connection but when we are talking about high scale High number of connection we have to
think about different solution so for that we build uh a new multiplexing solution which opened a single TCP channel uh to pass multiple different client connection and the unique multiplexing protocol uh increase the command processing by over 3.5 times um and reduce the number uh of open connections by collocating several commands and response on the same network buckets so that makes the proxy very fast lightweight and uh uh more reliable now uh elastic servers provide High availability um b
y Distributing both cach nodes and proxy nodes across multiple different dis so that even a a full ay failure will not impact the availability and to avoid and not to avoid I'm sorry to uh uh provide low latency clients are routed to the proxy in their own a so they can achieve uh um submillisecond latency in and and and we do that thanks to R 53 local La capability now there is a good chance that the proxy will route the request to a different e where the cach nodes is located there and and the
we we have a solution for that and the solution is to use the read from replica option now if we are sensitive to latency we can use that and the proxy will detect the uh option that we want to use the read for replica and we'll make sure that it will route all the requests to the same a where the application is located so all the process will be on the same a so we can achieve the submillisecond latency now let's talk about R from replica our recommendation and the best practice is to use the
read from replica option um the proxy as I mentioned before also for the replica will provide a single entry point to manage all the connectivity to the replica it will encapsulate all the connectivity to R do replica and there is nothing to handle on that on the client side besides inste of just turning on the uh read from replica option so the proxy will prioritize always to read from the local a regardless if it's a replica or primary node and if the replica is busy now with syncing the data
so no worries the proxy will choose a different node uh for seamless experience now there is no need to configure as I mentioned before uh nothing on the server side everything is done automatically on the server side we just need to turn on the uh option that we have on the client sign same as you see here in the code so let's go through the code and I will explain what we see here we have a python code that connects to um elastic seress for Edis this is a python code for Edis and what we can s
ee here we see here single entry point single address that connect to Elastica serverless we construct it in the redis client and we ask to use two option to use the SSL and we want to use the read for applica option now we have two block of codes one is a for Loop the second one is a pipeline where in the for Loop we are just receiving the values and in the in the pipeline we batch Four Keys on the same time what is beautiful here is that we offloading all the complexity of the connectivity to
the proxy it will make sure to connect to the right chard he will make sure that um uh he will choose the local availability Zone to achieve the sub milliseconds latency and there is nothing that we need to do in the server side to um to handle that now same as for redis Elastica Ser from mkd offer as well the option to read performance uh using read endpoint all the right operation will continue to function as as usual but the read operation will run through the local La for uh better performan
ce to achieve the sub millisecond latency so both example that I show uh are very popular in caching patterns where we want to achieve a sub millisecond latency so what we see till here is so I walk through multiple diff different technical problems that we had to solve in this project of course this is not all the story if you still want to catch up with me I'm here at the r event tomorrow we have a light talk uh we presenting a demo so you can join us um and I hope you enjoyed I will call to A
B to continue thank you body thanks iron so hopefully that was uh useful and uh you got an insight into how uh you know things are built behind the scenes uh you know so elastic surus is live uh we have customers using it today um you know uh it's as you know all the core benefits right just to recap you you get a pretty significant Simplicity of cash operations right you can set up a cash and operate a cash at at uh you know at high scale applications without any configuration and so on you get
uh you know fast scaling uh we we can accommodate workloads as they grow and shrink and you get paper Ed pricing so you don't have to worry about uh paying for capacity at any point right so you're always paying for the actual usage so some of our I think an interesting question some of our customers have asked us you know hey when does it scale back in and the answer is it doesn't matter right because you you're only always paying for use you're never ever paying for capacity so it doesn't mat
ter when we decide to scale in it's uh it's purely a cost optimization thing for us internally it doesn't we will always make sure that there is enough uh resources available for your workload uh as it scales right so you can get started today I encourage all of you as as we discussed it's as easy as giving us a cache name go into the console give us a name get a cach in point plug it into your red client or mcash client um so here's here's a URL for where you can learn more about about serverle
ss uh all the features that we discussed um and I hope you you know try it out today um we can I think we have about 15 minutes or so so we can open it up to some questions I'm sure you guys have some uh maybe we'll start here

Comments