MCPA MuleSoft Certified Platform Architect Level 1 – Getting Production Ready Part 5

  1. API Performance – I

In this lecture, let us see how we can scale the application networks to have better performance of your APIs. There are actually two main ways of scaling the performance of the API, okay? One is the vertical scaling and the second one is the horizontal scaling. I am very sure you must have heard this already many times before, because these terms are agnostic to the product or technology, right? So, in the context of this any point platform, let us see how we can do vertical scaling and how we can do horizontal scaling. Coming to vertical scaling, which is nothing but scaling the performance of each node on which a mule run time and deployed API implementation or API proxy executes.

Okay? So in Cloud Hub, this is supported through different worker sizes, okay? So they have sizes like zero one v core, zero two V core, or one v core one vico they call virtual cores. So that is the size of the worker, which is directly mapped to the size of your EC to instance. If you remember, we have already discussed this in the previous sections. What each worker size is mapped to the AWS Ecto instance. For example, 0. 1 V core is mapped to like 500 MB size and one CPU of the Amazon Ezra instance.

Correct. So you know that sheet, we have already shown this in the previous sections what that sheet maps to. And we have also given the link to that particular resource on the Mule Soft site, where you can go and find the size of the each worker. How is it mapped to the AWS? Easy instance. This is supported through the different worker sizes and is a disruptive process that requires the provisioning of a new Cloud hub workers with the desired size to replace the existing Cloud hub workers. Okay? Why it is disruptive? Because if it is a vertical scaling, when you say suddenly change from a 0. 1 V core to say zero two vico, then what happens is in background it has to go and provision a new ECU instance with that size of vicor.

Meaning, if it is zero two, then it means it should be a one GB Ram or memory and say two CPUs. That particular situation should be spawned up and then your application should be redeployed on those. Correct? That is why it is called disruptive. But remember, this does not mean it is a downtime to your application. Mule promises that it tries to ensure you get a zero downtime deployment, because this happens side by side.

So what MuleSoft does is any point platform does is it goes and spawns up the second instance of the new configuration, which is the enhanced vico. For example, once that is up, it deployed all your applications, make sure it is up and running all green, only then it just switches your traffic to the new one. Okay, so you get the zero downtime, but  it is disruptive because it causes a new instance to respond and things like that.

Okay? Now, the horizontal scaling is the scaling of the number of nodes on which the mule run times and the deployed APA implementations or your APA proxies executes. Okay? In many any point platform runtime planes, including Cloud hub, the Pi virtual Cloud Foundry, and the runtime fabric which we have seen in the deployment models of the anypoint platform. In the previous section, this is directly supported through scale out and load balancing. Okay? But only thing is, currently there is a limit of eight workers per AP implementation. I remember that this has been mentioned even before in the previous sections. The maximum number of workers we can allocate our maximum eight today as we speak, the limit is eight workers. So both types of scaling, vertical and horizontal can be triggered in one of two ways, okay?

One way is explicitly doing it meaning by a user going into the any point runtime manager and actually changing the weaker from say . 1 to . 2 or adding a new worker a second one or third one. Okay? This can be explicitly done like this, or even via script or any point platform API, okay? Script means it can be a command line interface script or any point platform API is through a rest or postman correct. So using any of this explicit criticism with one meaning, someone has to do it by writing a piece of script or the manually from the web UI. Okay? There can be a second one, which is interesting, which is automatically scaling this thing. Okay? To be frank, this is an AWS feature called Auto Scaling Groups, which is leveraged by the endpoint platform to achieve this. But still it is good that they brought in all the way into the platform.

Okay? So how this can be done automatically? So by Cloud Hub itself, it can be done due to a significant change in the runtime characteristics of the AP implementation. Okay? So this can be set up in the Cloud hub with this auto scaling in such a way by configuring through policies that scale up or down or horizontally or even vertically based on actual CPU or memory usage over a given period of time. So we can set like info in the period of five minutes if my CPU exceeds this, or if a memory exceeds this much limit and all you can set up a policy when that breach happens or a threshold breach happens, the scaling will automatically happen just like the AWS auto Scaling Groups. OK, if you are aware of the AWS Auto Scaling groups.

  1. API Performance – II

What you are seeing in front of you now is just a visual representation of this auto scaling. How you can configure the policies. So configuring an auto scaling policy in cloud hub to scale the number of cloud hub workers of a given new application between one and four. If you see there is a limit between in the action section and we are given like one and four workers cut, what it means is that when the rules which we are given in the above meet that, then it has to scale between one to four. But don’t cross four so that maybe you don’t exceed your licensing limits and all.

Okay, but what are the conditions we have given? It scales between this one and four based on whether CP usage over ten minute interval is above 80% or below 20%. Okay? So after the rescaling step was triggered, auto scaling is suspended for 30 minutes. Okay? So this is because to ensure that it won’t go as a row growth and keep increasing. That is why once an increment happens, say if it changed from one to two, it gives some time to breathe or digest.

Okay? That is why there will be a 30 minutes suspension time so that it will observe for minimum 30 minutes whether any of these rules are again going up or down. Okay? Instead of blindly increasing immediately, that okay. Now it changed from one to two. Still, if the limit is above 80, you don’t immediately change from two to three. Okay? It will wait for 30 minutes. So if separate APA proxies are deployed in addition to your AP implementations because it is possible, right? You know why APA proxies come into play.

If you don’t want to use API implementation runtimes itself to enforce your policies, then you can have a dedicated APA proxy which is also a mule runtime in the background. And you can have your policy enforcement in the APA proxy. Or second reason is if you don’t have your APA implementations in mule, they are third party one, then you can have a proxy. So whatever is the reason, if you want to scale them as well separately in addition to the AP implementation, then the two types of nodes can be scaled independently. Because like I said, they are end up with that the same run, some mule runtime, right? So you can independently scale them.

But in most real world scenarios, the performance limiting node is the APA implementation and not the APA proxy. Because proxy like, you know, it’s just a policy enforcement place, right? It won’t definitely cause that much memory issues or CPU intensive issues. Okay? So the point of place where you have to increase is mostly APA implementations runtimes, okay? So that is why typically they are the ones who need to be more or larger instances compared to the APA proxy.

And one more aspect of this performance thing is the C four A team because from the support or operations perspective, the horizontal or vertical scaling can be done on the ad hoc basis by monitoring the behavior of the applications. Right? So that is one way you can do it. It is fine, that is more of operations side, but from the applications or the practice side there are some performance considerations that need to be considered and C four E is the best team to take the ownership of that particular thing.

Okay, so what are those? See, the performance of any API should not be just like randomly tuned or randomly set, okay? Although there might be some scenarios you do that, but it is not that. OK, this particular ape is behaving slow. So let’s increase the worker or the void. Okay? If we keep doing that, then end of the day we are doing like a gold plated solution instead of the permanent solution. So the C four should take the ownership here and try to review the code wherever applicable and understand the significance of the APIs from the full domain context. Meaning, see, the API which is running on a mule runtime may be underperforming. All right? So you are trying to increase the workers. That is also fine. But we have to find the root cause if it is happening recurrently. Okay? So if it is recurringly happening, say every once in a day, definitely at a particular time or in a particular pattern, then the same should be noted and the C four E should do analysis and find what is the root cause. Because most of the times the issues may be from the downstream systems.

Okay? It may not be in the code of yours. Yes, there might be some bad code, but it’s not always true that only because of your code some issues are happening. So most of the time what people do is invest time in trying to find the issue in their own code by scaling or vertically. Horizontally, like what we discussed. But eventually the issue keeps coming, okay? Because whatever workers do, scale even they slowly get exhausted and increase the memory and then they fail. Right? So C four should come in and see, okay? Maybe for this particular API, the downstream system we are interacting is say for example ERP. Okay? So the ERP has this create sales order functionality and create sales order functionality might be a costly operation for the ERP. So if we bombard a lot of load at a given point of time, ERP slowly performs and that is why the lag is falling on the system, a pi of the create sales order and causing this memory or CPU intensive issues.

Okay? So we have to tune the API with that domain limitation in mind. Meaning you have to implement things like we discussed in previous sections, right? Like asynchronously accepting it, throttling the API throughput by putting some rate limiting and slowly releasing the orders into the ERP and having some cash mechanism or even driven event store where the retrieve will get response temporarily from the event store until the orders are fully submitted to the ERP. Correct. So this kind of design should be implemented to overcome such limitations and have the better performance instead of just blindly scaling the workers or the infrastructure part.

Okay, so this is something c Four E has to come in and assist the implementation teams to advise and have the current design. This may not come in the early phase of the project due to the limited knowledge or the unexpected behavior of the domain systems, but whenever it happens, maybe even after it went to prod because we have workaround of this worker scaling vertically or horizontally. We can meanwhile, do or live widget workaround and with the help of c four e implement this particular or redesign this particular API. Okay? So these things also should be considered when it comes to performance of the API. All right, happy learning.

  1. Deprecating and Deleting an API

Let us now see how we can gracefully end the life of an API. When I talk about API here in this particular lecture, I do not mean API means the APA implementation, okay? Because API implementation is agnostic to the APA client. API client do not see the API implementation, correct? They only see the API interface that is exposed on the APA manager. That is the place where they come to know the APA endpoints and the policy enforcement kick in and all the interactions they do. The APA manager is the first place the client talks to, okay? That is why the API implementation is irrelevant to the APA clients. So I’m not talking about that implementation, which is the actual code. I am talking about the API itself, which is an APA manager. The reason being there is no special interest in the lifecycle termination of the AP implementation, okay? It is just a mule app running on the mule runtime, correct? So in background, if you have a new logic implemented by respecting the RAML spec, the request response and all the agreed policies of that particular API, you can very well seamlessly deploy a new application and point that to your API and delete the old driven correct? Like we discussed in the versioning of the API section. Okay? If you remember, there was a versioning API lecture in the previous sections, correct? We discussed, right? If there is a major, minor and patch three daughter of the versioning major, minor, dot patch, as long as the fix is patch or minor, the client need not worry because for the patch fix, the client need not even know that we have done the patch fix because we anyway only give the first version.

Then the fixes that are minor may help the customer if they want some new feature but won’t break the existing one. So there also you don’t need to specially inform the customer to change their endpoint or change their code cut. But only the breaking one is a major one. If there is a major one, that means they may have to change the endpoint or code depending on the impact of the change and how to repoint stuff cut, they can’t work on the existing AP.

So, just like we discussed, that way if your AP implementation is done as a minor or patch version of kind of change, then there is no impact on the customer side or API client side. So we can go and delete or make the obsolete of the old AP implementation have a new one. So let’s forget about that for now. So here when we say gracefully ending the life cycle or life of an API, we talk about how to properly make the life of an existing API interface done or deleted. Okay? Why? Because this needs coordination with the APA clients. They have to know that, okay, then APA is being terminated or deprecated. They have to know it.

Otherwise, suddenly one day their code stop working. So this is important and done in a proper way in an APA fashion. Like how we have taken some practices while developing the API interfaces. Same kind of practice should be taken while deprecating or deleting a APA version as well. Okay, let us see how this can be done in the endpoint platform. Any point platform supports this end of life management of APIs in following ways. One in any point APA Manager an API instance which exists in a particular environment. Because you know, right, the APA manager also has the environment concept. So from any environment can be set to deprecated. There is an option there. You can go to a particular environment, select an APA, and market is deprecated.

So this prevents APA consumers from requesting access to the tanks for their APA clients. Okay? All the existing contracts will be there on the particular API they can still invoke, but the new ones cannot request access because we have marked it as deprecated. At a later stage, the APA instance can then be deleted from their environment. Okay, so this is a step by step process. You deprecate it first so that we don’t allow any new APA clients to request access to that API. They can’t do that. Existing ones can keep continuing. Meanwhile, you can inform them either formally or through a process delay telling that, okay, this is coming to the end of its life.

Please migrate to the new version of the API and share the link of the new API or the exchange information, any point exchange information of the new API and slowly you can then delete from that environment. In addition and independently, in any point exchange an individual asset version. I mean the full semantic version, right? The major minor dot spatch of an API entry can be deprecated. This informs API consumers that version of the API specification should no longer be used. But again, it does not prevent them from requesting access to API instances of that API version. Just like information we are giving to the API consumers. If all exit versions of an API Exchange entry have been deprecated, then the entire any point Exchange entry for that API is automatically hired as duplicated.

Okay, so these are the ways we can communicate to the API consumers that an API is coming to the end along with your organization’s official ways of communication. So if you see in front of you now this particular screenshot is just trying to show an option to deprecate or delete an APA instance of a policy holder search API in any point. API Manager okay, so hope you understand now why this deprecating and deleting the API is important and why it has to be done in this way. Because this is official way of first deprecating and then slowly deleting of APIs. Let us move on to the next lecture in this section. Happy learning.