Amazon AWS DevOps Engineer Professional – Configuration Management and Infrastructure Part 13

  1. ECS – Auto Scaling

So now let’s talk about auto scaling for our ECS services. So if we go to the ECS classic one, we have our demo service, and we can go to the auto scaling tab. And in here right now, it says there are no auto-scaling resources configured for this service. To create an auto scaling, we must first click Update. So we’ll go here, click on Next, and then the network; everything is fine. Click on “next.” And here we can configure the service’s auto-scaling. And this is optional. So we can configure the auto-scaling to have a minimum number of tasks. So this looks just like an auto-scaling group. a maximum number of tasks, for example, four, and a desired number of tasks, for example, two. So this looks just like an auto-scaling group. And here we give ECS permission to create and use the ECS autoscale role. So ECS will do the autoscaling for us. And then we define an auto-scaling policy. So we can add a policy. And we have two kinds of auto-scaling policies. We have target tracking and step scaling.

So they’re very, very similar to the policies that we’ll encounter in an auto-scaling group. So, target tracking is saying that we should track an ECS service metric, for example, the average CPU utilization, and say that we want the target value, for example, to be 40%. And so if we go over that value, if our ECS service is using more than 40%, then we’ll add tasks. And if the service goes below 40%, then it will remove tasks. Okay? So I can call it this one, stabilizing at 40% at 40% excellence, right? We also have a cooldown period that specifies how long to wait between scaling actions, as well as a scale-in cooldown period that specifies how long to wait between scaling in.

As a result, there is a scale out and scale in cool down period. Or we could disable scaling if we never want our service to remove tasks overall. So this is one of those, and we could save it. And that would be our scaling policy. But we could have another scaling policy, which would be a step policy. And this looks just like the one we had in the Auto Scaling group. So I’ll call it “increase” when the CPU is high. And we can use an existing alarm, for example, or create a new alarm. So we could create an alarm for our ECS service and call it “Dummy.” And here we’re saying, “Okay, if the CPU is over 60% for one consecutive period of 5 minutes, then what you should do is add two tasks, and so on.” And so this would be the kind of alarm we have. Then click the save button. And so this is the second kind of auto-scaling policy we could have.

So the first one is very easy to reason about, and it’s easier to manage; the second one requires alarms to increase and decrease capacity when the alarms do go off. But they’re very, very similar to the kind of scaling policy we can have in auto-scaling groups. So I’ll just use the stabilised at 40%, click on Next Step, and click on Update Service. And now everything is going to get created for my service to be able to autoscale. So one thing I want you to notice is that, yes, we have defined auto scaling for our ECS service, but that doesn’t mean that our EC2 instances will also auto scale. So this is the tricky bit with Otoscaling when you’re using ECS Classic. So if we go view the service, we have Odo scaling enabled, and that means that it will try to add a task when needed. So we can track to about 40% CPU utilisation means that it But it doesn’t mean that in our auto-scaling group here, it will add instances if we need more instances to run these tasks. As a result, this is a major issue because we should be able to connect the ECS scaling to the EC scaling. and that’s really, really hard to do. So in that regard, auto scaling for ECS Classic is possible but quite difficult to achieve because we need to define two auto scaling policies. One for the service and one for the EC—two instances. So, if you really think about it, Fargate is actually much easier because we don’t rely on any EC2 instance. So we can just take our service and update it. And here again, we are able to define an auto-scaling policy, but this one will make a lot more sense. These are the same parameters, obviously. When you need to autoscale automatically, Fargate will figure out the best way to create our Docker containers. And as such, we don’t need to manage two auto-scaling policies. We just need to manage one auto-scaling policy, which is the one for our Fargate service. So it’s a bit easier with Farget. But what you need to remember is that both have the option to do auto-scaling at the service . But what So finally, how could we do autoscaling simply for a classic cluster? An alternative would be to use an elastic beanstalk. And so if we go to the elastic beanstalk in here and try to create a new application, it turns out that this application will be, for example, a multi-container Docker. Well, with this, we’re able to set a task definition, an auto-scaling policy, and an analyst beanstalk. When we add EC, two instances will also add task containers for our applications. And so the two things will scale together. So this is an easier way of doing auto-scaling with ECS that is going to be backed by two ECS instances. So you need to remember this going into the exam. Because otoscaling is difficult for ECS, But it does exist. You also need to understand the nuances of auto scaling within ECS. Auto-scaling with elastic beanstalks or Fargate is another option. Alright, that’s it. I will see you at the next lecture.

  1. ECS – CloudWatch Integrations

So now let’s talk about the integrations between ECS and Cloud Watch, because we know that Cloud Watch for the DevOps exam is something that’s extremely important to understand. So first, let’s talk about Cloud Watch logs. So in our task definition, and let’s take task definition number two, we are able to define a log driver. So let’s create a new revision. And I will scroll down, and in this container right here, I will click on it, scroll down, and at the very bottom, I should have some information about logging. So I should set up storage and logging there. And so we are able to define a log configuration, for example, by calling it “auto-configure Cloud Watch logs.” And the log driver is going to be AWS Logs. But we have different options. We could use Splunk as an example. So by defining AWS logs as the log driver for our task definition, we’re able to say, “Send this log to this log group, this log region, and this stream.” So we can just auto-configure.

And we have the values in here being filled in. So this will make our containers directly run and send the application logs into CloudWatch Logs, and that’s really handy. And for this, we don’t need to instal any Cloud Watch log agents. The Docker container itself knows how to send logs to Cloud Watch logs, and it will just use the im permissions that have been assigned correctly, hopefully, to the task definition for the I am role. So you need to make sure that this ECS task execution role has the necessary permissions to send logs to Cloud Watch. So let’s check that this role is an ECS task execution role, and we need to attach a policy. Maybe this one already has it. Let’s take a look. And, yes, creating the log stream and storing log events is a role. So the container is now logging all the way through, using the AWS logs as a log configuration within the container. And then, if we are using ECS container instances, backed by EC2 instances, we can instal the Cloud Watchlogs, which will allow us to send files from the OS of the EC2 instance onto the Cloud Watch logs. So, if I scroll down, we can send VAR logdmsg VAR log messages, varlog docker VAR log ECS in andlog, which is from the ECS in its service, as well as the agent log, ECS agent log, audit log, and so on. And so we would define a log configuration for the Cloud Watch agent.

And all the way at the bottom, you see a nice screenshot that they give you. And we’ll have log groups being created so that the Cloud Watch log agent can send all the files from within the EC2 instance into Cloud Watch logs. So the difference here is that for logging stuff that’s happening in the instance for EC2 launch mode, we can use the Cloud Watch Logs agent. And for sending the log of the application itself, which is running in a Docker container, we need to use that task definition parameter to send the log using the AWS Logs option. Okay, so I’ve said that’s enough. Now let’s move on to the next thing. The next thing is going to be around Cloud Watch metrics. So if I go to Cloud Watch and look at the metrics, we can see that the ECS service will offer us a lot of different metrics available. So let’s go into metrics. And here we are able to see ECS. And so we get metrics at the cluster level or at the cluster and service name level. So this is a bit more depressing. So if we look at this one and look at CPU utilisation for the demo ECS classic cluster name and the demo service service name, we’re able to see a graph of the CPU utilization. And we could use this, for example, to define an alarm on top of it. And using that alarm, we would be able, for example, to do scaling. So this is how we did auto-scaling using these Cloud Watch alarms and these Cloud Watch metrics. Okay? And there is another thing that’s really, really new. Probably not at the exam just yet,  but it’s still good to show you. If I go to account settings for ECS and I scroll down, I have Cloud WatchContainer Insights, and this is a new way. And this is something you have to pay for.

So if you take that box, you will have to pay for it. But this will send per-container metrics into Cloud Watch logs. So that means that if you go into “Cloud Watch,” sorry, “CloudWatch Metrics,” that means that if you go into “Cloud Watch,” you will see a new category here that represents every single container that you have. And we provide you with the metrics of every single container. This is really helpful when you want to debug a single container and get more information about it. So this is a new feature, and it’s called Container Insights. And you have to pay extra for it. If you go to clusters and look at the monitoring right now, it says “Default monitoring.” So these are the metrics that we see right here. These are the monitoring settings by default. And if we enable Cloud Watch monitoring, this is the advanced metric for Cloud Watch Container Insights that we get.

Okay? This is also something you get when you create a cluster. For example, if I create this cluster right here and click on Next Step at the very bottom of all the configuration, I get Cloud Watch Container Insights. You could also drag that box here to enable Container Insights. So we’ve done Cloud Watch logs and we’ve done CloudWatch metrics, and the last thing you know is going to be around Cloud Watch events. So, let’s go into your Cloud Watch and look at events, and I’ll get started by adding a service name. So it’s going to be ECS, and where is it? Here it is. And we can look at state changes for specific types of details, for example, an ECS container instance state change, and we could also look at an ECS task state change for specific clusters. For example, this one cluster and this one cluster give us this event pattern. And then, if we look at the sample events for this configuration, we don’t get anything. But if I just remove this configuration, here we go. We get some information around, for example, when a task gets launched, when it gets terminated, when it fails, and so on.

So these are the same events you would see if you went into a service and clicked on the events tab. And here are all these events that would be Cloud Watch events for you. So they can be really helpful to automate a lot of things and send notifications to Slack, for example, or get more information around the failures of your services using Cloud Watch events. Okay? And then finally, we can always use a schedule, for example, a one-day schedule of one day.And every day we could add a target, and the target would be an ECS task. And so it is possible for us every day to run this task definition in Fargate mode, and then we can define the task group and the platform version. So, for example, we could define some sort of bad job auto scaling or, if anything happens with some event pattern, such as elasticsearch, we could target and create an ECS task to deal with it. So it could also be a really nice automation. ECS can be both an event source and an event target. And Cloud Watch events are going to be the centrepiece of your exam to build all the automation you need to do as a DevOps. So that’s it for this lecture. I hope you liked it, and I will see you in the next lecture.

  1. ECS – CodePipeline CICD

So this is a really long workshop, and I don’t want us to do it live because it will take a lot of time, and it’s not really necessary for the exam to understand how to do it precisely. However, you must have a high level understanding of how to implement a CICD pipeline for ECS. So here are four links that I have compiled for you. The first one is from ECS Workshop.com, and it shows how we can have a CI CD pipeline for ECS. Developers follow the steps by pushing through the Get repository. It could be Code Command or it could be a Docker file on GitHub, and then Code Pipeline will pick up the changes and invoke Code Build to build the Docker image from the Docker file found in the git repository. When the Docker build is done to build the Docker image, it will push it to Amazon ECR. Then a cloud formation template will be run by Code per pipeline, and this confirmation template will push a new task definition into Amazon ECS, and that task definition will reference the new Docker image that was pushed to ECR. Therefore, Amazon ECS, the EC2 instances, or the Fargate containers will be pulling the latest version of the Docker image directly from Amazon ECR. So this is quite a common thing, and if you’re interested in building this out, you can go to this link: ecsworkshop.com/introductioncid.

There are also three different tutorials you can do, one on the ECS page to do continuous deployment with Code Pipeline, and we could do this on your own time. It shows you how to build a build spectrum YAML file to build and push Docker files into ECR. But at the very end, what I’m going to show you is that it sources from code commits, builds with code build, and then deploys directly into ECS, which is pretty easy. Then the next thing you can see is that you can do deployments with Code Deploy. So Code Deploy is an option for ECS deployments and BlueGreen deployments. So the same way we’ve seen that Code Deploy works with EC2 and on-premises instances and lambdas, it also works with ECS services. So if you scroll all the way down to see what the pipeline looks like, we can see that here. When the source changes and an image is added to Amazon ECR, this pipeline will deploy to ECS using the ECS service and the Code Deploy BlueGreenOption.

So it’s quite a nice option and a fun tutorial to do as well. but at a high level. I want you to remember that Code Deploy can be used to do Bluegreen deployments on Amazon ECS. Finally, there is a blog that appears to be very similar to the one on this page that you can use to build a continuous delivery pipeline for your container images using Amazon ECR as a source. So overall, I want to remember that there are different kinds of pipelines you can build for doing continuous deployment and continuous delivery on ECR and ECS, all the way from code commit into code build into ECR, some cloud formation, and even some code deployment if you wanted to. That’s pretty cool. And then finally, I want you to look at the ECR. So when you go to ECR and you go to repositories, for example, I’ve created a demo repository and we’ve pushed a demo file. We can see that there’s an image tag, and that’s the latest. And the most recent image tag is the most recent image tag. And it’s not an immutable image tag. That means that if we start pushing a new image with the latest, it will override that image. Something we can do to make sure that the image is pulled correctly and that it’s the right image is to use either a version tag or a version number.

So we would tag the version with 123456, and that would allow us, when we push a new task definition, to always find the right Docker image in here. Or we could be using the digest here. And it’s the long-shot 256-bit digest that uniquely identifies the image. So the way you would do this is by doing docker run, then the name of the repository, the name of the directory, and the image. So demo at the Arab, the add sign atsha 256, and then the digest is in here. And that uniquely identifies the Docker image, which makes you certain that the right deployment will happen in your task definition. So, instead of Docker run demo latest, you run it with this instead of using tags. So that’s it. That’s all I want you to remember about the CICD pipelines. I hope you like this. And if you want to practise at your own pace, I definitely encourage you to do this. It’s always a good idea to get your hands dirty with the code pipeline so you can build, commit, and form clouds. I’ll see you at the next lecture.

  1. OpsWorks – Getting Started Part 1

Okay, so let’s get started with OpsWorks. And Upsworks is one service that I find incredibly hard to use on AWS because it requires some knowledge of Chef. And this is not a course about cooking, and I’m trying to acquire as much knowledge as I need to understand this. But overall, Ops Works is quite complicated and only comes up in a few questions at the exam. I’ll make sure to stress during these lectures what is important to know for Upwork. So get started with me. So, to make things easier, OpsWorks offers three distinct services. There is an upsworks tax, an upsworks for Chef Automate, and an upsworks for Puppet Enterprise. Fortunately, Upsworks Tax is all we’ll need to know for the exam. So, to put it simply, Upsworks Tax is about using Chef Cookbooks to deploy your application. So with Upsworks, we’re going to manage both our infrastructure and our configuration and our applications within one service. So this is for the people who already use Chef in their on-premises infrastructure. And Chef is open source, and they want to migrate to the cloud and start using Chef on the cloud as well. So what we’ll do is start fresh and add our first stack. Now, a stack is a set of layer instances and related alias resources whose configuration you may want to manage together. So if we go in here in the documentation, it shows us how Ops works.

So there are the users; they access the Internet, and they may access our application through a load balancer. There will be a bunch of app servers and maybe a database server. We’re pretty used to this kind of architecture. Now for OpsWorks, it translates into this thing. The users access the Internet, and then the Internet accesses our AWS environment. Within it, we have a VPC, and then we have the Upwork stack. Our stack is going to be made of a couple of layers. Here we have an elastic load-balancing layer. Here we have application server layers, and here we have an Amazon RDS layer. So, as you can see, within the same Upswork stack, we’ve defined three different layers. You need to remember this. There’s one load balancing layer, one application layer, and one RDS layer. The first and last ones are going to be Amazon-specific. where the application layer is up to you to provision. And for this, you can use something called cookbooks. And cookbooks are going to be chef recipes, and they’re just code in some sort of specific format to tell the application how to provision itself. And then, for the app repository, you also have an app that gets provisioned as part of your layer. So let’s go through the hands-on and see how that works. So, we are going to start with a sample stack. But first, we’ll go ahead and click on “Chef 12 Stack” to see what options we have. So we have a region that we can launch into, and I will choose EU Ireland, and then we have a stack name that we could define. So then we have to define a VPC and a default subnet, and this is where our instances will be launched by default. and a default operating system. We use Amazon Linux too. We could use an SSH key that we already have. If you wanted to start SSHing into our stack, we could use version twelve of Chef. And we could define custom Chef cookbooks to provision our application. They could sit in Git, S3, or some HTTP version. So Git would be, for example, a code command or a GitHub, and so on. We could also have an SSH key to access the Git repository.

So the important thing to remember here is that we have to provision the Chef Cookbooks, and we have to indicate to UpsWorks where they are. They may be in Git, an HTTP Archive, or an S3 archive. For now, we’re not going to use any custom Chef cookbooks. For the advanced section, we could look at the default route type. Would it be EBS, Instance Store, or something else? the Im role for the Upsworks service itself, and the default instance IAM profile, which will be assigned to our instances once they are created. Then the API endpoint region will use EUs 1, which is the same region as the one we had up there. The rest is pretty basic, and we won’t look at it because it’s actually quite advanced. Okay, so we’ll cancel this because we don’t want this stack, and we’ll add our first stack. Sorry. Go to the sample stack and choose Linux. And we’ll choose a Linux Chef 12-sample stack that will deploy a Node JS application for us. So let’s click on “create stack.” And here we go. It’s going to create the stack. Chef cookbooks. Add the layer and then assign a recipe to that, deploy lifecycle events, and add an instance to that layer. We’ll explore the sample stack together now. Okay, so we are in the Upsworks UI, and as you can see, there are a lot of different things to click on. So there’s a lot of knowledge to have. But hopefully, I will show you what is necessary for the exam. The first thing you want to look at is the stack setting. And these are the settings that we could have set on our own. Because we used a sample stack, we can see that the region that was defaulted for us was the United States to West Oregon.

So I think this is the same region you’ll be using. It shows us that the cookbooks were custom and that they came from an HTTP archive that was at this location. So we could download this archive folded around to look at okay, then in terms of the advanced options, the instances are EBS-backed. The Im role for the Ops work service is now available. The Im role for the instance profile can be found here. So we’ll just open up both, and then we are in US West 2 for the API endpoint region. Okay, this is it for the settings, and we could always go ahead and edit them. Now back to our stack. Here’s our stack, and it shows us that we have a layer that is a Node JS app server layer, and we also have an app that is a Node JS sample. In terms of instances, we have one, but currently it’s stopped. So we’ll look at some instances in a second. So why don’t we go ahead and click on Layers? Layers will define a set blueprint for easy access to instances. So that will specify the instance settings, the associated resources, the install packages, profiles, and security groups. And we can also add recipes to lifecycle events. And we’ll have a lecture dedicated to lifecycle events. Let’s go ahead and click on this layer. Then we’ll click on this layer, and here is the name for the setting, and so on. We could change all of these things, and the instance shutdown timeout is 120 seconds. One important setting to consider is whether or not Oto Healing is enabled. Yes.

So that means that if an instance is unhealthy, then the Upswork stack will reprovision that instance for us using auto-healing, and that’s quite nice. Then for recipes, these are the recipes that we use from this link in the repository URL that will say how to set up, configure, deploy, undeploy, and shut down our application. Currently we just have one recipe assigned to deploy, and that’s the NodeJS demo for the network. We could define here to add an elastic load balancer, but we haven’t provisioned one, so that won’t work. We could have EBS volumes and security groups and look at the Cloud Watch logs. And if you take that box and click on Upgrade, then we will add the AOS OpsWorks CloudWatch Logs management policy to the Im role that was created, allowing our application to send logs into Cloud Watch. So we’ll click on “Upgrade,” and this is done. Okay, next we click to edit the layer settings to enable this Cloud Watch logs integration. So I’ll turn this on? Yes. And in the Streamscoops custom log, I can just click on “Save” and we’ll be done. Okay, next tags. We could tag this layer, and we’re done. So our layer is currently defined, there is one instance of it, and currently it’s stopped. So let’s click on “instances” and see what happens. So this instance is a Node JS server, and it’s currently stopped.

The size is two and a half. What I’m going to do is click on it and actually look at the configuration in greater detail. So this instance is assigned to our layer, and currently it’s not launched, so we don’t have an easy instance ID, but we do have an OpsWorksID because it was created through OpsWorks. The instance type means that instance is going to be running all the time. The size is T-2 medium, and I’m going to change this to T-2 micro so we don’t incur cost, and then the rest is just basic settings for the subnet for Amazon, Linux T-2, and so on. So let’s go ahead and click on Edit, and I want to edit this to be a T two micro. This way, we don’t get billed for it.

We could also set an SSH key if we wanted to. Okay, let’s click on Save, and we’re done. So now we have our instance, and what we can do is get started by clicking on “Start,” which will take a very long time to do. So, if we go to EC-2 now from these services, we can see that this is the US to EC-2 very soon. So it’s lagging a little bit for me because I’m far away from that region in instances. Now the instance is being created and is in the pending state. This one was from some experimenting that I did on my own. So here we go. Here is our instance, which is pending and being created in the UpsWorks UI. Now we can see that our NodeJS server is in the “pending” state and will go through some configuration. So I’m going to pause the video and get back to you when this is done. So our instance has now booted and is running the setup. So we are in a new status, the setup is down, and our instance is now online. So, if I go to the public IP, I should see congratulations: you have just deployed your first OpsWorks app. We didn’t do much, but we went through the sample app, and that was enough. So we have done everything here. Our app is online, and there’s a lot more I want to tell you. So why don’t we pick this up in the next video? So, until the next lecture.

  1. OpsWorks – Getting Started Part 2

So we have our app running, but let’s talk about these instances. This is something a bit different from what we’ve seen here. The instances are managed by OpsWorks, and Upsworks was the one to provision the instances for us. So we have one app server, and it’s a 24/7 type of server, and we’re able to stop it from this UI and start it from this UI. We could also do this in the EC2 management console, but it’s much better to do it from the Upsworks.

Now we have other kinds of instances, though. We have time-based instances and load-based instances. So let’s first take a look at time-based instances. Upsworks will automatically start and stop time-based instances based on the specific schedule. So we’ll add a time-based instance, and for example, this is our Node Server 2, and the size is going to be a T-2 micro as well. So let’s find it. Here we go: t two micro, with the subnet possibly in US West two B. Now for the vent settings, we can see that it is a time-based type of instance. We could set an SSH key, the operating system, and everything else, but everything looks pretty basic. Okay, let’s add this instance and see what happens. So, our instance has been added, and as you can see, now I need to define a schedule for when I want my instance to run. So maybe I know that between 9:00 a.m. and 12:00 p.m., 01:00 p.m., there is a need for that instance to come up. So, during this time, UPS will work every day and cause my instance to come up and then go down; perhaps I’ll know that on Tuesday as well.

During this time frame, I really need it. So I can set it up for Tuesday to have this instance come up. So I’ll include it; I’ll include this. So every day it will be between nine and 1:00 p.m., but on Tuesday it will go all the way to 4:00 p.m. So here, I’m really able to define a schedule for my instance. And when we enter that schedule, Upstack will bring up the instance and then bring it down. So this is quite powerful. And so we are able to add many types of instances. So we could add another one like this, another G-2 micron here, and then we’ll add it for us to see. And here we’re able to define another type of schedule for that one instance. So we’re really free to decide how we want our time-based instances to work. And they’re scheduled. Okay, so Upsource will bring them up, and then we’ll bring them down based on the schedule. Following that are load-based instances, which are beginning to respond to changes in CPU memory and application load across all instances in the layer. So we could edit the configuration, but let’s just go ahead and add a load-based instance. and this is the one. Maybe we want to have another T.2 micro as well to be started.

And in the subnet, two D and Advanced Settings are the same as before, but this time they are load-based. So we’ll add the instance, and here we go. We now have a load-based instance, but it isn’t running because load-based auto scaling is turned off. To enable it, simply turn it on and then specify the scaling configuration. We’ll just leave it at that, which means that after 80%, add instances and below 30%, remove instances. Okay, so now we have three kinds of instances running. We have the 24/7 kind, we have the time-based and we have the load-based kind. So you have to realise that each of these instances is going to be on one server.

So the idea is that Upsworks needs to know in advance how many servers you want to run. It’s not like auto-scaling, where you have to set arbitrary minimum and maximum values and then do it. Now, in Upswitch, assign these servers to a type, such as 24/7 time or load, and an AZ, and they will be created ahead of time and stopped and started as needed. So this is really important to understand. It’s not as flexible as auto-scaling, even though there is a load-based type of instance. Okay, so here we have four instances, and this is quite good. We could start all instances if we wanted to, or stop all instances right here if we wanted to as well. Okay, so let’s move on then. We have our applications. And our apps are going to be defining how the code stored in a repository should be installed on the application server instances. So if we click on “Simple App” in here, we have the information that the repository URL for this app is coming from Git at this URL. And so if we go to this URL and check it out, we should see that there is some code for us to look at. And this code will contain all the simple applications that are needed by OpsWorks. Okay, so this is pretty good. We could deploy that app onto our instances.

As a result, we could run a deploy command and select which instances to deploy it on. Right now, this will be deploying one of four instances because the others are not active at all. So you could trigger a manual deployment directly from OpsWorks. And again, all these things happen within OpsWorks. So talking about deployments, you could create a deployment here by deploying an app. Or you could choose to run a command on all your instances. so you could execute recipes. And the recipes are in chef cookbooks, and you must choose which ones you want to make. Or you need to set up, configure, or upgrade your operating system, or update a custom cookbook. Then you would select all of the instances on which you want this command to run. So again, from within OpsWorks, if you’re very familiar with Chef, you are able to update and act on all your instances at once and manage them from within. So that makes UPS work kind of like an all-in-all integrated solution. You also have some monitoring where you can look at the CPU average, the memory average, the load, the number of processes, and so on. And then you have some permissions if you want to create users to import users to use your OpsWork stack.

Overall, there is a lot of information overload. But I want you to remember that there are layers. We could also add one of these layers. And in the adding, we could add an OpsWork layer, which is what we have here. “Oh, I’m back to the wrong page,” we could say. Or we could add a layer that would be an ECS layer if we needed to, or an RDS layer if we needed our data to be held. So this represents the layer right here—the RDS data in the documentation. So there is a lot of complexity. But for this, you would need to add an RDSDB instance to RDS, and then you would register it under this layer for OpsWorks. Okay? So we can have as many layers as we want. and usually we have three. We’ll have a load-balancing layer, an application layer, and an RDS layer, and we’re good to go. So, out of it, what do we remember? Well, we remember that our stack has many layers. Then each layer will have instances. And there are three types of instances: time, load, and 24/7. And they will come up and come down based on some rules that we define.

When the application is launched, for example, in our load server, some events and commands are executed. For example, the setup command was run for five minutes, and then the configure command was run for 47 seconds. And we’re able to view the logs if we want to. For example, what is the log looking like? Well, this looks like a lug when you run some Chef commands, and it’s pretty boring, but I’ll show you in a second. It showed you that the cookbooks were updated and then run by the Lug itself. So far, so good. So our instances run some commands whenever they appear. And then we can deploy different apps, so we can create not just one app but many different apps as part of our OpsWork stack to deploy onto the instances. So all of this is pretty convoluted, as you admit. But you have to remember that Ops works on a lot of things at the same time. It is an instance manager. It is a layer manager and an app manager. It does monitoring and so on. If you’re using Chef Cookbooks, it’s a fully integrated solution. So that’s it. just for the overview. In the next lecture, I’ll show you something very important about stacks in upsworks. So, until the next lecture.

  1. OpsWorks – Lifecycle Events

So the most important thing you need to remember for the exam, the DevOps exam with OpsWorks, is the stack lifecycle events. And this is the most important thing to remember. So, each layer has a set of five lifecycle events, each of which has an associated set of recipes that are specific to the layer. And when an event occurs on the layer instance, for example, an instance comes up, then AWS UpswordStacks automatically runs the appropriate set of recipes. And to provide a custom response to these events, you can implement custom recipes. And this is code—basically, Shave code—and I assign it to the appropriate events for each layer.

And then Upsource Tax will make sure that those recipes are run after the events built into them. So let’s go to our layer and see if that means anything to us. We’ll go in here and click on Recipes. And here we see that we have five lifecycle events happening. We have set up, configured, deployed, undeployed, and shut down. If we click on Edit, we will be able to add some cookbooks in here to define what to do during setup, what to do during configure, deploy, undeploy, and shut down. So it is very important to remember the five of them. There are five of them, so they’re here, and you need to remember when they’re running. So let’s go to the documentation to learn about that setup. This occurs after an instance has completed booting. So whenever the instance has booted, the setup will happen. So it will set up whatever you want. And also, you can run your custom cookbook to set up more stuff on it. Then we have the configure events, and this is by far the one you need to remember the most.

So this event occurs on all of the Stacks instances when one of the following occurs: an instance enters or leaves the online state. We associate an elastic IP, attach an elastic load balancer to a layer, or detach it from a layer. So this is super important to understand here. The event occurs on all of the stack instances at the same time. So, let’s talk about it, shall we? We have four examples. Assume they’re all online, correct? And we have a fifth one coming up. The fifth one will go first through the setup phase and will be set up. The configure event will then be triggered. But not just on the fifth instance; it will kick in on servers 1234 and 5. The key point here is that during this configure event, all Stack instances will run the configure. And so if you have a distributed application, you can, at this stage, using the Configure Upsworks thing, make sure to configure all of the instances in response to one of them coming up or one of them coming down. This is really helpful. For example, when you have an H-aproxy or so on, basically when you have a distributed application, you want to make sure that they’re all configured to know about each other. So this is one of the most important events to know about.

And remember, it runs on all instances if one instance enters or leaves the online state. Okay, back to our layer. Let us go here and then to Recipes edits. And so this was the configuration stage. So it’s set up, and then every instance that will be up and running will be configured. Then we have deployment. And deploy is exactly what you’d expect. This event occurs when you run the deploy command, typically to deploy an application to a set of application server instances. And the instances run recipes that deploy the application and any related files from a repository or layer of instances. So by clicking on the “Deploy” button, it would deploy the application overall to all the instances. And something to note is that when you use the setup, it includes the deploy.

So when our instance is first set up, it will also deploy the application on it. So that’s good to know. Then there’s undeploy. And this event occurs when we delete an app or run an undeployment to remove an app from a set of server instances. Okay? So when we do run an undeploy, if we were, for example, deleting an app, then the app would be gone. Finally, we’ve reached the end. And this is a fifth hook. And this happens when we shut down an instance but before the associated EC2 instance is actually terminated. So this is used to perform some cleanup tasks, such as shutting down services. And so remember, if there’s a shutdown happening, then the application will be offline, and therefore the configure hook will run again on all the Stack instances. So what you need to remember from this lecture is that setup, deployment, undeployment, and shutting down are instance-specific and can be run individually. But configuration happens on all the instances if one of them comes online or goes offline. And this could be used to configure all the instances at once and make them aware of each other. And so you need to remember this going into the exam, because I think that’s going to be one of the most important things when Ops Works comes up.

Okay, so one question you may ask me is, “What is happening during this configure event and what data do we have access to?” Well, this link to stack configuration and deployment attributes shows us the kind of data that is sent to all the instances during the configure events. And this is what it looks like. So we have a massive JSON document that contains the layers, and each layer will contain the list of all the instances. And for each instance, we have the instancename and, in some cases, the IP address. And it’s very important to understand that we have its IP, the AZ, the instance ID, the private business name when it was created, if it’s online, the back ends, the public business name, and so on. So using this information, we are able to get the list of all the other instances, their IPS, their Azs, and so on, and maybe create a distributed configuration file. As a result, configuring steps and hooks in the OpsWork stacks would be an excellent use case. Okay, so that’s it for this lecture. Hopefully, that makes a lot of sense. I know it was a lot theoretical, but with the way apps work, I don’t want to get too deep into it without bothering you with details. So try to remember the fact that we have five hooks for instances, and I will see you in the next lecture.

  1. OpsWorks – Auto Healing & CloudWatch Events

So let’s talk about the integration of Ops Works and Cloud Watch events. So we saw in the general settings that there was auto-healing enabled, and that means that if a layer instance fails, then it will be automatically healed. So what does it mean to be automatically healed? That is, for an Amazon EBS-backed instance, the instance will be stopped, and the OpsWorkservice will verify that it has stopped. And then we’ll start the EC2 instance again. Okay, and what does it mean to be unhealthy? Well, every instance has an Upwork agent that communicates regularly with the service. And Upsworks imagines that if in five minutes the agent has not communicated with Upsworks, then the instance is considered to have failed, and then Oto Healing will kick in. So OTO healing is something that’s pretty important because you don’t want it to happen too often. You don’t want your instances to fail, for whatever reason. And so what if you want to be notified of it? Well, that’s a very good question, but we know there is something called “Cloud Watch events.”

So, if we go to Cloud Watch, we’ll look at the rules on the left hand side and then create a rule. We can have the service name “Upsworks” there, and the event type would be, for example, “Upsworks instance, state change.” Then we can search for specific statuses like “connection lost,” “start failed,” “stop failed,” and so on. So we could be looking at a lot of different things in here. And then, as a target, for example, we could send this to an SNS topic, and then, through the SNS topic, send ourselves an email when these things happen. So Cloud Watch events will allow us to create rules to monitor our Ops Works instances and get some information about when they come online, when they get stopped, and that kind of thing. All these operational details would allow us to build a lot of automation if we needed to as a DevOps team. Okay, so this is all I want to show you, remember, about auto-healing—that it did happen. And then, if you want to get notified about autohealing, you need to use CloudWatch event roles. All right, that’s it. I will see you at the next lecture.

  1. OpsWorks – Summary & Cleanup

So let’s summarise and clean up everything. So in OpsWorks, we need to remember that we create stacks, and each stack will have a set of instances. Then these instances can be 24/7, time-based, or load-based, and we have to define them in advance. and they will be managed by OpsWorks automatically. When an instance comes up, it will be reacting to a bunch of lifecycle events. And these lifecycle events could be defined by recipes. And we have the setup, configuration, deployment, undeployment, and shut-down lifecycle events. And configure is the one that runs on all the instances if one of them comes up or one of them goes down, allowing us to distributely configure all the instances at once. We are able to define apps in here, and the apps must come from an application source, which could be Git or HTTP, and will contain all the chef cookbooks we need to make our application.

We’re then able to deploy an app on our instances and also run a command if we want to, so this is not working. And we could also run a command if we wanted to. using a custom cookbook. We have the option to monitor our instances, and we get user management through permissions. So that’s it. That’s all you need to remember for OpsWorks. and I know that’s a lot, but that should be enough for the exam. And when it comes to deleting Upsworks, you need to delete the layer. So you need to delete this layer, and you cannot do this until you delete all your instances. So you need to stop this instance, and then we’ll be able to delete it. So we’ll delete this one, that one, and also this one. Now we need to wait for this server to be stopped. Our instance can now be deleted. And yes, I want to delete it. Thank you. Then I will go to layers, and I can delete this layer. Yes, thank you very much. And then, finally, I can go into the stack and delete the stack itself. And for this, we also need to delete the apps. So let’s go ahead and delete the app. Here we go. The app has been deleted, and now we can delete the stack altogether, which is done with upsource. So thank you for watching, and I will see you in the next lecture.

img