Amazon AWS Certified Database Specialty – Amazon DynamoDB and DAX Part 5

  1. DynamoDb backup and restore – Hands on

Now, let’s quickly look at the backup and restore features in DynamoDB. So here is our gameplay table, and I’m going to go on the Backups tab. This is where you can create backups and restore from existing backups. Before we create a backup, let’s look at the items in the table.

So we have these two items here, right? And what we’re going to do is create a backup to go on the Backup tab and create a backup. Let’s name it gameplay. Manual backup. Right. And hit create. And that’s about it. The backup is created and its status is available. Now, you can use this backup for restore purposes. And just so you know, you can also access backups from the left hand side menu.

So if you click on this backups option, you can see the backups of different tables that have been created till now. You can select a backup and restore it, or you can also restore from the table options from the Backups tab. Also, you can select the backup and carry out the same restore process. From here, you can delete the backup or restore the backup. So let’s see how restore works.

We’re going to restore the backup. You have to give a new table name. So let’s say gameplay restored, right? And you can restore entire table data, or you can restore with that the secondary indexes. You can choose whichever you want. You can restore in the same region or in another region. The encryption option at the time of backup will be selected here, and you can change it if you like, and then it will show you what are the backup details.

So it shows you the primary key, the sort key, the pricing mode that it was provisioned, and if it was on demand, it will show up as on demand. And if you want to change the pricing mode, you can do that after the restore process is done, right? And once you have made your choices, you can review the options that you have selected and simply hit the Restore Table option to restore the table.

And now you see that a new table is being created, and it will take a while for this table to be created. And I’m going to pause the video here and come back when this table is ready. All right, I’m back. And you can see that the table has been restored with a new name. And if you look at the items, you can see that we have the same number items with exact same data. This is how easy it is to restore from a backup.

  1. Continuous backup with PITR

All right, now let’s look at the continuous backups with Pitr. Now, continuous backups with Pit r allow you to restore your table data to any second in the past 35 days. All right? And this is priced pergb based on the table size. And important thing to remember here is the restoration time or restoration windows start starts from the time you enable Pitr.

So if you just enable the Pitr, then you’re not going to be able to restore to any second in the last 35 days. You can only restore to a time from where you enable your Pitr. So if you disable Pitr and re enable it, the 35 days clock gets reset. And the continuous backups do work with unencrypted tables or encrypted tables as well as global tables. And they can also be enabled on each local replica of a global table.

Now, we’re going to look at what global tables are later in this section. And if you restore a table which is part of global tables, the restored table will be an independent table and it won’t be part of the global tables and the restore process, irrespective of whether it is a continuous backup with Pitr or if it is an on demand backup, the restores always happen to a new table.

And what cannot get restore is the same thing as with the on demand backups stream settings, TTL options, auto scaling, configuration, Pitr settings, alarms and tags. These things do not get restored and you have to set them up manually if you want them to be available for the new table. And all the Pitr API logs do get logged in cloud trail for auditing purpose. Now let’s go into a demo and find out how to enable continuous backups with Pitr. And also take a look at how how to run a restore using Pitr.

  1. Continuous backup with PITR – Hands on

In this demo, let’s look at how to use the canineious backups with Pitr. Say, this is our gameplay table, and let’s go to the Backups tab. And here you can enable the Point in Time Recovery feature or the Pit R feature that allows you to restore your table data to any second in the past 35 days. Say, simply click on Enable, and from the dialog box, again, click on Enable.

Now, this shows us that the status is enabled. So once you have enabled, the Pitr DynamoDB is going to show you the time, the earliest restore date, and the latest restore date. Right? Now, they are the same. And as the time progresses, this is going to change. So if you see it after some time, the earliest restore date will not change, but the latest restore date will change in some time. So basically, what it means is you can restore to any second between these two timestamps, the earliest restore date and the latest restore date. So what I’m going to do now is I’m going to go to the Items tab, and I’m going to delete user two. Okay? All right, the user two is gone, and if I now restore this data using the continuous backups, I’m going to get this data back. So let’s see how it works.

Again, go to the Backups tab, and you can restore to Point in time using this option. So click on Restore to Point in Time and give new table names. Gameplay pitr restored. Right. And here you can choose the exact timestamp at which you want to restore your data. Of course, you can only restore to 753. So 1953 is the time you can restore it to because we have just enabled the Pitr feature. Otherwise, you could have up to 35 days of time period to restore your data. All right, rest of the settings are same as you have in the regular restore feature. So I’m not going to go into that. We can simply click on the Restore table option to see how the restore happens. Okay, and now you can see that a new table is being created. Again, this is going to take some time, so I’m going to pause the video here and come back once this table is restored.

And you will see that this new table also has two items, and we should be able to get back the item that we deleted. All right, so our table has been restored to a point in time, and let’s look at the items. And there we go. We have the second user that we deleted. So if you look at the original table, you have only one item, but the restore table has two items with the one that was deleted. All right, and before we close, let’s go back to the original table and go to the Backup tab. And here you can now see that the latest restore date has changed. So you have a period of time between the two timestamps, and you can restore to any point between these two timestamps. And if you refresh the page, you’re going to see that this time stamp changes, you have 80 to 50 01:00 P. m. . And now if you refresh again, you’re going to see that timestamp change. So now it is 80 03:03 P. m. . Simply, you can restore to any second between these two timestamps. All right, so that’s about it. Let’s continue.

  1. DynamoDB encryption

In this lecture, we are going to talk about DynamoDB Encryption. Now DynamoDB encryption is integrated with the Kms service. DynamoDB provides serverside encryption at rest, and this is enabled by default. Of course, if you want, you can disable encryption, but by default, encryption is enabled, and this is transparent to the user users. So if you log into your AWS console, you will still see the data, but it is still stored as encrypted. This, of course, uses Kms, as I mentioned, and Kms uses 256 bit AES encryption for the encryption at rest. And we can use any of the CMK. CMK stands for Customer Master Key, and that is the key we use for encryption. So we can use either the AWS owned CMK or the AWS managed CMK, or Customer managed CMK. All are supported. And what gets encrypted is the primary key secondary indexes streams, global tables, backups and tax clusters. So all of these things do get encrypted when you enable server side encryption.

Now, the encryption in transit. Now, DynamoDB always uses SSL Endpoints, so your connections from the application are always encrypted. And if your application sits in a VPC, then you can also use VPC Endpoints. And as I mentioned, we use TLS Endpoints for encrypting data in transit. And by default, DynamoDB only provides SSL Endpoints, so your data is always encrypted in transit. Now, DynamoDB also provides something that’s called as encryption client. And you can use this if encryption is super important to you and you don’t want to transfer your data unencrypted, even on the SSL channel. When you use DynamoDB Encryption Client, you simply encrypt your data at the client side and insert into DynamoDB only the encrypted data. So DynamoDB has no way to decrypt it. You have to decrypt it on your own through your application. And this is for additional protection on top of the encryption in transit.

Now, this results in an end to end kind of encryption. Now, remember that this doesn’t encrypt the entire table. It just encrypts attribute values and not the attribute names. And this is kind of logical, because otherwise DynamoDB will not know what are different attribute names or what are the primary keys. And it also doesn’t encrypt the primary key attribute values, because the primary key attribute values decide the partitions. DynamoDB does not encrypt the primary key attribute values, and you can selectively encrypt other attribute values. So if you do not want all the attribute values to be encrypted, you can selectively encrypt some of the attribute values, and you can encrypt selected items in the table or selected attribute values sum or all items. So it’s all up to you what you want to encrypt. You can encrypt selected items, or you can encrypt selected item attributes. It’s all up to you. DynamoDB does not know that you have encrypted the data when you use Encryption Client.

  1. DynamoDB streams

In this lecture we are going to look at DynamoDB streams. What are streams? Streams are 24 hours time ordered log of all your table right activity. So, once you enable DynamoDB streams, whenever you insert a new item into the table, or update an existing item or delete an item, that event is captured in the DynamoDB stream and it is relayed in the stream. And you can read the stream to react to changes to DynamoDB tables in real time. And you can read the stream data using AWS Lambda, or you can use an application running on EC two, or you can use Elasticsearch or Kinesis to read the stream data and react to changes happening on your DynamoDB table in real time.

So when you enable DynamoDB streams, all the log of the right activity is pushed to the stream. And then you can use the AWS SDK, or you can use lambda functions to process the stream and you can push that data into Elasticsearch or Kinesis. Now, there are a number of use cases for streams. Like you can use it for replication of data. So replicate data from one table to another, or you can archive your table data from one table to another. You can use this for notification purpose, you can use this for log processing. There are numerous such applications of DynamoDB streams.

Now, DynamoDB streams are organized into what is called as shards. Now, stream records are pushed into the stream in the form of shards. And these records are not retroactively populated in a stream after you enable it. What that means is when you enable streams from that point on, the data of your right activity will be captured and pushed into the DynamoDB stream. So when you enable DynamoDB streams from that point on, all the right activity on the table will be logged and pushed into the DynamoDB stream.

And you can simply enable streams from DynamoDB console. And there are four views that you can choose, which is keys only new image, old image or new and old images. What this means is if you choose keys only the keys of the changed items are pushed into the stream. If you use new image, only the new data after the item has changed, the new item data is pushed to the stream. If you choose old image, the old state of the item before the change happened will be pushed to the stream. And if you use new and old images, then all the data before the update as well as after the update will be pushed to the DynamoDB stream. And these views or these streams simply contain the JSON structure of your new and old data along with some metadata.

  1. DynamoDB TTL

Now let’s look at what TTL is. TTL is something that allows you to tell DynamoDB when to delete an item from the table. And this is very, very handy feature, and this helps you to control the size of your table. So if you do not need certain data, after a certain period of time, you can set a TTL value, and the corresponding table items will be automatically automatically deleted once that time expires. So here we have our gameplay table, and let’s say we add a new attribute to the table, and we can name it anything. Here I have named it Expires, and this expires attribute simply towards a timestamp value. Okay? So we simply add a Unix timestamp in this attribute.

So this timestamp will serve as the expiry timestamp for the particular item. So you simply create a TTL attribute, you can name it anything, and in that attribute, you store a Unix timestamp of the time when that particular item should expire. When you enable TTL and tell DynamoDB that this is the attribute where I’m storing my TTL timestamp, then DynamoDB is going to mark that item for deletion whenever that timestamp expires. So the expired items get removed from the table, as well as from the secondary indexes automatically within about 48 hours.

So it’s important to note that this might take up to 48 hours. So the expired items can show up in your API responses until they get deleted. Even if they are marked for deletion, they might show up in your API responses. So your application should use filter operations to exclude the items that have been marked for deletion. And whenever the TTL process deletes the items, they do appear in the DynamoDB streams, if you have enabled the DynamoDB streams. Now let’s go into a demo and see how to enable the TTL feature.

  1. DynamoDB TTL – Hands on

Now let’s do a quick hands on to see how to use TTL. So this is our gameplay table, and we have about one item. So what I’m going to do is I’m going to add some more items by duplicating this item. So simply change some of the properties here. Okay, so I’m just going to change a few properties to create more items. All right, so we have about four items here. And to use detail, we need an attribute containing a timestamp, an expiry timestamp. So let’s add an attribute to each of these items and remember that this attribute has to be a number. You cannot use any other data type. So I’m going to use Number, and you can call it anything. I’m going to call it Expiry. And we have to get the timestamp. So let’s go to Epoch converter. com and grab the current timestamp, all right, and save. So I’m going to do the same thing with all other items.

And for the last item, I’m going to change the timestamp and change the first three digits to eight eight, eight. So the timestamp will be way ahead in the future. So when the TTL routine runs, the first three timestamps, which are already in the past now, should get deleted, while the last item should be retained in the table. So to enable TTL feature, you have to note the timestamp attribute. In our case, it is Expiry. So go to Overview Tab and choose Manage TTL. And under TTL attribute, just add the name of our Expiry attribute. So the name of our attribute is Expiry. And if you enable the DynamoDB Streams checkbox, then whenever the items get deleted by the TTL routine, they will appear in the DynamoDB stream. So you can use that maybe to archive it to another table or for any other purpose.

Okay? And if you click on this run preview button, you will see that these three items are now due for Expiry. Right, you can see that these three items will be deleted whenever the TTL routine runs. That means that our TTL configuration is correct, and we can continue. And now you can see that the TTL attribute name Expiry appears over here. And let’s go to the Items tab and let’s refresh the page. And now you can see that the Expiry attribute has been marked as the TTL attribute. And whenever the TTL routine runs, you will see that the first three items will be removed from the table, and the last item will remain as the timestamp of the last item is way ahead in the future.

And this is going to take some time for the items to be removed from the table. It can take up to 48 hours, but generally this should happen in a few minutes because our table is small. And if the TTL routine runs earlier, these items could be removed faster than that. All right, so I’m going to pause the video here and come back when the items get removed. And now you can see that the TTL routine has run and three of the items have been removed from the table. And we can only see one item which has the expiry timestamp in the future. That’s how TTL works. So that’s about it. Let’s continue to the next lecture.

  1. TTL use cases

TTL use cases. There are numerous use cases for TTL. First, you can use it for data archival to another table using DynamoDB streams. So you simply designate an expiry timestamp on the source table. And once that data expires, the data will be deleted automatically from the source table. And then you can use DynamoDB streams to copy the deleted records to a new table for archival purposes. If you have time series data, you can use TTL to separate hot and cold data.

Let’s say you have time series data in your table, and the recent data is queried more frequently than the older data. So what you can do is you can keep the recent data in the source table, and the older data can be moved to another table using TTL. So what you can do is you can enable DynamoDB streams on the source table and use a lambda function to archive the older data to a new table. So this is how you can separate the hot and cold data and improve the performance of your DynamoDB tables.

  1. DynamoDB global tables

In this lecture, let’s look at the DynamoDB global tables. So the global tables are the multiregion, multimaster version of DynamoDB tables. This offers automatic multimaster active active crossregion replication. So you have multiple masters. That means you can write in any region. And this works automatically. This is very useful for low latency and for dr purposes. And this really works well when you have a global user base for your application. And each user can read or write their data from the nearest DynamoDB region. And this really results in very low latency for local reads. And you can of course, use it for disaster recovery purpose as well. And it’s a near real time replication. Replication lag is under 1 second.

So let’s say you have a table in US East One, and you enable global tables and create another table in the Southeast Two region in Asia Pacific regions. When you write any data or update any data in one region, that will automatically be replicated in the second region. And if you do the same thing in the second region, like you write or update any data or delete any data in the second region, that operation will be repeated or replicated in the first region as well. So this is like a multimaster active active replication. Important thing to remember here is that global tables support eventual consistency when you do a cross region read. And if you read and write from the same region, then you can use strong consistency. And if there is a conflict, for example, an item gets updated simultaneously in two regions, then DynamoDB uses a concept of last writer wins.

So whichever data is written last, that will be replicated across all the tables. And if you use transactional consistency, then remember that the transactions are asset compliant only in the region where the rights occur originally. So only in the region in which you do the right operation, the transactions will be asset compliant in that region. Now, there are some prerequisites when you want to use global tables. For example, if you want to enable global tables, the table must be empty across all the regions in which you want to use global tables. And you can only use one replica per region. So you cannot have two tables with the same name in one region.

And global tables use the DynamoDB streams with old and new images for replication. So you must enable DynamoDB streams with new and old images. Also, you must have the same table name and same primary key across all the regions for any particular set of global tables. And it’s recommended that you use identical settings for your table and indexes across all regions. Although you can have different settings for better performance, it’s always recommended that you use identical settings. Now, let’s look at one of the use cases of the global tables. So let’s say you have an application that uses a DynamoDB table and you have your users across the globe.

So you have an application running in the US West region, so you have your application running in the American region and you have users across the globe. So the users in America will see a very low latencies, but the users in Europe or the users in Australia or in Asia are going to see longer latencies or higher latencies. Now, if you create a global table in European region, then the users in Europe will also experience low latencies. And if you create one more global table, let’s say in the Asian region, then all the users across the globe are going to see low latencies. And this is going to improve your application experience by leaps and bounds. So users are going to see very fast application performance. So that’s the use case of global tables. So now let’s go into a demo and see how to create global tables and also see them in action.

  1. DynamoDB global tables – Hands on

Now let’s quickly see how to create a global table. So I’m going to use the same gameplay table that we have been using till now and create a global table out of it. Okay? So from the table screen, go to the Global Tables tab, and here you can create your global tables. Before you can create global tables, you must enable the Dynamic streams, which you can enable using this button. Or you can also go to the Overview tab and enable it from the Manage Stream button. So just click on that and you must choose the View type new and old images for global tables to work. So we choose that and click on the Enable button and that’s about it. So now the stream has been enabled, so we can go back to the Global Tables tab. And now we can add additional regions to this table. So we can see currently this table is in the Oregon region. That is us. West Two. Right. But before we can add regions, we must ensure that we either use on demand capacity for this table or we at least enable Auto scaling feature on the table. So let’s go to Capacity tab and check what we have.

And here we have provisioned capacity. So here we are using the provision capacity. But auto scaling has not been turned on. So let’s enable auto scaling, okay? And once that is done, just click on Save and that’s it. Now you can go back to the Global Tables tab and add a region. Let’s select one of the regions. Let’s go with EU London. The region is ready. Click on the Create Replica button to create the global table. And this should take a few minutes. And the table will be ready in the target region, which is London region. For our case. Let’s go to the London region.

I’m going to open this in a new tab, and we don’t have any tables here. And once this replication process is complete, we should see a global table or we should see our gameplay table in the target region. So I’m going to pause the video here and come back when this process is completed. All right, now we can see that the table is active in two regions. So we have created a global table. So let’s switch over to the London region. So you can see this is a London region and we have one table. And let’s go to the Global Tables tab.

You should see two regions. And on the Items tab, you should see the items copied over, right? All right. So you have one item over here and in the original region, that is Oregon region as well, you will have one item. Now what I’m going to do is I’m going to edit an item and I’m also going to add one item. So what I’m going to do is I’m going to change the score. So from 97, I’m going to change it to 98, for example, and save. So this has been changed. And what I’m going to do now is I’m going to duplicate this item and create one more item, let’s say with user three and with score as 99 and with game three, okay, just an arbitrary item.

And now this item should get replicated in the London region. So let’s go over to the London region and see if we can find this item there. So let’s refresh. And here we go. You can see that we have the item here, and you can also see that the score has been updated from 97 to 98. So now what I’m going to do is I’m going to remove the user two, okay, from the London region. So I’m just going to delete it. So it’s gone and let’s go back to the Oregon region and refresh. And here we go. You can see that the item has been removed from here as well. So this is how global tables work. And before we close, I just want to show you the Capacity tab. Go to capacity.

And here you can see that you have some new options for capacity. So you have global tables. Auto scaling. AWS recommends that you keep consistent settings across all regions. But if you like, or if your use case demands that you can manage settings for each region independently. So you can set the auto scaling rules for each region independent of the other regions as well. And if you go to the Global Tables tab here, you can add more regions or remove existing regions from the global tables. All right. And just below that, you will see the replication lag. And here you can see that the recent replication lag was about 1 second. All right? So that’s how global tables work. Generally, the replication lag is under 1 second. So that’s that’s about it. And let’s continue to the next lecture.

  1. Fine-grained access control and Web-identity federation in DynamoDB

In this lecture, we are going to talk about the fine grained access control feature in DynamoDB. So we can use IAM to control access to DynamoDB resources. And DynamoDB doesn’t support tag based conditions, but what it does support is use of condition keys. So you can really use conditions in your Im policy to create a fine fine grained access control.

So what I mean by fine grained access control is you can really restrict access to certain items or even certain attributes within your items based on the user identity. So you can add your username or the IAM username in the table attribute, let’s say in the table index or in a secondary index, and you can use that in your Im policy conditions to control who can access which table items or who can access which table attributes. For example, you can allow users to access only the items that belong to them based on certain primary key values. For example, if you have a user profile table, you may just want users to access their own profiles and not have access to the profiles of other users.

Or in case of our gameplay table, we might just want the users to be able to access only their own gameplay records and not be able to see records of other users. So you can use fine grained access control policies to control which items or which attributes a user can access. So here is an example of a fine grained access control policy. You can see that under resource we have mentioned the name of the table, the arm of the table, and the table name is Messages. And we have specified a condition. So there are different conditions, and you can go over the link on the

screen to learn more about different conditions. But some of the common conditions that we see in fine grade access control are for all values colon string equals. Now, this compares the requested attribute values with the values in the table. So in our example here, we are using the DynamoDB leading keys and matching it with the user ID. So for all values in the incoming API request, DynamoDB is going to check the leading key leading keys is nothing but the tables partition keys. So in this example, DynamoDB is comparing the value of the partition key with the value of the user ID that is passed in the incoming request. So the access will be granted only if the user’s user ID matches with the primary key value stored on the table.

And you can also use DynamoDB colon attributes in the condition to allow access to certain attributes. So in this example, we are restricting users to access only their own data, and we are also restricting their access only to four attributes mentioned here the user ID, message, ID content and timestamp. So you can really have a fine grained access control using the condition keys like this. Now, let’s look at the web. Identity Federation. This is also called as DynamoDB Federated identities, and you use this for authentication and authorization of your application users.

So for example, you have a mobile app that uses a DynamoDB table in the back end, and then you can use the Web Identity Federation approach to authenticate your users and authorize them to different items in your DynamoDB table. So there is no need to create individual im users. All you do is you store your users in a DynamoDB table and then your users log in using any of the identity providers like Google, Facebook, Amazon, so any of the open ID providers you can use. And then you can use Cognito to exchange a web identity token with temporary im credentials. So what you essentially do is you get a web identity token from the identity provider.

For example, you use login with Amazon here, and then once you have the token, you pass that token to Cognito. And then Cognito gets a temporary im credential or an Sts token from AWS Sts service and it returns it to you. If you don’t want to use Cognito, you can directly use the Sts API to generate your temporary im token, and then you use the temporary credentials that you received from Sts or from Cognito to access DynamoDB, and you will use the role that’s associated with those credentials. So you can of course use condition keys here for fine grained access control. So that’s about the Web Identity Federation.

  1. CloudWatch contributor insights for DynamoDB

Let’s look at the Cloud Watch contributor insights. This is a Nifty diagnostic tool, and it shows you the most accessed and throttled items in your DynamoDB table. It helps you to analyze time series data as well. And this feature, the Cloud Watch contributor Insights, is supported for DynamoDB and also for Cloud Watch Logs. Logs. What it does is it helps you identify the outliers or the contributors that are impacting your system or your application performance.

You can identify the heaviest traffic patterns, and you can use it to fine tune your application performance. You can use it to identify the top system processes. And all this Contributor Insights data is displayed in the Cloud Watch dashboard, and it’s also integrated with Cloud Watch alarms. You can set up alarms to alert you in case of certain events. All right, that’s about it. Thank you so much, and I’ll see you in the next section.

img