1z0-062 Oracle Database 12c: Installation and Administration – Module 11: Data Concurrency

  1. Data Concurrency Part1

So Oracle is probably the largest database system in the world. They claim they are the largest database system in the world. We’re not talking megabytes of data here. We are talking terabytes and terabytes and terabytes. I forget what, as of Oracle twelve C, I forget what the storage capacity of it is just off the top of my head, but I believe it’s in the hundreds of terabytes for a 32 bit operating system. Most operating systems, of course, are not 32 bit operating systems. Most of them, like Windows, most distributions of Unix and Linux are eight bit operating systems. But even still we’re talking terabytes and terabytes, an unimaginable amount of storage that can be stored within the Oracle system. And of course, Oracle runs some of the largest database systems in the entire world. Oracle runs entire governments. Oracle runs entire nations within its database system. Very, very large nations like the US. Run on several different database systems. But that includes, for instance, the Oracle database system. So we are talking great masses of data, just terabytes and terabytes of data, especially on the systems that can run more than just eight byte systems.

So that amount of data, as we know, has to be very, very accessible, highly accessible. And when we run into those giant amounts of data, we may have many, many users. Now the problem with many users is that we have many users accessing, performing DML on, for example, lots and lots of tables. And there are a couple of things that we need to do about that. One is the equipment has to be able to handle it. But that’s not the subject of this course. One of the things that we have to worry about is that the DML that occurs on those tables doesn’t just plain overwrite the other DML. When one person is writing to a table, they have to be very careful not to overwrite the data of another person or user. And we’ve talked a little bit about that. Actually, we’ve talked a good bit about that when we’ve talked about DML. When it comes to transactions, we know, for instance, that when we update and update is the real problem. When we update certain rows to the database, another user may come along and may update those rows. For instance, let’s say we’re in a large bookstore chain. I tend to pick on bookstores as I teach this course. And into that bookstore somebody updates the number of books of, say, some book by Stephen King and that person adds to our collection in say, the New York branch of 100 copies. Now they’re not going to have 100 in a New York bookstore or anywhere.

Let’s say we have ten copies of a particular book by Stephen King. Two more get ordered and so now we need to add those two more that come in. So now we’ve got twelve copies, but at the same time one gets returned. That was purchased last week. That’s one more. So what happens if the transaction of two copies of this Stephen King book are added to our inventory, but at the same time another transaction overwrites that transaction to say that one book just got returned. So somehow three books came in, but the first transaction added two to our collection of ten or to our inventory of ten. And then overriding that transaction was one that got returned and are as we know, we have 13 copies of this Stephen King book, but what’s our database count going to be? We had ten, two came in, that transaction got overwritten. So now we recorded the one, lost the two, and we have eleven copies according to the database of this Stephen King book, when in reality we know that there’s 13. That is a serious problem. And that’s a problem of data concurrency.

Multiple users, multiple systems, whatever, are writing to our database at the same time and they’re overwriting each other’s data. We have to be very, very careful about that. We have to make sure that transactions cannot overwrite each other. So how do we fix this? Well, we could make just one user and we could make sure that no more than one human being can log in as that one user. But that’s really going to slow the system down. We could have a bunch of copies of the table and maybe at 03:00 A. m. , they merge together somehow and all the transactions perhaps commit one at a time in the middle of the night when our stores are never open. Except that I don’t know what we do about stores that are on the other side of the planet if we have those stores, because they might be running transactions so we can’t hold off our transactions until the middle of the night. And why would we do that? Because let’s say we have one copy of the Stephen King book in the store as of 09:00 A. m.

New York time. Another copy comes in at, but the transaction doesn’t commit until the middle of the night. So we don’t even know that we have that copy in inventory. The first book sells. So now our inventory is showing zero, even though we really have one. Somebody comes in looking for the book, computer says we have zero. So now they have to go off to someplace else to buy that book, which is what they’re going to do, and we’re going to lose the money. This isn’t going to work. This is a bad idea. And if you’ve thought how this whole transactional stuff works as we’ve gone through this course or at some other point in your life and you’re thinking, this really sounds like a bad idea I mean, this doesn’t happen with access where things aren’t usually transactional, where we just enter data into the database and, poof, it shows up and everybody can see it. It makes you kind of think, I mean, Oracle was one of the first transactional database systems when we came up with the idea of relational database management systems, when Ted Cod, this guy who came up with the whole idea, was thinking about this originally, these thoughts entered his mind.

He was thinking at least I think he was thinking. He didn’t write about it at the time, but I think he was thinking, what happens when people overwrite each other? Or what happens when you have to wait for this collection of DML, which we’re going to call a transaction, makes it to the database? How is this all going to work? This is just a really bad idea. But it is, by the way, in my opinion, the best database system that we have, the relational database management system is the best that we’ve got going. If you’ve got a better idea, I’m listening. Everybody’s listening. And some people have come up with some good ideas. But you can see things have to be transactional. Transactions must commit, and transactions must commit in a certain order. The best way for that to happen is, if possible, these transactions need to happen in milliseconds, because if they happen in milliseconds, then things are going to move along in order. Okay? So the idea of making sure that the database is accurate to the real world, to all users at hopefully all times, as close to all times as is humanly possible, and we humans don’t operate in milliseconds. Computers do, but we don’t.

So that’s not that big of a deal. It’s the whole idea that every once in a while we’ve got to make sure that the database and when I say every once in a while, I mean within a couple of seconds. The database has always got to be updated. So that whole idea means that when somebody, some worker in our bookstore, pulls up the screen, they’ve got to be able to see, okay, look, we’ve got one copy of the book that you’re looking for. It’s right here in our store. Let me go pull it off the shelf for you. Or if they look in there and they say, oh, sorry, looks like we don’t have that book in stock as I look in here. But if the book does come into stock, we can’t wait and overnight for it to show up. It’s got to pop up right away. The workers got to say, oh, wait a second, my screen just changed, and now I can see that we have one copy. Let me go check the back to see if that one copy is here.

That’s how things need to be. That’s the idea of data concurrency. Multiple users must operate concurrently on the database. They can’t over write each other. And we have to do this all fairly quickly because we’ll lose money if our data is inaccurate. So this is the idea that all users see the truth.

  1. Data Concurrency Part2A

So the larger problem that we have is that if users are going to be accessing the database at essentially the same time, we need to make sure that data isn’t overwritten. So Oracle has some solutions for that, a whole set of solutions. And we’re going to talk about the major ones that are, one on the exam and two, two are probably going to be the answer in most cases, almost all cases. So the answer for this in most database systems is row locking. Because it’s okay, for instance, for 1000 people to be accessing the table, for instance, if all they’re doing is inserts. For instance, if I log into the database to insert ten new books that just came in this morning, and then a couple of seconds later, you log in to the database to add two more books that came in this morning that came in a little bit late on a late shipment or something like that, no problem. Because think about what’s happening.

A whole series of inserts, my ten inserts are going into the database, and then your ten inserts are going into the database. Fine. Think about a really large database system where all of the stores all throughout, say, the Eastern time zone, all get their shipments at the same time during the day, and all of those store managers or whomever takes care of logging the new shipment into the database are all inserting all those new books into the database system. No problem. Because the point is, eventually all that inventory is going to be entered into the tables and nobody’s overriding anybody. That’s the secret. People are just adding more and more books. It’s just inserts. And if the inserts have to wait a little while before they go into the system, that’s all right. There’s no transactional problem there. There’s just the problem of the delay of the inserts. The inserts have to wait for one another, that’s all. But there’s no chance that my entry of ten books in, say, New York this morning are going to overwrite your inserts of, say, two books at the Newark store this morning. That’s not going to happen because inserts are just inserts. They’re just new rows into the database.

If somebody deletes something, that may be something different. But if we’re just adding rows, I’m not making any changes to your rows. You’re not making any changes to my rows, and except for the fact that it may take a few seconds or even a few minutes, theoretically a few hours, although that’s not likely with the Oracle system. In order for my inserts to go in or your inserts to go in, or 50 other people’s inserts to go in, we don’t want that delay in the database. But if it happens, it happens. We want those books to be available as soon as possible so that our store folks can see them when people are asking for them. But if there’s a little delay. That’s probably not a serious problem for our business. Users will be able to see those books pretty soon, and when people are looking up what we have in inventory, it’s no big deal. It’s probably good enough. Okay, so that all works for us. Inserts are not a serious problem. Deletes, on the other hand let’s take a look at the problem of deletes. Somebody comes in, we have one copy of the book left. That copy gets purchased at 09:00. Our delete has to wait 20 minutes while all the inserts happen.

That’s probably a bad idea in terms of business rules. We can wait a little bit because the books still have to be shelved anyway. But deletes are a problem because somebody comes in, asks for the book. The worker pulls it up on the computer, and it says, oh, yes, we have a copy. It should be shelved right over there. They go over, they look for the book, and it’s not there because the delete hasn’t occurred yet. Now they’re looking all over the store. Darn it. It must have been mishelved. Maybe somebody took it to the back. Maybe somebody put it behind the desk, and it’s on reserve for someone else to buy it. And now we’ve wasted an employee time here. 20 minutes later, the person says, I’m sorry, ma’am, I can’t find your book anywhere. Let me check the database again. They check it, and that book’s missing now. Oh, what the heck happened? It was here. I know I saw. There was one copy in this store, and now there’s zero. Did somebody find it and took it behind my back? I don’t understand. Who sold that one copy of the book when really that book wasn’t there? And now 20 minutes of a worker’s time, because that’s how long it took for that DML to hit the database. 20 minutes of a worker’s time have been wasted. We don’t want that.

So we need concurrency. And by the way, when you’ve gone into one of these mega bookstores, has this not happened to you? Has this not happened when they said, especially if you live in a larger metropolitan area like I do, and they say, we don’t have that copy here, but the computer is showing that it’s in the other one halfway across town. Then you make that drive halfway across town through the traffic. Takes you an hour and a half to get there. You finally get there, and you say, no, we don’t have that. No, our computer is showing that we don’t have any copies of it. I don’t know what their computer is talking about. This happens, and this has happened to me a couple of times. Maybe not many times, but a couple of times when those computers are not concurrent, we need concurrency within our database systems.

So there are a couple of methods to make this happen. One of the things that oracle will do for you is on updates and deletes, the rows will become locked, not on inserts. Why not? Because inserts aren’t a problem. Inserts can go ahead and commit. We can add things all we want to on updates and deletes, though. That’s a serious problem. Think about why I talked a little bit about Inserts, but inserts don’t make any changes to data. Inserts only add. Data. Deletes are serious business, though, because if I drive all the way across town in heavy traffic and it takes me an hour and a half to get there and I find out that book isn’t there, it’s been deleted, that row has been deleted from the database. As a customer, I’m upset. Or I find out that that row that says they had two copies and now that row says zero copies.

So in other words, the update didn’t commit. That’s a problem. And that’s a problem we need to solve as DBAs, we need to make sure that there is high availability of the correctness of our data. So maybe when the transaction begins, those rows need to lock so that changes can’t occur from other users. It may not solve the problem of data inaccuracy because maybe it just locks things the way that they are, but at least that’s something. These solutions aren’t perfect database systems. We’re not on machines, and machines are what they are. But be aware of this. We don’t want if two books come into, say, the Miami store and one comes into, 3 seconds later, the Orlando store, we don’t want the Orlando store overriding the data in the Miami store. So while Miami is doing its update, adding its two books into the nationwide system, we probably want to lock that table.

We want to lock that table while it’s in a transitional state, and I say transitional on purpose because a transaction is occurring. Miami is in the middle of inserting two new books into the table. Now, Orlando can’t overwrite the two new book insert that Miami is doing. Inserts a bad example because inserts don’t need to lock. Let’s say that Miami is overwriting the row which has the count of the number of books. It’s updating the book number column in that particular row. We don’t want then Orlando overriding the book number column. We can’t have Miami being overwritten by Orlando. So we want to wait until the Miami transaction fully commits. Nobody can touch that table until Miami fully commits. So what’s going to happen to Orlando’s transaction? Well, it’s just going to sit there. Now, realistically speaking, it’s going to sit there for maybe seconds, probably milliseconds, hopefully milliseconds, but it’s going to sit there for maybe a few seconds. Is that bad? Yes. Yeah, that’s bad. But we can’t have data inaccurately overwritten data. Concurrency is the idea that we want to make sure that the data is entered correctly. And that means there may be some sacrifices, there will be some sacrifices to be made. Sacrifice might be in terms of time, but we can’t have data overwritten. We may, and as a matter of fact, I’m going to suggest to you that we should even lock our select statements. In other words, if someone is querying a table or tables, they should be locked during the query and we do that with the For Update statement.

So if I select Star from the table book inventory or something like that, if I select titles and inventory count from the Book inventory table, I can select it for Update. For Update would be the last clause. It would go after all other clauses in the select statement in the query. If I select for update, I have locked the table. No other DML will occur. I shouldn’t say other DML because select is not a DML statement. No DML will occur on the table while the select statement is running. So we know that that selection will be accurate. So there are a couple of ways to lock tables automatically, if you want to use that term automatically within the Oracle system. One is updates will lock the tables for transaction purposes. That’s just to make sure that users cannot overwrite each other with their updates. Same thing with Deletes. Remember though, it doesn’t happen with Inserts. So updates and Deletes will automatically overwrite or excuse me, will automatically lock the tables. And lastly, select statements will also lock the table, but only if you select with the four Update clause. Only if you add the four Update clause at the end of your select statements.

So the question to you is, do you want to run out there, run over to the developer floor of your building and say, hey everybody, new rule. New rule is we’re going to use the For Update clause. From now on, every SQL select statement that you pass will be passed for Update. And is this something that you want to do? Sure, why not? Let’s make sure that when users select data from the table, all their data looks alike because that’s something that probably should have bothered you for a while now. Of course it also might bother you that you can’t do a select statement while a transaction is running. All I can say is make sure that those transactions are tuned. Make sure the transactions can occur within a very short amount of time. Because if you select for Update, transaction can’t occur, excuse me, the query can’t occur except from the person who’s running the query for Update. So we want to be careful about that. Tables are locked explicitly, by the way, whenever the tables are locked in exclusive mode, though when tables are locked in exclusive mode, queries can still happen so people are still able to query against the database against that particular table. If a table is locked, that is for Update. If a table is locked for Update, when does the unlock happen? There has to be a point at which the table is unlocked because other users need to run queries against that table. And that’s easy. The table will unlock itself automatically once a commit or a rollback happens. So we’re locking the table for an update or a delete. That is, we’re erasing rows or we’re making changes to rows and we’re doing it for update, which means we’re locking the table. But once we get a commit, which means our data is saved, or once we get a rollback, which means we’re undoing whatever we wanted to do, then the forupdate no longer matters. We’ll unlock that table and other users will have full access.

Okay, a little bit complicated, but it makes total sense. This is how we make sure that users can’t see different data at the same time. That is a bad thing. Talk about losing faith in your database. Nothing worse than two users. One user logging in and seeing that there are two books on the shelf. The user goes off to look for the books. Second user logs in and says and sees that there are zero books on the shelf. Doesn’t bother and sends the customer away to some other bookstore down the road. And your company loses money. And who even knows what the truth is? That’s a for sure way to lose faith to cost customers. We want to make sure that that can’t possibly happen. Data must be accurate and data must be concurrent. All users must see concurrent data. All users must see the same data.

  1. Data Concurrency Part2B

Concurrent transactions. By default, concurrent transactions can happen. Oracle does not, within its own system try to prevent concurrent transactions. I may try to insert two new books into that inventory. I shouldn’t say insert. I may change the book count from 1010 to twelve at the same time you’re trying to change the book count from ten to eleven. Whoever writes last wins. So the book count should be 13. Let’s say you go last, your transaction commits last. Book count is eleven. So obviously that’s a very, very serious problem. How do we fix that? As we’ve said before? Well, if we’re viewing the data, we can view for update. Personally, I think that’s almost always a good idea. Will it slow down your sequel statements? Yes. This may be something that isn’t possible for you because you just need the time on your sequel statements. I get that. That’s life. That’s something that’s going to happen sometimes you just can’t do it because it’s just going to ruin the tuning on your database. Okay, that’s fine. This should not drag out the database that much though. The only thing you can do it is the only thing you can do is try it and see what happens.

Hopefully it won’t do too much damage to the speed of the database. You can also lock the tables. You can lock the table in two modes, shared or exclusive. The default is shared when the database or a particular table is in shared mode. Everyone shares it. Can all do select statements and DML on that particular table. If you’re in exclusive mode, then the table locks itself during DML, and other people cannot perform DML on that particular table. And how do we do this? Lock table my table in share mode. Lock table my table in exclusive mode. You just run that statement and this is how you’d make that change to that particular table.

You just run it in SQL Plus if you wanted to, but typically it would be run in code. A developer might run the line. Lock table my table in exclusive mode. Run the transaction, commit it. Lock table my table in shared mode. That might be an idea. That’s a perfectly legitimate way to make sure that transactions don’t overwrite each other, that users aren’t overwriting each other in transactions. Nothing wrong with that. And like I say, that might be a good idea to handle concurrency in your database system. Undo management. We’ll talk more about undo later, but this is something that we probably want to talk about in terms of concurrency, and that’s why I’ve placed it in this section. Undo management is the idea that sometimes you want to obviously undo what’s happened in a database. You want to remove the actions that have occurred in a particular table or set of tables.

Probably you as the DBA don’t want to mess with this because Oracle does a pretty darn good job of taking care of undoes it sets it up in such a way that you can do undo without worrying about tuning it. But there are some things you may want to do in terms of taking care of undo. For instance, by default, Oracle saves back in time 900 seconds. The ability to undo exists for 900 seconds or 15 minutes, which is not a long time. But I mean, how much data do you want to keep? And I know you’re thinking, well, I may need to want to keep days and days of data, weeks, months of data that I may want to go back to and see what things look like, or take data that just got entered today and get rid of it. There are better methods than undo management.

There are backup and restores and flashback and things like that that we can use to take data back to a prior moment in time if we want to set undo management, maybe make it manual instead of letting this happen automatically, which means the data is saved for 15 minutes. Here’s our command. Undo system set, undo management equals and then manual or auto. Those are your two choices. There, you can either set it to manual, in which case you have the ability to control different parameters within the undue management or auto, which means you’ll let Oracle handle the whole thing. Here’s what happens. Or here are some of the parameters that you may set undo management into. And here are the commands. And these commands, if you want to, can be issued in SQL plus from the SQL prompt. Alter system set, undo management equals manual.

Now it’s in your control, or if you don’t want it to be, you want it to go back. Set undo management equals auto. Now the database will handle undo management itself. Again, alter system set undo tablespace equals. And here’s one that I just made up. Undo main TS, the undo main table space. The table space is where it’s going to store all this undo data so that it can undo whatever is done. Or it really doesn’t store the undo data, it stores the undo DML, so we can go back and undo what DML has occurred on a particular table. So it basically erases those commitments. 15 minutes isn’t good enough for you. Alter system set undo retention equals 1818 hundred is basically 30 minutes. So now it’s going to save 30 minutes worth of DML to the table.

And if you want to create an undo table space, you can do that with create undo tablespace retention guarantee, which means it’s going to make sure that it keeps that undo table space information. So that’s manually how we do undo management. Okay, those are just some of the parameters, by the way. There are others, I think off the top of my head in twelve C, there are about 15 different parameters. But again, like I always say to you, when it comes to those kind of things. Don’t hold me to it, but I’m pretty sure there’s around 15 parameters related to undue management. Okay, so that is that section. Thank you for listening. I’ll see you on the next section.

img