1 00:00:00,280 --> 00:00:05,320 If the necessary synchronization requires acquiring more than one lock, there are some 2 00:00:05,320 --> 00:00:09,349 special considerations that need to be taken into account. 3 00:00:09,349 --> 00:00:14,190 For example, the code below implements the transfer of funds from one bank account to 4 00:00:14,190 --> 00:00:15,610 another. 5 00:00:15,610 --> 00:00:20,130 The code assumes there is a separate semaphore lock for each account and since it needs to 6 00:00:20,130 --> 00:00:26,000 adjust the balance of two accounts, it acquires the lock for each account. 7 00:00:26,000 --> 00:00:31,410 Consider what happens if two customers try simultaneous transfers between their two accounts. 8 00:00:31,410 --> 00:00:36,030 The top customer will try to acquire the locks for accounts 6005 and 6004. 9 00:00:36,030 --> 00:00:43,050 The bottom customer tries to acquire the same locks, but in the opposite order. 10 00:00:43,050 --> 00:00:48,370 Once a customer has acquired both locks, the transfer code will complete, releasing the 11 00:00:48,370 --> 00:00:49,370 locks. 12 00:00:49,370 --> 00:00:55,289 But what happens if the top customer acquires his first lock (for account 6005) and the 13 00:00:55,289 --> 00:01:00,269 bottom customer simultaneously acquires his first lock (for account 6004). 14 00:01:00,269 --> 00:01:06,920 So far, so good, but now each customer will be not be successful in acquiring their second 15 00:01:06,920 --> 00:01:12,039 lock, since those locks are already held by the other customer! 16 00:01:12,039 --> 00:01:17,500 This situation is called a "deadlock" or "deadly embrace" because there is no way execution 17 00:01:17,500 --> 00:01:19,950 for either process will resume. 18 00:01:19,950 --> 00:01:25,360 Both will wait indefinitely to acquire a lock that will never be available. 19 00:01:25,360 --> 00:01:30,960 Obviously, synchronization involving multiple resources requires a bit more thought. 20 00:01:30,960 --> 00:01:36,070 The problem of deadlock is elegantly illustrated by the Dining Philosophers problem. 21 00:01:36,070 --> 00:01:39,859 Here there are, say, 5 philosophers waiting to eat. 22 00:01:39,859 --> 00:01:45,590 Each requires two chopsticks in order to proceed, and there are 5 chopsticks on the table. 23 00:01:45,590 --> 00:01:49,000 The philosophers follow a simple algorithm. 24 00:01:49,000 --> 00:01:53,750 First they pick up the chopstick on their left, then the chopstick on their right. 25 00:01:53,750 --> 00:01:57,890 When they have both chopsticks they eat until they're done, at which point they return both 26 00:01:57,890 --> 00:02:01,969 chopsticks to the table, perhaps enabling one of their neighbors to pick them up and 27 00:02:01,969 --> 00:02:03,210 begin eating. 28 00:02:03,210 --> 00:02:10,489 Again, we see the basic setup of needing two (or more) resources before the task can complete. 29 00:02:10,489 --> 00:02:13,220 Hopefully you can see the problem that may arise 30 00:02:13,220 --> 00:02:17,780 If all philosophers pick up the chopstick on their left, then all the chopsticks have 31 00:02:17,780 --> 00:02:22,629 been acquired, and none of the philosophers will be able to acquire their second chopstick 32 00:02:22,629 --> 00:02:24,050 and eat. 33 00:02:24,050 --> 00:02:25,640 Another deadlock! 34 00:02:25,640 --> 00:02:29,170 Here are the conditions required for a deadlock: 1. 35 00:02:29,170 --> 00:02:34,410 Mutual exclusion, where a particular resource can only be acquired by one process at a time. 36 00:02:34,410 --> 00:02:35,410 2. 37 00:02:35,410 --> 00:02:40,790 Hold-and-wait, where a process holds allocated resources while waiting to acquire the next 38 00:02:40,790 --> 00:02:41,860 resource. 39 00:02:41,860 --> 00:02:42,940 3. 40 00:02:42,940 --> 00:02:48,500 No preemption, where a resource cannot be removed from the process which acquired it. 41 00:02:48,500 --> 00:02:52,909 Resources are only released after the process has completed its transaction. 42 00:02:52,909 --> 00:02:54,260 4. 43 00:02:54,260 --> 00:03:00,069 Circular wait, where resources needed by one process are held by another, and vice versa. 44 00:03:00,069 --> 00:03:04,680 How can we solve the problem of deadlocks when acquiring multiple resources? 45 00:03:04,680 --> 00:03:09,460 Either we avoid the problem to begin with, or we detect that deadlock has occurred and 46 00:03:09,460 --> 00:03:11,800 implement a recovery strategy. 47 00:03:11,800 --> 00:03:14,330 Both techniques are used in practice. 48 00:03:14,330 --> 00:03:18,960 In the Dining Philosophers problem, deadlock can be avoided with a small modification to 49 00:03:18,960 --> 00:03:20,540 the algorithm. 50 00:03:20,540 --> 00:03:25,700 We start by assigning a unique number to each chopstick to establish a global ordering of 51 00:03:25,700 --> 00:03:29,579 all the resources, then rewrite the code to acquire resources 52 00:03:29,579 --> 00:03:34,599 using the global ordering to determine which resource to acquire first, which second, and 53 00:03:34,599 --> 00:03:36,470 so on. 54 00:03:36,470 --> 00:03:40,720 With the chopsticks numbered, the philosophers pick up the lowest-numbered chopstick from 55 00:03:40,720 --> 00:03:42,939 either their left or right. 56 00:03:42,939 --> 00:03:47,840 Then they pick up the other, higher-numbered chopstick, eat, and then return the chopsticks 57 00:03:47,840 --> 00:03:49,530 to the table. 58 00:03:49,530 --> 00:03:52,120 How does this avoid deadlock? 59 00:03:52,120 --> 00:03:57,590 Deadlock happens when all the chopsticks have been picked up but no philosopher can eat. 60 00:03:57,590 --> 00:04:01,650 If all the chopsticks have been been picked up, that means some philosopher has picked 61 00:04:01,650 --> 00:04:06,430 up the highest-numbered chopstick and so must have earlier picked up the lower-numbered 62 00:04:06,430 --> 00:04:09,069 chopstick on his other side. 63 00:04:09,069 --> 00:04:15,190 So that philosopher can eat then return both chopsticks to the table, breaking the hold-and-wait 64 00:04:15,190 --> 00:04:16,940 cycle. 65 00:04:16,940 --> 00:04:21,500 So if all the processes in the system can agree upon a global ordering for the resources 66 00:04:21,500 --> 00:04:26,650 they require, then acquire them in order, there will be no possibility of a deadlock 67 00:04:26,650 --> 00:04:29,250 caused by a hold-and-wait cycle. 68 00:04:29,250 --> 00:04:33,940 A global ordering is easy to arrange in our banking code for the transfer transaction. 69 00:04:33,940 --> 00:04:39,031 We'll modify the code to first acquire the lock for the lower-numbered account, then 70 00:04:39,031 --> 00:04:42,169 acquire the lock for the higher-numbered account. 71 00:04:42,169 --> 00:04:48,260 Now, both customers will first try to acquire the lock for the 6004 account. 72 00:04:48,260 --> 00:04:53,030 The customer that succeeds then can acquire the lock for the 6005 account and complete 73 00:04:53,030 --> 00:04:55,180 the transaction. 74 00:04:55,180 --> 00:05:01,419 The key to deadlock avoidance was that customers contented for the lock for the *first* resource 75 00:05:01,419 --> 00:05:03,310 they both needed. 76 00:05:03,310 --> 00:05:07,310 Acquiring that lock ensured they would be able to acquire the remainder of the shared 77 00:05:07,310 --> 00:05:12,639 resources without fear that they would already be allocated to another process in a way that 78 00:05:12,639 --> 00:05:15,750 could cause a hold-and-wait cycle. 79 00:05:15,750 --> 00:05:20,960 Establishing and using a global order for shared resources is possible when we can modify 80 00:05:20,960 --> 00:05:23,150 all processes to cooperate. 81 00:05:23,150 --> 00:05:28,090 Avoiding deadlock without changing the processes is a harder problem. 82 00:05:28,090 --> 00:05:32,271 For example, at the operating system level, it would be possible to modify the WAIT SVC 83 00:05:32,271 --> 00:05:39,100 to detect circular wait and terminate one of the WAITing processes, releasing its resources 84 00:05:39,100 --> 00:05:42,319 and breaking the deadlock. 85 00:05:42,319 --> 00:05:46,810 The other strategy we mentioned was detection and recovery. 86 00:05:46,810 --> 00:05:51,169 Database systems detect when there's been an external access to the shared data used 87 00:05:51,169 --> 00:05:57,250 by a particular transaction, which causes the database to abort the transaction. 88 00:05:57,250 --> 00:06:02,340 When issuing a transaction to a database, the programmer specifies what should happen 89 00:06:02,340 --> 00:06:09,479 if the transaction is aborted, e.g., she can specify that the transaction be retried. 90 00:06:09,479 --> 00:06:13,960 The database remembers all the changes to shared data that happen during a transaction 91 00:06:13,960 --> 00:06:18,990 and only changes the master copy of the shared data when it is sure that the transaction 92 00:06:18,990 --> 00:06:22,979 will not be aborted, at which point the changes are committed to 93 00:06:22,979 --> 00:06:24,990 the database. 94 00:06:24,990 --> 00:06:30,200 In summary, we saw that organizing an application as communicating processes is often a convenient 95 00:06:30,200 --> 00:06:31,770 way to go. 96 00:06:31,770 --> 00:06:37,300 We used semaphores to synchronize the execution of the different processes, providing guarantees 97 00:06:37,300 --> 00:06:42,620 that certain precedence constraints would be met, even between statements in different 98 00:06:42,620 --> 00:06:43,620 processes. 99 00:06:43,620 --> 00:06:48,800 We also introduced the notion of critical code sections and mutual exclusion constraints 100 00:06:48,800 --> 00:06:53,370 that guaranteed that a code sequence would be executed without interruption by another 101 00:06:53,370 --> 00:06:55,110 process. 102 00:06:55,110 --> 00:07:01,550 We saw that semaphores could also be used to implement those mutual exclusion constraints. 103 00:07:01,550 --> 00:07:06,210 Finally we discussed the problem of deadlock that can occur when multiple processes must 104 00:07:06,210 --> 00:07:11,199 acquire multiple shared resources, and we proposed several solutions based on 105 00:07:11,199 --> 00:07:17,260 a global ordering of resources or the ability to restart a transaction. 106 00:07:17,260 --> 00:07:21,220 Synchronization primitives play a key role in the world of "big data" where there are 107 00:07:21,220 --> 00:07:26,319 vast amounts of shared data, or when trying to coordinate the execution of thousands of 108 00:07:26,319 --> 00:07:29,090 processes in the cloud. 109 00:07:29,090 --> 00:07:34,250 Understanding synchronization issues and their solutions is a key skill when writing most 110 00:07:34,250 --> 00:07:35,539 modern applications.