Читать книгу Distributed Computing Pearls - Gadi Taubenfeld - Страница 11
ОглавлениеCHAPTER 1
Distributed Computing
Two major developments that have happened in recent years are dramatically and irrevocably changing the computer landscape: the dramatic expansion in the use of the Internet, and the design of multicore computers where a single computer has multiple processors. A processor is the “brain” of the computer, and modern computers are now built with multiple “brains.” In each one of these cases, processors on the same or different computers have to work together as a team to achieve common goals.
In addition to personal computers, smartphones, laptops, tablets, and webcams, the Internet offers connectivity between many other computing devices. A few examples are sensors which enable environmental monitoring applications; sensors which are used for monitoring changes in structural conditions of infrastructures like bridges; wearable sensors that can remotely read a patient’s biometrics and send them to the patient’s physician; infant monitors that provide parents with real-time information about their babies; and much more.
1.1 WINDS OF CHANGE
Computers and computer networks are one of the most incredible inventions of the 20th century, having an ever-expanding role in our daily lives by enabling complex human activities in areas such as entertainment, education, and commerce. One of the most challenging problems in computer science for the 21st century is to improve the design of systems where communicating devices interact with one another, and in particular, to invent new applications that will take advantage of these new capacities.
The fundamental changes in computing architectures, where the whole hardware industry shifted to both mobile and multicore platforms expanding the arena of communicating devices to an unprecedented scale, require a fundamental change in how applications are written for computer networks and multicore computers. Concurrency and synchronization are fundamental issues that are critical for the design of such applications—the Internet could not work without synchronized protocols, database systems would be unthinkable without them, operating systems would crash even more frequently, and concurrent programming would be an impossibility.
Concurrency exists, of course, not only in computer systems but also in nature and living organisms. We can find concurrency at the molecular level as well as at those of cells, organs, individuals, communities, and ecological systems. Much of the future of computers in the 21st century will be told by how well programmers can take advantage of concurrency.
This book gently introduces the general reader to some of the basic principles and some of the fascinating results of computer science which involve interactions between computing devices (or processors on the same or different devices), so that the reader can get a feel of the nature of this exciting and interesting field called distributed computing. These fundamental principles and results, underlying the design of algorithms for distributed systems, are presented by comparing issues that arise when dealing with groups of computing devices to those that arise when a group of people has to share their resources and work as a team to solve various problems. The book is self-contained; no reliance is made on previous knowledge in computer science. In particular, all the figures in the book which include code segments can be skipped without loss of continuity.
1.2 THE INTERNET
The Internet is a vast collection of computer networks which are connected into a single huge network. It is used for the transport of data and messages across distances which can be anywhere from the same room to anywhere in the world. It provides an infrastructure for the use of E-mail, banking, commerce, education, entertainment, telecommunications, manufacturing, and social networks and enables the sharing of remote resources such as databases, files, discs, printers, and computational resources. It also enables several computers to solve problems together much faster than any computer can do alone.
Historically, the Internet was designed as a research network without the expectation that it would have a significant role in our daily lives. The scale and heterogeneity of the Internet have far surpassed all expectations. As the Internet continues to grow, more and more business-critical functions rely on its availability. The vast majority of communications traffic, including telephone, television, radio, business data, and government data, rely on an Internet infrastructure that is available and secure.
Many applications of large-scale computer networks, such as the Internet, require a high level of reliability and security, for example, an air traffic control system, spacecraft navigation systems, or an integrated corporate management system. As the system size increases, so does the need to design fault-tolerant applications, for the probability that the entire system functions correctly at any given time rapidly approaches zero. Such applications enable the system as a whole to continue to function despite the failure of a limited number of components.
Today’s economy involves manufacturing, distributing, and retailing of goods. However, it also has to do with creating and disseminating information, for example, publishing books, filmmaking, etc. Future economy is likely to be dominated by information. Information is a representation of knowledge, and can be represented in two ways: analog—a book that you can hold in your hand; or digital—a book that is stored as a file in your computer. The digital revolution is about converting analog information to digital information and use computer networks such as the Internet to move the digital information around. Such networks are required to be able to move the information in large quantities, everywhere, cheaply, securely, and as fast as possible.
1.3 COMPUTERS WITH MULTIPLE PROCESSORS
A processor is the brain of the computer. It is a component in a computer that interprets and execute computer program instructions and processes data. Throughout the history of modern computing, application developers have been able to rely on improved processor design to deliver significant performance improvements while reducing costs at the same time. That is, as processors got faster so did the applications. Unfortunately, increasing difficulty with heat and power consumption of the new smaller and faster processors along with the limits imposed by quantum physics has made this progression increasingly more difficult.
Until a few years ago mainstream computers were built with a single processor only. This situation has changed, as it became more difficult to increase the (clock) speed of uniprocessor computers. Hence, all microprocessor companies have been forced to bet their futures on multiple processors (also called multicores1) which resides inside a single computer.
Essentially, all computer manufacturers are now offering a new generation of multiprocessor computers where a single computer includes several processors all executing concurrently, and interact and collaborate with one another. Several computer manufacturers have been building, for many years now, costly high-end computers where each computer includes many processors,2 however relatively cheap multiprocessor computers are available today as mainstream computers and can be found in many homes.3 The switch from uniprocessors to multiprocessors is a milestone in the history of computing, and researchers have the rare opportunity to re-invent some of the cornerstones of computing.
This fundamental change in computing architecture requires a fundamental change in how such computers are programmed. Writing a concurrent application for a multiprocessor computer that takes advantage of having multiple processors to increase speed and get better performance is much more challenging and complex than programming a uniprocessor computer, and requires an understanding of new basic principles. Much of the future of computers with multiple processors will be told by how well programmers can take advantage of the new concurrent computing architecture.
1.4 SYNCHRONIZATION
Computation on computer networks like the Internet and computation on a single multiprocessor computer have many aspects in common. The key issue in both cases is the need to understand how separate computers on the Internet or, similarly, separate processors within a single computer, interact and synchronize with one another. Synchronization techniques are perceived as essential to design and support the working activities of groups of computers and processors.
Many of our daily interactions with other people involve synchronization. You and your spouse may have to synchronize on who will buy the groceries, empty the garbage can, take the kids to school, which one of you will be the first to take a shower (assuming you only have one shower at home), will take the car, or use the single computer you have. Assume that you have a cat and your neighbor has a dog and you and your neighbor are sharing a yard, then you and your neighbor might want to coordinate to make sure that both pets are never in the yard at the same time.
In these examples, synchronization is used to ensure that only one participant (and not both) will take a specific action at a given time. Another type of synchronization has to do with cooperation. You and your spouse might need to move a heavy table together to its new location (it is too heavy for just one person). A classical example of cooperation is for two camps of the same army to decide on the exact time for a coordinated attack on the enemy camp.
We point out that the use of the term synchronization in computer science is slightly more general than its use in standard English. The following quote from the Oxford dictionary explains this point, “The use of synchronize to mean coordinate or combine as in ‘We must synchronize our efforts’ is considered incorrect by some people and should be avoided in standard English.” In computer science, synchronization also means coordination. That is, synchronization between processors is classified as either contention or coordination.
1.5 WHY IS SYNCHRONIZATION DIFFICULT?
All the above examples for synchronization between people have similar examples for synchronization between computers. Synchronization is needed in all systems and environments where several processors can be active at the same time. Without proper synchronization, the integrity of the data may be destroyed if two computers update a common file at the same time, and as a result, deposits and withdrawals could be lost, confirmed reservations might have disappeared, etc. However, while achieving synchronization between humans is sometimes relatively easy, achieving synchronization between computers is challenging and difficult. The reason is that most computers communicate with each other in a very restricted way.
While humans can see and hear each other, computers, and computing devices, in general, can in most cases only read and write. So, one computer can write a note (or send a message) that the other computer will later read, but they cannot see each other. To understand the difficulty with this type of restricted communication, the next two chapters examine several simple two-person interactions where communication is restricted either to writing and reading of notes or to sending and receiving of messages.
1.6 ALGORITHMS AND PROGRAMS
The notion of an algorithm is a central notion in computer science. An algorithm is just the recipe upon which a problem is solved. It was originally used in the context of solving mathematical problems. Euclid, the famous Greek mathematician, invented sometime between 400 and 300 B.C., an algorithm for finding the greatest common divisor of two possible integers. For example, the greatest common divisor of 18 and 27 is 9. This algorithm is considered to be the first nontrivial mathematical algorithm ever devised.
The word algorithm is derived from the name of the Persian mathematician Mohammed al-Khowârizmî, who lived in Baghdad during the 9th century. Al-Khowârizmî laid out the basic algorithms for adding, multiplying, and dividing numbers, and for extracting square roots. On a computer, an algorithm is expressed as a computer program which specifies, in the exact syntax of some programming language, the computation one expects a computer to perform. A recipe for preparing a cake, which prescribes the activities needed for preparing the cake, is also an algorithm. Such a recipe can be expressed in many different natural languages.
A plan or a strategy for winning in a game or solving a puzzle is also an algorithm. Thus, throughout the book, we shall use the terms, an algorithm, a plan, a strategy, or a solution, interchangeably. In most chapters of the book, we explain fundamental concepts which involve concurrency and synchronization between computers, by examining situations and solving problems which relate to interactions between people where communication is restricted in various ways. Thus, we shall use the terms a plan or a solution, more often than we shall use the term an algorithm.
1.7 CONCURRENT AND DISTRIBUTED ALGORITHMS
A concurrent or distributed algorithm is the recipe upon which a problem is solved by more than just one computing element. Finding the largest number in a set of numbers by first dividing the set into two subsets, using two processors to find the maximum number in each subset, and then comparing these two numbers, is an example of a simple concurrent algorithm. (Such an algorithm should also specify how to find the maximum in each subset.)
The term distributed algorithms refers to algorithms where the computing elements are physically far away from each other and communicate by sending and receiving messages (as done on the Internet). The term concurrent algorithms refers to algorithms where the computing elements are physically very close to each other and communicate by reading from and writing to shared memory locations (as done inside a multiprocessor computer). In the field of distributed computing, both types of algorithms are studied.
When a processor executes a computer program (such as a web browser), the execution itself is called a process. A process runs on a processor, which is the physical hardware. The physical location of the different processes or processors—we shall use the terms interchangeably—involved in a single concurrent activity can be anywhere from the same computer to different computers anywhere in the world.
There are two main technological underpinnings of the fascinating rapid developments of computer networks and multiprocessor computers. The first is, of course, the advances in the design of faster hardware. The second is the development of efficient concurrent and distributed algorithms for supporting complex interactions between processors and computers. This book tells the story of the development of such algorithms. It is a fascinating story!
1.8 IMPOSSIBILITY RESULTS
In addition to the study of concurrent and distributed algorithms, the field of distributed computing also explores the inherent limitations of distributed systems: what problems cannot be solved in particular systems. Identifying features of a specific distributed system architecture that make it inadequate for solving certain problems is crucial for the design of better systems which can overcome such limitations.
Impossibility results help us understand the crucial limitations of real systems, why certain systems are (computationally) weak while others are powerful, when should we stop looking for a solution for a given problem, and how to adjust a problem statement or a system model to overcome an impossibility result.
Impossibility results usually depend on assumptions about: how the computing elements communicate with one another, what kinds of failures may occur, or whether randomization can be used. Such results are usually hard to prove.
Through the book, we will introduce and discuss some of the most fundamental impossibility results of distributed computing. We will highlight the insights gained from these results so that the reader can understand and appreciate their utmost importance.
1.9 CHAPTER NOTES
In 1968, Edsger Wybe Dijkstra, one of the most influential members of computing science’s founding generation, published his famous paper “Co-operating Sequential Processes” [14], that originated the field of concurrent programming. A few concepts and results from Dijkstra’s papers are covered in Chapters 2 and 7.
The Internet traces its beginning back to the early 1960s. At that time several research groups had invented a new technique, called packet switching, to transmit information as an efficient and robust alternative for circuit switching which is the transmission method used by the telephone network. Packet switching is the transmission method used today by the Internet. James F. Kurose and Keith W. Ross’ excellent book Computer Networking: A Top-Down Approach explains and uses the Internet’s architecture and protocols as primary vehicles for studying fundamental computer networking concepts [31].
In 1995, Gordon Moore published an article predicting exponential growth in the density of transistors in integrated circuits [37]. Since then, this prediction, known as “Moore’s Law,” went on to become a self-fulfilling prophecy. Moore’s Law has become actual fact, with the doubling of transistor density every 18 months, as well as exponential growth in clock rates. However, due to inherent limitations imposed by the laws of physics, this exponential growth of the computing power of uniprocessors has to decline. The constraints of heat dissipation due to high-power consumption by fast uniprocessors have forced chip manufacturers to develop multicore architectures, where increases in throughput are achieved by packaging multiple cores embedded on the same chip which resides inside a single multiprocessor computer.
1Multicore means multiple processors embedded on the same chip (i.e., on the same piece of semiconducting material).
2In mid-2007, IBM unveiled Blue Gene/P, the second generation of one of the most powerful supercomputers in the world. The Blue Gene/P supercomputer configuration is a 294,912-processor, 72-rack system harnessed to a high-speed, optical network. It is used for executing applications like hydrodynamics, quantum chemistry, molecular dynamics, climate modeling, and financial modeling.
3The multicore era for mainstream computers began in spring 2005 when Intel and AMD (followed the lead of IBM and Sun Microsystems) announced that their microprocessors would rely on multiple processors or cores, and introduced computers with two processors each (dual-core processors) which enable the execution of two programs in parallel.
