Power Users Demand More Power
There are definite limits to the size (in terms of the work that can be accomplished) of a computer's Central Processing Unit (CPU). As the requirement to do more work with a Central Processing Unit increases, the people who make these CPUs endeavor to make them ever faster employing more exotic technologies. Increasingly this has involved in the largest computers the use of Emitter Coupled Logic (ECL). There are still limits to this technology and eventually it is found that it is necessary to organize CPUs so that they can co-operate and share the load. It has been found that these sharing computers can be more cost effective, particularly when cheaper technologies such as Complementary Metal Oxide on Silicon (CMOS) are used.
The first form of co-operation involves using several CPUs in an arrangement where they share the main memory and the other communications and storage resources of the computer. The approach is called Symmetric Multi Processing (SMP), since all the CPUs in a machine have equal status and allow the operating system to use several threads of control simultaneously. In reality the CPUs will not share everything since the architecture of the modern CPU allows for a cache of private memory. This is particularly true of modern RISC CPUs where the circuitry required to decode complexed addressing modes has been sacrificed to provide more registers and bigger caches. It is the need to keep the caches synchronized which limits this architecture. The more CPUs that are added to a shared memory model the more time they will have to spend maintaining caches and the addition of each additional CPU will not give a linear increase in performance. Eventually a point is reached where the additional CPUs give no increase in performance. This limit is the matter of some debate, it depends upon the efficiency of the cache updating strategy. Until recently, eight CPUs was considered the limit, but now with the advent of Non-Uniform Memory Architecture (NUMA) schemes some computer manufacturers believe that the limit on the number of CPUs in SMP architectures is significantly higher.
The majority of computers in the world are used for data manipulation tasks. These tasks involve the movement of data from the CPUs to some secondary data storage system, usually disks. Exploiting the fact that data is seldom uniform and that the secondary storage system can be broken up between several computers leads to a clustered share nothing approach. This trick in the past has been easiest to do with hierarchical databases such as DL/1 but increasingly has been accomplished with more modern relational database models.
In a relational model the data is grouped in tables which cross reference each other. These tables can naturally be very large, but they can be split horizontally, so that a portion of each table can lie on a different computer. When a datum is added or retrieved from the table not only its physical address on the disk can be determined from the `index' key, but also on which computer it lies. Relational databases employ a large number of such tables which interact with each other. Making them parallel across a number of computers in this fashion requires two key programs. A parallel query decomposer which acknowledge the distributed nature of the database and produces an optimal data retrieval method and a parallel lock manager which too acknowledges the distributed nature of the database and allows the tables to be updated safely.
The computers that participate in parallel relational databases can of course have a number of CPUs and be, individually, SMP machines. However since relational technology produces large data flows between related tables to be effective the computers have to be clustered tightly together. To keep transmission speeds to the minimum the network that they exist on must be very high speed and to keep latency to a minimum the computers must be located closely together. When such computers are all effectively kept in the same box and are linked by networks whose speed approaches that of the main data bus of the individual computer such collections are known as Massively Parallel Processing (MPP) computers.
MPP machines by their very nature lend themselves to redundancy and duplication, allowing highly reliable computing to scale without foreseeable limit. They do however require the addition of extra software which will control the allocation of work between the various components. This is a vital task for a Transaction Processing Monitor.