[Up] [Previous] [Next]

10 Nov 1995

The I/O Bus

A personal computer may transfer data from disk to CPU, from CPU to memory, or from memory to the display adapter. A PC cannot afford to have separate circuits between every pair of devices. A mechanical switch, like the old phone systems used, would be too slow.

The solution is a Bus. The Bus is simply a common set of wires that connect all the computer devices and chips together. Some of these wires are used to transmit data. Some send housekeeping signals, like the clock pulse. Some transmit a number (the "address") that identifies a particular device or memory location. The computer chips watch the address wires and respond when their identifying number is transmitted. They then transfer data on the other wires.

From the first IBM PC up through the first PS/2 computers (introduced in 1987) a computer had one bus and all of its devices and chips ran at the same speed. On those systems, additional computer memory was often added by plugging an adapter card into the same slots that held I/O adapters. Starting with machines that used the 386 CPU, the memory and CPU of the system ran faster than the I/O devices. The solution was to separate the CPU and memory from all the I/O. Today, memory is only added by plugging it into special sockets on the main computer board.

Whether there is more than one Bus, or one Bus with different speeds, is a matter of perspective. A car drives down the local streets at 25 miles per hour. Then it turns onto a highway ramp and accelerates to 55. Is there one road system, or two? The important thing is that there is a connection that allows a flow of traffic between the two speed zones. Within the PC, data can flow from any chip to any other chip, but different parts of the path run at different speeds.

Analogy Alert: Electricity flows through wire at the same speed everywhere (at about the speed of light). So a "faster" bus is not one where the electrons move faster, but rather one in which the time between meaningful events (the "clock speed") is faster. On a faster bus the chips have to react more quickly, and the engineering has to be more rigorous so that voltage signals can be clean and tight.

In a modern PC, there may be a half dozen different Bus areas. There is certainly a "CPU area" that still contains the CPU, memory, and basic control logic. There is a "High Speed I/O Device" area that is either a VESA Local Bus (VLB) or an PCI Bus. An very low cost home computer may have no high speed devices. A more typical desktop system connects the high speed bus on the mainboard to the display adapter and IDE disk interface chip. Then one or two extra I/O slots may allow adapter cards to connect to the high speed bus. The remaining I/O device slots support standard "ISA" bus cards. Some computers will also provide sockets for a number of PCMCIA "credit card" adapters commonly found in laptop computers.

History

In 1984 IBM was shipping its PC AT model. The CPU, memory, and I/O bus all shared a common 8MHz clock. This became the basis for all subsequent clone computers. The term "AT" is a registered trademark of IBM, so this I/O bus became known as the ISA (Industry Standard Architecture) bus.

Every currently marketed PC supports some ISA interface slots. The bus and matching adapter cards are simple and cheap. ISA is a 16-bit interface, which means that data can be transferred only two bytes at a time. More importantly, the ISA bus runs at only 8 MHz and it typically requires two or three clock ticks to transfer those two bytes of data. This is not a problem for devices that are inherently slow like the COM port (modem), the printer port, the sound card, or the CD-ROM. However, the ISA bus is too slow for high performance disk access and therefore is not acceptable in Servers. It is also too slow for modern Windows display adapters.

In 1987 IBM introduced a new Microchannel (MCA) bus. It had clear advantages over the previous PC bus. It's 10 MHz clock was slightly faster. The cards could be automatically configured with a utility program instead of setting physical switches and jumpers. The bus can transfer four bytes of data at a time and, in some configurations and with some cards, it can transfer data every clock tick. However, the Microchannel itself was expensive, the adapter cards were more expensive, and the technology remained encumbered by IBM licensing.

The other vendors developed an extension of the older ISA interface called EISA. An EISA slot contained the older ISA interface, and then an extra socket with additional connections. The user could plug either an old ISA card or a new EISA card into the slot. The newer cards supported a 32-bit data interface and could therefore transfer four bytes of data per operation. However, to remain compatible with the old card, EISA still ran at 8MHz. And the extra logic pushed up the cost of both the EISA system and each adapter card.

As the 486 CPU chip became popular, the idea of running I/O devices at 8 or 10 MHz collided with a mainboard that ran everything else at 33 MHz. It was also clear that it should not require an extra $500 to transfer data at 32-bits.

The first solution was the VESA Local Bus (VLB), which became popular at the start of 1993. VESA is a consortium of companies making displays and display adapters. Desktop machines began to include one or two Local Bus slots to support a high speed video card and, perhaps, one other high speed device. A few vendors produced VESA SCSI adapter cards, or Local Bus LAN adapters. Nevertheless, VESA remained largely a display standard.

PCI - The Current Standard

The PCI bus was developed by Intel. Although it is mostly known for its CPUs, Intel also has a historical association with Ethernet, multimedia, and some disk interfaces. So Intel was unhappy with the VLB concentration on just the video interface and wanted to develop a general purpose bus. The objective was an interface that was fast and inexpensive. It did not have to be simple (advances in chip technology took care of that) and could achieve a low cost by high volume production.

PCI is a 64 bit interface in a 32 bit package. Figuring this out requires a bit of arithmetic. The PCI bus runs at 33 MHz and can transfer 32 bits of data (four bytes) every clock tick. That sounds like a 32-bit bus. However, a clock tick at 33 MHz is 30 nanoseconds, and memory only has a speed of 70 nanoseconds. When the CPU fetches data from RAM, it has to wait at least three clock ticks for the data. By transferring data every clock tick, the PCI bus can deliver the same throughput on a 32 bit interface that other parts of the machine deliver through a 64 bit path.

The PCI bus has all the signals of the old ISA bus. This allows a PCI adapter card to emulate older equipment. For example, a PCI disk controller can respond to the same addresses and generate the same interrupts as the older disk controllers that the BIOS understands. However, PCI devices can also be self-configuring and operate in a Plug and Play mode.

The PCI bus connects at one end to the CPU/memory bus and at the other end to a more traditional I/O bus. The PCI interface chip may support the video adapter, the EIDE disk controller chip, and maybe two external adapter cards. A desktop machine will have only one PCI chip, and so it will add a number of extra ISA only slots. A server may add additional PCI chips, and extra server slots will usually be EISA.

While ISA and EISA are exclusively PC interfaces, the PCI bus is now used in Power Macintosh systems and PowerPC machines. It may be attractive for minicomputers and other RISC workstations.

Current Status

Interrupts

An I/O bus is similar to bus between the CPU, mainboard control logic, and memory. Both types of bus structure have address wires, data wires, and a similar set of housekeeping wires. Both bus structures must determine if an operation refers to memory or an I/O address. Both must distinguish between 8-bit, 16-bit, and 32-bit operations. Both must be able to introduce "Wait States" to slow down the CPU when a device needs more time to complete an operation.

The most important difference between the CPU-memory local bus and the I/O bus is the presence of Interrupt Request (IRQ) wires. The I/O bus has 15 separate IRQ wires. The CPU has only one interrupt pin. The chip set on the mainboard has to provide a translation between the two.

Without interrupts, the CPU must start an operation to a device and then spin in a loop asking, "Is it done yet? Is it done yet? Is it done yet?" After a few hundred thousand tests, the device will signal that the operation is complete. Interrupts allow the CPU (particularly on a more advanced operating system like Windows 95, OS/2, or NT) to do some other work until the operation is complete.

When a device generates an interrupt, the CPU hardware stops running an ordinary program and jumps to an interrupt handling routine in the Device Driver. The interrupt may signal that:

Any device on the I/O bus can request an interrupt by placing a signal on one of the 15 IRQ wires. If more than one IRQ signal is received at the same time, the chip set on the mainboard has to select the one with highest priority to process first. The CPU is interrupted (by sending a signal on its one wire) and the chip set then transfer the identity of the IRQ level to be processed.

Each IRQ wire goes to every slot in the I/O bus. An adapter card is configured, physically with switches or logically with a utility) to use a specific IRQ value. The I/O bus on the first PC assumed that each device would have its own IRQ line, so the circuit to drive an interrupt request was made very simple. Unfortunately, when two adapter cards are incorrectly configured with the same IRQ value, and if both try to generate IRQ's at the same time, the result is to produce a short in the I/O bus. Usually there is no damage, but the effect can be to burn out either card or to trash the mainboard.

Later bus architectures (MCA, EISA, PCI) use safer circuitry that allows two devices to share the same interrupt. When an interrupt is shared, the system responds to an interrupt by calling the device driver for each device associated with that IRQ. The drivers poll their respective adapter cards to determine if there is any pending activity which requires a response. There is a slight loss of efficiency when interrupts are shared, but not enough to cause worry.

Plug and Play

An adapter card designed for Microchannel, EISA, or PCI can handle the IRQ problem automatically. However, the rest of the interface to advanced bus structures is expensive. The ISA bus interface is simple and cheap, but it requires the user to set the IRQ, either with physical switches on the card or through some kind of setup configuration utility. The solution to this problem is a new arrangement called "Plug and Play."

To use Plug and Play you need three things:

  1. The PC has to be ready to do Plug and Play. Most new computers have this ability, and most old ones don't.
  2. The adapter card has to be enabled for Plug and Play. It must minimally be able to report to the computer what I/O address and IRQ it is using or is able to use, and it must accept commands to use the values that the computer selects for it.
  3. The operating system must be able to support Plug and Play. Currently, this limits you to Windows 95. Windows NT doesn't support it, based presumably on the theory that anyone running NT better be sophisticated enough that IRQ issues are no problem. OS/2 doesn't support it yet.

A more expensive but more widely supported alternative to Plug and Play is provided by the PCMCIA interface to "Credit Card Adapters" used in most laptop computers. This interface was designed to be self-configuring. Technically it uses software called "socket services" and "card services," but these have the same effect as Plug and Play. The system dynamically assignes I/O addresses and IRQ values to the cards during boot processing or when the cards are inserted. Windows NT and OS/2 both support a large number of PCMCIA cards even through they do not support Plug and Play.

The ISA Bus is SLOW!

The CPU uses the OUT instruction to send data or commands to an I/O device, and it uses the IN instruction to read data or status from the device. These instructions cause the address of the I/O device to be placed on the bus, and they flag one of the housekeeping wires in the bus to indicate that this is an I/O address and not a memory address.

Each device is configured to respond to a range of addresses (the "ports"). Generally, a device will respond to a range of eight addresses. For example, the COM1 port responds to addresses 03F8 to 03FF. When the device sees an I/O address on the bus that matches a value in its range, it responds.

A device uses each address to process a different type of command or generate a different type of status. COM1, for example, uses the first address of 03F8 to handle all the data. The remaining four addresses are used to configure the line (speed, parity), to control the phone (hang-up, begin), to check modem status, and perform other housekeeping.

Unfortunately, it is still possible to plug a card built in 1984 into a modern PC. The I/O bus cannot know how fast a device is able to operate, so it handles the worst case and slows everything down to match the speeds used ten years ago. The chip set on the main board generates a long stream of Wait State signals, forcing the CPU to wait for 250 nanoseconds. The device itself can respond to request more Wait States to give itself even longer to respond. All this time, the CPU is stopped dead waiting for the IN or OUT instruction to end.

Running with the Brake

A department is having performance problems with a LAN database server running on a workstation. They call in a performance specialist who normally works with mainframes and minicomputers.

The CPU is running 100% busy. Since the machine has a socket for upgrading the CPU, the specialist suggests buying a Pentium chip and replacing the old 33 MHz 486. In theory, the system would then be 10 times faster.

They spend $600, get the new chip, plug it in, and the server is instantly 100% busy. Worse, it is not actually getting any more work done. What went wrong?

A mainframe or minicomputer has an I/O bus which is separate from the CPU. However, on an ISA bus the two are logically interconnected. An I/O device such as a disk controller or LAN adapter runs at some speed and inserts enough wait states into the CPU bus to slow the processor down to match the device speed.

Every time the Pentium tries to read or write data from a slow adapter card, the I/O device adds Wait States to the bus to halt the CPU until the device can complete its operation. Unlike the mainframe definition of "Wait," a PC Wait State causes the CPU to appear busy. However, the operation proceeds at the speed of the device and not that of the CPU. Upgrading the CPU chip did not make the device any faster, so the system still appears to be busy all the time. The correct response would have been to upgrade the old dumb slow devices.

The Bleeding Edge

A modern PC will support almost any adapter card ever built for any earlier model of machine. However, just because it runs doesn't mean that you should use it. This is especially true of old LAN adapters. Pull out an old 3C503 Ethernet card or an IBM Token Ring card and you will frequently find that the card plugs into only one socket of the ISA bus instead of two sockets. This card has an "eight bit edge" that transfers one byte of data per bus cycle. Such cards may be reused in lightly used desktop systems, but they should never be used in a Server or in a higher performance machine.

There are also some eight bit interfaces on old CD-ROM interface cards. Vendor even built eight bit SCSI adapters. Such cards should not be used in a Server, or any machine where performance is critical. Even if the device connected to the card is not important, the use of the card will slow down the machine for everyone. The ISA bus needs 375 nanoseconds to read any data from any device. If the LAN adapter presents only one byte, then the CPU has to wait another 375 nanoseconds to get the second byte of data. In contrast, the PCI bus can deliver four bytes of data in 30 nanoseconds.

Direct Memory Access

All PC devices support Programmed I/O (PIO). The operating system executes IN and OUT instructions to read or write data one, two, or four bytes at a time to the device. For simple devices like the keyboard or display this may be perfectly reasonable. DOS and Windows 3.x do not support multitasking, so the CPU isn't doing anything useful while any device is busy.

A smarter device can transfer data directly to memory without the use of the CPU. This is called Direct Memory Access or DMA. Some DMA capability was designed into the first IBM PC. However, DMA dropped out of favor during the 1980's. The problem is that the original DMA design required additional bus cycles, and was therefore slower than Programmed I/O. As long as DOS was the primary operating system, it was better to disable or bypass DMA and let the CPU do the work.

One special case was the Multimedia Sound Card. A DOS or Windows game stores data representing some music or speech in a buffer. It then passes the sound card the address of the data and its length. The card uses DMA to fetch bytes of sound and play them while the CPU proceeds to show video or run the game. When the buffer is used up, the sound card generates and interrupt and the program provides a new buffer of data.

There are a set of DMA controller chips on the Mainboard. Each DMA circuit has a level number and can support one device. When installing new adapters, it is important to avoid conflicts for the DMA number just as one avoids conflicts for the I/O address or interrupt level.

A program stores into the DMA circuit a starting memory buffer address and length. When the device is ready for more data, it uses one bus cycle to send a request to the DMA chip, the chip then substitutes for the CPU in generating the next buffer address to the memory circuits to fetch the next chunk of data for the device. The CPU can be running other programs. However, that first signal from the device to the DMA chip takes one more bus cycle than ordinary Programmed I/O. Thus DMA has not been attractive for disk, LAN, and other performance critical I/O.

Busmaster

On a Microchannel, EISA, or PCI machine, the I/O bus presents a full set of 32 address wires to every adapter card. The adapter can then generate a memory address to reference any location in RAM. A Busmaster adapter card has its own Direct Memory Access control chip on the adapter board. That chip can generate a sequence of memory addresses to read data from memory to the adapter, or to write data from the adapter to memory.

Since DMA and I/O functions are combined on the same card, they can be tightly coordinated. A Busmaster EISA device can transfer data every other cycle, and PCI Busmaster cards can move data in every I/O cycle.

The software on the PC builds a high level request to READ or WRITE an entire buffer of data. It passes the address and size of the buffer to the Busmaster card. Then the PC software does something else.

Internally the Busmaster card moves data between itself and the buffer in memory. When the entire buffer has been processed, it generates an interrupt to the CPU indicating that the request is complete.

In order to make effective use of a Busmaster adapter you need two things. First, you need to have something else to do while the adapter performs the I/O. Secondly, you need an operating system that will allow I/O to run in the background. Generally Busmaster adapters make sense only on Server machines running a multitasking operating system (Windows NT, OS/2, UNIX, or Netware).

Continue Back PCLT

Copyright 1995 PCLT -- Introduction to PC Hardware -- H. Gilbert

This document generated by SpHyDir, another fine product of PC Lube and Tune.