VMEbus Live Insertion (Hot Swap)

By John Rynearson, Technical Director, VITA
July 1998

Question: Can VMEbus support live insertion (hot swap) systems?

Live insertion usually means that an electronic module can be removed and then reinserted into a system while the system remains under power. Another popular term for "live insertion" is "hot swap". The assumption is that removal of the module and reinsertion will cause no electrical harm to the system. Why is this capability desirable? Systems are usually powered down for either repair or reconfiguration. However some systems are configured in such a manner that it is not acceptable to power down all or part of the system. Telecommunications system fall into this category. Other systems may control industrial equipment such as robots, sorters, etc. that must remain powered up. Whatever the reason live insertion is becoming a desirable feature for many systems.

A Design Issue

Live insertion is really a system design issue. What are the application requirements that make live insertion a desired feature? Including live insertion support in a system costs money. If your application doesn't require live insertion, then save money and don't put it in. If your application does require live insertion, then a careful analysis must follow.

System Operation

First, must your system continue to function during live insertion? If the only requirement is that the system remain powered, but that its functions may cease temporarily, then a simple method for "safeing" the software and the hardware that it controls is all that is required.

On the other hand if the system must continue to function then a careful analysis of the system at a modular level is required. Each module must be examined to determine how its possible failure and replacement can be handled.

Conventional, High Availability, and Fault Tolerance

A conventional computer system will have many single points of failure. That is, a failure in a specific component will cause the entire system to cease functioning. Systems that fall into this category cannot really maintain functionality when a component fails. At the other end of the spectrum are fault tolerant systems. A system is fault tolerant if it is defined to have no single point of failure. Fault tolerant systems are constructed with redundant components so that if one component fails another is available to take its place. However, fault tolerant systems are expensive and may not always be necessary. A less expensive approach is a "high availability" system. High availability systems don't meet the definition of fault tolerant systems in that they may have one or more single points of failure. However, they can be used in applications where certain functions need to be redundant and others don't.

Repair, Reconfigure, or Both

Is the purpose of supporting live insertion for repair, reconfiguration, or both? Supporting live insertion for repair assumes that when a module goes bad the system can detect the failure, take the affected module off-line, and notify the system operator. When the module is replaced the system must detect the replacement, qualify the new module for system operation, and then bring the module smoothly back on-line. Reconfiguration, on the other hand, can be done as either replacement, enhancement, or both. For example, a trivial case would be to install additional memory into a system. When the module is installed, the system must be notified that a new resource has been added. The system must then qualify the new resource and incorporate it into system operation. On the other hand, imagine a multiprocessing system that is to be upgraded with new enhanced processor boards. If the system is dynamically reconfigurable, then the older module can be removed and replaced with a newer module with perhaps more memory, a faster processor, or additional I/O ports.

VITA 1.4, VME64x Live Insertion System Requirements

The VITA 1.4 standard for VME64x live insertion provides a basic framework for supporting live insertion. For example, the sequences that a board must go through when it is removed is shown in the list below.

Typical Board De-allocation Process

System administration software disables new connections to board's device driver.
System waits for all connections to terminate or forces existing connections to terminate.

Extraction Process

BREQ* and INTn* are inhibited.
Board drives LI/O* high. Daisy-chain switch on backplane is enabled.
VMEbus interface ASICs and ETL transceivers are disabled.
Board waits for LI/I* to go high or ejector handle switch to open.
Craft Person grasps ejector handle and is discharged through the front panel to FRAME ground.
Craft Person unseats the ejector handles and the ejector handle switches open.
Board's logic power is removed.
Blue LED is turned ON.
VMEbus and primary power pins break contact with the backplane.
Ground and Vpc pins break contact with the backplane.
Board front panel low resistance contact with FRAME ground disconnects.
Bleed resistor B makes contact with FRAME ground.
Bleed resistor A makes contact with FRAME ground.
Board is removed from the guide rail and breaks contact with the ESD contact.

A similar set of sequences are involved when a module is reinserted.

Summary

Live insertion or "hot swap" is becoming a desired feature on many new embedded systems. The spectrum of live insertability ranges from simple systems that require only that power remain on to more complex systems that require that the system continue to function. Thrown into all this may be the requirement to carry out failure detection on a dynamic basis. On the surface live insertion may seem like just a board extraction/insertion issue, but it is not. A comprehensive system design must be done if live insertion is to operate properly. VITA 1.4, VME64x Live Insertion System Requirements, provides a basic framework for live insertion in VME based systems. It is currently in task group ballot within the VITA Standards Organization (VSO) and should be submitted to ANSI canvass later this year.

This FAQ page last updated: Sep 15, 1999

Reprinted from the VITA Journal with permission from VITA.

Return to the main VMEbus FAQ Page