By John Rynearson, Technical Director, VITA
Live insertion usually means that an electronic module can be removed and then reinserted into a system while the system remains under power. Another popular term for "live insertion" is "hot swap". The assumption is that removal of the module and reinsertion will cause no electrical harm to the system. Why is this capability desirable? Systems are usually powered down for either repair or reconfiguration. However some systems are configured in such a manner that it is not acceptable to power down all or part of the system. Telecommunications system fall into this category. Other systems may control industrial equipment such as robots, sorters, etc. that must remain powered up. Whatever the reason live insertion is becoming a desirable feature for many systems.
Live insertion is really a system design issue. What are the application requirements that make live insertion a desired feature? Including live insertion support in a system costs money. If your application doesn't require live insertion, then save money and don't put it in. If your application does require live insertion, then a careful analysis must follow.
First, must your system continue to function during live insertion? If the only requirement is that the system remain powered, but that its functions may cease temporarily, then a simple method for "safeing" the software and the hardware that it controls is all that is required.
On the other hand if the system must continue to function then a careful analysis of the system at a modular level is required. Each module must be examined to determine how its possible failure and replacement can be handled.
A conventional computer system will have many single points of failure. That is, a failure in a specific component will cause the entire system to cease functioning. Systems that fall into this category cannot really maintain functionality when a component fails. At the other end of the spectrum are fault tolerant systems. A system is fault tolerant if it is defined to have no single point of failure. Fault tolerant systems are constructed with redundant components so that if one component fails another is available to take its place. However, fault tolerant systems are expensive and may not always be necessary. A less expensive approach is a "high availability" system. High availability systems don't meet the definition of fault tolerant systems in that they may have one or more single points of failure. However, they can be used in applications where certain functions need to be redundant and others don't.
Is the purpose of supporting live insertion for repair, reconfiguration, or both? Supporting live insertion for repair assumes that when a module goes bad the system can detect the failure, take the affected module off-line, and notify the system operator. When the module is replaced the system must detect the replacement, qualify the new module for system operation, and then bring the module smoothly back on-line. Reconfiguration, on the other hand, can be done as either replacement, enhancement, or both. For example, a trivial case would be to install additional memory into a system. When the module is installed, the system must be notified that a new resource has been added. The system must then qualify the new resource and incorporate it into system operation. On the other hand, imagine a multiprocessing system that is to be upgraded with new enhanced processor boards. If the system is dynamically reconfigurable, then the older module can be removed and replaced with a newer module with perhaps more memory, a faster processor, or additional I/O ports.
The VITA 1.4 standard for VME64x live insertion provides a basic framework for supporting live insertion. For example, the sequences that a board must go through when it is removed is shown in the list below.
Typical Board De-allocation Process
A similar set of sequences are involved when a module is reinserted.
Live insertion or "hot swap" is becoming a desired feature on many new embedded systems. The spectrum of live insertability ranges from simple systems that require only that power remain on to more complex systems that require that the system continue to function. Thrown into all this may be the requirement to carry out failure detection on a dynamic basis. On the surface live insertion may seem like just a board extraction/insertion issue, but it is not. A comprehensive system design must be done if live insertion is to operate properly. VITA 1.4, VME64x Live Insertion System Requirements, provides a basic framework for live insertion in VME based systems. It is currently in task group ballot within the VITA Standards Organization (VSO) and should be submitted to ANSI canvass later this year.
This FAQ page last updated: Sep 15, 1999
Reprinted from the VITA Journal with permission from VITA.