Basic Concept of Intermittent Fault Detection
Intermittent Fault Detection (IFD™) using the IFD™ TE is a highly effective testing approach which, when applied methodically and correctly, provides outstanding results for the User in detecting and isolating intermittent events. The more the User understands the concept of IFD™ Technology and, in particular, about intermittent fault causes in electrical and electronic equipment, then the easier it will be to apply the IFD™ TE effectively. Therefore, the purpose of this section is to provide the User with the relevant background knowledge to enable them to exploit the extensive capabilities of the IFD™ TE to its full potential.
Intermittency Characteristics & Causes
A system with, theoretically, perfect Integrity would perform uninterrupted and as intended for the entirety of its service life. It could also be considered, theoretically, that electronic systems should not really wear out with use: and yet they obviously do. The short duration deviation from the normal operating conditions of the system is known as intermittency, a well documented phenomenon concerning electrical and electronic circuitry. Intermittency has been shown to be influenced by mechanical stress (fretting corrosion, for example) which then leads to transient variations – or intermittency – in degraded contacts. These intermittent events can last for mere nanoseconds, but this contact intermittency can be enough to result in system failure or loss of information. Not only are these intermittent events extremely short in duration they are also, by definition, random. With the probability of detecting a random, nanosecond-duration root cause event being marginal at best, the temptation of speculatively replacing a component in the hope of removing the fault’s root cause from the system becomes a great one. By replacing the component, however, the electrical contact characteristics of the system have been changed but the susceptible components such as cables and connectors have been left unchanged. For connectors in particular, they cannot be permanently sealed and so they are susceptible to corrosion and debris ingress, plus they experience wear in use and as a consequence of maintenance.
Therefore, just like machinery, these particular components will wear out gradually. Over time, left undetected, the physical mechanisms that affect contact intermittency and, to which precipitate the fault’s root cause, will degrade as a consequence of ageing, usage, environmental factors and maintenance factors. Given the mechanical characteristics of these failure mechanism, it is found extensively in ‘the 3Cs’: Connectors, Cables and Chassis. These 3Cs are the interfaces within systems and circuits, and they experience micro-changes in circuit characteristics due to operating environments (vibration, debris ingress etc) and also, but not limited to, the aforementioned ageing effects such as fretting corrosion. These minute changes in circuit behaviour are enough to cause systems to fail randomly, temporarily or for the remainder of the duty cycle, and then without intervention they may reset themselves. The intermittent events will become greater in duration and amplitude, degrading to the point where either the root cause is diagnosed and detected, and/or the fault has become permanent: a ‘hard’ fault. Given the massive variation possible in the degrading factors mentioned, the evolution of the fault’s root cause from initial intermittency to hard fault could take place over a life-cycle ranging from seconds to years. Therefore, the longer and more gradual this fault degradation life-cycle, the harder it is to detect. An extremely common outcome is the inability to detect the root cause of the intermittent fault and/or to duplicate the previously observed failure: in other words ‘No Fault Found’ (NFF) or as otherwise known as 'Fault No Found' (FNF).
No Fault Found: The Testing Problem
By their very definition, intermittent faults and their symptoms occur randomly in time, place, amplitude and duration. Conventional test equipment that functions adequately for detecting hard faults is extremely limited when applied to detecting and isolating intermittent problems because of the ‘stimulate-measure-compare one circuit at a time’ approach that most use. Therefore intermittent fault events can be missed altogether by the conventional testing time window – illustrated below – incurred by digital sampling rates. This testing blind spot, which is compounded further by digital averaging of results, means that conventional equipment does not provide effective test coverage for detecting intermittent faults, and so technicians often resort to ‘shotgun maintenance’ and speculative component replacements. The outcome is an inability to detect the fault root cause, leading to NFFs, repeat arisings and repeat maintenance.
An alternative school of thought considers that NFF/intermittency problems can be addressed by using traditional measurements such as tracking and comparing circuits down to fractions of a milliohm, one-circuit at a time, against long-running records of similar measurements. However, there are major limitations to this approach. When an intermittent circuit is in a temporary 'working' state it will generally pass such tests and only those approaching hard-failure status will be detected this way. Also, measuring ‘fractions of a milliohm’ and attempting to take meaningful action based on these values is extremely difficult, time-consuming and requires precise control in the test set-up and test environment. These approaches are not effective or pragmatic enough for the majority of intermittence testing scenarios.
The Testing Solution
Overcoming the aforementioned testing challenges posed by NFF/intermittency problems requires a different approach to that of using conventional digital equipment predicated on accuracy of measurements and time-consuming results analysis. Truly effective and practical detection of intermittency requires improved test coverage and, consequently, vastly improved probability of detection. The IFD™ TE is designed specifically to address this problem:
· Its simultaneous analogue sensing technology is able to detect extremely low amplitude, high-speed (nanosecond) impedance changes.
· Its neural-network architecture monitors all of the potential failure points simultaneously, in parallel.
· Its digital processing provides for fast, precise data handling and generates custom fault-graphics to help quickly isolate the failing source to its root cause and location.
In short, the IFD™ Technology has been designed to test all of the test points all of the time, in a simultaneous and continuous manner. This means that the overall test coverage is several orders of magnitude more effective than conventional test technologies at detecting intermittency. When an intermittent event occurs, it is detected.
Neural Networks
The artificial neural network, which is the key to the IFD™ Technology, is used to resolve , as outlined previously, one of the most prevalent and difficult testing problems in the electronics industry. The theory of operation of the IFD™ Technology is based on a hardware neural network whose origins can be traced to concepts normally associated with biological systems. The way this network functions is best described by analogy with an example of the human brain and its function whilst monitoring the body’s nervous system.
When placing your shoes on in the morning your brain does not wish to be reminded every second of the day that your shoe is still on your foot. All you really care about is when the situation changes. You may wish to know when the laces untie or a stone gets inside the shoe or when some other change from the steady state occurs. In accomplishing this task the brain is functioning using what has been described as ‘parallel distributed processing’. Biologically, this is accomplished using what is termed as neural systems or networks, with the human neural system monitoring all sensors (nerves) in all parts of the body, continuously. It is not using on-off digital sampling techniques which would take far too much processing time. Rather, it is conducting more efficient continuous, analogue monitoring of all sensors all of the time. However, it is only interested in change or the differential information.
The ability of the analogue and parallel neural network to monitor all points continuously is very important because any change in an electronic circuit may occur randomly at any point, at any time. Therefore predicting when, where or at what severity the event will occur is impossible. This is why scanning, sampling or interval related testing or measurement methodologies as currently employed in virtually all digital-based test equipment cannot reliably and efficiently detect random intermittent faults.
A Unique Set of Test Capabilities
This chapter has outlined the concept of how detecting intermittency, which causes the major proportion of NFF problems, achieves an immensely higher probability of detection if analogue neural network technology is used instead of conventional digital equipment. The embodiment of that capability is found in the series of IFD™ TE. However, whilst intermittent fault detection is the outstanding feature of this equipment, the IFD™ TE has a multitude of functions available to the User for testing and analysing circuits.
Its internal switching matrix can divert each selected circuit(s) to a traditional equipment measurement bus to conduct standard continuity, shorts and miss-wire testing, for example; additions instrumentation can also be used depending on the system's options. This dynamic switching ability also enables automated connection mapping of Units Under Test (UUT). UUT degradation can be characterised and trended proactively and reactively using ‘Gold Standards’. In short, the IFD™ TE provides the best of all the testing worlds: full intermittence test coverage, plus all the traditional and effective point to point (P2P) testing capabilities with which the User will already be fully accustomed.