Cause UPS failure: 10 common man-made faults

Time of issue:2021-07-23



The reliability of the data center power supply system is of paramount importance. No matter how sophisticated the IT equipment is, how superior the function of the system, and how high its reliability is, once the power goes out, no matter how good the system is, it won't work. Therefore, the importance of equipment maintenance during operation cannot be ignored. It can be seen that the burden on the shoulders of maintenance personnel is very heavy.

In order to ensure the reliable operation of the power supply system, many good measures have been formulated in many places. But even so, there are many loopholes. The reliability of the equipment has been determined after leaving the factory. For example, some are congenital deficiencies. For example, some power output isolation transformer windings use aluminum enameled wires instead of copper enameled wires for cables. In all likelihood, accidents will occur when running at full load. However, according to statistics, less than 30% of the failures caused by the quality of the equipment itself, 70% of the failures are caused by man-made failures, the performance is as follows:

1. Failure caused by improper selection

(1) The basic concept is unclear and easy to be misled by manufacturers. For example, a highway bidding for UPS requires the UPS to have the ability to continue supplying power without discharging the battery after one or two phases are interrupted in the input. Because some manufacturers advertise that the battery of his UPS does not discharge after one phase of the input is interrupted, the UPS still has 50% of the power supply capacity; the battery still does not discharge after the input of two phases is interrupted, and the UPS still has 25% of the power supply capacity, which extends the battery's power. Service life. Users think this performance is good, and it is not difficult to find its shortcomings with a little brainstorming: if you want to enjoy its advantages, you must purchase a UPS with 4 times the load capacity, otherwise the current load will not be driven after one phase is broken. Then again, what if the UPS disconnects the two wires behind the input switch? Repair it or not? When will it be repaired? Can it be repaired after the power is completely cut off? Wait for how to solve this series of problems. If the user really buys such a UPS according to the actual capacity of the load, this is a huge hidden danger, which is a problem that cannot be solved by operation and maintenance. 

(2) Reasons for inconvenience. For example, some users have been using a certain brand of machine since the last century. At that time, due to objective reasons, despite the low input power factor, low efficiency, large size, high power consumption and high price, it was impossible and inconvenient to solve it. Nowadays, new models that are much superior to the original ones have already come out. For example, the new high-frequency machine structure UPS saves 50,000 kilowatts of electricity per 100 kilowatts per year compared with the original industrial frequency machine structure UPS, and this capacity is several megawatts. The computer room can save millions of kilowatt-hours of electricity every year. However, for some reason, the energy-consuming machine was still included in the bid book without choosing the energy-saving equipment. I was afraid that it would not be safe to do so, and the structural characteristics of the machine were written into the bid book. This not only increases the investment and floor space of air-conditioning equipment, but also undoubtedly lays down hidden dangers for future operations. This is another problem that cannot be solved in operation and maintenance.

(3) Pursue low prices. Some users think that UPS is the same, so they pursue low prices, which leads to failures. For example, a highway headquarters was greedy for cheap, and it installed the machine on the first day and caught fire on the second day; a life insurance company purchased a machine at a low price, and it burned almost all the input circuits of IT equipment due to UPS failure in less than half a year, causing the system to be paralyzed. ; Another example is a megawatt data center with multiple UPSs connected in parallel. Within a few months of installation, one of the inverter power tubes in one of the UPSs has broken down and all UPSs have tripped. 

2. Failure caused by improper use environment

The machine is not placed in accordance with the requirements of the environment in the manual, and some even put the UPS in the corridors and dripping basements that are casually walked through. For example, several 200kVA UPSs are placed in a bungalow with only one layer of prefabricated panels on the roof. The air conditioners are just two 5P comfortable air conditioners. Another example is that a glass factory actually places the UPS in a powdery factory building and causes frequent failures.

3. Failure caused by imperfect system

For example, some personnel on duty randomly connect electric stoves, rice cookers, and vacuum cleaners to the UPS, causing overload and tripping; some personnel on duty cause rats to get into the machine and cause fires due to their food.

4. Handover failure

This type of failure is mainly caused by the fact that the management staff is not a group of people before and after or poor cooperation. For example, in a train station ticketing system, the front check-in personnel disconnected the UPS external battery pack due to the moving machine location, and failed to explain to the latecomer afterwards. As a result, the mains and UPS were out of power at the same time.

5. Experience failure

Experience is indispensable and a rare treasure. But experience has its relativity, that is, the experience gained on a certain kind of UPS may not be completely suitable for another kind of UPS, otherwise it will lead to failure. A telecommunications bureau used the same method to start another brand machine without reading the manual, which caused the inverter to burn out.

6. Oversight failure

Some devices will experience aging or early failure during operation, and failure will result if they are not checked in time. These cannot be found in automatic monitoring. For example, fuses that start to bend due to aging, loosening of battery structural screws, and minute cracks in the battery case after long-term battery discharge, etc., can cause failures if they are not discovered in time or are not handled in time after discovery.

7. Failure caused by rush into battle

Do not have the slightest impatience to engage in maintenance, and you must consider everything before you do it. An engineer of a company wants to overhaul a user's running UPS. According to regulations, the UPS must be removed from the maintenance bypass switch and then overhauled. However, according to the procedure, the automatic bypass must be activated first, and then the maintenance bypass knife should be closed. Perhaps the project has other urgent matters to be done. After entering the computer room, the bypass switch was closed without consideration, which caused the inverter power tube to explode.

8. Secondary failures caused by improper maintenance

Regular maintenance of UPS is necessary, but there should be a set of strict management procedures. Those who are irresponsible and do not perform regular or irregular maintenance according to regulations are important reasons for machine failure. In addition, it can also cause malfunctions during maintenance. For example, when measuring the potential of a circuit board with a multimeter probe, the probe will short-circuit two points and cause a malfunction. When a user discharges the battery, he removes the battery from the UPS. After the battery is discharged, the model is released when the battery is connected back, causing the current to explode. Another example is when an engineer accidentally slipped the adjustable wrench on the control panel when replacing the centrifugal fan. He didn't care at the time. After the fan was replaced, he couldn't turn on the machine. After checking, one of the device legs was broken... 

9. Failure caused by static electricity

A computer room was shut down for maintenance as usual, but it could not be turned on after maintenance. After inspection, it was found that a component had a voltage breakdown. Recalling the maintenance process, it was found that the control board was swept through dust with a plastic toothbrush. Plastics can generate several thousand volts of friction electrostatic voltage on the surface of the drying device. Because the small signal circuit in the machine uses some MOS devices, these devices have low withstand voltage and are most afraid of static electricity. After measuring an ordinary plastic bag, rubbing with a circuit board can generate an electrostatic voltage of 3000V. Therefore, it is best to put a grounding ring on your wrist when inspecting these circuit boards.

10. Failure caused by overconfidence

Confidence in doing things is the foundation of success, but overconfidence can sometimes make mistakes. For example, an international bank should update its equipment after UPS has been in operation for 8 years, and the manufacturer has repeatedly reminded it. Since the UPS has rarely had problems in the past eight years, the person in charge of the user repeatedly answered "No need to update". As a result, the UPS stopped supplying power for two hours due to an aging failure a few months later, causing global business to be interrupted for two hours, resulting in a great loss. In fact, according to international statistics, the nominal battery service life of 5 years is no more than 3 years at most. Usually, it should be replaced within 2 years if it is not maintained.

There are many similar man-made fault phenomena. In the final analysis, the selection of the power supply system is the first level. Failure to control this level first planted the seeds of hidden dangers. The connection of the power system is the second level. With good equipment, if there is no good connection scheme, hidden dangers will be buried.