![]() | ![]() |
![]() |
![]() ![]() ![]() |
There are arguments for both sides, so you could decide either way. There is occasionally quite a bit of discussion about this topic on the mailing list, and the following is some of the observations that were made.
| |
The mean time between failures of hard drives is about 10 years at 24/7, and normally drives either die after a few months, or they keep going essentially for ever. On most hard drives, turning them off and on causes more damage than leaving them running 24/7. I know of many bits of equipment (10 year old fileservers) that were turned off for Y2K, and the increased load when turning the drives back on resulted in drive failure.
Disk access frequency is not really a factor in HDD failure. Most of the drives that failed on my machines were backup drives which were used relatively infrequently.
Most people now accept that the easiest way to achieve hardware reliability is to have a machine in a cool low humidity environment, to turn it on, and leave it permanently on. Start up loads are far more damaging than continuous use.
In my opinion, the biggest factor on how long drives last is background
vibration, humidity, and how well they are cooled.
| |
A computer that's on 24/7 is actually at slightly
LESS (exactly HOW MUCH less is a subject still being debated among the
experts) risk of failure since it isn't experiencing the jolt of power at
startup time, and isn't having to start the drives spinning. Spinning up
from dead stop is the most stressful time in a hard drive's life -
depending on the specific drive, the power required to spin up from a cold
start can be as much as *100 times* (that's a worst-case situation -
Usually it's closer to 3-5 times) what is required to keep it spinning
while idle! The sudden draw on the power supply to handle that can cause
failure of the system all by itself.
| |
Running a laptop 24/7 is just not a good idea, they're not designed to be
run continously. Most don't have adequate cooling, and they're much more
fragile than standard computers.
| |
Most semiconductor failures come from the slow dispersion
of the "doping" (impurity) atoms through the crystal matrix of the substrate
(silicon/germanium). The dispersion rate increases with temperature, and
even with cooling fans, it's going to be faster when the machine is running than when it is not.
I also have some good news: this effect is usually minimal, and chances are the
machine will be hopelessly obsolete before it breaks even under 24/7 conditions. | |
As someone else mentioned, the extra current
knocks dopants from N type silicon into the P type silicon at the junction,
leading to failure of a transistor. I seem to recall reading that this can
happen in a couple years of use if you seriously overclock and keep the CPU
busy, or in ten year or more if you don't overclock. (If you overclock, but
only occasionally use it at full power, it'll live longer. Linux, and
probably some other OSes, use the halt instruction in their idle loop to put
the cpu into low power, wait-for-an-interrupt mode.) | |
Running any client 24x7 doesn't contribute to failure any more quickly than
an idling machine. Idling means only that the CPU is passing mostly zeros
through its registers. The CPU is still pushing instructions, most of them
simply do nothing.
| |
Has anyone bothered to look at the additional stress on the CPU that is
not running the client? As an example, consider a server that usually sits
idle but periodically gets a burst of work. The CPU on this server is going to
be cool while idle between jobs then quickly heats up when a job starts and
cools down again when the job is done. Differential temperature changes during these heating and cooling cycles are going to create thermal stresses on the chip. These stresses can cause minor flaws in the chip to expand until a critical circuit is broken and the CPU fails. By running the client the CPU is always busy so the thermal variations will be minimized.
| |
My work experience with semiconductor chip failures
were most often at the bonding pad level which is
stressed by thermal fatigue rather than constant high
temperature. Maybe interconnects have improved over the
years - there are many fewer of them these days. Any
equipment that I want to keep running simply stays on
all the time. The stresses of power on / off is where &
when most of the failures of electronics fails for me.
| |
I would suspect that the first failures in a computer system would be observed in the mechanical systems such as the hard drive or in the power supplies. Integrated circuits which have survived the infant mortality period (usually 48 hours) should last 10-20 years, no matter what you do to them. They should not be adversely affected by the stress of thermal cycling, and dopants should not diffuse at typical junction temperatures on the order of 100 C. Two operating-life failure mechanisms of integrated circuits are hot electron injection and electromigration. Hot electrons are produced when a transistor switches states. They cause charge to be trapped in the gate oxide of the transistor, eventually (after many years) changing the behavior of the transistor and causing it to fail. Leaving your system on will hasten its death due to hot electron injection. However, as I stated above, I believe components other than the ICs are likely to fail first. The rate of hot electron injection is also proportional to the voltage and clock speed of the chip and so can be affected by overclocking. Electromigration occurs when the current density in the wire traces on the chip is too high. The flow of electrons can actually begin to move the metal in the wires until it causes an open circuit. This generally only occurs in a poorly designed circuit and should not be a concern. I have also seen operating-life failures due to random particulate defects on the chip. However, it is not the thermal cycling but the electric fields on the chip which cause these defects to kill a circuit. Most of this type of defect are weeded out during the infant mortality stage. |
|
© Copyright distributed.net 1997-2013 - All rights reserved