A few words on stress testing computers

2024-06-11

I've recently been researching stress testing for my workstation. Since I run many background services, I need it to be highly reliable, capable of running 24/7 without issues even under load.

Servers use specialized, high-quality, and redundant hardware to achieve high uptimes. While consumer parts can't match this level of reliability, we can come close by choosing high-quality components. I'll soon write an article about my workstation hardware (hint: it's a Ryzen with ECC RAM).

But how can we ensure our machine's reliability? Stress testing is part of the answer, but there's a lot of misinformation about it. After extensive research, I've summarized my findings for those who want to push their hardware to the limits.

Many guides are available for various stress testing software. Some focus on RAM, others on the CPU, and their effectiveness varies. You might run a stress test for days without issues, then switch to another and encounter problems within an hour.

Before starting, always measure your system's temperature. You'll be surprised at how easily your CPU can overheat with this software.

Here's what my research led to:

• CPU: Use Prime95[1] small FFT tests. This is the most demanding CPU stress test available and works on Linux, macOS, and Windows. Be cautious: it will severely tax your CPU. Machines that handle other stress tests may crash with Prime95's torture test. Select the small FFTs torture test and let it run overnight. If it passes, your CPU is likely reliable.

• RAM: This is trickier. The most reliable RAM test software is TestMem5 with the anta777 extreme profile[3], but it doesn't run on Linux. On Linux, your best option is Prime95 torture test with large FFTs. Again, let it run overnight. If it shows no errors in the morning, your RAM is probably stable.

• GPU: I don't have a particularly good GPU and don't stress it since I only use it for idling and driving displays.

There is other software available, like Google's SAT[3]. However, most aren't as effective at quickly identifying problems as the ones mentioned above.

[1] Prime95 official home.

[2] TestMem5 anta777 configuration.

[3] Google's SAT