💾 Archived View for gmi.noulin.net › mobileNews › 4321.gmi captured on 2023-01-29 at 18:38:28. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Building Titan: The world s fastest supercomputer

2012-10-29 09:30:01

An exclusive, behind-the-scenes look at the US bid to build a radical new

machine, capable of solving some of the most complex questions in science

today. Its secret: video game technology.

The sound of 20 quadrillion calculations happening every second is dangerously

loud. Anyone spending more than 15 minutes in the same room with the Titan

supercomputer must wear earplugs or risk permanent hearing damage. The din in

the room will not come from the computer's 40,000 whirring processors, but from

the fans and water pipes cooling them. If the dull roar surrounding Titan were

to fall silent, those tens of thousands of processors doing those thousands of

trillions of calculations would melt right down into their racks.

Titan is expected to become the world's most powerful supercomputer when it

comes fully online at the US Oak Ridge National Laboratory, near Tennessee, in

late 2012 or early 2013. But on this afternoon in mid-October, Titan isn't

technically Titan yet. It's still a less-powerful supercomputer called Jaguar,

which the US Department of Energy (DoE) has operated and continuously upgraded

since 2005. Supercomputing power is measured in Flops (floating point

operations per second), and Jaguar was the first civilian supercomputer to

break the "petaflop barrier" of one quadrillion operations per second (a

quadrillion is a one followed by 15 zeroes). In June 2010 it was the fastest

supercomputer on Earth.

Gallery: Building a speed machine

But high-performance computing records don't last long: a Chinese machine

pushed Jaguar into second place just six months later. Then in October 2011,

the supercomputer design firm Cray announced that it would transform Jaguar

into a new machine that could retake the number-one spot, with an estimated

peak performance of 20 petaflops.

Cray's blue-jacketed technicians have been pacing up and down Jaguar's

catacomb-like aisles for months, opening its 200 monolithic black cabinets and

sliding out its processor blades like enormous safe-deposit boxes. Jaguar's

brain surgery takes place on spartan worktables that wouldn't look out of place

in a hobbyist's garage. A technician fits a paperback-sized ingot of metal and

silicon into an empty space in the blade and fastens it into place with a

battery-powered screwdriver. The ingot contains a graphics processing unit, or

GPU. Cray has installed one of these GPUs alongside every one of Jaguar's

18,688 CPU chips. It's this "hybrid architecture" that will turn Jaguar into

Titan, packing an order of magnitude more computing horsepower into the same

amount of physical space.

Turbo-charged

GPU-accelerated supercomputers burst onto the world stage in 2010, when China's

Tianhe-1A machine overtook Jaguar as the fastest supercomputer on earth. "It

came out of nowhere," says Wu-chun Feng, a high-performance computing expert at

Virginia Tech. "China didn't even have a high-performance computing program."

Instead of relying solely on expensive, highly customized, multicore

microprocessors, Tianhe-1A got a speed bump by using "off the shelf" GPUs made

by Nvidia, whose chips power the displays of video-game consoles and consumer

laptops. Titan takes the same approach using the same chip design that powers

the ultra-high resolution Retina display on Apple s Macbook Pro. These

intricate squares of silicon will provide 90% of Titan's peak supercomputing

performance.

So, what do video-game graphics have in common with high-end scientific

computing? Simulation. "About ten years ago, we observed that the chips we

designed for gaming were starting to look more like general purpose processors

for simulating physics," says Sumit Gupta, Nvidia's senior director of high

performance GPU computing. "When you'd shoot a tree in a video game and it

would fall, you'd want it to look natural, so the simulations became more and

more complex."

At the same time, redrawing every pixel on an HD laptop screen 60 times per

second also requires so-called parallel computation. "This is why GPUs are

designed to run hundreds of calculations at the same time very efficiently,"

says Steve Scott, Tesla chief technology officer at Nvidia. "It turns out that

this is very similar to the way high performance scientific computing is done,

where you're simulating the climate, or the interactions between drug

molecules, or the airflow over a wing."

But where video game physics only have to look real enough to a distracted

teenager, supercomputer simulations have to be scientifically accurate down to

the level of individual atoms - which is why Titan needs tens of thousands of

GPUs all working together on the same problem, not to mention enough Random

Access Memory (RAM) to hold the entire simulation in memory at once. (Titan has

710 terabytes of RAM, about as much as a stack of iPads 7km high.)

But supercomputers have been getting along without GPUs for decades. A CPU chip

- the same general-purpose silicon "brain" inside your laptop, your smartphone,

and every computer at Google or Facebook - can run high-performance scientific

calculations, too, if you chain enough of them together. The current fastest

supercomputer, IBM's "Sequoia" system at Lawrence Livermore National Laboratory

in California, contains over 98,000 CPUs, each with 18 cores.

What GPUs offer that CPUs can't is a blast of relatively cheap,

energy-efficient horsepower. Scaling up the Jaguar supercomputer from 1.75

petaflops to 20 could have been done by adding more cabinets stuffed full of

CPUs. But those take up space, and more importantly, suck up power.

Off-the-shelf GPUs, meanwhile, aren't designed to act self-sufficiently like

normal chips - they're add-ons "that accelerate a CPU like a turbo engine,"

says Gupta - so they consume much less energy than a CPU would to do the same

amount of calculating. By bolting a GPU onto each one of the 18,688 AMD Opteron

CPU chips already in Jaguar, the DoE was able to create a next-generation

supercomputer without scrapping the one they already had - or blowing up their

electric bill.

Bigger is better

The new machine, like any supercomputer, is all about speed: "time to

solution," as Jack Wells, director of science for Oak Ridge s computing

facility, puts it. "It's about solving problems that are so important that you

can't wait," he says. "If you can afford to wait, you're not doing

supercomputing." Competition among research projects for "core hours" on Titan

is intense. Of the 79 new-project proposals received by Oak Ridge's selection

panel, only 19 will run on Titan in 2013.

Winning proposals will apply Titan's computational might to problems in areas

such as astrophysics (simulating Type-1A supernovae and core collapses),

biology (modeling human skin and blood flow at a molecular level), earth

science (global climate simulations and seismic hazard analysis of the San

Andreas fault in California), and chemistry (optimizing biofuels and engine

combustion turbulence). According to Buddy Bland, project director of the Oak

Ridge computing facility, Titan will typically run four or five of these

supercomputing "jobs" at once.

But some jobs are so complex that they'll take over Titan entirely. The

Princeton Plasma Physics Laboratory, for example, will use all of Titan's

computing cores to help design components for the International Thermonuclear

Experimental Reactor (Iter), a prototype nuclear fusion project in France.

"Their goal is to have this reactor online by 2017," Bland says. "It'll use

magnetic fields to circulate plasma through a big donut-shaped reactor at 100

million degrees Fahrenheit. How do you contain that kind of energy? That's what

they need Titan to help them figure out."

As fast as Titan is, these simulations can still take days, weeks, or even

months to complete. And the very idea of "fast" has a different meaning to

computational scientists than it does to users of consumer apps like Photoshop

or Final Cut Pro. "It's not so much about running our applications and

calculations faster - we want to run them bigger," says Tom Evans, a scientist

at Oak Ridge who uses the supercomputer to model nuclear reactor systems.

"Maybe that means adding four times more spatial resolution in our simulations,

or replacing approximations with more accurate physics. Of course we always

like to go faster. But it's less interesting to do the same science faster than

it is to do something new that you couldn't even do before."

In other words, bigger is better - and not just for the scientific bragging

rights. Having a top-ranked supercomputer on American soil "demonstrates global

competitiveness and attracts brainpower," says Jack Wells. Take Jeremy Smith,

director of Oak Ridge's Center for Molecular Biophysics, who used to work at

the University of Heidelberg in Germany. "I found out that Oak Ridge would have

this nice toy to play with," he says, "so I nipped across the pond." (Smith's

research on biofuels began on Jaguar and will continue on Titan.)

Power play

Many of the smart people that Titan attracts will use the supercomputer to

chart the future of supercomputing itself. So-called petascale machines like

Titan and Sequoia can accomplish amazing feats of simulation, like screening

millions of potential drug compounds against a target molecule in a single day.

But researchers like Jeremy Smith want to do even more.

They envisage an "exascale" computer - a thousand times more powerful than

Titan and able to do one quintillion calculations per second (a quintillion is

a one with 18 zeroes after it). A machine like this "would have enough

computing power to screen tens of millions of drug compounds against all known

living protein classes," Smith says. "That means we'll be able to predict if

the drug will work and what all the side effects will be - not only

generically, but for individual people, based on their own genetic sequences.

This is amazing potential."

The trouble with building an exascale machine, however, is the amount of energy

required to get there. "If we just scaled up what we're doing today, it would

take a couple of nuclear power plants to power," says Buddy Bland. But Wu Feng,

who curates an annual list of the world's most energy-efficient supercomputers,

is less pessimistic. "The trends indicate that we'll be able to get to the

exascale for 50 megawatts," he says. That's about half as much power as Apple

and Google s data centers in North Carolina are estimated to use.

But government-funded scientific institutions don't have tech companies

bottomless bank accounts. The DoE wants an exascale computer by 2020 that can

run on 20 megawatts of electricity or less. Reaching that goal will require

entirely new chip designs that draw even less power than the GPU-accelerated

systems like Titan do.

Mobile devices, most of which use chip designs from the UK firm Arm, could

offer a way forward. "You've probably noticed that when you put a smartphone in

your pocket it doesn't burn through your pants," says Jack Wells. "The same

design principle is going to be used in high-performance computing to get to

the exascale." Jack Dongarra, a computer scientist at the University of

Tennessee whose Top500 list ranks the world's fastest supercomputers, ran

benchmarking software on an iPad 2 and found that the tablet was equivalent to

some of the fastest supercomputers of the mid-1990s. "That's incredible

computing power in your hand," he says. "The Arm processor is clearly capable."

Still, simply lashing together thousands of low-power processors - whether they

come from smartphones, gaming consoles, or laptops - does not a supercomputer

make. Passing data between all those chips creates bandwidth bottlenecks that

limit the total speed of the system. "It's like having two hemispheres of your

brain on opposite sides of the room connected by a wire," says Feng. An

exascale computer will have to speed up its entire internal network - perhaps

by using fibre optic connections between racks of chips, accelerators on every

piece of silicon, or both.

Meanwhile, says Buddy Bland, jockeying for the title of "world's fastest

supercomputer" will continue, and no single interconnect design or chip

architecture is "best." "Whoever has the biggest budget is likely to be in the

top spot," he says wryly. "But a healthy diversity in architectures is a

wonderful thing because certain applications can run well on one, and others

well on another."

What's indisputable is that supercomputing has become the "third pillar" of

doing science, alongside theory and experimentation. The best way to grasp the

power of Titan, says Bronson Messer, a computational astrophysicist at Oak

Ridge, is not to compare it to a Formula 1 racing car or a turbocharged engine,

but to the Large Hadron Collider. "Titan is like the particle accelerator, and

the simulations and applications that we run on Titan are like the detectors

that discovered the Higgs boson," Messer says. "The size or power of these

machines isn't what pushes science forward. It's the people using them, who

know what to look for."