HPC in the News

New ReRAM Memory Can Process Data Where it Lives

A team of international scientists have found a way to make memory chips perform computing tasks, which is traditionally done by computer processors like those made by Intel and Qualcomm.

This means data could now be processed in the same spot where it is stored, leading to much faster and thinner mobile devices and computers.


OpenMP Tutorial

Tutorial by Lawrence Livermore National Labs provides a detailed review of OpenMP programming.


MPI Tutorial

While the exercises contained in this tutorial are designed to be run on LLNL’s clusters, the information regarding MPI programming is universal. Advanced MPI tutorial slides can be found here: https://computing.llnl.gov/tutorials/mpi_advanced/DavidCronkSlides.pdf


Go Parallel Online Magazine/Blog

Useful articles and tutorials on parallel programming can be found at this site, sponsored by Intel.


“Google Docs” for life sciences accelerates discovery

MIT spinout Benchling is bringing life science researchers a cloud-based platform that integrates different types of lab software into one place, with aims of making research and development easier, quicker, and more collaborative.

Using Benchling, researchers can design, record, analyze, share, and constantly update data in the cloud. Generated data can be shared with members of a team, and other scientists around the world. Benchling has, on average, increased scientist productivity by up 30 to 50 percent, automating organizational tasks and reducing experiment duplication, according to feedback from users, says MIT co-founder Saji Wickramasekara. “It really impacts the productivity and day-to-day life of researchers,” he says.


Researchers Develop New Parallel Computing Method

(HPCwire – November 28, 2016)

Researchers from Julia Computing, UC Berkeley, Intel, the National Energy Research Scientific Computing Center (NERSC), Lawrence Berkeley National Laboratory, and JuliaLabs@MIT have developed a new parallel computing method to dramatically scale up the process of cataloging astronomical objects. This major improvement leverages 8,192 Intel Xeon processors in Berkeley Lab’s new Cori supercomputer and Julia, the high-performance, open-source scientific computing language to deliver a 225x increase in the speed of astronomical image analysis.

The code used for this analysis is called Celeste. It was developed at Berkeley Lab and uses statistical inference to mathematically locate and characterize light sources in the sky. When it was first released in 2015, Celeste was limited to single-node execution on at most hundreds of megabytes of astronomical images. In the case of the Sloan Digital Sky Survey (SDSS), which is the dataset used for this research, this analysis is conducted by identifying points of light in nearly 5 million images of approximately 12 megabytes each – a dataset of 55 terabytes.

Using the new parallel implementation, the research team dramatically increased the speed of its analysis by an estimated 225x. This enabled the processing of more than 20 thousand images, or 250 gigabytes – an increase of more than 3 orders of magnitude compared with previous iterations. The article is available here.


Chinese Research Team That Employs HPC to Understand Weather Patterns Wins 2016 ACM Gordon Bell Prize

(HPCwire – November 23, 2016)

A 12-member Chinese team is the recipient of the 2016 ACM Gordon Bell Prize for their research project, “10M-Core Scalable Fully-Implicit Solver for Nonhydrostatic Atmospheric Dynamics. The winning team presented a solver (method for calculating) atmospheric dynamics. The ACM Gordon Bell Prize tracks the progress of parallel computing and rewards innovation in applying high performance computing to challenges in science, engineering, and large-scale data analytics.

In the abstract of their presentation, the winning team writes, “On the road to the seamless weather-climate prediction, a major obstacle is the difficulty of dealing with various spatial and temporal scales. The atmosphere contains time-dependent multi-scale dynamics that support a variety of wave motions.”

To simulate the vast number of variables inherent in a weather system developing in the atmosphere, the winning group presents a highly scalable fully implicit solver for three-dimensional nonhydrostatic atmospheric simulations governed by fully compressible Euler equations. Euler equations are a set of equations frequently used to understand fluid dynamics (liquids and gasses in motion).

Elaborating further, they add, “In the solver, we propose a highly efficient domain-decomposed multigrid preconditioner that can greatly accelerate the convergence rate at the extreme scale. For solving the overlapped subdomain problems, a geometry-based pipelined incomplete LU factorization method is designed to further exploit the on-chip fine-grained concurrency.”

The fully-implicit solver successfully scales to the entire system of the Sunway TaihuLight, a Chinese supercomputer with over 10.5 M heterogeneous cores, allowing for a performance of 7.95 PFLOPS in double precision. The Chinese team contends that this is the largest fully-implicit simulation to date. The Sunway TaihuLight is ranked as the fastest supercomputer in the world. It is nearly three times as fast as the Tianhe-2, the supercomputer that previously held the world record for speed.