1、Challenges in Programming the Next Generation of HPC Systems

William Gropp
University of Illinois at Urbana-Champaign

After over two decades of relative architectural stability for distributed memory parallel computers, the end of Dennard scaling and the looming end of Moore's "law" is forcing major changes in computing systems. Can the community continue to use programming systems such as MPI for extreme scale systems? Does the growing complexity of compute nodes require new programming approaches? Can performance portability be achieved? Are new I/O models required? This talk will discuss some of the issues, with emphasis on internode and intranode programming systems and the connections between them.

2、Efficient Optimization Algorithms For Large-scale Data Analysis

Ya-xiang Yuan
AMSS, Chinese Academy of Sciences

Optimization models are ubiquitous in data analysis. In this talk, we first review efficient Newton type methods based on problem structures: 1) a stochastic trust region method for reinforcement learning; 2) semi-smooth Newton methods for composite convex programs and its application to large-scale semi-definite program problems and machine learning; 3) an adaptive regularized Newton method for Riemannian Optimization; 4) globally convergent Levenberg-Marquardt Method for phase retrieval. Secondly, we explain efficient algorithms based on matrix factorization: 1) state aggregation of Markov chains; 2) a sparse and low-rank completely positive relaxation for the modularity maximization problem from the community detection problem. Thirdly, we discuss a few parallel optimization approaches: 1) parallel subspace correction method for a class of composite convex program; 2) parallelizable approach for linear eigenvalue problems; 3) parallelizable approach for optimization problems with orthogonality constraint. Finally, we introduce continuous optimization models for spectral clustering.

3、Supercomputing Infrastructures for Convergence of HPC and Big Data /AI

Satoshi Matsuoka
Director Riken-CCS /
Professor, Tokyo Institute of Technology

With rapid rise and increase of Big Data and AI as a new breed of high-performance workloads on supercomputers, we need to accommodate them at scale, and thus the need for R&D for HW and SW Infrastructures where traditional simulation-based HPC and BD/AI would converge, in a BYTES-oriented fashion. The TSUBAME3 supercomputer at Tokyo Institute of Technology which has become online in Aug. 2017, embodies various BYTES-oriented features to allow for such convergence to happen at scale, including significant scalable horizontal bandwidth as well as support for deep memory hierarchy and capacity, along with high flops in low precision arithmetic for deep learning.TSUBAM3's technologies have been commoditized to construct one of the world’s largest BD/AI focused open and public computing infrastructure called ABCI (AI-Based Bridging Infrastructure), hosted by AIST-AIRC (AI Research Center), the largest public funded AI research center in Japan. Although not a supercomputer for HPC, its Linpack ranking is No.1 in Japan and No.5 in the world, as well as embodying 550 AI-Petaflops for AI, as well as being extremely energy efficient with novel warm water cooling pod design. Finally, Post-K is the flagship next generation national supercomputer being developed by Riken and Fujitsu in collaboration. Post-K will have hyperscale class resource in one exascale machine, with well more than 100,000 nodes of  sever-class A64fx many-core Arm CPUs, realized through extensive co-design process involving the entire Japanese HPC community. Post-K is slated to perform 100 times faster on some key applications c.f. its predecessor, the K-Computer, but also will likely to be the premier big data and AI/ML infrastructure.  Currently, we are conducting research to scale deep learning to more than 100,000 nodes on Post-K, where we would obtain near top GPU-class performance on each node.

4、European Supercomputing in Perspective

Michael Resch
High Performance Computing Center Stuttgart

In the ever-lasting race for the fastest systems in the world Europe has decided to establish a strategy to position itself as a leader. Over the last 10 years the Partnership for Advanced Computing in Europe (PRACE) has provided access to European users based on a federated model. With the creation of the Joint Undertaking EuroHPC the situation has changed. EuroHPC is intended to set up pre-Exascale and Exascale systems in the coming years. All this is accompanied by a number of further measures that will help European users to get access to best of its class supercomputing systems. In addition, the European Processor Initiative has been launched to be able to provide European research and industry with European processor technology and hence reduce the dependability on non-European technology. The talk will present an overview of these developments and will elaborate on how this is going to change the role of Europe in supercomputing。