Workshops and Sessions

1.High Performance Numerical Algorithms and Software for Large-scale Science & Engineering Applications

Time: 08:25-17:40, 14/01/2019

Location: Function Room 1 (鸿云厅)

Time Title Speaker Chair
8:30-8:35 Opening Remarks
8:35-9:00 Development of a high-performance multi-threaded ILU-GMRES solver based on a novel parallel ordering technique Takeshi Iwashita
Hokkaido University
Xiaowen Xu
IAPCM
9:00-9:30 Numerical Simulations of Fluid Flows in Carbonated Oil Reservoirs Chensong Zhang
AMSS,CAS
9:30-10:00 Numerical Simulations of Thermal Reservoir Model using Parallel Computers Hui Liu
University of Calgary
10:00-10:30 Coffee Break & Group Photo
10:30-11:00 TBA Gabriel Wittum
KAUST
Linbo Zhang
AMSS,CAS
11:00-11:30 Scalable Parallel Methods for Patient-specific Blood Flow Simulations Rongliang Chen
Chinese Academy of Sciences
11:30-12:00 Fast Nonoverlapping Block Jacobi Method for the Dual Rudin-Osher-Fatemi Model Chang-Ock Lee
KAIST
12:00-13:30 Lunch Break
13:30-14:00 Introduction to the algebraic multiscale method Dongwoo Sheen
Seoul National University
Chensong Zhang
LSEC,CAS
14:00-14:30 Simulation of density-driven groundwater flow with a phreatic surface in complicated geometries Dmitry Logashenko
KAUST
14:30-15:00 A Framework of Non-Convex Methods for Low-Rank Matrix Reconstruction: Algorithms and Theory Jianfeng Cai
HKUST
15:00-15:30 Decoupling Techniques for Coupled Models in Multi-Physics Supercomputing Lian Zhang
HKUST
15:30-16:00 Coffee Break  
16:00-16:30 Implementation of Immersed Finite Element Methods on Tetrahedral Meshes Linbo Zhang
AMSS, CAS
Hui Liu
University of Calgary
16:30-17:00 Co-designing Computational Fluid Dynamics and Numerical Linear Algebra Xin He
Chinese Academy of Sciences
17:00-17:30 JPSOL, an App-oriented Parallel Suite for Numerical Algebraic Problems Arising from PDEs. Ran Xu
IAPCM
17:30-17:40 Closing

1.High Performance Numerical Algorithms and Software for Large-scale Science & Engineering Applications

14/01/2019, 08:30-17:20, Xiangyun Room

(1) Development of a high-performance multi-threaded ILU-GMRES solver based on a novel parallel ordering technique

Speaker:

Takeshi Iwashita
Vice director and a professor in the Information Initiative Center, Hokkaido University.

Abstract

A high-performance multi-threaded ILU-GMRES solver was developed. The ILU-GMRES method is one of the most popular linear solvers for a linear system of equations with an unsymmetric coefficient matrix. It has been used in various applications. In parallelization of the ILU-GMRES solver, a sparse triangular solver involved in the preconditioning step is often problematic. One of well-known techniques to parallelize the sparse triangular solver is parallel ordering. Among various parallel ordering techniques, we focused on the algebraic block multi-color ordering and developed an enhanced version of it, which is applicable for a linear system with an unsymmetric coefficient matrix. The effectiveness of the developed solver was confirmed in numerical tests conducted on latest multi-core and many-core processors.

About the Speaker

Takeshi Iwashita received a B.E., an M.E., and a Ph.D. from Kyoto University in 1992, 1995, and 1998, respectively. In 1998-1999, he worked as a post-doctoral fellow of the JSPS project in the Graduate School ofEngineering, Kyoto University. He moved to the Data Processing Center of the same university in 2000. In 2003-2014, he worked as an associate professor in the Academic Center for Computing and Media Studies, Kyoto University. He currently works as the vice director and a professor in the Information Initiative Center, Hokkaido University. His research interests include high performance computing, linear iterative solver, and electromagnetic field analysis.

(2) Numerical Simulation of Fluid Flows in Carbonate Oil Reservoirs

Speaker:

Chensong Zhang
Academy of Mathematics and Systems Science

Abstract

Mineral dissolution plays an important role in many subsurface transport processes including water-flooding, geological CO2 sequestration (GCS), and matrix acidizing in carbonate formations. Such a dissolution process could dramatically affect the efficiency of the oil/gas production. Hence a numerical model that accurately describes the dynamic behavior of fracture evolution is essential. In this talk, we introduce a 3-D mathematical model that combines the Stokes—Brinkman equation and reactive-transport equations to describe the coupled processes of fluid flow, solute transport, and chemical reactions in both single and multiple mineral systems. The proposed numerical model can be applied to describe 3-D linear flow and radial flow for different scenarios. We employ a numerical procedure that solves the Stokes--Brinkman equation and the reactive-transport equations by the staggered-grid finite difference method and the control volume finite difference method, respectively, in a sequential fashion. We will also introduce a multiscale hybrid-mixed method for discrete fracture models. The three-dimensional fluid flow in the reservoir and the two-dimensional flow in the discrete fractures are approximated using mixed finite elements. A general system was developed where fracture geometries and conductivities are specified in an input file and meshes are generated using the public domain software GMSH. Several test cases illustrate the effectiveness of the proposed approach comparing the multiscale results with direct simulations.

About the Speaker

Chen-Song Zhang, PhD. Graduated from the Applied Mathematics & Scientific Computing program at the University of Maryland, College Park, US; Worked as a postdoctoral fellow at the Penn State University, University Park, US; Currently working at the Academy of Mathematics and Systems Science, CAS, China. Main research interests include numerical analysis, adaptive methods, petroleum reservoir simulation, complex fluid/flow simulation, and parallel computing.

(3) Numerical Simulations of Thermal Reservoir Model using Parallel Computers

Speaker:

Hui Liu
Researcher in University of Calgary

Abstract

Thermal technologies are the main production processes in Canada, due to the high viscosity of heavy oil, and steam is injected into reservoirs to heat the reservoir and to reduce the viscosity of heavy oil. The cost is high compared with conventional production technologies. Therefore, it is important to study the production plan before applying to actual production. The simulation time could be hours, days or even weeks long is the model size is large or the geological model is complicated. In our work, parallel computers are employed to accelerate the simulation. As we know, the performance of parallel computers is proportional to the number of CPUs. As a result, thermal problems can be solved thousands of times faster using parallel computers. In this talk, numerical methods of thermal model and its parallelization are presented.

About the Speaker

Hui Liu is a researcher in University of Calgary, Canada. Hui is working on numerical methods for reservoir simulations, preconditioning, algorithms and development of parallel reservoir simulators. He received his PhD in computational math and parallel computing in 2010 from Academy of Mathematics and System Sciences, Chinese Academy of Sciences.

(4) TBA

Speaker:

Wittum Gabriel
Professor for Applied Mathematics and Computational Science of the Extreme Computing Research Center of the King Abdullah University of Science and Technology

About the Speaker

Gabriel Wittum is a professor for Applied Mathematics and Computational Science of the Extreme Computing Research Center of the King Abdullah University of Science and Technology, Saudi Arabia. He got his PhD from University of Heidelberg in 1991. Before he joined KAUST, he is director of Goether Center for Scientific Computing, Frankfurt University. His research focuses on a general approach to modelling and simulation of problems from empirical sciences, in particular using high performance computing (HPC). Particular areas of focus include: the development of advanced numerical methods for modelling and simulation, such as fast solvers like parallel adaptive multi-grid methods, allowing for application to complex realistic models; the development of corresponding simulation frameworks and tools; and the efficient use of top-level supercomputers for that purpose. These methods and tools are applied towards problem-solving in fields including computational fluid dynamics, environmental research, energy research, finance, neuroscience, pharmaceutical technology and beyond.

(5) Scalable Parallel Methods for Patient-specific Blood Flow Simulations

Speaker:

Rongliang Chen
Associate professor position in Shenzhen Institutes of Advanced Technology , Chinese Academy of Sciences

Abstract

Numerical simulation of blood flows in compliant arteries based on patient-specific geometry and parameters can be clinically helpful for physicians or researchers to study vascular diseases, to enhance diagnoses, as well as to plan surgery procedures. In this talk, we will discuss some scalable parallel methods for the simulation of blood flow in compliant arteries on large scale supercomputers. The blood flow is modeled by 3D unsteady incompressible Navier-Stokes equations with a lumped parameter boundary condition, which are discretized with a stabilized finite element based on unstructured meshes in space and a fully implicit method in time. The large scale discretized nonlinear systems are solved by a parallel Newton-Krylov-Schwarz method. Several mathematical, biomechanical, and supercomputing issues will be discussed in detail, and some numerical experiments for the cerebral and coronary arteries will be presented. We will also report the parallel performance of the methods on a supercomputer with a large number of processors.

About the Speaker

Rongliang Chen, holds an associate professor position in Shenzhen Institutes of Advanced Technology , Chinese Academy of Sciences. He received his PhD in computational mathematics from Hunan University in 2012. He visited the University of Colorado Boulder as a joint PhD student from 09/2009-04/2012 and as a visiting scholar from 09/2014-02/2015 and 07/2017-08/2018. His research interests include high performance computing, computational fluid dynamics, domain decomposition method, and fluid-structure interaction computation.

(6) Fast Nonoverlapping Block Jacobi Method for the Dual Rudin-Osher-Fatemi Model

Speaker:

Chang-Ock Lee
Professor of Department of Mathematical Sciences, KAIST.

Abstract

in this talk, we consider nonoverlapping domain decomposition methods for the total variation minimization. We show that the nonoverlapping relaxed block Jacobi method for the dual Rudin-Osher-Fatemi model have the O(1/n) convergence rate of the energy functional,where n is the number of iterations. Moreover, by exploiting the forward-backward splitting structure of the method, we propose an accelerated version whose convergence rate is O(1/n^2). Numerical results for comparison with existing methods are presented.

About the Speaker

Chang-Ock Lee is a Professor of Department of Mathematical Sciences, KAIST. He got his Ph.D from University of Wisconsin-Madison in 1995. His research areas including domain decomposition methods (DDM for optimization problems, Dual iterative substructuring method with a penalty term), Isogeometric Analysis. He is Vice President, Korean Society of Computational Mechanics (2016~), and Chair of Committee for International Exchanges, Korean Society for Industrial and Applied Mathematics (KSIAM) (2017~).

(7) Introduction to the Algebraic Multiscale Method

Speaker:

Dongwoo Sheen
President of the Korean Society of Computational Sciences

Abstract

We introduce an algebraic multiscale method. The idea is motivated from the algebraic multigrid method (AMG) which is an iterative scheme to accurately approximate a linear system. Our approach differs in investigating in how to obtain a rough approximation to the original algbraic system arising from modeling heterogeneous materials. We discuss in detail how macroscopic basis functions can be formulated and result in a reduced macroscopic linear system based on the knowledge of microscopic linear system, with significant redution of dimension.

About the Speaker

Dongwoo Sheen graduated from Seoul National University (SNU) with a Bachelor and Master Degrees in Mathematics in 1981 and 1983. He received his PhD under the guidance of Prof. Jim Douglas, Jr. He was appointed as a postdoctoral research fellow at the University of Pavia, Italy and at Purdue University until he joined SNU in 1993 as an assistant professor. Serving as a professor of Mathematics, he also founded and chaired the Interdisciplinary Program in Computational Science and Technology at SNU for many years. His research interests include Numerical Analysis and Scientific Computation in several application areas including fluid and solid mechanics, electrodynamics, math finance, and math biology. Specifically he has contributed in developing several fundamental Nonconforming Finite Element Methods and parallel algorithms based on Laplace Transform Methods. He is currently the President of the Korean Society of Computational Sciences. He was a Plenary Speaker at the 5th Asian Mathematical Conference, Kuala Lumpur, Malaysia (2009). He received a Certificate of Commendation from the Minister of Science and Technology of Korea in 2012 for his contribution in Supercomputing in Korea.

(8) Simulation of density-driven groundwater flow with a phreatic surface in complicated geometries

Speaker:

Dmitry Logashenko
Research scientist at the Extreme Computing Research Center of the King Abdullah University of Science and Technology

Abstract

Presence of a phreatic surface separating saturated and unsaturated parts of an aquifer influences the groundwater flow essentially. Macroscopically, this interface can be modeled as a moving boundary which is tracked with the flow. We represent it by the level-set method and use the ghost-fluid method to impose the boundary conditions for the flow model in the saturated part. This technique is applied to the density-driven haline groundwater flow in real hydrogeological formations. These 3d domains have complicated, anisotropic, layered geometries and curved boundaries, that should be accurately resolved by unstructured grids consisting mainly of prisms. The non-linear model is discretized by a finite volume method. The linearized systems are solved by the geometric multigrid method with ILU smoothing. The high resolution of the grid motivates the parallelization is of the computations. In the talk, we present examples of these simulations.

About the Speaker

Dmitry Logashenko is a research scientist at the Extreme Computing Research Center of the King Abdullah University of Science and Technology, Saudi Arabia. He received his PhD from Department of Mathematics and Computer Science of the Heidelberg University in 2004. His research interests include development, analysis and implementation of efficient numerical methods for solving the partial differential equations on parallel architectures. The main field of the applications is the density-driven groundwater flow in fractured porous media and with free surfaces.

(9) A Framework of Non-Convex Methods for Low-Rank Matrix Reconstruction: Algorithms and Theory

Speaker:

Cai Jianfeng
associate professor from Department of Mathematics, Hong Kong University of Science and Technology

Abstract

Low-rank matrix is a versatile model that describes the structure of many datasets of practical interests arising from machine learning, bioinformatics, computer vision etc. Under this model, it is a fundamental problem how to recover a low-rank matrix from small amount linear samples. We present a framework of non-convex methods for low-rank matrix recovery. Our methods will be applied to several concrete example problems such as matrix completion, phase retrieval, and robust principle component analysis. We will also provide theoretical guarantee of our methods for the convergence to the correct low-rank matrix.

About the Speaker

Jian-Feng Cai is an associate professor from Department of Mathematics, Hong Kong University of Science and Technology (HKUST). He obtained his Bachelor degree in Computational Mathematics from Fudan University, and PhD degree in Mathematics from Chinese University of Hong Kong. Before joining HKUST in 2015, he has been worked at National University of Singapore, UCLA, and University of Iowa. His research focuses on the design and analysis of algorithms for problems in imaging and data sciences, using tools from computational harmonic analysis, numerical linear algebra, optimization, and high-dimensional probability.

(10) Decoupling Techniques for Coupled Models in Multi-Physics Supercomputing

Speaker:

Lian Zhang
Ph.D student at Department of Mathematics, HKUST

Abstract

We discuss decoupling issues for numerical computation with coupled PDE models in large scale simulation of multi-physics systems. An abstract mathematical framework is presented for devising effective and efficient decoupled numerical methods. Applications in the fluid-structure interaction (FSI) will be examined. Approximation and stability issues will be addressed, with special attention to the added-mass effect in decoupling FSI computation

About the Speaker

Lian Zhang is a Ph.D student at Department of Mathematics, HKUST. He received B.S. at Zhejiang University in 2015. His current research interests are in the areas of numerical PDE, numerical analysis and multi-physics problems. He mainly work on the numerical simulation and numerical schemes for fluid-structure interaction(FSI) problems.

(11) Implementation of Immersed Finite Element Methods on Tetrahedral Meshes

Speaker:

Linbo Zhang
Professor of Academy of Mathematics and Systems Science of Chinese Academy of Sciences, Director of State Key Laboratory of Scientific and Engineering Computing

Abstract

In this talk I will present algorithms and user interfaces in the open source parallel adaptive finite element toolbox PHG for immersed interface methods, especially a robust numerical quadrature algorithm for high order extended finite element methods, and demonstrate their applications with some benchmark problems. The codes are freely available in the distributions of PHG and can be used in implementations of immersed boundary/interface algorithms on tetrahedral meshes.

About the Speaker

Linbo ZHANG was born in 1962. He graduated from the Mathematics Department of Beijing University in 1982 and received his Ph.D. degree in Mathematics from Universit\'e de Paris-sud, France, in 1987. Currently, he is a professor of Academy of Mathematics and Systems Science of Chinese Academy of Sciences, and the director of State Key Laboratory of Scientific and Engineering Computing. His research interests include numerical algorithms and high performance computing.

(12) TBA

About the Speaker

Xue, Wei (Tsinghua, China). Dr. Xue is an associate Professor of Tsinghua University, China. His Research interests including numerical algorithm and performance optimization on modern supercomputers.

(13) A Kouhia Stenberg Type Immersed Finite Element Method for Elasticity Problems

Speaker:

Do-Young Kwak
Professor of Department of Mathematics Science, KAIST

About the Speaker

Do-Young Kwak is a professor of Department of Mathematics Science, KAIST. He got his PhD in Mathematics at University of Pittsburgh in 1985. His research interest is numerical analysis.

(14) Co-designing Computational Fluid Dynamics and Numerical Linear Algebra

Speaker:

He Xin
Associate professor in the Institute of Computing Technology, Chinese Academy of Sciences

Abstract

Computational fluid dynamics (CFD) is of key importance in many academic and industrial applications,e.g. maritime industry. In this presentation, I introduce how to use numerical linear algebra and high performance computing to accelerate the procedure of CFD, in particular the fast and efficient solutions of sparse and linear systems arising from CFD.

About the Speaker

Dr. Xin He is an associate professor in the Institute of Computing Technology, Chinese Academy of Sciences. Dr. He got the PhD degree from Uppsala University, Sweden and then worked as a post-doctoral researcher at Delft University of Technology, the Netherlands. The main research consists of computational fluid dynamics, numerical linear algebra and high performance computing.

(15) JPSOL, a app-oriented parallel suite for numerical algebraic problems arising from PDEs

Speaker:

Ran Xu
Associate Professor with the Institute of Applied Physics and Computational Mathematics, Beijing, China

Abstract

A new parallel numerical solver suite, JPSOL will be introduced in this presentation. JPSOL is founded by Institute of Applied Physics and Computational Mathematics since 2015, and aims to act as a powerful tool for huge numerical algebraic problems arising from scientific and realistically engineering applications. Different with hypre, petsc et. al. well-known packages, JPSOL is constructed by modern C++ language, and provides a more convenient and morenatural way for using. It means that user can define and solve linear, non-linear and eigen problems in semantic uniform interfaces. With matrices and vectors in JPSOL, wihich aredefined as combinations with necessary data and traits, you will feel relaxed to found new algorithm in the natrual matrix-vector way of expression. Further more, application can define its characteristic matrix or preconditioner obeying a few behavior standards. For example, a new matrix type is only asking to realize the spMV operator. At the last of the talk, some numerical results of realistic engineering cases achieved with JPSOL will be shown, such as theSanxia Dam, nuclear plants et al.

About the Speaker

Ran Xu received the B.S. and Ph.D. degrees in computational mechanics from Tsinghua University, Beijing, China, in 2006 and 2012, respectively. He is currently an Associate Professor with the Institute of Applied Physics and Computational Mathematics, Beijing, China. His current interests include fast linear system solver, computational mechanics, massively computation method, and their applications.

2.Open Source HPC Collaboration on Arm Architecture─Linaro Workshop

Time: 08:55-18:00, 14/01/2019

Location: Function Room 2 (祥云厅)

Time Title Speaker Chair
08:55-09:00 Workshop Introduction Jill Guo
Linaro EVP
Victor Duan
Linaro
09:00-09:25 System Software for Armv8-A with SVE Yutaka Ishikawa
RIKEN-CCS
09:25-09:50 Deep Learning Frameworks Portability Survey: Post-K perspective Mohamed Wahib
RIKEN-CCS
9:50-10:05 AM Coffee Break
10:05-10:30 Arm SVE and ML acceleration Francesco Petrogalli
Arm
10:30-10:55 The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC Ecosystem Shinji Sumimoto
Fujitsu
10:55-11:20 Benchmarking Huawei ARM Multi-Core Processors for HPC workloads Lin XinHua
Shanghai JiaoTong University
11:20-11:45 Science Cloud services for Computational Chemistry with Arm HPC Teppei Ono
HPC System Inc.
11:45-12:10 Multi-scale Application Software Development Ecosystem on ARM Xiaohu Guo
UK National HPC Center
12:10-13:30 Buffet Lunch and Networking
13:30-14:00 Transforming HPC with Huawei ARM HPC Solution Pak Lui
Huawei
14:00-14:30 The New Generation of Phytium‘s 64Cores Processor and Ecosystem Guo Yu Feng
Phytium
14:30-14:50 Arm Neoverse Dong Wei
Arm
14:50-15:10 PM Coffee Break
15:10-15:30 End to End Deep Learning Solution on Arm Jammy Zhou
Linaro
15:30-15:50 GIGABYTE Position in ARM Server Market - Leading Pioneer Akira
Gigabyte
15:50-16:10 Scale-out AI Training on Massive Core System: from HPC to Fabric-based SOC Fu Li
Quantum Cloud
16:10-16:50 Panel: Frontiers of AI Deployments in HPC on Arm Elsie Wahlig
Linaro
17:00-18:00 Welcome Banquet

2.Open Source HPC Collaboration on Arm Architecture─Linaro Workshop

14/01/2019, 08:45-16:50, Hongyun Room

(1) System Software for Armv8-A with SVE

Speaker

Yutaka Ishikawa
RIKEN-CCS

About the Speaker

Yutaka Ishikawa is in charge of developing post K computer. Ishikawa received the BS, MS, and PhD degrees in electrical engineering from Keio University. From 1987 to 2001, he was a member of AIST (former Electrotechnical Laboratory), METI. From 1993 to 2001, he was the chief of Parallel and Distributed System Software Laboratory at Real World Computing Partnership. He led development of cluster system software called SCore, which was used in several large PC cluster systems around 2004. From 2002 to 2014, he was a professor at the University Tokyo. He led the project to design a commodity-based supercomputer called T2K open supercomputer. As a result, three universities, Tsukuba, Tokyo, and Kyoto, obtained each supercomputer based on the specification.

(2) Deep Learning Frameworks Portability Survey: Post-K perspective

Speaker

Mohamed Wahib
Senior scientist at AIST/TokyoTech Open Innovation Laboratory, Tokyo, Japan

About the Speaker

“Mohamed Wahib is currently a senior scientist at AIST/TokyoTech Open Innovation Laboratory, Tokyo, Japan. Prior to that he worked as a researcher in RIKEN Center for Computational Science (RIKEN-CCS). He received his Ph.D. in Computer Science in 2012 from Hokkaido University, Japan. Prior to his graduate studies, he worked as a researcher at Texas Instruments (TI) R&D labs in Dallas, TX for four years. His research interests revolve around the central topic of "Performance-centric Software Development”, in the context of HPC. He is actively working on several projects including high-level frameworks for programming traditional scientific applications, as well as high-performance AI and data analytics.”

(3) Arm SVE and ML acceleration

Speaker

Francesco Petrogalli
Software engineer working on the development of Arm Compiler for HPC

About the Speaker

Francesco Petrogalli is a software engineer working on the development of Arm Compiler for HPC. He contributed to the implementation of the Vector Length Agnostic (VLA) vectorizer for the Scalable Vector Extension (SVE) of Arm, to ABI specifications for the vector functions, and to the open source library SLEEF. Francesco has also worked on optimizing a variety of computational kernels that are core to Machine Learning algorithms.

(4) The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC Ecosystem

Speaker

Shinji Sumimoto
Fujitsu

About the Speaker

Shinji Sumimoto, Fujitsu, Senior Architect of Software Development division. He is in charge of technical development for Post-K computer, and is an HPC System SW Specialist, especially ultra large scale high performance communication library and cluster filesystem research and development for over 20 years. He is also working on ARM HPC ecosystem development.

(5) Benchmarking Huawei ARM Multi-Core Processors for HPC workloads

Speaker

Lin Xinhua
Vice Director of High Performance Computing (HPC) Center at Shanghai Jiao Tong University

About the Speaker

Dr. James Lin is the Vice Director of High Performance Computing (HPC) Center at Shanghai Jiao Tong University (SJTU). He received his Msc Degree in Computer Science at SJTU in 2005 and Ph.D. degree in Mathematical and Computing Science at Tokyo Institute of Technology in 2018. He worked in Department of Computer Science at SJTU from 2005 to 2012, and then move to the HPC Center at the same university. In the HPC Center, he built ‘π’, the first GPU- accelerated supercomputer in China's universities. This 350TFlops supercomputer ranked No. 158 in the TOP500 List in June 2013, and was the fastest supercomputer in China’s universities from April 2013 to October 2015. His research focuses on performance optimizations on emerging many-core processors, such as NVIDIA GPU, Intel Xeon Phi. He was the PI of NVIDIA CUDA Center of Excellence (CCoE) in 2011 and Intel Parallel Computing Center (IPCC) in 2017 in recognition of his ongoing research on parallel computing. He serves on the steering committee of the Intel eXtreme Performance Users Group (IXPUG). He also has served and been serving on several conferences’ organizing committee, such as a proceeding co-chair of Supercomputing19, a co-chair of International Scalable Computing Challenge15 (co-located with CCGrid15), and a publicity co-chair for Computing Frontier19, ICPP17, and eScience17.

(6) Science Cloud services for Computational Chemistry with Arm HPC

Speaker

Teppei Ono
President & CEO at HPC Systems Inc.

About the Speaker

TEPPEI ONO. BSBA from Northeastern University MA, USA. In 2004, founded Industrial computer and High-end server development and small-medium manufacturing service company in Taiwan. Experienced developing low power microblade server with Transmeta Efficeon processor for Japanese Tier-One IT company, as well as developed Electronic Design Automation (EDA) workstation and high-end server with AMD Opteron processor for semiconductor companies. Since 2007, became President & CEO at HPC Systems Inc., customer-oriented HPC Solution Company providing hardware and simulation software, system integration service, cloud service for High Performance Computing, AI/Deep Learning, and Computational Chemistry consulting services for technology and manufacturing companies in Life Science and Materials Science.

(7) Multi-scale Application Software Development Ecosystem on ARM

Speaker

Xiaohu Guo
Principle computational scientist at STFC Hartree Centre(UK National HPC Centre

About the Speaker

Dr. Xiaohu Guo, Ph.D in parallel computing. He is currently a principle computational scientist at STFC Hartree Centre(UK National HPC Centre), leading in the development of computational methods using particle method based and unstructured mesh based computation for a range of application fields including CFD, Materials Science and Medical image processing and reconstruction. His work has provided enabling technologies for a wide range of engineering and science applications spanning the brief of the UK STFC, EPSRC and NERC research councils. One of Guo’s research commitments at the moment is to address the common computational challenges for large scale particle method based application simulations and unstructured mesh based applications with novel computing technologies. Dr. Guo is the leading developer of several software packages which has wide applications in the area of nuclear thermal hydraulics, offshore and marine energy industries, offshore oil and gas industries and coastal engineering. Dr. Guo is also member of SC/ISC technical program committee, Specialist Editor in HPC, Grid and Novel Computing, Computer Physics Communications and visiting Professor of Harbin Engineering University.

(8) Transforming HPC with Huawei ARM HPC Solution

Speaker

Pak Lui
Principal Architect at Silicon Valley Computing Lab at Futurewei Technologies, the Huawei Technologies R&D center in the USA

About the Speaker

Pak Lui works as a Principal Architect at Silicon Valley Computing Lab at Futurewei Technologies, the Huawei Technologies R&D center in the USA. He is served in the HPC Advisory Council HPC|Works Special Interest Group Co-Chair and Student Cluster Competition Manager for the HPC Advisory Council. He has been involved in demonstrating application performance on various open source and commercial applications. His main responsibilities involve characterizing HPC workloads, analyzing MPI profiles to optimize on the HPC applications, as well as exploring new technologies, solutions and their effectiveness on real HPC workloads. Previously Pak worked as a Senior Manager at Mellanox Technologies in Silicon Valley where his main focus is to optimize HPC applications on products, explore new technologies and solutions and their effect on real workloads. Pak has been working in the HPC industry for over 17 years. Pak worked as a Cluster Engineer at Penguin Computing, responsible for building and testing HPC cluster configurations from different OEMs hardware and ISVs software. Pak also worked at Sun Microsystems for over 7 years in Sun’s High Performance Computing (HPC) group, he worked as a Software Engineer on Sun’s own MPI implementation (Sun HPC ClusterTools) as well as Open MPI, and contributed code and performed scalability analysis with MPI applications on large clusters and SMPsPak holds a B.Sc. in Computer Systems Engineering and a M.Sc. in Computer Science from Boston University in the USA. Pak also helped organizing the last Linaro Arm Architecture HPC Workshop in Santa Clara, CA (Silicon Valley) in July last year.

(9) The New Generation of Phytium‘s 64Cores Processor and Ecosystem

Speaker

Guo Yufeng
Deputy general manager of Tianjin Phytium Information Technology Co., Ltd.

About the Speaker

Guo Yufeng, deputy general manager of Tianjin Phytium Information Technology Co., Ltd., Ph.D. He has been engaged in research and development of high-performance computers and microprocessor chips for a long time, and has hosted many developments of Phytium CPU chips.

(10) Arm Neoverse

Speaker

Dong Wei
Senior director and lead architect, distinguished engineer at Arm

About the Speaker

Dong Wei is a senior director and lead architect, distinguished engineer at Arm. He is responsible for the ServerReady certification program and the related SBSA, SBBR, EBBR and SBMG standards. He is the Vice President (Chief Executive) of the UEFI Forum, co-chair its ACPI Spec Working Group and chair its UEFI Test Working Group. He chairs the PCI Firmware Working Group at the PCI SIG. He is also the vice-chair of the Software Working Group at the CCIX Consortium. He represents Arm at DMTF and OCP. Before joining Arm in 2016, he was a VP and Fellow at HP responsible for the system architecture definitions for PA-RISC, Itanium, x86, and RISC-V systems, and cofounded the UEFI technology with Intel.

(11) End to End Deep Learning Solution on Arm

Speaker

Jammy Zhou
Solution Director of Linaro China

About the Speaker

Jammy Zhou is the Solution Director of Linaro China, driving the technical collaborations with regional members in various areas including Arm servers, Artificial Intelligence, IoT and etc. Before that, he worked in AMD as the leading architect of AMDGPU-Pro Linux graphics driver stack, and contributed a lot of patches to the upstream. He also worked at Freescale for some time, and had rich experience on Linux/Android BSPs for Arm embedded platforms.

(12) GIGABYTE Position in ARM Server Market - Leading Pioneer

Speaker

Akira Hoshino
Planning Head of GIGABYTE’s NCBU (Network &Communications Business Unit)

About the Speaker

Akira Hoshino is the Product Strategy & Planning Head of GIGABYTE’s NCBU (Network &Communications Business Unit), with responsibility for the planning and roadmap of GIGABYTE’sworldwide server products, leading the company’s development of ARM server products, andmanaging business with selected ARM processor vendors.

(13) Scale-out AI Training on Massive Core System: from HPC to Fabric-based SOC

Speaker

Fu Li
Founder and CEO of Quantum Cloud Future (Beijing) Technologies Co., Ltd.

About the Speaker

Dr. Fu Li is the founder and CEO of Quantum Cloud Future (Beijing) Technologies Co., Ltd., which aimed to the cutting-edge application-centric hyper-converged computing and storage products for media, AI, and blockchain industries.Dr. Fu Li is also the founder and CEO of Lumi Media Limited and Quantum Cloud Inc. He has garnered a wealth of experience across a range of quantum statistics, molecular simulation and future network. Dr. Fu Li is often recognized for his vast contributions to cloud computing and media application industries, he has become vice-minister of China Society of Motion Picture and Television Engineers and Industry professor of School of Digital Media of Jiangnan University, as well as the innovative and entrepreneurial talent of Jiangsu province.

3.Deep Learning for Science DL4Sci

Time: 09:30-16:30, 14/01/2019

Location: Ball Room B (宴会厅B厅)

Time Title Speaker Chair
9:30-9:40 Opening Remarks James Lin
SJTU
James Lin
SJTU
9:40-10:10 Reinforcement learning based image restoration and semantic image super-resolution Chao Dong
Shenzhen Institute of Advanced Technology
10:10-10:30 Privacy-Aware process mapping in geo-distributed cloud data centers Amelie Chi Zhou
Shenzhen University
10:30-10:50 Coffee Break
10:50-11:20 Machine Learning for Simulation of Molecular Dynamics Simon See
Nvidia
11:30-13:30 Launch and Break
13:30-14:30 Broad Learning System: An effective and efficient incremental learning system without the need for deep architecture C. L. Philip Chen
University of Macau
Yanjie Wei
Shenzhen Institute of Advanced Technology
14:30-15:00 Deep learning in protein structure prediction Zhen Li
Chinese University of Hong Kong (Shenzhen)
15:00-15:20 Coffee Break
15:20-15:50 Videl: A vision-based AI diagnoser for early leukemia Jianwen Wei
SJTU
15:50-16:20 Overview of Solving Differential Equations with AI Charles Cheung
NVIDIA
16:20-16:30 Closing Remarks Yanjie Wei
Shenzhen Institute of Advanced Technology

3.Deep Learning for Science DL4Sci

14/01/2019, 09:00-16:30, Yanhui Hall Room A

(1) Reinforcement learning based image restoration and semantic image super-resolution

Speaker

Chao Dong
Associate professor in Shenzhen Institute of Advanced Technology, Chinese Academy of Science

Abstract

Introduce two of our recent works (published on CVPR2018) on low-level vision problems. In the first paper, we investigate a novel approach for image restoration by reinforcement learning. Unlike existing studies that mostly train a single large network for a specialized task, we prepare a toolbox consisting of small-scale convolutional networks of different complexities and specialized in different tasks. Our method, RL-Restore, then learns a policy to select appropriate tools from the toolbox to progressively restore the quality of a corrupted image. In comparison to conventional human-designed networks, RL-Restore is capable of restoring images corrupted with complex and unknown distortions in a more parameter-efficient manner using the dynamically formed toolchain. In the second paper, we solve the problem of semantic super-resolution and show that it is possible to recover textures faithful to semantic classes. In particular, we only need to modulate features of a few intermediate layers in a single network conditioned on semantic segmentation probability maps. This is made possible through a novel Spatial Feature Modulation (SFM) layer that generates affine transformation parameters for spatial-wise feature modulation. Our final results show that an SR network equipped with SFM can generate more realistic and visually pleasing textures in comparison to state-of-the-arts.

About the Speaker

Ryohei Kobayashi is an assistant professor of Center for Computational Sciences, University of. Tsukuba, Japan. His research interests include FPGA systems for high performance computing.He received the Ph.D. degree from Tokyo Institute of Technology, Japan in 2016.

(2) Privacy-Aware process mapping in geo-distributed cloud data centers

Speaker

Amelie Chi Zhou
Assistant Professor in Shenzhen University

Abstract

Recently, various applications including data analytics and machine learning have been developed for geo-distributed cloud data centers. For those applications, the ways of mapping parallel processes to physical nodes (i.e., “process mapping”) could significantly impact the performance of the applications because of non-uniform communication cost in geo-distributed environments. What’s more, the different data privacy requirements in geo-distributed data centers pose additional constraints on process mapping solutions. While process mapping has been widely studied in grid/cluster environments, few of the existing studies have considered the problem in geo-distributed cloud environment, which is a challenging task due to the heterogeneous network performance, multi-level data privacy constraints and process failures. In this paper, we introduce the special privacy requirements in geo-distributed data centers and formulate the geo-distributed process mapping problem as an optimization problem with multiple constraints. We develop a new method to efficiently find good process mapping solutions to the problem. Experimental results on real clouds (including Amazon EC2 and Windows Azure) and simulations demonstrate that our proposed approach can achieve significant performance improvement compared to the state-of-the-art algorithms.

About the Speaker

Amelie Chi Zhou is currently an Assistant Professor in Shenzhen University, China. Before joining Shenzhen University, she was a Postdoc Fellow in Inria-Bretagne research center, France. She received her PhD degree in 2016 from School of Computer Engineering, Nanyang Technological University, Singapore. Her research interests lie in cloud computing, big data processing, distributed systems and resource management.

(3) Machine Learning for Simulation of Molecular Dynamics

Speaker

Simon See
Solution Architecture and Engineering Director and Chief Solution Architect for Nvidia AI Technology Center.

Abstract

Molecular Dynamics simulation is an important tool for the study of material science, physics, Biochemistry , drug design and many other scientific research systems,  However, long MD simulations are mathematically ill-conditioned, generating cumulative errors in numerical integration that can be minimized with proper selection of algorithms and parameters, but not eliminated entirely. However, the predictive power of these simulations is only as good as the underlying interatomic potential. \ Molecular dynamics (MD) simulations employing classical force fields constitute the cornerstone of contemporary atomistic modelling.  Classical potentials often fail to faithfully capture key quantum effects in molecules and materials. In this talk, the author give an overview of how machine learning and deep learning are being applied to MD and QMD.

About the Speaker

Dr. Simon See is currently the Solution Architecture and Engineering Director and Chief Solution Architect for Nvidia AI Technology Center. He is also a Professor and Chief Scientific Computing Officer at Shanghai Jiao Tong University and Professor in Beijing University of Posts and Telecommunications (BUPT). He is being conferred as a Distinguished Fudan Scholar in September 2018 by Fudan University, Shanghai, China. Previously Professor See is also the Chief Scientific Computing Advisor for BGI (China) and has a position in Nanyang Technological University (Singapore) and King-Mong Kung University of Technology (Thailand). Professor See is currently involved in a number of smart city projects, especially in Singapore and China. His research interests are in the area of High-Performance Computing, Big Data, Artificial Intelligence, Machine Learning, Computational Science, Applied Mathematics and Simulation Methodology. Professor See is also leading some of the AI initiatives in the Asia Pacific. He is a Steering Committee member of NSCC’s flagship High Performance Computing Conference Supercomputing Asia (SCA) since March 2018.. He has published over 200 papers in these areas and has won various awards.

(4) Broad Learning System: An effective and efficient incremental learning system without the need for deep architecture

Speaker

C. L. Philip Chen
Chair professor of the University of Macau (UM) Faculty of Science and Technology (FST), Member of Academia Europaea, American Association for the Advancement of Science (AAAS) fellow, Fellow of the IEEE, Chinese Association of Automation, Academician of the International Academy of Systems and Cybernetics Sciences (IASCYS).

Abstract

In recent years, deep learning caves out a research wave in machine learning. With outstanding performance, more and more applications of deep learning in pattern recognition, image recognition, speech recognition, and video processing have been developed. The talk is to introduce “Broad Learning”-- a very fast and accurate learning without the need of deep structure. Without stacking the layer-structure, the designed neural networks expand the neural nodes laterally and update the weights of the neural networks incrementally when additional nodes are needed and when the input data entering to the neural networks continuously. The designed network structure and learning algorithm are perfectly suitable for modeling and learning big data environment. Experiments indicate that the designed structure and algorithm out-perform existing structures and learning algorithms. If time permits, a Fuzzy Restricted Boltzmann Machine to enhance deep learning will be introduced.

About the Speaker

Biography: Prof. CL Philip Chen is a chair professor of the University of Macau (UM) Faculty of Science and Technology (FST), a member of Academia Europaea, American Association for the Advancement of Science (AAAS) fellow, a Fellow of the IEEE, Chinese Association of Automation, and an Academician of the International Academy of Systems and Cybernetics Sciences (IASCYS). He was President of the IEEE Systems, Man, and Cybernetics Society from 2012-2013. Currently, he is editor-in-chief of the IEEE Transactions on Systems, Man, and Cybernetics: Systems and an associate editor of several IEEE Transactions. He has served as general program chair of the IEEE SMC Society flagship conference and many international conferences in different capacities; has given keynote talks and served as a visiting chair professor; and has received seven best paper and research awards. In addition, he is an Accreditation Board of Engineering and Technology Education (ABET) program evaluator for computer engineering, electrical engineering, and software engineering programs.

(5) Deep learning in protein structure prediction

Speaker

Zhen Li
Research assistant professor of School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen) ; Research scientist of Shenzhen Research Institute of Big Data

Abstract

Accurately and reliably predicting protein structures is one of the most challenging tasks in computational biology. With the help of Next Generation Sequencing (NGS) technique, protein sequences are accumulated in UniProt database with an exponential growth rate, while the solved protein structure in Protein Data Bank (PDB) grows in a linear rate. Staring from the original amino acid sequence, I focus my research on protein structure prediction by using deep learning methods based on big data, i.e., protein secondary structure prediction, protein contact map prediction and de novo protein tertiary structure folding based on previous predicted structure.

About the Speaker

Zhen Li is currently research assistant professor of School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen) and research scientist of Shenzhen Research Institute of Big Data. His research interests include protein structure prediction and 3D vision tasks using deep learning. He used to be a research intern at the Technological Institute at Chicago and Computational Institute at the University of Chicago. He has published papers on top conference IJCAI, ECCV, ICCV, RECOMB and journal Cell Systems (Cell Press) as (co)first-author. Besides, he was the core member for CASP12 competition winner in 2016 (highly-cited work by web of science).

(6) Videl: A vision-based AI diagnoser for early leukemia

Speaker

Jianwen Wei
Engineer in High Performance Computing (HPC) Center at Shanghai Jiao Tong University

Abstract

Leukemia is one of TOP 10 cancers in China, especially to young children and elderly. Treating Leukemia at the early stage can significantly increase the cure rate. However, detecting early stage Leukemia, i.e, finding a small amount of blood cells among massive normal cells, is both technically challenge and labor intensive. Therefore, we designed an AI-powered diagnoser for precise identification of white blood cells. The precise identification has two steps,blood image collecting and cell classification. First, with the help of doctors in Ruijing Hospital, we have been collecting about 1,000 high-quality labeled blood cell samples each week and accumulated more than 200,000 samples in total.  This is the largest blood cell image dataset for leukemia in China. Second, based on these blood cell samples, we adopted the ResNet-variant classification method and achieved above 95\% accuracy in 17 subtypes of white blood cells. The classification accuracy is comparable to experienced doctors in Ruijing Hospital. Third, models are trained on the latest Intel Knights Landing 7210 Platform with Intel Omni-Path interconnections. Applying MKL-optimized matrix operatiors and RDMA-enable comunication library (PSM) shortens the training time from more than 6 hours to less than an hour. Our AI-powered diagnoser can reduce the time and the cost of early Leukemia diagnosis from one week to one day, and 1,000 USD to 100 USD, respectively.

About the Speaker

Mr. Jianwen Wei is an engineer in High Performance Computing (HPC) Center at Shanghai Jiao Tong University (SJTU). He received his Msc Degree in Electronic Engineering at SJTU in 2005. His research focuses on  performance optimizations on high-speed low-lantency networks and AI application on precise medicine. He actively contributes to several HPC open source projects including Spack package and Puppet configuration management systems.

(7) Overview of Solving Differential Equations with AI

Speaker

Charles Cheung
Deep Learning Solution Architect in NVIDIA AI Technology Center

Abstract

Differential equation play a very important role in many areas including engineering, physics, economics and biology. Many numerical solvers such as finite difference method, finite element method, finite volume method and meshless method have been developed over the years to tackle different type of equations with different properties. Apart from the traditional numerical method for solving the differential equations, researchers start to solve it with AI algorithm such as machine learning and deep learning algorithms. With this new weapon to this field, we can develop new solvers for inverse problem, high dimensional equation and fast simulation. In this talk, an overview of the AI solver for partial differential equation will be discussed.

About the Speaker

Dr Charles CHEUNG is currently a Deep Learning Solution Architect in NVIDIA AI Technology Center. He obtained the Bachelor degree and PhD degree in applied and computational mathematics from Hong Kong Baptist University. His research focus on meshless method, in particularly asymmetric collocation method, on solving partial differential equations, fractional differential equations and partial differential equations on surfaces. Besides academic research, he worked in Hong Kong Applied Science Technology and Research Institute for applied research. He was a project manager in leading machine vision group to develop deep learning algorithm for industrial defect inspection.

4.IXPUG Asia Workshop

Time: 13:30-17:00, 14/01/2019

Location: Ballroom C(宴会厅C厅)

Time Title Speaker Chair
13:30-13:35 Opening Remarks James Lin
Shanghai Jiao Tong University
James Lin
Shanghai Jiaotong University
13:35-14:20 Keynote: Massively scalable computing method for handling large eigenvalue problems for nanoelectronics modeling Hoon Ryu
KISTI
14:20-14:50 Videl: A vision-based AI diagnoser for early leukemia Jianwen Wei
(co-authored by Shenggan Chen, Ming Zhao, Zhangyu Jin, Jie Wang, Yichao Wang, and James Lin)
SJTU
14:50-15:20 Invited Talk: Scheme of the code modernization optimization for ocean numerical model:A case study of MASNUM wave model:
A Case Study of MASNUM Wave Model
Zhenya Song
FIO, MNR
15:20-15:40 Coffee Break
15:40-16:00 OpenCL-enabled high performance direct memory access for GPU-FPGA cooperative computation Ryohei Kobayashi
(co-authored by Norihisa Fujita, Yoshiki Yamaguchi, Taisuke Boku)CCS at University of Tsukuba
Taisuku Boku
University of Tsukuba
16:00-16:30 Invited Talk: Introduction for benchmark test for multi-scale computional materials software Shun Xu
CAS-CNIC
16:30-16:50 Optimization of parallel mesh generation via file for multigrid method Toshihiro Hanawa
(co-authored by Kengo Nakajima)University of Tokyo
16:50-17:00 Closing Remarks Taisuke Boku
University of Tsukuba

4.IXPUG Asia Workshop

14/01/2019, 13:30-17:00, Yanhui Hall Room B

(1) Massively scalable computing method for handling large eigenvalue problems for nanoelectronics modeling

Speaker

Hoon Ryu
Principle researcher at Korea Institute of Science and Technology Information

Abstract

This talk will help you learn how Lanczos iterative algorithm can be extended with a parallel computing to solve highly degenerated systems. The talk will address the performance benefits of the core numerical operations in Lanczos iteration, which can be driven with manycore processors (KNL) compared to the heterogeneous systems containing PCI-E add-in devices. This work will also demonstrate an extremely large-scale benchmark (~2500 KNL computing nodes) that has been recently performed with KISTI-5 (NURION) HPC resource. As this talk covers the numerical details of the algorithm, it would be also quite instructive to those who consider KNL systems to solve large-scale eigenvalue problems.

About the Speaker

Dr. Hoon Ryu received B.S./M.S./Ph.D. from School of Electrical Engineering, Seoul National Univerisity / Department of Electrical Engineering, Stanford University / School of Electrical and Computer Engineering, Purdue University, respectively. He was with System LSI Division, Samsung Electronics, and currently is with Korea Institute of Science and Technology Information, where he works as a principal researcher leading Intel Parallel Computing Center. His specialty and main research interests are in simulation of advanced semiconductor devices with aids of numerical analysis coupled to high performance computing.

(2) Videl: A Vision-based AI Diagnoser for Early Leukemia

Speaker

Jianwen Wei
Engineer in High Performance Computing (HPC) Center at Shanghai Jiao Tong University

Abstract

Leukemia is one of TOP 10 cancers in China, especially to young children and elderly. Treating Leukemia at the early stage can significantly increase the cure rate. However, detecting early stage Leukemia, i.e, finding a small amount of blood cells among massive normal cells, is both technically challenge and labor intensive. Therefore, we designed an AI-powered diagnoser for precise identification of white blood cells. The precise identification has two steps,blood image collecting and cell classification. First, with the help of doctors in Ruijing Hospital, we have been collecting about 1,000 high-quality labeled blood cell samples each week and accumulated more than 200,000 samples in total. This is the largest blood cell image dataset for leukemia in China. Second, based on these blood cell samples, we adopted the ResNet-variant classification method and achieved above 95\% accuracy in 17 subtypes of white blood cells. The classification accuracy is comparable to experienced doctors in Ruijing Hospital. Third, models are trained on the latest Intel Knights Landing 7210 Platform with Intel Omni-Path interconnections. Applying MKL-optimized matrix operatiors and RDMA-enable comunication library (PSM) shortens the training time from more than 6 hours to less than an hour. Our AI-powered diagnoser can reduce the time and the cost of early Leukemia diagnosis from one week to one day, and 1,000 USD to 100 USD, respectively.

About the Speaker

Mr. Jianwen Wei is an engineer in High Performance Computing (HPC) Center at Shanghai Jiao Tong University (SJTU). He received his Msc Degree in Electronic Engineering at SJTU in 2005. His research focuses on  performance optimizations on high-speed low-lantency networks and AI application on precise medicine. He actively contributes to several HPC open source projects including Spack package and Puppet configuration management systems.

(3) Scheme of the Code Modernization Optimization for ocean numerical model: A case study of MASNUM wave model

Speaker

Zhenya Song
Professor of First Institute of Oceanography(FIO), MNR, China

Abstract

Numerical model has becom one of key tools for ocean research and forecast, and the demand for increasing the computational efficiency is now necessary and urgent. In order to make full use of the characteristics of modern computer architecture and improve the ocean model’s computational efficiency, a code optimization scheme, which is demonstrated using MASNUM wave model as an example, was proposed in this paper. Firstly, Intel Vtune Amplifier XE and Intel Trace Analyzer Collector were evaluated the performance and load balancing of the MASNUM wave model. Then four steps of optimization, which are compiler options, serial and scalar optimization, vectorization, and using MPI/OpenMP parallelization, are proposed for the hotspot function located by Intel Vtune Amplifier XE. The result shows that after optimization, the speed up reaches 1.95 times in a single node, and the strong-scalability of the model is almost linear when extended to multi nodes. It indicates that the code modernization scheme is very effective.

About the Speaker

Zhenya Song (FIO) is a professor of First Institute of Oceanography(FIO), MNR, China. He received Ph.D in Physical Oceanography from Ocean University of China. His research field focuses on ocean and climate numerical simulation, HPC, and short-term climate prediction.

(4) OpenCL-enabled High Performance Direct Memory Access for GPU-FPGA Cooperative Computation

Speaker

Ryohei Kobayashi
Assistant professor of Center for Computational Sciences, University of Tsukuba, Japan

Abstract

We propose a high-performance GPU-FPGA data communication using OpenCL and Verilog HDL mixed programming in order to make both devices smoothly work together. OpenCL is used to program application algorithms and data movement control when Verilog HDL is used to implement low-level components for memory copies between the two devices. Experimental results using toy programs showed that our proposed method achieves remarkable communication performance, thus confirming that the proposed method is effective at realizing the high performance GPU-FPGA cooperative computation.

About the Speaker

Ryohei Kobayashi is an assistant professor of Center for Computational Sciences, University of Tsukuba, Japan. His research interests include FPGA systems for high performance computing. He received the Ph.D. degree from Tokyo Institute of Technology, Japan in 2016.

(5) Introduction for Benchmark Test for Multi-Scale Computional Materials Software

Speaker

Shun Xu
Associate Professor at High Performance Computing department at Computer Network Information Center of Chinese Academy of Sciences

Abstract

Software benchmark test can detect not only the inherent problems of software but also the configure problems of hardware. We choose the mainstream computational materials software, including LAMMPS, NWChem, ABINIT, MICRESS and so on, to carry out specific software performance benchmark and analysis. The goal is to extract some common problems of software parallelism in domain of computational materials, such as parallel computing model, hierarchical storage management, network communication, parallel IO and so on. At the same time, according to the software benchmark testing, we will present the critical bottlenecks that affect the performance improvement. Especially for the configuration of Intel platform, the corresponding solution is suggested, and some results of performance prediction and analysis are also given. We try to establish the standard of performance benchmark and analysis, related to design model, data format and so on, which will help us to well understand the software and hardware cooperative performance.

About the Speaker

Dr. Shun Xu joined the High Performance Computing department at Computer Network Information Center of Chinese Academy of Sciences in August 2015. He received his PhD in Computer Science in 2012, and currently as an Associate Professor studies in High Performance Computing technologies and parallel optimization of scientific computing applications. He focused on the performance optimization of computational simulation software, in term of computational materials and computational chemistry, and worked on optimizing LAMMPS package on Intel platform. He is currently leading the R&D of eMD software which is oriented to large-scale molecular dynamics simulation in high-performance computing environments.

(6) Optimization of Parallel Mesh Generation via File for Multigrid Method

Speaker

Toshihiro Hanawa
Associate professor of Information Technology Center, The University of Tokyo

Abstract

We consider about the optimization of parallel mesh generation process on the Oakforest-PACS system (OFP) which is the large-scale cluster using Intel Xeon Phi and Omnipath Architecture. OFP has not only the Lustre file system but also Infinite Memory Engine (IME) as the fast file cache system. MPI-IO enables the single shared file access, the optimal file access on the Lustre file system, and the native access to IME. As the result, we obtained 243 times better performance in maximum than the original version with the File Per Process. Moreover, the case of the Single Shared File with IME is 31 times faster than the original one with the File Per Process.

About the Speaker

Toshihiro Hanawa received the M.E. degree and the Ph.D. degree in computer science from Keio University in 1995 and 1998. He was an assistant professor of Tokyo University of Technology, Japan, from 1998 to 2007, a research fellow of Center for Computational Sciences (CCS), University of Tsukuba, from 2007 to 2008, an associate professor of CCS, from 2008 to 2013, and a project associate professor of Information Technology Center, The University of Tokyo, from 2013 to 2015. Since Dec. 2015, he is an associate professor of Information Technology Center, The University of Tokyo. His research interests include computer architecture, interconnection network, and accelerated computing. Dr. Hanawa is a member of IEEE CS and IPSJ.

5.Intelligent Computing Inspires HPC Potential

Time: 13:00-17:10, 15/01/2019

Location: Function Room 1 (鸿运厅)

Time Topic Speaker Chair
13:00-13:30 Huawei Intelligent Full-Stack HPC Platform Accelerates Research Progress Xie Haibo
Huawei
Pak Lui
Huawei
13:30-14:00 Transforming HPC with Huawei ARM HPC Solution Francis Lam
Huawei
14:00-14:30 Large-Scale Structural Analysis Software ADVENTURECluster Akira Nakayama
SCSK
14:30-14:50 Lucky Draw
14:50-15:20 Science Cloud ARM HPC Teppei Ono
HPC Systems
15:20-15:50 Huawei High Performance Computing Public Cloud Solution Liu Chengyang
Huawei
16:00-16:20 Best Practice of HPC Hybrid Cloud Platform with AI Yunsheng Duan
Anhui University
16:20-16:30 Lucky Draw
16:30-17:00 Panel
17:00-17:10 Lucky Draw

5.Intelligent Computing Inspires HPC Potential

15/01/2019, 13:00-17:10, Yongyun Room

(1) Huawei Intelligent Full-Stack HPC Platform Accelerates Research Progress

Speaker

Xie Haibo
Head of HPC Solution, Huawei

(2) Transforming HPC with Huawei ARM HPC Solution

Speaker

Francis Lam
Director of HPC Product Management, Huawei

(3) Large-Scale Structural Analysis Software ADVENTURECluster

Speaker

Akira Nakayama
General Manager, Analysis Solutions Dept. III, SCSK

Abstract

SCSK, one of the largest full line ICT companies in Japan, will reveal why the self-developed ADVENTURECluster has incredible strength when applied to highly complicated and nonlinear models, such as IC engine assembly and more.

About the Speaker

Akira Nakayama. General Manager, Analysis Solutions Dept. III, SCSK

(4) Science Cloud ARM HPC

Speaker

Teppei Ono
CEO, HPC Systems, Japan

Abstract

HPC Systems, a leading HPC solution vendor in Japan, provides high-performance computing cloud service in Japan and China markets utilizing state-of-the-art ARM servers. The CEO of HPC Systems will introduce personally what these services are and how to use them.

About the Speaker

Teppei Ono, CEO, HPC Systems, Japan

(5) Huawei High Performance Computing Public Cloud Solution

Speaker

Liu Chengyang
Huawei

(6) Best Practice of HPC Hybrid Cloud Platform with AI

Speaker

Yunsheng Duan
Director of Modern Education Technology Center, Anhui University

Abstract

Huawei works with PARATERA to build a one-stop HPC+AI computing platform for Anhui University. The platform supports unified resource planning and management to improve device utilization, reduce O&M complexity, and avoid repetitive construction. The hybrid cloud supercomputing platform of Anhui University has provided high-quality services for academic research and scientific research innovation in all disciplines.

About the Speaker

Yunsheng Duan, Director of Modern Education Technology Center, Anhui University

6.Visualization for Insights in HPC

Time: 13:00-17:10, 15/01/2019

Location: Function Room 3 (瑞云厅)

Time Title Speaker Chair
13:00-13:30 Data and Task Management for Visualization in HPC Xiaoru Yuan
Peking University
Xiao Li
Southwest University of Science and Technology
13:30-14:00 Topology Visualization in Large-Scale Vector Field Wenke Wang
NUDT
14:00-14:30 Exploration of Time-varying Multivariate Volume Data Based on Isosurface Similarities Jun Tao
Sun Yat-sen University
14:30-15:00 Coffee Break
15:00-15:30 TeraVAP: The Visual Analysis Engine for Scientific Computing Li Xiao
Institute of Applied Physics and Computational Mathematics
Xiaoru Yuan
Peking University
15:30-16:00 Perception Enhanced Flow Visualization and Interaction Yadong Wu
Southwest University of Science and Technology
16:00-16:30 Uncertainty Analysis of Association Patterns and Visualization of Feature Exploration Huijie Zhang
Northeast Normal University
16:30-17:10 Biclusters Based Visual Exploration of Multivariate Spatial Data Xiangyang He
Zhejiang University

6.Visualization for Insights in HPC

15/01/2019, 13:00-17:10, Ruiyun Room

(1) Topology Visualization in Large-Scale Vector Field

Speaker

Wenke Wang

Abstract

Critical points and periodic orbits are the two most important topological features of vector field. This talk will introduce the efficient visual computing techniques for these two topological features. For the critical point, the redundant computation of existing detection method is reduced. Moreover, the Poincaré index of the critical points can be calculated during the detection without much addition effort. Then two efficient streamline visualization methods will be introduced. One method replaces numerical integration with Newton Iteration, and achieves higher speed with the same accuracy. The other method is for large amount of integral curves calculation. It speeds up the visualization by reusing the intermediate results. Based on the above techniques, an efficient visualization method of periodic orbits in two-dimensional vector field is introduced. This method is based on Poincaré-Bendixon theorem and reduces the redundant computation during the extraction of periodic orbits.

About the Speaker

Wenke Wang is an associate professor in National University of Defense Technology. He received his BS degree and PhD degree in computer science and technology at Tsinghua University in 2003 and 2009, respectively. His primary research interest is scientific visualization. He has been engaged in the research and development of large-scale scientific visualization system of Tianhe series supercomputers for a long time. Wenke Wang won a second prize for military science and technology progress award. He has published more than 40 papers on scientific visualization and authorized 5 national patents for invention. He was in the Program Committee for ChinaVis 2015, 2016, 2017 and 2018.

(2) Exploration of Time-varying Multivariate Volume Data Based on Isosurface Similarities

Speaker

Jun Tao

Abstract

Many scientific simulations produce time-varying multivariate volume data that can span hundreds or thousands of time steps and consist of tens of variables. Understanding this kind of data is essential to study the underlying physical phenomena. However, the scale of data often poses great challenges in both the analysis to discover important features and their relationships and the visualization to convey this information to scientists. This talk will cover both aspects. We will first introduce an efficient computation strategy to measure similarities among isosurfaces and then discuss an interactive approach for exploring the similarities and understanding the relationships among variables and their temporal development.

About the Speaker

Jun Tao is an associate professor at School of Data and Computer Science at Sun Yat-sen University and National Supercomputer Center in Guangzhou. He received a PhD degree in computer science from Michigan Technological University in 2015 and worked as a postdoctoral researcher at University of Notre Dame from 2015 to 2018. His major research interest is scientific visualization, especially on developing expressive exploration tools for understanding flow fields and time-varying multivariate data sets. He is also interested in graph-based visualization, image collection visualization, and software visualization. He received the Dean's Award for Outstanding Scholarship and the Finishing Fellowship at Michigan Tech in 2015, and a Best Paper Award at IS&T/SPIE VDA 2013.

(3) Uncertainty Analysis of Association Patterns and Visualization of Feature Exploration

Speaker

Huijie Zhang

Abstract

Multivariable volume data are commonly high-dimensional, time-varying and large-scale. The data complexity makes it difficult to quantify the correlation between voxels effectively. To solve this problem, comprehensive evaluation methods of voxel similarity and spatial neighbourhood difference are proposed, by considering the multivariable time-varying patterns of voxels and the information of spatial neighbourhood, so as to analyze the correlation between voxels effectively. For the inevitable uncertainties in scientific simulation experiments, we show the overview and credibilities of the association patterns among different variables, through extracting the uncertainty isosurface from volume data and using a fusion coloring technology. Based on the existing works, I will analyze the extensions and inspirations from information visualization methods to the research ideas of scientific visualization, and introduce our preliminarily exploratory work, which helps analysts explore the complex features that are difficult to be found in multivariate volume data by combining subspace clustering method and RadViz projection technology.

About the Speaker

Huijie Zhang is a full professor in School of Information Science and Technology, Northeast Normal University. She is the head of Data Visualization and Visual Analysis Research Group. She received her PhD degree in computer science from Jilin University. Her main research interests include data visualization, visual analysis, computer graphics and optimization algorithms. More than 50 academic papers have been published in journals containing IEEE Access, The Visual Computer, Pattern Recognition and Neurocomputing, and in international conferences including "ACM SIGSPATIAL GIS", "EuroVis" and "PacificVis". She also published a personal academic monograph about terrain visualization. She has received the grants from the National Natural Science Foundation of China, the National Natural Science Youth Foundation of China and the Human Resources and Social Security Department. She is a senior member of China Computer Federation (CCF), the vice chairman of CCF Young Scholars Forum (YOCSEF) at Changchun in 2018 and the standing committee member of Visualization and Visual Analysis Committee of China Society of Image and Graphic (CSIG). She has also won the Jilin Talent Development Fund and the second prize of Natural Science Academic Achievement Award in Jilin Province.

(4) Biclusters Based Visual Exploration of Multivariate Spatial Data

Speaker

Xiangyang He

Abstract

Multivariable volume data are commonly high-dimensional, time-varying and large-scale. The data complexity makes it difficult to quantify the correlation between voxels effectively. To solve this problem, comprehensive evaluation methods of voxel similarity and spatial neighbourhood difference are proposed, by considering the multivariable time-varying patterns of voxels and the information of spatial neighbourhood, so as to analyze the correlation between voxels effectively. For the inevitable uncertainties in scientific simulation experiments, we show the overview and credibilities of the association patterns among different variables, through extracting the uncertainty isosurface from volume data and using a fusion coloring technology. Based on the existing works, I will analyze the extensions and inspirations from information visualization methods to the research ideas of scientific visualization, and introduce our preliminarily exploratory work, which helps analysts explore the complex features that are difficult to be found in multivariate volume data by combining subspace clustering method and RadViz projection technology.

About the Speaker

Huijie Zhang is a full professor in School of Information Science and Technology, Northeast Normal University. She is the head of Data Visualization and Visual Analysis Research Group. She received her PhD degree in computer science from Jilin University. Her main research interests include data visualization, visual analysis, computer graphics and optimization algorithms. More than 50 academic papers have been published in journals containing IEEE Access, The Visual Computer, Pattern Recognition and Neurocomputing, and in international conferences including "ACM SIGSPATIAL GIS", "EuroVis" and "PacificVis". She also published a personal academic monograph about terrain visualization. She has received the grants from the National Natural Science Foundation of China, the National Natural Science Youth Foundation of China and the Human Resources and Social Security Department. She is a senior member of China Computer Federation (CCF), the vice chairman of CCF Young Scholars Forum (YOCSEF) at Changchun in 2018 and the standing committee member of Visualization and Visual Analysis Committee of China Society of Image and Graphic (CSIG). She has also won the Jilin Talent Development Fund and the second prize of Natural Science Academic Achievement Award in Jilin Province.

7.Vendor Vision

Time: 13:00-17:10, 15/01/2019

Location: Function Room 2 (祥云厅)

Time Talk Speaker Chair
13:00-13:05 Opening Ruibo Wang
NUDT
13:05-13:35 Transforming HPC: Huawei Embraces the Era of Intelligent Computing Francis Lam
Huawei
13:35-14:05 Data Defines the Future Liu Yinglei
Intel
14:05-14:25 A Development Way for China Domestic Application Jian Chen
Paratera
14:25-14:45 Enabling the Arm HPC Ecosystem using Open Source Software Renato Golin
Linaro
14:45-15:10 Coffee Break
15:10-15:30 Ali Cloud-Supercomputing Architecture: Practices and Cases Wanqing He
Alibaba
15:30-15:50 New Generation Fabric Manager & Advanced Monitoring System, Haakon Bryhni, Fabriscale Haakon Bryhni
Fabriscale
15:50-16:10 How performance ensures AI success in Future Ying Liu
DDN
16:10-16:30 From Competition to the Next HPC Generation — ASC Student Supercomputer Challenge Aska Huang
Inspur
16:30-16:50 Network Computing - The Way to Exa-Scale HPC & AI System Qingchun Song
Mellanox
16:50-17:00 Closing

7.Vendor Vision

15/01/2019, 13:00-17:10, Xiangyun Room

(1) Transforming HPC: Huawei Embraces the Era of Intelligent Computing

Speaker

Francis Lam
Huawei

Abstract

TBA

About the Speaker

Francis brings 20+ years of HPC and IT industry experience specialized in server systems design and HPC solution architecture.  Before joining Huawei Enterprise USA as Director of HPC Product Management, Francis served in Huawei US R&D Center since 2011 as an HPC System Architect. Francis is responsible of driving future direction of Huawei HPC products  and solutions. Prior to joining Huawei, Francis has spent 10+ years with Hewlett-Packard, 8 years with Oracle/Sun Microsystems and 2 years with Super Micro.

(2) Data Defines the Future

Speaker

Liu Yinglei
Intel

Abstract

Digital Economy is transforming the way we live and work. It is driving a virtuous cycle of growth from IoT to Cloud infrastructure that is tightly coupled with HPC, High Performance Data Analytics and AI techniques to extract actionable intelligence. This new dynamics is creating a whole new category of value creation: data is the new oil. In this talk, we explore methodologies for integrating HPC simulation into the digital economy.

About the Speaker

Liu Yinglei (Sam) had a long history in the server industry - 19 years of experience dealing with the most advanced and dynamic server technologies, marketing and business development, and eco-system enablement. Most recently Yinglei worked as HPC Sales Director for HPC Sales Enablement team in Intel Data Center Sales Group, and focused on marketing Intel products to support local key HPC and data analytic deals. Yinglei received his Bachelor of Science in Physics at Fudan University. He lives in Shanghai.

(3) A Development Way for China Domestic Application

Speaker

Jian Chen
Paratera

Abstract

In recent years, China has developed rapidly in the field of high-performance computing. On the list of the world's 500 fastest supercomputers, China's supercomputers ranked first in ten consecutive periods; on the latest list, there are 227 sets in China. There are 109 sets in the United States, which exceeds the United States in total, and there is a trend of further widening the gap; In high-perform ance applications area, China has won the most influential GORDON BELL PRIZE in high performance computing for two consecutive times. China has an extra ordinary influence in the world of high-performance computing.However, while achieving outstanding performance, the development of high-performance computing applications in China is relatively lagging behind. The core high-performance applications are sourced from international advanced scientific research institutions and companies such as the United States and Europe. The core industrial software is monopolized by foreign countries, which seriously res tricts The development of relevant industry sectors in China. China not only needs to develop its own chips, but it is extremely urgent to develop independent HPC applications in the field of high-performance computing.

About the Speaker

Jian Chen, Ph.D, CEO of Beijing PARATERA Tech Co., Ltd ., CCF Director, Member of the CCF High Performance Computing Commission, member of the CCF YOCSEF Academic Committee, and member of the TEEC Tsinghua Entrepreneur Association. From 2005 to 2010, he served as Intel China High Performance Computing Archi tect and Senior Performance Optimizati on Engineer. Responsible for system architecture design, HPC system optimization and high-performance computing technology promotion for China's large HPC and cutting-edge HPC projects. From 2002 to 2005, he served as the manager of the solution department of the high-performance computing server division of Lenovo, and participated in the development of Lenovo's 1T-flops and 4T -flops supercomputers. He graduated from the Department of Engineering Mechanics of Tsinghua University in 2002 with a Ph.D. in fluidmechanics. He was a visiting scholar at TUDelft University in the Netherlands for one year.

(4) Enabling the Arm HPC Ecosystem using Open Source Software

Speaker

Renato Golin
Linaro

Abstract

The move from mobile to server markets present a unique challenge to Arm vendors, which were used to custom solutions, now need to create aconsolidated ecosystem. Unlike other architectures, Arm has a diverse hardware ecosystem.  In collaboration with many Arm based companies, Linaro builds software for a robust and open source ecosystem, in which all vendors can participate, improve and profit from. In this talk, we will introduce the Linaro company, culture and collaborative work to enable Arm based platforms for a new era of computing.

About the Speaker

Renato is the Tech-Lead for the HPC-SIG at Linaro, working with vendors and users to foster the open source ecosystem on Arm hardware. As a toolchain engineer for the past 10 years, he has previously worked as LLVM Team-Lead at Linaro, on HPC compilers for HPCC, embedded compilers and debuggers for Arm. Before that, he worked with bioinformatics, web infrastructure and some consultancy, adding up another 10 years.

(5) Ali Cloud-Supercomputing Architecture: Practices and Cases

Speaker

Wanqing He
Alibaba

Abstract

TBA

About the Speaker

TBA

(6) New Generation Fabric Manager & Advanced Monitoring System

Speaker

Haakon Bryhni
Fabriscale

Abstract

As HPC systems are built with more powerful compute nodes, GPU and FPGA accelerators and network links with higher bandwidth, there is an increasing need for optimal management of network resources. We will show how many applications in InfiniBand-based HPC clusters can greatly benefit from improved network management. For communication-intensive workloads, Fabriscale has proven up to 40% improved performance of a standard HPC system with InfiniBand network using our “Wingman” software fabric manager instead of traditional OpenSM. The presentation will explain how improved InfiniBand routing by smart software can reduce runtime and enable HPC systems to increase job throughput and increase system reliability. Example of how performance of typical HPC applications can be improved by high-end InfiniBand routing will be presented, using benchmarks from leading HPC systems in China and the USA. Another challenge facing HPC owners and operators are how to understand the interplay between jobs and how they affect performance of the overall HPC system. We will present the network intelligence and analytics platform “Hawk-Eye”, and how it is used by HPC owners to simplify management and provide analytics which can be used for troubleshooting and performance improvement. Hawk-Eye collects and analyses performance statistics gathered from the entire HPC system and data is visualized directly in the HPC topology. The monitoring system is closely integrated with Slurm, Torque and other job management systems to leverage job scheduling information to visualize jobs in the cluster, identify potential job specific network bottlenecks and conduct job management. All performance data is stored in a scalable database so the operator can compare job performance as function of time, including information on how each job is using the network. Hawk-Eye automatically monitor the HPC system and raise alarms (link failures, port error rates, congestion notification etc.) only when the operator’s attention is required. The presentation will demonstrate how improved fabric management software and advanced network analytics can improve performance and simplify operation of state-of-the art HPC systems.

About the Speaker

TBA

(7) How performance ensures AI success in Future

Speaker

Ying Liu
DDN

Abstract

TBA

About the Speaker

TBA

(8) From Competition to the Next HPC Generation — ASC Student Supercomputer Challenge

Speaker

Aska Huang
Inspur

Abstract

HPC is now facing major challenges including the gap between the application and HPC system, the increasing convergence of AI&HPC, and the cultivation of HPC talents to unleash their potential and accelerate the development of HPC industry. In this talk, we will introduce the ASC Student Supercomputer Challenge and how Inspur is approaching these challenges through the world's largest supercomputer hackathon and making continuing efforts to inspire innovation, boost talent cultivation, and the development of HPC industry.

About the Speaker

TBA

(9) Network Computing - The Way to Exa-Scale HPC & AI System

Speaker

Qingchun Song
Mellanox

Abstract

TBA

About the Speaker

Master of Computing Science in China Tsinghua University. Has 18 years experience in HPC and storage industry. Had been technical director of Mellanox Asia, Mellanox Taiwan General Manager and principle architect of AI solution in Mellanox China. Acting as sr. director of Mellanox APAC market development and chair of HPC-AI advisory council now. Member of ODCC Technical Expert group. Successfully built the RDMA eco-system over China mainstream machine learning/deep learning and big data framework, as well new storage market.

8.Benchmarking in Data Center

Time: 13:00-17:30, 16/01/2019

Location: Function Room 2 (祥云厅)

Time Title Speaker Chair
13:00-13:30 An overview of data centers in China Guo Liang Samar Aseeri
KAUST
13:30-14:00 Using the SPEC HPG Benchmarks for Better Analysis and Evaluation of Current and Future HPC Systems Robert Henschel
Indiana University
14:00-14:10 Liquid cooling technology and performance efficiency in data center Vikcy Xie
CAICT
14:10-14:30 The Role of Privacy in Benchmarking Geo-Distributed Data Centers Yao Xiao
Shenzhen University
14:30-15:00 The Green500: Metrics, Methodologies, and Workloads Wu Feng
Virginia Tech
15:00-15:30 Coffee Break  
15:30-16:00 Benchmarking Java as a Possibility to Merge HPC and Big Data Processing Piotr Bala
Warsaw University
Benson Muite
University of Tartu
16:00-16:30 Benchmarking and Accelerating Big Data and Deep Learning Systems on Modern HPC and Cloud Architectures Xiaoyi Lu
Ohio State University
16:30-17:00 Intel MPI benchmarks Zhuowei Si
Intel
17:00-17:25  Discussion Juan Chen
NUDT
17:25-17:30 Closing  

8.Benchmarking in Data Center

16/01/2019, 13:00-17:30, Hongyun Room

(1) An overview of data centers in China

Speaker

Guo Liang

Abstract

In recent years, large data centers have developed rapidly in China. Major internet companies and telecom operators have played a very important role. In the process of rapid development, IDC face many problems, such as policy support, industrial collaboration and so on. We need to do more to promote the healthy and rapid development of IDC in China.

About the Speaker

Guo Liang has been engaged in policy support, standard formulation, technical research and evaluation of data centers for more than ten years. He has undertaken national 863 and related major projects, published more than 10 papers and 6 patents.

(2) Using the SPEC HPG Benchmarks for Better Analysis and Evaluation of Current and Future HPC Systems

Speaker

Robert Henschel
Director for Research Software and Solutions at Indiana University, Chair of the Standard Performance Evaluation Corporation (SPEC) High Performance Group, Treasurer of the OpenACC organization

Abstract

The High Performance Group (HPG) of the Standard Performance Evaluation Corporation (SPEC) is a forum for discussing and developing benchmark methodologies for High Performance Computing (HPC) systems. The group is comprised of HPC vendors and research institutions from all over the world and has been developing production quality benchmark suites, like SPEC MPI2007, SPEC OMP2012 and SPEC ACCEL, for over 20 years.
SPEC HPG benchmark suites are based on real world parallel scientific applications and go beyond synthetic kernel benchmarks to allow users to better understand real world system performance. An important part of the work is peer reviewing the results and publishing them in a repository on the SPEC web page. This curated result repository is freely available and can be used to model and estimate performance of a wide range of HPC systems.
In this talk, I will present an overview of the High Performance Group as well as SPEC’s benchmarking philosophy in general. Most everyone knows SPEC for the SPEC CPU benchmarks that are heavily used when comparing processor performance, but the High Performance Group specifically focuses on whole system benchmarking utilizing the parallelization paradigms common in HPC, like MPI, OpenMP and OpenACC. I will present use cases for the benchmarks like comparing compiler performance and performance of programming paradigms like OpenMP and OpenACC. I will show how the benchmarks are used at Indiana University as a selection criteria during procurement of large HPC systems, to track system performance over time and evaluate overhead of various virtualization solutions. At the end of my talk I will invite the audience to participate in the development of HPG’s next benchmark suite, a hybrid benchmark that will utilize multiple data sets to scale from a single compute node all the way to multiple thousand compute nodes with accelerators.

About the Speaker

Robert Henschel received his M.Sc. from Technische Universitat Dresden, Germany. He joined Indiana University in 2008 as the manager of the Scientific Applications Group and today serves as the Director for Research Software and Solutions. He is responsible for providing advanced scientific applications to researchers at Indiana University and national partners as well as providing support for computational research to the Indiana University School of Medicine. Henschel serves as the chair of the Standard Performance Evaluation Corporation (SPEC) High Performance Group and in this role leads the development of production quality benchmarks for HPC systems. He also serves as the treasurer of the OpenACC organization.  Henschel has a deep background in High Performance Computing and his research interests focus on performance analysis of parallel applications.

(3) Liquid cooling technology and performance efficiency in data center

Speaker

Vikcy Xie
China Academy of Information and Communications Technology.

Abstract

In the 5G era, the increased data transmission and network capacity has stimulated an explosive growth of data traffic. The underlying infrastructure, in particular data centers faces the challenge of improved cooling due higher to energy consumption and high operating efficiency. Air-cooled heat dissipation methods have traditionally been used in data centers. However, for high-density large-scale data centers, liquid cooling technology has some advantages. For liquid cooling, a performance efficiency theory is needed to enable comparison between different systems. As computing performance and density increase, data center and server restructuring will be required, part of which will be guided by performance metrics.

About the Speaker

Vikcy Xie is affiliated with the China Academy of Information and Communications Technology.

(4) The Role of Privacy in Benchmarking Geo-Distributed Data Centers

Speaker

Yao Xiao
Graduate student at Shenzhen University

Abstract

Recently, many big data applications involve analyzing a large volume of data generated in geo-distributed data centers. To analyze this data, input or intermediate data need to be transferred across data centers. This leads to the concern on data privacy due to the different privacy requirements of data owners as well as the various data privacy regulations and laws in those data centers. Although many studies have been proposed to preserve the privacy of big data applications using techniques such as data encryption, differential privacy and constrained optimization, it is hard to compare their effectiveness on privacy-preserving between different studies due to the diverse problem assumptions and attack models. A privacy benchmark which provides a standard process for privacy-related evaluations and comparisons is needed. In this paper, we propose PrivBench, which includes four privacy-related metrics originated from a state-of-the-art data privacy regulation, namely the General Data Protection Regulation (GDPR) of the European Union. PrivBench can be used by data owners to choose an appropriately private data center to store data, or by data center managers to learn the privacy-preserving level of other data centers to build trust and seek collaborations. As the first step towards bench marking privacy in geo-distributed data centers, we hope our work can encourage more studies on this topic.

About the Speaker

Yao Xiao is a graduate student at Shenzhen University.

(5) The Green500: Metrics, Methodologies, and Workloads

Speaker

Wu Feng
Professor of computer science and electrical & computer engineering at Virginia Tech

Abstract

To be announced

About the Speaker

Dr. Wu-chun Feng — or more simply, "Wu" — is a professor of computer science and electrical & computer engineering at Virginia Tech, where he directs the Systems, Networking, and Renaissance Grokking (SyNeRGy) Labora. His research interests lie broadly at the synergistic intersection of computer architecture, systems software and middleware, and applications software. Most recently, his research has dealt with high-performance networking protocols, dynamic multicore scheduling, accelerator-based computing for bioinformatics, virtual computing, power-aware computing, and bioinformatics in general. Wu joined Virginia Tech in 2006 after spending seven years at Los Alamos National Laboratory. He is the recipient of three Best Paper Awards in human-computer interaction, high-performance networking, and bioinformatics, respectively, and three R&D 100 Awards in green supercomputing, high-speed networking, and bioinformatics, respectively. He also leads four grassroots projects:
1) The Green 500 list: A ranking of the most energy-efficient supercomputers in the world.
2) mpiBLAST: An open-source, parallel implementation of NCBI BLAST.
3) Supercomputing in small spaces: Low-power and power-aware supercomputing.
4) MyVICE: A platform to simplify delivery of education and create engaging and "kid-friendly" curriculum in support of STEM.

(6) Benchmarking Java as a possibility to merge HPC and Big Data processing

Speaker

Piotr Bala
Professor, Interdisciplinary Centre for Mathematical and Computational Modeling, Warsaw University

Abstract

Benchmarking of the HPC systems has a long history documented by the number of scientific papers and benchmarks. There is a list of micro benchmarks testing the physical capabilities of the computational system such as bandwidth and latency. On the application side, there is a list of widely used benchmarks such as Linpack, Conjugate gradients (HPCG), Graph500 and others. Most of them are however based on the single numerical algorithm selected from the widely used numerical applications.
Recently we observe strong interest in running Big Data processing and Artificial Intelligence applications on HPC systems. Because of the different tools used in both areas as well as the different nature of algorithms used achievement of good performance is difficult. Big Data and AI applications are implemented in Java, Scala, Python and other languages which are not widely used in HPC processing. Vendors are putting a lot of effort to rewrite most time-consuming parts to C/MPI but this is not easy and time-consuming task and success are limited.
To solve this problem, we have focused on the usability of Java for HPC applications which open field to easy integration with Big Data and AI applications. We have developed PCJ (Parallel Computing in Java), a novel tool for scalable high-performance computing and big data processing in Java. PCJ is Java library implementing PGAS (Partitioned Global Address Space) programming paradigm. It allows for the easy and feasible development of computational applications as well as Big Data and AI processing. The use of Java brings HPC, Big Data and AI type of processing together and enables running on the different types of hardware.
The PCJ applications can run on the PC's, x86 clusters and supercomputers including Cray XC40 systems. PCJ has been tested on Intel KNL processors as well as Power8/9 systems. The applications implemented with PCJ and Java scale up to hundreds of thousands of cores. A good example is 2D stencil code running with 196k cores of Cray XC40 at HLRS.
We will present performance and scalability of PCJ library measured on Cray XC40 systems with standard micro-benchmarks such as ping-pong, broadcast, and random access. We describe parallelization of example applications of different characteristics including FFT and 2D stencil. Results for standard Big Data benchmarks such as word count are presented. In all cases, measured performance and scalability confirm that PCJ is a good tool to develop parallel applications of different type.

About the Speaker

Piotr Bala graduated and received a Ph. D. degree in physics, in 1988and 1993, respectively, from N. Copernicus University (Torun, Poland).He works as a Professor with the Interdisciplinary Centre for Mathematical and Computational Modeling,Warsaw University. Since 2000, he has been a leader of the team at ICMWarsaw University which develops grid tools for molecular biology. His group has created the PCJ library allowing for effective parallelization of calculations in Java. He is a co-founder of the UNICORE software that allows distributed computations in molecular biology and quantum chemistry.His main focus of research is on developing new methods for bio molecular simulations as well as in parallel and distributing computing. He has published over 120 scientific papers.

(7) Benchmarking and Accelerating Big Data and Deep Learning Systems on ModernHPC and Cloud Architectures

Speaker

Xiaoyi Lu
Research Assistant Professor in the Department of Computer Science and Engineering at the Ohio State University

Abstract

With the convergence among HPC, Big Data, Deep Learning, and Cloud Computing technologies, Big Data and Deep Learning systems and applications have started taking advantage of the advanced technologies, such as multi-/many-core based CPUs/GPUs,RDMA-enabled interconnects, PCIe-/NVMe-SSDs, etc., which are widely adopted on modern HPC and Cloud architectures. However, fully exploiting the benefits of these advanced features to efficiently process data-intensive workloads is still full of challenges. This talk will first provide an overview of challenges in accelerating Big Data and Deep Learning systems on modern HPC and Cloud environments. Then, an in-depth overview of advanced designs based on RDMA and heterogeneous storage architecture for Hadoop, Spark, Memcached, and Tensor Flow will be presented. Benefits of these designs on various cluster configurations will be shown. The talk will also address the need for designing benchmarks using a multi-layered, isolated, and systematic approach, which can be used to guide the performance accelerations of these frameworks.

About the Speaker

Dr. Xiaoyi Lu is a Research Assistant Professor in the Department of Computer Science and Engineering at the Ohio State University, USA. His current research interests include high performance interconnects and protocols, Big Data Analytics, Parallel Computing Models, Virtualization,Cloud Computing, and Deep Learning system software. He has published more than 100 papers in major International conferences, workshops, and journals with multiple Best(Student) Paper Awards or Nominations. He has delivered more than 100 times of invited talks, tutorials, and presentations worldwide. He has been actively involved in various professional activities in academic journals and conferences. Recently, Dr. Lu is leading the research and development of RDMA-based accelerations for Apache Hadoop, Spark, HBase, Memcached, and Tensor Flow, and OSU HiBD micro-benchmarks, which are publicly available from http://hibd.cse.ohio-state.edu.These libraries are currently being used by more than 300 organizations from 35 countries.More than 28,650 downloads of these libraries have taken place from the project site. He is also leading other projects, such as MVAPICH2-Virt (high-performance and scalable MPI for HPC cloud), DataMPI (extended MPI for data-intensive applications), etc. He is a member of IEEE and ACM. More details about Dr. Lu are available at http://www.cse.ohio-state.edu/∼luxi.

(8) Intel MPI benchmarks

Speaker

Zhuowei Si
Intel Technical Consulting Engineer

Abstract

Fully measure the performance of your cluster systems with Intel MPI Benchmarks.

About the Speaker

Zhuowei Si is an Intel Technical Consulting Engineer that has supported cluster tools.

9.Challenges and Opportunities Facing Supercomputing Centers around the World

Time: 13:30-17:20, 16/01/2019

Location: Grand Ballroom (宴会厅)

Time Title Speaker Chair
13:30--13:35 Opening Hui Yan
NSCC-GZ
13:35-14:00 The Next 30 Years for NCSA William Gropp
NCSA
14:00-14:25 Challenges and Opportunities AI Brings to Supercomputing Kenli Li
NSCC-CS
14:25-14:50 TBA Michael Resch
Sttugart HPC Center
14:50-15:20 Coffee Break
15:20-15:45 Super-AI on Supercomputers: Cases on Earth Science Haohuan Fu
NSCC-WX
15:45-16:10 Merging Data, HPC, and the Endless Frontier of Science Niall Gaffney
TACC
16:10-16:35 Starlight: the Next Generation Service Platform for Supercomputers Yunfei Du
NSCC-GZ
16:35-17:15 Panel
17:15-17:20 Closing

9.Challenges and Opportunities Facing Supercomputing Centers around the World

16/01/2019, 13:00-17:30, Xiangyun Room

(1) The Next 30 Years for NCSA

Speaker

WillIam Gropp
Director and Chief Scientist, National Center for Supercomputing Applications Thomas M. Siebel Chair, Department of Computer Science, University of Illinois in Urbana-Champaign

Abstract

The nature of computing has changed dramatically in the last decade. Cloud computing has provided both an alternative way to access computing and created a dynamic software ecosystem. The availability of data and new methods to work with data are transforming the world. And the end of Dennard scaling has led to a new era of innovations in computer architectures - along with the challenges in programming new architectures. NCSA has a more than 30 year history of being part of the computing revolution; this talk will discuss some of the ways that NCSA is looking forward to the next 30 years.

About the Speaker

William Gropp is Director and Chief Scientist of the National Center for Supercomputing Applications and holds the Thomas M. Siebel Chair in the Department of Computer Science at the University of Illinois in Urbana-Champaign. He received his Ph.D. in Computer Science from Stanford University in 1982 and worked at Yale University and Argonne National Laboratory. His research interests are in parallel computing, software for scientific computing, and numerical methods for partial differential equations. He is a Fellow of ACM, IEEE, and SIAM and a member of the National Academy of Engineering.

(2) Challenges and Opportunities AI Brings to Supercomputing

Speaker

Kenli Li

Abstract

TBA

About the Speaker

TBA

(3) TBA

Speaker

Michael Resch
Director, High Performance Computing Center Stuttgart
Director, Institute for High Performance Computing
Full Professor, University of Stuttgart

Abstract

TBA

About the Speaker

The focus of research of Prof. Resch is currently on the application of supercomputers in engineering and industrial research as well as the scientific theory of simulation. He is leading projects in the fields of High Performance Computing, Cloud Computing, Visualization, Scalable Parallel Algorithm and Programming, and Philosophy of Simulation. At the center of his research is the applicability of mathematical methods and computer science to real world problems.
Prof. Resch has a more than 30 years track record in high performance computing. In 2007 his activities in supercomputing were honored by an invitation to be an invited plenary keynote speaker at SC’07 in Reno, USA. He was winner of the HPC Challenge in 2003 at SC’03 at Phoenix, USA and leader of the group that won the US NSF award for real distributed supercomputing in 1999.
Prof. Resch received an honorary professorship from the Russian Academy of Science (Hon. Prof.) in 2014, an honorary doctoral degree (Dr. h.c.) from the Donetsk National Technical University in 2009 and an honorary doctoral degree (Dr. h.c.) from the Russian Academy of Science / Siberian Branch in 2011.
Prof. Resch is a Principal Investigator (PI) in the national cluster of excellence for “Data-Integrated Simulation Science (SimTech)” funded by the German DFG as part of the German Initiative for Excellence in Research.
Prof. Resch holds a Dipl.-Ing. degree (MSc) in Technical Mathematics of the University of Graz, Austria and a PhD (summa cum laude / with honors) in Engineering from the University of Stuttgart, Germany. In 2002 he held the position of Ass. Prof. at the University of Houston, USA.

(4) Super-AI on Supercomputers: Cases on Earth Science

Speaker

Haohuan Fu
NSCC-WX

Abstract

TBA

About the Speaker

TBA

(5) Merging Data, HPC, and the Endless Frontier of Science

Speaker

Niall Gaffney
Texas Advanced Computing Center

Abstract

The rapid evolution in computing is perhaps best reflected by the advances over the past decade in areas of research in engineering and science; sometimes referred to as the Endless Frontier. The diversity of technologies have produced great discoveries in fields ranging across all aspects of research today using techniques developed for HPC and Machine Learning. Hardware has evolved nearly as quickly, leading to an even more complex set of challenges facing researchers. This talk will review these topics as supported at the Texas Advanced Computing Center (TACC) and the changes we have seen in supporting research. I will also discuss our near future system, Frontera, and the new ways we will continue to advance computing supporting research and discovery in the coming decade.

About the Speaker

TBA

(6) Starlight: the Next Generation Service Platform for Supercomputers

Speaker

Yunfei Du
National Supecomputer Center in Guangzhou

Abstract

TBA

About the Speaker

TBA