PDF - EPCC - University of Edinburgh
Transcription
PDF - EPCC - University of Edinburgh
The newsletter of EPCC, the supercomputing centre at the University of Edinburgh news Issue 74 Autumn 2013 In this issue Managing research data HPC for business Simulating soft materials Energy efficiency in HPC Intel Xeon Phi ARCHER A new UK service for academic research ARCHER is a 1.56 Petaflop Cray XC30 supercomputer that will provide the next UK national HPC service for academic research Also in this issue Dinosaurs! From the Directors Autumn marks the start of the new ARCHER service: a 1.56 Petaflop Cray XC30 supercomputer that will provide the next UK national HPC service for academic research. We have been involved in running national HPC services for over 20 years and ARCHER will continue our tradition of supporting science in the UK and in Europe. Our engagement with supercomputing takes many forms - from racing dinosaurs at science festivals, to helping researchers get more from HPC by improving algorithms and creating new software tools. EPCC staff design and deliver high-quality training, we equip scientists with the skills they need to produce ground-breaking computer simulations, and we undertake research into the software, tools and technologies that will make possible a new generation of exascale supercomputers which will be many, many times more powerful than ARCHER. Big Data activities are also playing an increasingly important part in our academic and industry work. We have always prided ourselves on the diversity of our activities. This issue of EPCC News showcases just a fraction of them. You can find out more about our work on our website and blog: www.epcc.ed.ac.uk/blog Alison Kennedy & Mark Parsons EPCC Executive Directors a.kennedy@epcc.ed.ac.uk m.parsons@epcc.ed.ac.uk Contents 3 PGAS programming 7th International Conference Profile Meet the people at EPCC 4 New national HPC service Introducing ARCHER 7 Big Data Data preservation and infrastructure 9 HPC for industry Making business better 11 Simulation Better synthesised sounds Improving soft matter design 13 Support for science Advancing molecular dynamics 15 Future of HPC A roadmap to the next generation Numerical simulations NAIS’s new GPGPUs 16 Energy efficiency in HPC More efficient parallel and cloud computing implementations on a range of different hardware. We are also porting a number of applications to GPGPUs using OpenACC, including CASTEP (DFT-based materials modelling code), and COSA (frequency domain CFD code). 19 Legacy code Parallelising commercial codes 20 Exascale European research projects The consortium brings together: hardware vendors (Cray and NVIDIA); software providers (CAPS, Allinea, PGI, and Rogue Wave), and research establishments (including Georgia Tech, Oak Ridge and Sandia National Labs, and the Tokyo Institute of Technology). 21 Intel Xeon Phi Our first impressions 23 Training and education MSc in HPC; research software; DiRAC, Summer of HPC 26 Outreach Spreading the word about supercomputers 29 MSc in HPC Study with us Adrian Jackson a.jackson@epcc.ed.ac.uk EPCC has joined the OpenACC consortium. OpenACC is a directives-based parallel programming model, in the vein of OpenMP, designed to enable C, C++, and FORTRAN codes to effectively utilise accelerator technology such as GPGPUs. The consortium works on the OpenACC standard, OpenACC tools, and OpenACC benchmarks and example applications. EPCC has a strong engagement with OpenACC, including OpenACC compiler developers amongst our staff, and we have created a set of OpenACC benchmarks to enable users to evaluate OpenACC www.openacc-standard.org www.castep.org Contact us www.epcc.ed.ac.uk info@epcc.ed.ac.uk +44 (0)131 650 5030 EPCC is a supercomputing centre based at The University of Edinburgh, which is a charitable body registered in Scotland with registration number SC005336. 2 PGAS2013 in Edinburgh The 7th International Conference on PGAS Programming Models visited Edinburgh on the 3rd and 4th October, making its first ever appearance outside the United States! The PGAS conference is the premier forum to present and discuss ideas and research developments in the area of PGAS models, languages, compilers, runtimes, applications and tools, PGAS architectures and hardware features. The keynote talks were given by two highly-regarded experts in the field: Dr Duncan Roweth, a senior principal engineer at Cray, who focussed his talk on hardware support for PGAS-type languages in current (and future) HPC systems Michele Weiland m.weiland@epcc.ed.ac.uk Professor Mitsuhisa Sato from the University of Tsukuba in Japan, who took the opportunity to discuss how PGAS may play a role in the race to the exascale. The conference, which attracted over 60 attendees from across the globe, had a varied programme of research papers, “hot” sessions where speakers introduced work in progress, as well as a poster reception. More information, including links to the papers and proceedings, can be found on the conference website: www.pgas2013.org.uk Staff profile Applications Consultant Eilidh Troup talks about her work here at EPCC. I work as an Applications Consultant on a project called SPRINT (Simple Parallel R INTerface), which allows users of the statistical language R to make use of multi-core and HPC machines without needing any parallel programming skills. We provide parallelised versions of standard R functions which can just be slotted in to replace the usual R function and will then run on many processors behind the scenes. I particularly enjoy working on SPRINT as it is mostly used by biologists. I studied genetics before I became a programmer, and love this opportunity to keep up to date with the latest biology technology. Next Generation Sequencing can rapidly produce terabytes of data that must then be analysed. This amount of data needs a lot of computational power to process and EPCC is well placed to work on this. Next Generation Sequencing can be used for measuring gene expression to diagnose and understand diseases or to sequence genomes, for example to find out which microorganisms are present in a habitat. Eilidh Troup e.troup@epcc.ed.ac.uk I am also involved in EPCC’s public outreach events and love the enthusiasm of children pretending to be part of a computer and working together to sort coloured balls or numbers. People are very interested in the science we support at EPCC and the real hardware that makes a supercomputer is always popular too. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh 3 ARCHER: On target for a bullseye Autumn ushers in a new era for UK supercomputing with the start of the ARCHER (Advanced Research Computing High End Resource) service in Edinburgh. ARCHER is the next national HPC service for academic research and comprises a number of components: accommodation provided by the University of Edinburgh; hardware from Cray; systems support from EPCC and Daresbury; and user support from EPCC. In autumn 2011, the Minister for Science announced a new capital investment in e-infrastructure, which included £43m for ARCHER, the next national HPC facility for academic research. After a brief overlap, ARCHER will take over from HECToR as the UK’s primary Academic research supercomputer. HECToR has been in Edinburgh since 2007. What is ARCHER? The new Cray XC30 architecture is the latest development in Cray’s long history of MPP architectures, which have been supporting fundamental global scientific research for over two decades. The Cray XC30 incorporates two major upgrades to the fundamental components of any MPP supercomputer: the introduction of Cray’s newest network interconnect, Aries; and the use of Intel’s Xeon series of multi-core processors. Each has enhanced capabilities over previous architectures. Aries incorporates the novel dragonfly network topology that provides multi-tier all-to-all connectivity. This new network allows all applications, even those that perform all-to-all style communications, the potential to scale to the full size of the system allowing to tackle problems that might have been considered impossible on previous systems. 4 The latest Intel Xeon Ivy Bridge processors used in ARCHER provide the next generation of computational muscle, with best-in-class floatingpoint performance, memory bandwidth and energy efficiency. Each ARCHER node comprises two 12-core 2.7 GHz Ivy Bridge multi-core processors, at least 64 GB of DDR31833 MHz main memory and all compue nodes are interconnected via an Aries Network Interface Card. ARCHER has 3008 such nodes, ie, 72,192 cores, in only 16 cabinets providing a total of 1.56 Petaflops of theoretical peak performance.Scratch storage is provided by 20 Cray Sonexion Scalable Storage Units, giving 4.4PB of accessible space with sustained read-write bandwidth of over 100GB per second. ARCHER is also directly connected to the UK Research Data Facility, easing the transition of large data sets between high-performance scratch space and long-term archival storage and between successive HPC services. Updates included in the newest versions of the Cray Compilation Environment provide full support for generating highly optimised executables that fully exploit the “Ivy Bridge” processors. Users will also have access to the latest Intel Composer Suite of compilers, and the industry standard GNU Compiler Collection, all of which are At your service The Service Provision function for ARCHER is provided by UoE HPCX Ltd. This includes responsibilities such as systems administration, helpdesk and website provision, and administrative support. The work is subcontracted to EPCC at the University of Edinburgh (EPCC) and the STFC’s Daresbury Laboratory. Service Provision will be delivered by two sub-teams: the Operations and Systems Group led by Mr Michael Brown, and the User Support and Liaison Team led by Dr Alan Simpson. Enabling a smooth transition for users from the HECToR to ARCHER services is one of our key aims. For ARCHER, we will utilise SAFE (Service Administration from EPCC) for both the ARCHER Helpdesk and Project Administration & Reporting functions. The ARCHER website provided by EPCC contains supporting documentation for the service and will also showcase the research that uses the system. The configuration of the ARCHER service will evolve over time to stay in line with users’ needs. Continual Service Improvement will be a key goal, and as such the service will be delivered following the ITIL Best Practice Framework. fully integrated with the feature-rich Cray Programming Environment suite that is familiar to existing HECToR users. ACF: building for the future ARCHER’s Accommodation and Management function is provided by the University of Edinburgh. ARCHER is housed at the University’s Advanced Computing Facility (ACF). The University has a long-term commitment to ensure the ACF is capable of hosting top-end facilities and deliver excellent levels of energy efficiency. In readiness for ARCHER, the ACF was extended, with the addition of 500m2 of Computer Room floor space, and an additional 760m2 plant room to contain the additional electrical and mechanical infrastructure. This included a new high-efficiency 4MW-capacity cooling system and an upgrade to the site’s private high-voltage network that increases the capacity to around 7MW. The project to extend the facility commenced in May 2012, with the building ready for the installation of plant in September (despite the wettest summer on record). The HV switch-room was commissioned in November 2012 and the full capability of the plant commissioned by year end. The facility was fit for purpose two months ahead of schedule and was delivered in excess of specification while under budget. The Cray XC30 and associated storage systems were delivered in September 2013. The installation went very smoothly, with all power and cooling connections made and the system powered up within a few days. Ready for business The acceptance tests of the ARCHER hardware were successfully completed in late October. Usage of ARCHER will ramp up in mid-November, with core research consortia online on November 13th. Remaining grant holders will begin to transfer from HECToR to ARCHER in December 2013. The HECToR Service will cease in March 2014. Computational Science & Engineering Support Computational Science and Engineering (CSE) support on ARCHER is provided by EPCC and includes responsibility for helping users with porting, optimising and developing their codes for ARCHER, ensuring that the correct scientific software is installed on the system to meet user requirements, providing advice and best practices to enable users to exploit ARCHER The newsletter of EPCC, the supercomputing centre at the University of Edinburgh Get in touch! We welcome ideas and opportunities for enhancing the CSE support. If you want to be involved or have any thoughts, please do not hesitate to get in touch via the ARCHER Helpdesk. Access The first EPSRC call for access to ARCHER has already opened, and details can be found on both the EPSRC and ARCHER websites. The first call for eCSE projects has opened, details can be found on the ARCHER website. EPSRC is the managing agent for the HPC facility on behalf of all of the Research Councils. 5 Images. Above: Aries Interconnect. Above right: ARCHER’s water cooling system. Below left: new high-voltage Switch Room. resources, and training and developing scientific software development expertise in the UK research communtity. Our goal for the CSE support is to be as open and inclusive as possible; allowing ARCHER users to draw on the full wealth of expertise available in the UK HPC and computational science community. We will use a mix of established, successful activities and innovative ideas to realise this goal. Embedded CSE programme The Embedded CSE (eCSE) programme expands and refines the successful HECToR dCSE programme to allow software development expertise to be placed in academic research groups where it can provide the most benefit and have the greatest impact. The first eCSE call has already opened (deadline: 14 January, 4pm). Details can be found on the ARCHER website. The in-depth CSE support will be fully integrated into the SAFE Helpdesk, providing a seamless service to users that gives direct access to ARCHER expertise and a rapid response to any queries. Technical Forum The ARCHER Technical Forum is open to all users (and external people who are interested in technical discussion around HPC). The Forum consists of a series of monthly meetings conducted using webinar technology with a wide range of technical experts invited to speak and attend, and a public mailing list for technical discussion. Consortium Contacts We have established a set of Consortium Contacts: HPC experts who will provide a direct link between the research communities using ARCHER and the service itself. These Contacts will allow the research communities to use ARCHER more effectively, have a role in driving the development of the service to meet their needs, and have a simple way to provide feedback to the CSE support team and the service in geenral. Tom Edwards, Cray tedwards@cray.com Mike Brown, EPCC m.w.brown@ed.ac.uk Liz Sim, EPCC e.sim@epcc.ed.ac.uk Alan Simpson, EPCC a.simpson@epcc.ed.ac.uk Training Training will be provided all over the UK through links with the HPC-SIG, HPC Wales, and the STFC’s Daresbury Laboratory. We are consulting people around the UK about the training requirements of different research communities. The lectures from the first course have been recorded and will be publically available on the ARCHER website in the near future. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh ARCHER website www.archer.ac.uk Support For any questions about the ARCHER service, please contact: support@archer.ac.uk. Cray www.cray.com 6 Managing research data Publishing your research used to be straightforward. You’d note your hypothesis, describe your methods, pop in a table of measurement results and a few graphs, then conclude with your analysis, and bingo. Another paper published, with everything necessary included for others to verify and build upon your work. But this model broke during the last couple of decades, and it is becoming more broken as research becomes increasingly – exponentially – digital. So many modern data sets simply do not fit in the paper. Long gone is the single table of measurements; analyses in data-driven science are based on datasets of gigabytes, terabytes and – soon – petabytes. This raises the question: should researchers publish these datasets, and if so, how? For science to remain verifiable and reusable, the answer to the first part must be yes. The answer to the second part has given rise to increasing efforts to create better ways to manage research data. (For a review of the arguments, see the Royal Society report of June 2012, Science as an Open Enterprise.) One of the biggest challenges in effective research data management is dealing with this separation of data from its context. If data is to be stored away from the pages of the publication, we must ensure it is findable, persistent and sufficiently well described – the intelligent openness of Science as an Open Enterprise. And, for credit to be given where it’s due, research data need to be citable. 7 EPCC’s research data archive These criteria are feeding in to the design of a long-term research data archive here at EPCC. We are delighted to host and manage, on behalf of the Research Councils, the UK’s Research Data Facility (RDF), a 26 petabyte combined disk and tape storage system. HPC systems users typically want a big, fast file system, and the current GPFS deployment on the RDF gives just that. But, as projects come to an end, for all the reasons noted above, a big file system is no longer the right environment in which to archive data for the longer term. So, over the next six months we’ll put a long-term data archive service in place. The technology we’ll use to deliver this is currently open, but we hope to leverage results from projects like EUDAT and PERICLES (see opposite). With luck we’ll be able to report a running service in the next issue of EPCC News! Rob Baxter r.baxter@epcc.ed.ac.uk Design aspects of EPCC’s research data archive • Allow remote access clients to add/retrieve data via network services, including, but not limited to, web browsers •Allow flexible authentication, without requiring local Unix login credentials • Use flexible authorisation controls defined by external data sources •Make providing metadata straightforward for depositors •Provide persistent identifiers for data deposits, based on the Handle system (the system behind DOIs, already in use at EPCC through EUDAT and more broadly within the University). PERICLES: archiving digital data Preserving art, records and other items has been a challenge throughout history, not just how to store them but how to help future generations to understand them. Even in the short time digital art and records have been around, this problem has become increasingly apparent and is exacerbated by technology’s rapid cycles. The PERICLES project is attempting to define and develop a framework for managing how digital data is stored and kept relevant and accessible. A small challenge it is not. Although the project has two case studies to look at (Art & Media and Space Science Data), PERICLES does not intend to create solutions just for these areas, or for this moment in time. PERICLES is to consider how to build a framework that will last and adapt through different types of change including policy and technological changes. At the latest meeting in Thessaloniki, the project scoped out detailed scenarios about how a long-term preservation system may be used and what will be required. Alistair Grant a.grant@epcc.ed.ac.uk Within the two case studies, groups of project members went into depth about what happens to new material up to the point of ingest to an archive and the process of ingesting a new object into an archive. This meeting has been highly informative about how to approach the concept of a longlived digital preservation system. Images show Rafael Lozano-Hemmer’s “Surface Tension” (1992), which is in the Tate’s collection. EUDAT: towards a collaborative data infrastructure The European Data project EUDAT is two years old this month. Over this time EUDAT has got to grips with the challenge of integrating five established research infrastructures into the beginnings of a true panEuropean collaborative data infrastructure. And it really is getting there. EUDAT brings together some of Europe’s largest HPC centres with five of its discipline-specific “research infrastructures”, covering linguistics, climate science, seismology, biodiversity and integrative medicine. Its goal is to fashion common data services – standardised ways of managing data and metadata – to realise economies of scale and to create a basis for the preservation, sharing and recombination of research data. EUDAT makes use of common core technologies and is federating and connecting them across Europe, creating “islands” of safelyreplicated and discoverable data which will, over time, merge together into a single resource. There are currently seven such islands in operation, from Edinburgh to Bologna, Barcelona to Helsinki, with another four planned for the coming months. A Joint Metadata Catalogue is harvesting metadata records from five disciplines (and counting), and the Simple Store service for smaller-scale data and individual researchers will be rolled out this autumn. With one year to go of the core project, EUDAT has made remarkable steps from such a complex starting point. We’ve had to take an incremental approach to systems integration, but a core of common data services are now managing data across 20 sites, with more to come. Our final year will be one of stabilisation and consolidation. EUDAT has laid excellent foundations for a truly pan-European haven for today’s – and tomorrow’s – research data. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh Rob Baxter r.baxter@epcc.ed.ac.uk EUDAT common core technologies iRODS, the Integrated RuleOriented Data System from RENCI1 Global Handle System through the European Persistent Identifier Consortium2 GridFTP and Globus Online3 CERN’s Invenio4 CKAN metadata repository5. 1. http://www.renci.org 2. E PIC consortium: http://pidconsortium.eu Global Handle System, http://www.handle. net 3. http://www.globusonline.org 4. http://invenio-software.org 5. ckan.org 8 HPC makes better business SUPERCOMPUTING SCOTLAND business innovation through high performance computing Integrated Environmental Systems (IES) is the world’s leading provider of software and consultancy services focused on making the built environment more energy-efficient, so reducing overheads and CO2 emissions. We worked with IES to improve the performance of its SunCast software, which is used by architects and designers to analyse sun shadows, solar penetration and the effects of solar gains on the thermal performance and comfort of buildings. SunCast processes each hour of a design day in series. In all there are 448 separate calculation tasks (160 diffuse and 288 direct) to be performed for any given model. EPCC ported SunCast to run over Microsoft MPI, so allowing the parallel processing of tasks, one per computer processor. When a processor has completed a calculation, it uses MPI to notify the controlling processor of its results and that it is ready to be assigned another task. Reduced analysis times Now MPI-compliant, SunCast can run on a supercomputer, creating huge time savings. In one extremely complex analysis, the run-time was reduced from an estimated 30 days to 24 hours. Reduction in analysis times allows IES to keep ahead of the competition by delivering faster turnaround times for clients. Reductions in analysis times also allow IES to perform more detailed analysis than previously. 9 New business model MPI-enabled SunCast has opened up a new business model for IES being able to sell SunCast through the cloud on a pay-per-use basis. This offering is open to IES endusers either as a managed service, where IES provides additional reportage, or as a self-managed service. Both routes will provide additional revenue streams to IES. Using the MPI in IES Consultancy has also increased the efficiency – and therefore profitability – of the company’s own consultancy offering. At time of writing, it had been used with 4 live projects with an average analysis time of under 12 hours. These particular projects were very large and complex and would previously have taken several weeks. FORTISSIMO IES is also engaged with EPCC on the FORTISSIMO programme: see opposite. Ronnie Galloway r.galloway@epcc.ed.ac.uk This work was carried out as part of Supercomputing Scotland, a joint EPCC-Scottish Enterprise programme. IES www.iesve.com Supercomputing Scotland www.supercomputingscotland.org EPCC offers faster MATLAB® programs and Simulink® models EPCC’s Accelerator service provides instant, on-demand access to the full suite of MathWorks’ software products. MATLAB® and Simulink® users can now gain significant performance advantages by running their computations and models on large memory, multi-core HPC platforms without the need for costly investment in new computing infrastructure. Our unique service lets you launch computations and simulations on EPCC’s facilities directly from your desktop using MATLAB® Parallel Computing Toolbox (PCT). Parallel up-scaling can be achieved through MATLAB® Distributed Computing Server (DCS), providing access to even greater performance improvements. Full pay-per-use access is provided using on-demand, hourly access to DCS and HPC cycles. DCS product use is billed by MathWorks, whilst use of the HPC infrastructure is billed by EPCC. As an alternative, users can run MATLAB® and Simulink® on EPCC’s platforms within the context of their current perpetual or annual PCT and DCS licences. George Graham g.graham@epcc.ed.ac.uk Find out more To set up a secure Accelerator account, or to request a trial, visit: www.epcc.ed.ac.uk/facilities/ demand-computing Fortissimo! Digital simulation and modelling for European industry EPCC has worked with companies – large and small – since it was set up in 1990. Despite our best efforts, we know there are many companies across Europe who could benefit from HPCenabled modelling and simulation but who don’t use it, either through a lack of knowledge or fear of its costs or complexity. In Scotland, our most recent programme of support for smaller enterprises, Supercomputing Scotland, has been running successfully for almost two years. The pan-European Fortissimo project will complement our existing activities in this area. The PlanetHPC project made the case for greater European Commission investment in support for modelling and simulation for Europe’s companies. Building on this work, EPCC led the development of the Fortissimo project as part of the European Commission’s Framework 7 Factories of the Future initiative. Project structure Fortissimo is split into four equal parts: • A core team of partners, mainly HPC service providers, who will create and manage the Cloud of HPC resources Mark Parsons m.parsons@epcc.ed.ac.uk • An initial tranche of 20 experiments, each involving a company with a modelling and simulation challenge and some experts • Two further Open Calls for experiments which will start at Month 12 and 18 of the project. With total costs of €22 million and funding from the European Commission of €16 million, this is a major project for us. The initial consortium consists of 45 partners and 20 experiments. We expect this to grow to around 90 partners and 50-60 experiments by the end of the project. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh Fortissimo www.fortissimo-project.eu PlanetHPC www.planethpc.eu Supercomputing Scotland www.supercomputingscotland.org 10 The effects of a timpani drum strike modelled over time. Image: Adrian Mouat & Stefan Bilbao. Next Generation sound synthesis When you think about applications for high performance computing and large-scale simulations, you probably don’t think of music. But the Next Generation Sound Synthesis project (NESS) may change that. Until now, most digital sound synthesis has either used primitive abstract methods (such as additive synthesis and FM synthesis) or used combinations of pre-recorded samples to generate music. These methods are computationally cheap but they have their limitations notably, they don’t always sound realistic, they can be hard for musicians to control, and they lack flexibility. A newer method, physical modelling synthesis, promises to overcome these limitations - but at the cost of being much more computationally intensive. In the NESS project, researchers from the Acoustics Group at the University of Edinburgh have teamed up with EPCC to further develop physical modelling synthesis, using HPC techniques to overcome the computational barriers. The goal is to generate the highest quality synthetic sounds possible, with GPU (graphics processing unit) acceleration to help keep run times manageable. 11 The computational difficulty of these problems varies widely; from simple linear 1-dimensional models that can easily run in real-time on a single processor, to 3D models of large spaces that are not feasible to run at all on current hardware due to memory constraints. However, the large problems are very well suited to GPU acceleration as they mostly involve performing the same simple operations over and over again on many different data items - exactly what GPUs are good at. James Perry j.perry@epcc.ed.ac.uk The NESS project started in January 2012 and will run for a total of five years. Several acoustic models, including plates, timpanis and whole rooms, are under development and more are planned. An interface is also being developed to allow visiting composers to make use of the models as easily as possible. • Nonlinear plate and shell vibration NESS is funded by the European Research Council. The project focuses on six areas, covering a range of musical instrument families: • Brass instruments • Electromechanical instruments • Modular synthesis environments • Room acoustics modelling • Embeddings and spatialisation www.ness-music.eu Design rules for new soft materials Soft materials include colloids, pastes, emulsions, foams, surfactant solutions, powders and liquid crystals. Everyday examples are (respectively) paint, toothpaste, mayonnaise, shaving cream, shampoo, talcum powder and the mess that results when soap is left in water. Soft materials are also used in many industrial areas such as drug delivery and electronic displays. Improving existing products and designing new ones are the goals of active research. EPCC works with Prof. Mike Cates at The University of Edinburgh to investigate soft materials as part of a programme funded by the UK Engineering and Physical Sciences Research Council. The intention of this work is to combine theoretical and experimental approaches, alongside computer simulation, to establish scientific design principles that will allow the creation of a new generation of soft materials for use in future technologies. Prof. Cates and his group perform theoretical, computational and experimental work on many different aspects of soft materials; EPCC supports the computational side of these activities by providing expertise in simulation and high performance computing (HPC). Such simulations complement the work of both theorists and experimentalists, and can help to identify design principles from existing materials. This is important in areas where analytical progress is difficult or impossible, and where practical approaches are technically awkward or prohibitively expensive. Important simulation approaches for soft materials, where relevant structure is typically at the “mesoscale”, include atomistic molecular dynamics and coarsegrained methods (which discard atomistic detail in return for larger length and time scales). Irrespective of the exact type of simulation, the computational effort required calls for state-of-the-art HPC. This soft materials research generates close collaboration with other EPCC activities such as the CRESTA project, and makes use of UK Research Council resources, as well as European ones such as PRACE. Even long-established products are continuously being updated or replaced. Such reformulation can make products healthier, safer, or more environmentally-friendly. The process of developing new soft materials, or improving existing ones, usually involves a large element of trial and error. A set of design principles, based on wellunderstood fundamental science, could speed up that process. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh Oliver Henrich o.henrich@epcc.ed.ac.uk Kevin Stratford k.stratford@epcc.ed.ac.uk Soft materials form many consumer products: computer simulation aids their design. www.rsc.org/softmatter Volume 9 | Number 43 | 21 November 2013 | Pages 10209–10428 ISSN 1744-683X PAPER Oliver Henrich et al. Rheology of cubic blue phases 1744-683X(2013)9:43;1-# The group’s work appeared in the Soft Matter journal: Henrich et al. Rheology of cubic blue phases, Soft Matter 9,10243-10256, 2013. 12 In 2011 a call for proposals for transatlantic research collaborations to address “Software Grand Challenges in the Chemical Sciences” was published by EPSRC and the US National Science Foundation. EPCC has risen to the occasion with roles in three of the four consortia currently funded to research and develop software solutions to these challenges. Here are two of them. APES: Advanced Potential Energy Surfaces The APES (Advanced Potential Energy Surfaces) project aims to incorporate novel potential energy surface models into a range of computational chemistry packages including Amber, DL_POLY, ONETEP, and Q-Chem. The choice of a suitable potential energy function is critical to performing meaningful molecular modelling and molecular mechanics simulations. Most established force field models in computational chemistry software use nonpolarisable fixed-point-charge approximations as these are computationally cheap and give reasonable results for equilibrium properties and for homogeneous systems. However, these models fall short when describing manybody effects, dynamical properties, heterogeneous and out of equilibrium systems. This is a major limiting factor for the successful application of computer simulations to a variety of Grand Challenge problems in computational chemistry, biochemistry and materials science. APES will develop and promote the use of polarisable force fields based on AMOEBA (Atomic Multipole Optimized Energetics for Biomolecular Applications), a prominent empirical polarisable force field model that allows atomcentred charges to vary depending on their environment. AMOEBA 13 includes polarisable atomic multipoles derived directly from ab initio quantum mechanical electron densities, and offers clear and systematic improvements in accuracy that make it a prime candidate for use in future grand challenge applications. Arno Proeme a.proeme@epcc.ed.ac.uk Mario Antonioletti m.antonioletti@epcc.ed.ac.uk EPCC, together with the Software Sustainability Institute, will: •P rovide a distributed memory parallelisation of TINKER, the reference implementation of AMOEBA, to take advantage of large-scale compute resources •T est and validate the algorithms used in TINKER and promote interoperability with other packages to promote uptake of advanced polarisable force fields. APES’s outputs will give researchers better tools for understanding the structure and function of molecules. By using open development processes, a community will be built around packages that implement AMOEBA. This should make the future development and adoption of this force field self-sustaining. About APES APES will develop and promote the use of polarisable force fields based on AMOEBA (Atomic Multipole Optimized Energetics for Biomolecular Applications), a prominent empirical polarisable force field model that allows atom-centred charges to vary depending on their environment. The APES project started in April 2013 and will run for three years. A free energy landscape of the Alanine-12 molecule mapped out in two diffusion coordinates determined without a priori knowledge of the system. Key stable and transition structures are labelled as shown. From “Discovering Mountain Passes via Torchlight: Methods for the Definition of Reaction Coordinates and Pathways in Complex Macromolecular Reactions” by Mary A. Rohrdanz, Wenwei Zheng, and Cecilia Clementi. Annual review of physical chemistry 64 (2013): 295-316. ExTASY: Extensible Toolkit for Advanced Sampling and analYsis The ExTASY project tackles the problem of understanding the behaviour and function of complex macromolecules such as proteins, DNA, and other biomolecules through sampling with Molecular Dynamics. The key problem is that to preserve accuracy, MD must use a time-step of only a few femtoseconds, whereas many events of biological importance occur on the order of seconds to hours. Even with state-of-the art simulation software, high-performance computing and purpose-built hardware, only milliseconds of MD are feasible today. The ExTASY project proposes a three-pronged attack on the problem: • Support for high-performance high-throughput execution of ensembles of MD calculations - managing thousands of coupled parallel jobs, and orchestrating ‘big data’ movement in a heterogeneous environment. • Developing novel analysis tools to allow on-the-fly control of the simulations to rigorously bias sampling towards the rare events. • Providing a flexible and portable interface to couple existing MD programs with new algorithms for ultra-large time steps integration. If we can achieve these three objectives together in a single framework or toolkit - ExTASY - then we will truly make a step change in our ability to compute and understand the dynamics of these complex macromolecular systems. The ExTASY project consortium is led by Prof. Cecilia Clementi of Rice University and totals seven partner institutions, including EPCC. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh Iain Bethune i.bethune@epcc.ed.ac.uk Locally-scaled diffusion maps Mapping out the transformations of biomolecules from one folded state to another is of key importance in understanding their function. The LSDMap program, developed in collaboration with EPCC, allows these mappings to be generated up to 100x faster than conventional methods. LSDMap analysis is one of the tools which will become part of ExTASY. http://sourceforge.net/projects/ lsdmap 14 Next Generation Computing What will next generation computing be like, and how should research, development and innovation be directed to achieve our vision of the future? This is the question that EPCC and our partners eutema, Optimat Ltd and 451 Research are attempting to answer under a contract with the European Commission to advise on its Horizon2020 work programme. Under the contract, we will make recommendations for the future direction of research in the form of a roadmap. It has become increasingly difficult to make predictions about computing because many challenges that we see today (eg the need for energy-efficient computing, more heterogeneous computing and dealing with big data) will almost certainly cause disruptive changes. There are also non-technical issues to consider. The massive take-up of social media for example is something few could have foreseen fifteen years ago. Will there be corresponding phenomena in the next decade that will define the markets for next generation computing? Mark Sawyer m.sawyer@epcc.ed.ac.uk We are trying to predict the future by looking at the past and by consulting today’s experts. We have interviewed leading industry and academic figures, held an online consultation, and run a workshop to analyse possible future scenarios. Together with examining existing technology and market trends that we know are happening today, we are beginning to see a picture of what next generation computing may look like. The project will publish its findings in early 2014 and will hold a further workshop to present the results to researchers. GPGPU hardware for Numerical Simulations NAIS The NAIS project (Numerical Algorithms and Intelligent Software), which EPCC is a member of, has recently purchased 8 NVIDIA K20 GPGPUs and associated computer nodes for use by NAIS members and researchers. The GPGPUs complement similar hardware that EPCC hosts for NAIS (NVIDIA Tesla GPGPUs) and will allow NAIS researchers to explore issues of performance and programmability associated with using such computing resources for scientific simulation. housed in a 2U server chassis, along with two 8-core Intel Xeon processors and 128GB of memory. This provides a double precision peak performance per node of over 5.3 TFlop/s, with 16 processor cores and 7488 GPU cores available in each node. We are currently installing these GPGPUs in two compute nodes that will be attached to EPCC’s existing hydra cluster, enabling access to the compilers and libraries installed on that system and facilitating scheduling of access to these GPGPUs through the common batch system used on hydra. Four GPGPUs are each Whilst these GPGPU resources have been purchased for NAIS, they are a resource that can be used by the computational simulation, mathematics, and HPC communities in general. If you are interested accessing these systems, please contact us. 15 Adrian Jackson, EPCC a.jackson@epcc.ed.ac.uk About NAIS NAIS is a collaboration between the universities of Edinburgh, Strathclyde and Heriot Watt. It is funded by EPSRC and the Scottish Funding Council (SFC). These GPGPU resources are funded directly by a grant from SFC to provide high performance computing resources for researchers. www.nais.org.uk Adept kicks off! The Adept project is motivated by the desire to understand the energy consumption of parallel codes on various hardware platforms, from HPC to Embedded systems, and how programmers can optimise their codes for power efficiency as well as runtime, memory usage and I/O. Led by EPCC, Adept will build on the HPC community’s skills in writing efficient parallel software, and the Embedded computing community’s skills in working within strict power budgets. EPCC’s technical contributions will be providing benchmark codes, real-life case studies based on scientific software, and provisioning and instrumenting hardware. Kick-off Adept started in September and we held our kick-off meeting in Edinburgh shortly after that. The project partners travelled from Sweden (Uppsala University and Ericsson), Belgium (Ghent) as well as from just across town (Alpha Data). This first meeting discussed the general logistics of the project, but its key purpose was to ensure everyone involved could get up to speed quickly and focus on technical discussions. Key issues We took the opportunity to start addressing some of the key issues the project wants to tackle: •W hat micro-, kernel- and application-level benchmarks can we develop that will allow us to measure power consumption of hardware components and to evaluate power use of different programming models and parallel algorithms? Michele Weiland m.weiland@epcc.ed.ac.uk •W hat hardware architectures, both representative of HPC and Embedded, are we going to measure power consumption on, and how will we achieve the high granularity we require for our modelling tool? •W hat information and data will we need to extract from both the benchmarks and the hardware, and how will it feed into our performance and power model? The meeting was wrapped up after a day and a half of in-depth technical discussions and the general feeling was that we had made an excellent start to the project. We will all get together again in Ghent in early December. In the meantime, work will start in earnest at the different sites and we hope to be able to report on first results in the near future – watch this space! The newsletter of EPCC, the supercomputing centre at the University of Edinburgh For more information on Adept, including announcements of upcoming events, see: www.adept-project.eu Adept is partially funded by the 7th Framework Programme of European Commission. It started on the 1st September 2013 and is set to run for 3 years. 16 ECO2Clouds: energy-efficiency in cloud computing The EU-funded ECO2Clouds project is investigating how to make cloud computing more energy-efficient. The overall goal of the project is to couple the functional and economic advantages of cloud computing with measures that enable cloud providers and applications developers to be more aware of the impact their operational and design decisions have on the environment. The project was developed as an extension of the BonFIRE cloud testing platform, however the approach is general and could be applied to other cloud platforms. The ECO2Clouds utilities track the energy usage on a cloud infrastructure at three different layers: • Infrastructure layer: ie on the underlying physical host computer and site where it resides • Virtual Machine (VM) layer: ie the VMs that run on the host computer • Application/Experiment layer: the application running on the cloud VMs. To store and make available these metrics to all ECO2Clouds modules, the ECO2Clouds Accounting Service was devised. A component of this Service, the Monitoring Collector, has been the recent focus of the EPCC team. Monitoring Collector The Monitoring Collector is the component of the ECO2Clouds Accounting Service tasked with 17 tracking all ECO2Clouds relevant resources, gathering their associated eco-metrics and recording them in a persistent database. These metrics are then utilised to inform decisions made by the ECO2Clouds Scheduler: the software component which uses optimisation techniques to produce energy efficient application deployment configurations. The Monitoring Collector comprises three elements: the Monitor, the Collector and the accounting database. A relational database system is used with the schema allowing for the association and dependencies between cloud resources at the different layers to be successfully captured. Generally, these are that an experiment can contain zero or more virtual machines while a virtual machine can belong to only a single experiment, a virtual machine can only be on a single physical host machine at any one time but may migrate to a different host during its lifetime, a host can only exist at a single site, and a site can contain multiple hosts. Dominic Sloan-Murphy dsloanm@epcc.ed.ac.uk Monitoring collector block interaction diagram. Message management The Monitor subscribes to the BonFIRE Management Message Queue (MMQ) through which all BonFIRE experiment notifications pass. BonFIRE is the cloud infrastructure currently used by ECO2Clouds. All messages from the queue are filtered by relevance, ie are they associated with ECO2Clouds and are they an experiment or compute resource event? The Monitor extracts identifying information, such as the experiment ID, from the filtered messages and updates the accounting database to enable tracking of experiment resources and statuses. The Collector is a concurrent component responsible for periodically gathering metric values for each of the resource layers. Metrics are measured by Zabbix, the enterprise-class monitoring solution capable of capturing statistics for all resource types. For host and site metrics, each site maintains a Zabbix infrastructure “aggregator” charged with aggregating all monitoring data related to that site and its physical host machines. Similarly, each ECO2Clouds experiment contains an experiment aggregator responsible for the application and VM layer metrics. The Collector first queries the accounting database for a list of active resources. This list is then employed to selectively query the appropriate aggregators, using an extension to the BonFIRE Application Programming Interface, for all ECO2Clouds relevant metrics. This process is then repeated at an established polling rate dependent on the desired age of the data to make available to the Scheduler and other parts of the accounting service. Through the Monitoring Collector, the ECO2Clouds Scheduler therefore has access to the necessary information to enable it to inform cloud providers and application developers of the impact that particular configurations and deployments will have on their environment. This tool will therefore enable them to make environmentally aware decisions. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh About the project ECO2Clouds is a collaboration between: • EPCC • ATOS (Spain) • University of Manchester (UK) • The University of Stuttgart’s High Performance Computing Centre HLRS (Germany) • Politecnico di Milano (Italy) • Inria (France). http://eco2clouds.eu 18 Addressing parallelism in legacy code Over the past few years, multicore processors have become standard for all forms of computer, from mobile phones, to laptops, through to supercomputers. Processor manufacturers, unable to deliver cost-effective performance improvements through increased clock speed, have resorted to coupling together multiple cores into a CPU. Whilst Multicore and Manycore (such as GPGPUs) processors provide unprecedented parallelism and speeds in individual computing devices, the processing power of the individual cores in those processors has declined. Parallelisation problem This presents a problem for the wide range of existing applications currently used by companies and individuals for their work and play. Ensuring these applications can realise the performance potential of modern hardware, and indeed that they don’t decline in performance due to the reduction in power of individual cores, requires the investment of significant effort and skill in parallelising software for these architectures. However, few software engineers have the experience of parallel programming needed to convert companies’ large legacy code-base to parallelism. This skill shortage represents a significant problem, which DynaPar addresses. DynaPar implements an assisted19 parallelisation approach that overcomes the limitations of the more usual source-code analysis tools by incorporating information from actual runs of the application with representative datasets. We have demonstrated an ability to deliver 96% of the performance of a manual parallelisation approach1, 2, taking a fraction of the time to complete and with limited demands on the programmer’s understanding of parallelisation. Platform portability DynaPar not only helps to address the parallel-programming skill shortage, but it also makes applications more sustainable through platform portability. Using conventional approaches, taking a parallel application from one computer architecture to another is a time-consuming and complicated process. DynaPar effectively automates the porting process by using machine learning: a little preparation to train the tool for a new architecture allows one to compress the porting task into a short, clearly defined process. We are currently working with companies in Edinburgh to evaluate the performance and useability of the tool on existing C and C++ codes, and following this evaluation period we will be looking to roll out a full product, probably integrated into modern development environments (such as Eclipse), to help companies in parallelising programs. Adrian Jackson, EPCC a.jackson@epcc.ed.ac.uk Björn Franke, Institute for Computing Systems Architecture, School of Informatics bfranke@inf.ed.ac.uk EPCC, working with the School of Informatics at Edinburgh, has been developing a parallelisation tool called DynaPar, designed to assist developers with parallelising serial programs. The collaboration, funded through the Numerical Algorithms and Intelligent Software (NAIS) project, builds on research undertaken in Informatics to identify parallel patterns in computer programs. 1. G.Tournavitis, Z.Wang, B.Franke and M.O’Boyle: Towards a Holistic Approach to Auto-Parallelization: Integrating Profile-Driven Parallelism Detection and Machine-Learning Based Mapping, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09), Dublin, Ireland, June 15, 2009 2. G.Tournavitis and B.Franke: Semi-Automatic Extraction and Exploitation of Hierarchical Pipeline Parallelism Using Profiling Information, Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT ‘10), Vienna, Austria, September 11-15, 2010. www.nais.org.uk Exascale research in Europe Participants of the exascale projects’ meeting in Barcelona. CRESTA is one of three complementary exascale projects funded by the EC. CRESTA’s software focus sits comfortably with the DEEP and Mont-Blanc projects’ focus on developing new hardware. Together, these projects underpin Europe’s strategy for developing, producing and exploiting exascale platforms. SC’13 activities Collaboration activities have significantly increased between these projects over the past year. Following a series of successful joint birds-of-a-feather (BOF) sessions at SC and ISC, we expect our SC’13 BOF on ‘Building on the European Exascale Approach’ to stimulate interesting debate. The three projects will share a booth on the exhibit floor at SC’13, with demos from all three projects. CRESTA will demonstrate its tools (TUD’s VAMPIR and MUST, Allinea DDT and Allinea MAP) and its applications (such as ECMWF’s numerical weather prediction code IFS, and HemeLB, the computation Haemodynamics code from UCL for simulating blood flow). We will also show a series of videos highlighting the socio-economic impact of the different applications engaged in CRESTA. Future collaboration The three projects had a successful meeting in Barcelona this summer, where we identified a number of key areas of potential collaboration. These included: testing CRESTA software on the other projects’ hardware; tools training on each other’s profiling and debugging tools; and testing novel programming models such as OpenACC and SMPSs on each other’s applications. MPI for exascale Finally, it is worth highlighting EPCC’s new exascale project, developed from work within CRESTA. Our collaborators at KTH in Sweden are leading an FP7 project on preparing MPI for the exascale. This will explore innovative, potentially disruptive concepts and algorithms for message passing. Its ‘Exascale MPI’ workshop at SC’13 will be a good opportunity to learn more. This is an exciting time in European exascale research. In its final year, CRESTA will produce a series of systemware software components, enhanced versions of our co-design applications and new novel scientific examples. We will keep you posted on their arrival. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh Lorna Smith l.smith@epcc.ed.ac.uk CRESTA (Collaborative Research into Exascale Systemware, Tools and Applications) investigates the software challenges of utilising future exascale resources. Coordinated by EPCC, this FP7 project has now entered its final year. www.cresta-project.eu Find us at SC’13 • Exascale MPI workshop Nov 22, 9.00–13.00 • Building on the European Exascale Approach Nov 19, 12:15–13:15 • European Exascale Projects (Booth 374) • EPCC, University of Edinburgh (Our own booth: 3932) 20 Performance of CP2K for an ab initio MD run with 64 water molecules on the Xeon Phi, showing the effect of task placement. First impressions of the Intel® Xeon Phi™ Based on Intel’s Many Integrated Core (MIC) architecture, each Xeon Phi card comprises 60 physical cores with 4-way simultaneous multi-threading (SMT), meaning that there are up to 240 virtual threads available to the user. Each core has a 512-bit wide SIMD vectorprocessing unit. With a clock speed of 1.053 GHz, this results in an aggregate peak double precision floating-point performance of 1.01 TFLOP/s - all contained inside a single PCI card package, and using less than 225 Watts! which we have successfully ported to the Phi and are currently optimising with support from the PRACE project. One of our MSc students, Jonathan Low, ported a medical imaging application as part of his dissertation project, and we have several external users working with codes in the field of Geoscience and Optics. If we consider that just 10 years ago, the National Service HPCx Phase 1 delivered 2.2 TFLOP/s peak, occupied over 40 cabinets and was the sole Capability Computing resource for the entire UK computational science community, then it seems we have come a rather long way in 10 years! The Xeon Phi cards can be programmed using a number of different models including OpenMP, MPI, Intel Thread Build Blocks (TBB), Intel Cilk+, and OpenCL. Intel emphasises the ease-of-use compared with competitors such as CUDA on the NVIDIA GPU platform. To cross-compile code for the MIC architecture, all the user needs to do is use the Intel C/C++/Fortran compiler with the –mmic flag and ensure that any libraries required by the code are also compiled for Xeon Phi. This makes porting existing parallel codes much easier than having to re-write large amounts of code in another language. Xeon Phi at EPCC Two Intel Xeon Phi 5110P cards were installed in May as an extension to our ‘Hydra’ cluster and were made available to staff and students for testing and evaluation. The majority of our initial investigations have involved the CP2K materials science application, 21 For the benefit of others who may be considering Xeon Phi, our key findings are below. Programmability Fiona Reid f.reid@epcc.ed.ac.uk Iain Bethune i.bethune@epcc.ed.ac.uk Over the last few months at EPCC we have been evaluating the new Xeon Phi co-processor from Intel, which recently powered the Chinese Tianhe-2 cluster to the number 1 spot on the Top500 list. Performance of FFT libraries on Xeon Phi. Problem sizes are typical to those used by CP2K. Application scalability and low memory usage To get the best performance from the Xeon Phi, you need an application which can scale up to 240 cores. Unlike conventional processors, the Xeon Phi needs to keep as many of its 240 virtual cores busy as possible. However, the Xeon Phi cards have relatively small amounts of memory - 8GB total, or around 34MB per thread. Most applications designed for memory-rich, multi-core CPUs will therefore run out of memory before enough threads can be generated to make full use of the Xeon Phi processors. New algorithms that minimise memory use or maximise sharing between threads may be needed. Task placement Even for an application with 240 low-memory threads, if the threads are poorly distributed across the physical cores then very poor performance can result as the cores will transfer much more memory across the ring interconnect which couples the cores to main memory. The figure on the opposite page shows the performance of MPI, OpenMP and MPI/OpenMP versions of CP2K running in native mode on the Xeon Phi. It is very important to place threads as close as possible to their parent process while maintaining overall load balance of tasks, particularly when running in mixed-mode. Performance tuning In addition to parallelism, serial code must also be tuned for the Xeon Phi in order to get maximum performance. Arranging loops to allow compiler vectorisation (the MIC SIMD unit can perform 8 double-precision FMA per cycle) or by utilising specially tuned libraries (such as Intel’s MKL) for key computational kernels of your code is crucial. The image above shows a comparison of the performance of the widely-used FFTW 3.3.3 library versus the Intel MKL implementation of FFTW3, which has been tuned to specifically for the Xeon Phi. In our tests, MKL outperformed FFTW3 by up to 6x for 1D FFTs and 3x for 3D FFTs. For codes which spend a considerable time doing Fourier Transforms, using the correct library could have a significant impact on performance. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh 22 MSc in High Performance Computing Students and staff from all three MScs. This year we welcome new students from 10 different countries to our MSc in High Performance Computing. The School of Physics & Astronomy has recently launched new MSc programmes in Theoretical and Mathematical Physics, so our students are now part of a larger taught postgraduate community in the School. It is an exciting time for the MSc in HPC, as this year’s students will be the first to have access to ARCHER, the new UK National Supercomputer hosted in Edinburgh (see p4 for more details). At the recent two-day induction event, our new students were introduced to the University, to EPCC, and most importantly to their fellow students and teaching staff. Not only do the students come from a wide range of countries, they have a diverse range of backgrounds; some have recently graduated from undergraduate studies, some have been in employment, while others have been doing research work in other or related fields. They have all now come together to learn about parallel programming and HPC technologies. Industrial projects In addition to a wide range of academic dissertation projects, this year our MSc students will have the opportunity to undertake their dissertation project with a local 23 company. By undertaking an industry-based dissertation project, students will have the opportunity to enhance their skills and employability by tackling a realworld industry project, gaining workplace experience, exploring potential career paths and building relationships with local companies. In addition there will be an opportunity to win the Summer Industry Project Prize. Cluster Competition We will again be offering a team of students the opportunity to participate in the International Supercomputing Conference’s Student Cluster Competition as part of their MSc dissertation project. Student teams from around the world build high-performance clusters and compete against each other to achieve the maximum performance from a set of benchmarks and applications. The MSc dissertation is considered by many students to be the culmination of their degree and we’re pleased to be able to offer such a wide range of project opportunities that cater for the wide range of interests, backgrounds and aspirations of our students. Crystal Lei yuhua.lei@ed.ac.uk 2012 dissertations The class of 2012 recently received their degree classifications. To see their MSc dissertations go to: www.epcc.ed.ac.uk/msc/overview/ programme-structure/mscdissertations/. MSc in HPC www.epcc.ed.ac.uk/msc Industrial projects www.msc-projects.ph.ed.ac.uk/ ISC’14 Student Cluster Competition http://hpcadvisorycouncil.com/ events/2014/isc14-studentcluster-competition/ Workshop for research software engineers The Software Sustainability Institute’s inaugural Workshop for Research Software Engineers was held at the Oxford e-Research Centre in September. It brought together a wide range of interested parties to discuss the challenges facing the application of software to research, from funding models to infrastructure provision. The day was split into two - with the morning dedicated to sharing of experiences and best practices including keynotes from Mark Hahnel who left academia to develop Figshare, and Professor Ian Gent, author of The Recomputation Manifesto. The afternoon was spent in groups discussing and proposing solutions to the issues affecting software engineers who support research. These are wide-ranging and won’t be immediately resolved but there was a consensus that a strong community, fostered by events such as the workshop, should inform policy-making at the SSI. They in turn can encourage funders and institutions to apply metrics that more effectively value the contribution of RSEs in the future, hopefully leading to more recognition and higher-quality software and research. Software Sustainability Institute Mark Woodbridge, Imperial College London m.woodbridge@imperial.ac.uk Software Sustainability Institute www.software.ac.uk DiRAC driving test roll-out under way The “driving test” developed by The Software Sustainability Institute, Software Carpentry and the DiRAC consortium is now being rolled out across DiRAC’s regional sites. DiRAC is the UK’s integrated supercomputing facility for theoretical modelling and HPCbased research in particle physics, astronomy and cosmology. The driving test is a basic software skills aptitude test which covers useful, and essential, software development skills including: • the shell and automation • version control • testing • code review • using public/private keypairs • secure shell. Training coordinators at DiRAC’s sites in Durham, London, Leeds, Edinburgh, Leicester, Exeter and Cambridge are collectively running through 70 post-doctoral research associates, 130 PhDs and 40 other new users by the end of this year between 16th September and early December. The ongoing cohort is estimated to be 30-40 a year. The test is designed to encourage researchers to undertake training in essential software development skills to both benefit their research and to help ensure DiRAC’s resources are used as efficiently and effectively as possible. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh Mike Jackson m.jackson@epcc.ed.ac.uk DiRAC at EPCC www.epcc.ed.ac.uk/facilities/dirac 24 We know what you did last summer Image shows polarizing microscope texture of a thin film of liquid crystal. Pic: redorbit.com The PRACE Summer of HPC is an outreach initiative which allowed 24 students from across Europe to spend two months at a High Performance Computing centre working on a visualisation project to produce a demo. The students came from ten European countries, and their projects ranged from modelling bioflow in coronary arteries to visualizing plasma turbulence. They all shared a keen interest in computer simulation as a scientific methodology, a desire to learn more about it and share their knowledge with other young scientists – to “spread some HPC magic”, as one student, Vojtech Bardiovsky, put it. Training The Summer of HPC began with a training week at EPCC in Edinburgh. Five full-time days of courses in MPI, Open MP and Scientific Visualisation were taught by Summer of HPC staff. In the evenings the students found the time to explore Edinburgh. After their training was completed, they flew off to their host centres to begin work on their projects. Four students were hosted at EPCC: Antoine Dewilde (Belgium), 25 Marko Misic (Serbia), Simone de Camillis (Italy), and Stamatia Tourna (Greece). As Stamatia said in her blog: “‘A Belgian, a Serbian, an Italian and a Greek are at a bus stop’ could be the beginning of a joke, but in this case it is the beginning of our day!” Projects The work the students did was far from a joke, however. These were exciting times – Marko, who worked on the CP2K code with Iain Bethune, had to defeat the supercomputing Hydra along the way, and Antoine, together with Nick Brown, built virtual dinosaurs and then raced them. Stamatia, with the help of Nick Johnson, worked on implementing Python to work with ScoreP’s tracing library, and Simone, supervised by Oliver Henrich and Kevin Stratford, was looking inside the liquid crystal display with ParaView. Irina Nazarova i.nazarova@epcc.ed.ac.uk “These two months working at EPCC were wonderful; Edinburgh really welcomed us. It was an amazing summer and I would definitely suggest to anyone to give it a try. It could really be a lifetime’s experience.” Stamatia Tourna Read more on the blog... http://summerofhpc.prace-ri.eu Supercomputing for the masses For the second year running, EPCC attended both the British Science Festival – one of Europe’s largest celebrations of science, engineering and technology – and Bang Goes the Borders, a local event that targets families and is held in Melrose, Scotland. We hoped to enthuse and inform the general public, especially young minds, about what supercomputers are, what they are used for, and to even let them have a go on HECToR, the UK national supercomputing service, using our dinosaur-racing demo (see p26). For the dino-racer, we used a model of Argentinosaurus, a large quadruped that walked what is now Argentina some 96-94 million years ago. Our demo allows three different aspects of the dinosaur to be configured: the foot size, the leg size and the body size. The model is then passed on to GaitSym (running on HECToR) which precalculates its motion. Several modified dinosaurs are placed on a running track and the race begins! Excited children cheered on their creature to victory or ignominy – a bad design risks causing your dinosaur to fall over before reaching the finishing line. Alongside this we have another demo showing how parallelism can solve problems more quickly by sorting coloured balls into boxes. Individuals and small groups would compete against the clock to complete the sorting task. If too many are working at once they might get in the way – a synchronisation problem. Having learned about parallelism we showed some HPC hardware – a Cray XT4, a Cray T3E and a Connection Machine blade – comparing these to more familiar modern desktop hardware. All the activities work independently of each other and people can participate in the parts that interest them. We will continue to visit schools and science events to show why supercomputers are important and to convince young people to choose a career in science. The newsletter of EPCC, the supercomputing centre at the University of Edinburgh Mario Antonioletti m.antonioletti@epcc.ed.ac.uk Both science festivals were extremely busy, with over 500 people visiting our exhibits. Dino-racing proved to be a big draw. More dinosaur models are in the pipeline, especially bipeds like T. rex! You can read more about our outreach work on the EPCC blog: www.epcc.ed.ac.uk/blog 26 Running with dinosaurs We have developed a dino-racing demo based on the pioneering work by the Animal Simulation Lab at Manchester University The Animal Simulation Laboratory at the University of Manchester investigates animal locomotion by creating computer simulations. It is perhaps best known for its work simulating dinosaur movements and, by building up accurate models based upon fossil evidence, the team have deduced the likely movements and top speeds of these prehistoric creatures. HECToR, the UK national supercomputing service, was used in conjunction with their GaitSym simulator to do much of the hardcore computation and palaeontologists have learned much from the results of this work. Dinosaurs have a special appeal to many and so we thought we might be able to use GaitSym and these models as the basis of an outreach demonstration. This actually forms an ideal illustration of HPC because simulation is increasingly becoming the third research methodology, complementing theory and experimentation. Deducing dinosaur movements provides a very clear example of where scientists need to use simulation to test their theories, because experimentation is simply not possible. 27 Our end goal was for the public to easily configure their own dinosaur, simulate its movements on HECToR and then race it against other people’s creatures to see who could design the fastest one. This allows people to do raw science at our outreach events - they actually design these creatures and then use HPC to validate their dinosaur before watching it race other similarly designed creatures. Initially we developed a prototype, which connected to GaitSym running on HECToR and allowed us to race existing dinosaurs that the team at Manchester had very helpfully provided models for. By far the most detailed model was the Argentinosaurus and whilst our early version was certainly work in progress, the visual impact of dinosaur skeletons plodding across the screen was still impressive. This gave us confidence that, with some further development, dinosaur racing could prove to be a successful outreach activity. Our early work coincided with project calls for the Summer of HPC programme, which offered undergraduate and junior postgraduate university students Nick Brown n.brown@epcc.ed.ac.uk the opportunity to spend two months of the summer at European HPC centres (see p24). The programme was specifically after visualisation projects. Further development of our virtual palaeontology prototype seemed like an ideal fit and Antoine from the Université libre de Bruxelles joined us for the summer. outreach demo, and it would be sitting alongside our other exhibits. How would it work outside of the lab? Would it stand up to a full day of heavy use? Would the network connection to HECToR be good enough? And, most importantly, would the general public engage with it and get a clear idea of the importance of HPC? Antoine certainly had his work cut out as there was plenty to do - such as configuration of the dinosaurs, improvements to the graphics and reporting real-time simulation usage. Lots of development was done over the eight-week period. By the time Antoine left, all of our project goals had been met and Virtual Palaeontology had developed from a rough prototype to a polished application. The doors opened at 10am, and it wasn’t long before we had a race going between two dinosaurs. Not just kids, but adults too enjoyed configuring their own dinosaur to see how well it would perform. We found out plenty along the way, such as configuring dinosaurs is trickier than you might first think! Not all configurations will work and poorly designed creatures will fall over, so half the challenge for participants is completing the race. Some people thought that a smaller one would go faster, and others that a larger one would be best. Not all creatures were stable but all at least managed to run a few metres before falling over and probably 60% of dinosaurs completed the race. Regardless of whether a dinosaur completed the race or not, we gave the designer a certificate, with an image of their customised creature and vital statistics such as its weight, height and top speed. It was with some trepidation that we arrived in Newcastle for the British Science Festival. This was to be the debut of our Virtual Palaeontology Throughout the day the demo was kept busy, and at times people were queuing up to try to create the fastest creature! The newsletter of EPCC, the supercomputing centre at the University of Edinburgh Our Virtual Palaeontology demo went down very well at the British Science Festival, and it has since been well received at two more outreach events. We have plenty of ideas for further developments. Some of the best have come from questions asked about dinosaurs and HPC by the public during these events. You can read more about our dino-racing demo on the EPCC blog: www.epcc.ed.ac.uk/blog 28 Postgraduate Master’s Degree in High Performance Computing Scholarships available for 2014/15 This MSc is offered by EPCC, an institute at the University of Edinburgh. EPCC is one of Europe’s leading supercomputing centres and operates ARCHER, a 72,000-processor Cray XC30 system. ARCHER is the new UK academic High Performance Computer System. This MSc will equip participants with the multidisciplinary skills and knowledge to lead the way in the field of High Performance Computing. Through our strong links with industry, we also offer our students the opportunity to undertake their Master’s dissertation with one of a wide range of local companies. The University of Edinburgh is consistently ranked among the top 50 universities in the world*. *Times Higher World University Ranking www.epcc.ed.ac.uk/msc The newsletter of EPCC, the supercomputing centre at the University of Edinburgh 29