Hi-Res PDF - Polyhedron

Transcription

Hi-Res PDF - Polyhedron
Newsletter
Summer 2007
With Dual-Core processors now outselling single cores and quad-cores just arriving
on the market, we look at how developers can exploit this extra processing power,
and stay ahead of their competitors.
I
f you buy a new computer today, the chances are
that it will have a dual or even a quad-core
processor. Intel and AMD have taken a break
from endlessly ramping up clock
speed, and are instead putting
several CPUs into a single socket - a
trend which will undoubtedly continue
for a few years.
To get the maximum performance
from these new architectures,
programmers have to divide their
problem up, so that several CPUs can
work on it simultaneously, and then
merge the results into a coherent
whole before presenting them to the user. This can
be a daunting task, with many new ways to go
wrong and get confused. Thankfully, there are tools
to ease the pain, by providing standardised ways to
analyse and implement parallel solutions, and to
visualize and debug the resulting software systems.
Analysis
The first step in working towards parallelization is to
analyze the existing code. The Intel VTune
Performance Analyzer has been
designed to quickly find application
bottlenecks by using different techniques
to gather tuning data. The Sampling
technique runs with a very low overhead
and quickly shows where the application
is spending most of its time and the Call
Graph profiler gives a pictorial view of program flow
and calling sequences.
Introduce Parallelism
The simplest option, provided by some compilers
(see table on page 2) is "Auto-Parallelization". This
requires no user intervention - the compiler
automatically detects loops that can be executed in
parallel and creates appropriate code. This can be
useful, but it's not a panacea: you can usually do
better with some programmer input.
World Wide Web
OpenMP
OpenMP is the industry standard for developing portable
multithreaded applications. The compiler directives are
an easy and powerful way to
convert sections of serial code
into parallel code. Parallelism is
added incrementally so that the
sequential program gradually
evolves into a parallel program.
OpenMP is normally restricted to
"shared memory" systems, though
Intel have recently introduced
"Cluster OpenMP", which extends
the model to clusters.
MPI
MPI is designed for "clusters" - groups of separate
computers, each with its own memory, which exchange
data via a high-speed network. However MPI programs
also run on shared memory systems, often with
efficiency similar to OpenMP. The MPI model is quite
different to OpenMP - you have to think of muliple copies
of a program passing messages to each other, rather
than a single program spawning new threads. It is
harder to program, but more flexible; it is the most
common choice for high performance super-computers.
Debugging and Tuning
There are many new classes of error in parallel
programming. For example a "data race"
occurs when multiple threads access the same
memory and results depend which gets there
first. Threads may also stall or block each
other. The Intel Thread Checker and Thread
Profiler can be used to detect data races,
pinpoint errors and highlight thread imbalances.
Similarly problems arise with MPI, and in addition, the
time spent passing messages can eat into the gains
from parallelization. Intel's Cluster Toolkit for Linux
provides an excellent Trace Analyzer and Collector to
graphically show how time is being spent in each
separate thread.
Polyhedron's website is widely respected
as a source for impartial reference material
Visit us at http://www.polyhedron.com/
Compiler Comparison
Windowscompilers
compilers
Windows
ABSOFT 10
INTEL 10
LAHEY 7.1
NAG 5.0
PGI
PGI
7.x
Salford
Salford 5.0
5.10
OpenMP
YES*
YES
NO
YES
YES
NO
64-bit support
YES
YES
NO
NO
YES
NO
2003 extensions
NO
YES
NO
YES
NO
NO
VS2005 Integration
NO
YES
Soon
NO
YES
YES
Linux compilers
ABSOFT 10
INTEL 10
LAHEY 8.0
NAG 5.0
PGI 7.x
Pathscale 3.0
OpenMP
YES*
YES
YES
YES
YES
YES
Auto-Parallelization
YES*
YES
YES
NO
YES
NO
64-bit support
YES
YES
YES
YES
YES
YES
2003 extensions
NO
YES
NO
YES
NO
NO
* Separate MP compiler
Portland PGI
Compilers & Tools
P
olyhedron are pleased to announce their new
partnership with Oregon based The Portland
Group. The company offers high performance
scalar and parallel Fortran, C and C++ compilers
and tools for 32-bit and 64-bit workstations, servers
and clusters running either Windows or Linux.
The PGI Visual Fortran product fully integrates the
PGI suite of Fortran compilers into Microsoft* Visual
Studio* 2005 with standard features such as syntax
colouring, tips and keyword completion and
enhanced features such as graphical symbolic
debugging of multi-thread and OpenMP applications.
PathScale (E
veryKnownOptimizationPath)
High Performance 32-bit & 64-bit Compiler Suite.
New to Polyhedron - PathScale Compiler Suite
s C, C++, and Fortran 77/90/95 compilers.
s Industry leading optimizations
s Complete support for OpenMP 2.0 (including
WORKSHARE)
s Complete support for 64-bit and 32-bit x86 compilation
s Code generation for AMD64 ABI, AMD Opteron, and Intel
EM64T
s QLogic optimized AMD Core Math Library (available for
download)
s New advanced serial debugger — Pathdb
s Compatible with GNU/gcc tool chain and popular Third
Party debuggers
s Supported on SUSE, RedHat, and Fedora Linux
We pride ourselves on our technical support and can offer advice to help you get the most out of your programming.
Alternatively, if you need to use our programming services, no job is too big or small. We use the software that we sell!
V
7.1of Winteracter, the Fortran GUI and graphics toolkit has now been released and includes the following
features:
6
6
6
6
6
6
6
6
6
6
6
3D models can now be split into parts allowing transformations, visibility,
material and names to be applied to sections of a model
3D DXF support has been upgraded to include the interrogation of the
number of vertices and facets, part recognition and faster loading of 'shared
vertices'
Database interrogation via ODBC on Windows, Linux and Mac OS X
Support for multi-line graphics text-blocks
Faster bitmap rotation on Windows
X/Winteracter now available for Mac OS X/x86 using Intel or g95 compilers
Support for Absoft Pro Fortran v10/Win32 as well as Win64, Linux32 and
Linux64
WiDE improvements to Fortran source reformatter and source code
exchange between Windows and Linux/Mac OS X
Help Editor: HTML tag insert options, access to Microsoft's HTML Tag Reference, Expanded documentation
Advanced documentation search options under Windows and new wsearch tool on Linux and Mac OS X
Additional help options and HTML tag insertion options in Troubleshooter editor
All trademarks are acknowledged
T
ecplot 360 is CFD & Numerical Simulation Visualization Software. Finally, with just one tool you can analyze and
explore complex datasets, arrange multiple XY, 2D and 3D plots, create
animations & then communicate your results with brilliant, high-quality output.
Tecplot 360 is designed for speed. Smarter loading of data results in a faster time to
display the first image and gives the ability to open files that were previously not
possible. Unlike the previous single-threaded version, Tecplot 360's intensive
computing operations are now spread across all available CPU's resulting in faster
streamtraces, slices and iso-surfaces.
Tecplot Focus
Tecplot Focus is a great way to get into Tecplot functionality if you don't need the
support of CFD data formats, CFD analysis and transient data. Its great price still
includes features such as; Multi-frame layout, Macros and automation, XY, 2D and
3D plotting and multi-platform support.
®
Intel Compilers V10
®
Intel Visual Fortran compilers now include Visual Studio
2005 PPE, so you only need to purchase a copy of VS if
you are doing mixed language programming.
The Professional Editions of Intel Fortran for Windows,
®
Linux and MAC OS include Intel Math Kernel Library
®
(MKL ), and customers can upgrade from Standard to
Professional for the price of Professional support.
Other new Fortran compiler features include:
• More Fortran 2003 features (Win/Linux/MAC)
• Updated COM Server Wizard (Win)
• Improved performance & threading (Win/Linux/MAC)
• Security checking & diagnostics (Win/Linux/MAC)
• Optimization reports (Win/Linux/MAC)
• Support for latest multi-core processors
(Win/Linux/MAC)
• 64-bit Mac OS X support (MAC)
Intel C++ new features include:
• Improved performance & threading (Win/Linux/MAC)
• Security checking & diagnostics (Win/Linux/MAC)
®
®
• Windows Vista and Visual Studio 2005 support (Win)
• Optimization reports (Win/Linux/MAC)
• Support for latest multi-core processors
(Win/Linux/MAC)
®
Customers can also upgrade Intel C++ Standard to
®
Professional for the price of Professional support. Intel
®
C++ Professional adds Intel Threading Building Blocks,
Intel® Integrated Performance Primitives and MKL®.
Intel Visual C++ does not include Microsoft Visual Studio
PPE - it still requires Microsoft Visual C++ or above.
G
INO 7.0 is now available and includes the following new features:
6
6
6
6
6
6
6
6
6
6
6
6
Importing of 2D and 3D DXF files including facilities to enquire the entity
count, graphical extent and list of layer names
Import DXF polymesh surface for interpretation by GINOSURF
Access to Hardware fonts by name and point size
Generate BMP/JPEG/PNG image containing OpenGL graphics
Generate colour-scaled XY plots together with automatic graduated colour
bar
Extend the cut and fill surface functionality catering for break lines and site
boundaries
Cater for vertical fault lines in surfaces allowing for vertical discontinuity
between two heights
Addition of Spinner/up-down control in GINOMENU
Generate Manifest files for maintaining Windows XP look and feel
Allow definition of HyperText link callbacks in GINOMENU
Improved Code Editor management in GINOMENU Studio including new Source Code viewer
Support for Visual Studio 2005 including full online Help integration
Lahey LF64 for 64-bit Linux
LF64 is available in two configurations, Express and Professional. LF64 PRO adds auto-parallelization, OpenMP
compatibility, the Winteracter Starter Kit, WiSK, for creating Windows GUIs and displaying graphics, thread-safe BLAS
and LAPACK, Polyhedron's Automake utility, and the Fujitsu SSL2 math library (thread-safe for parallel applications).
LF64 complies code for 64-bit machines but the code won’t run on 32-bit computers. LF95 for Linux 32-bit will run on
32 or 64 bit Linux machines. However, it will only produce 32-bit code.
B
ased on Mathematica, the tool of choice for
scientific research, engineering analysis,
modelling and technical education,
gridMathematica 2 delivers an optimized parallel
environment for modern multiprocessor machines,
clusters, grids and supercomputers.
Features include:
6
Parallelization at the Mathematica language
level
6
Support for multiprocessor machines, clusters
and grids
6
High-performance MathLink communication
protocol optimized for all common configurations
6
Efficient, adaptive load balancing
6
User-programmable scheduling
6
Support for tracing and debugging
Ease of Development
gridMathematica introduces only a small number of new
parallel computing constructs, and users familiar with
Mathematica can transition to gridMathematica without
difficulty. Furthermore, programs written in Mathematica
can be easily modified to run on a grid. Even users who
are new to Mathematica can use its high-level
programming capabilities and thousands of built-in functions
to solve grid-computing problems that used to require
thousands of lines of code in C or Fortran.
Platform Independence
gridMathematica is platform independent and can be used on
dedicated
multiprocessor
machines as well
as on
homogeneous and
heterogeneous
clusters. The only
technical
requirement, apart
from the ability to
run Mathematica, is
a TCP/IP
connection
between the
individual
computing nodes. This connection allows customers to run
the same code on any available machines without any
porting work. It also makes it easy to build ad-hoc clusters out
of under-utilized computers or to take advantage of low-use
periods.
Polyhedron Iterative Linear Solver
Many of you will have seen that, on our web site, there is a short paper
about Nested Factorization, an iterative linear solver which was
developed over 25 years ago by Dr John Appleyard, one of the
founders of Polyhedron and his then colleague Dr Ian Cheshire. Ever
since, this algorithm has been at the core of the ECLIPSE reservoir
simulator from Schlumberger. However Nested Factorization is not
widely known outside the oil industry. The NF benchmark on the
Polyhedron compiler comparison pages includes a simple
implementation of that algorithm.
Over the past couple of years, Polyhedron has developed a radical new
version of this algorithm, which unlike the original, is applicable to
general sparse matrices - particularly those arising from space filling
meshes. Serial and MPI versions of the new algorithm have been
implemented successfully in the current version of ECLIPSE software
and results have been extremely good.
Polyhedron is now looking to exploit this new algorithm in other
applications and industries. If you have an application that depends on
the fast solution of huge sparse matrices using iterative methods,
please email john.solver@polyhedron.com to arrange a meeting.
Visit our newly designed web-site at www.polyhedron.com for independent compiler comparisons, advice on programming and helpful articles.
Silverfrost FTN95 V5.10
Absoft Pro Fortran
Find your bugs fast with FTN95. Download the trial
version from our web-site to evaluate the compiler. See
for yourself how you can save hours using
CHECKMATE to find your undefined variables.
NEW – IMSL 32-bit & 64-bit numerical libraries for Pro Fortan
v10 under Mac/Intel are now shipping. Currently IMSL
numerical libraries are supported on over 70 platforms
worldwide, but IMSL for MacOS/Intel is available only through
Absoft.
Polyhedron Software Ltd, Linden House, 93 High Street, Standlake, WITNEY, OX29 7RH. United Kingdom
Tel (+44/0)1865-300579, Fax (+44/0)1865-300232 sales@polyhedron.com
www.polyhedron.com