Before building Sourcery VSIPL++, you must run a configuration script to tell Sourcery VSIPL++ what C++ compiler you are using and what optional software you wish to use. After running the configuration script, you will build and install the Sourcery VSIPL++ library.
These instructions assume that your shell's current directory is the
sourceryvsipl++-2.2-9 directory created when you
unpacked the VSIPL++ distribution. If you want to allow
Sourcery VSIPL++ to automatically configure itself, run:
> ./configure
You will see output explaining the configuration decisions that Sourcery VSIPL++ is making.
There are several options that you can use to tell Sourcery VSIPL++ about your particular environment.
CXX=path
Use path as the C++ compiler.
If you do not provide this option, Sourcery VSIPL++ will search
for a C++ compiler in your PATH.
CXXFLAGS=flags
Use flags as flags to pass to the
C++ compiler. The default value depends on your compiler. If
you are using multiple flags (like -O2
-ffast-math), you must enclose the
flags in quotes so that the shell
will consider all of the flags as a single argument.
--prefix=directory
Install the library in directory.
Header files will be placed in a subdirectory of
directory named
include; the library itself will be
placed in lib. You will need to have
sufficient permissions to write to the installation directory.
The default installation directory is
/usr/local, which is usually not writable
by non-administrators; therefore, you may want to use your
home directory as an installation directory.
--host=architectureSpecify the host-architecture that Sourcery VSIPL++ will be built for. The default is to build Sourcery VSIPL++ to run native on build machine. This option is useful when cross-compiling Sourcery VSIPL++.
--disable-parallelDo not use a parallel communications library, even if an appropriate MPI library is detected. This option is useful if you want to build a uniprocessor version of Sourcery VSIPL++. By default, MPI support will be included if it is available.
--enable-parallelSearch for and use a communications library for support of multi-processor systems for parallel computation.
--enable-parallel=lib
Search for and use the parallel communications library
indicated by lib. Available
options are lam, mpich2,
intelmpi, openmpi,
mpipro, and pas.
lam selects the LAM/MPI library.
mpich2 selects the MPICH2 library.
intelmpi selects the Intel MPI Library.
openmpi selects then Open MPI library.
mpipro selects Verari's MPI/Pro. This
option is necessary when using MPI/Pro on the Mercury platform.
pas enables the use of Mercury Parallel
Acceleration System (PAS) for parallel services if found.
This option is necessary to use PAS on the Mercury platform,
and when using PAS for Linux clusters.
--with-mpi-prefix=directory
Search for MPI installation in
directory first. MPI headers should
be in directory/include, MPI
libraries in directory/lib, and
MPI compilation commands (either mpicxx or
mpiCC) should be in
directory/bin. This option is
useful if MPI is installed in a non-standard location, or if
multiple MPI versions are installed.
--with-mpi-cxxflags=flags
--with-mpi-libs=flags
In some cases, Sourcery VSIPL++ is unable to automatically detect
the required compiler and linker options to enable MPI. In
these cases, the required C++ compiler flags can be specified
using the --with-mpi-cxxflags option, and the
required linker (library) flags can be specified using the
--with-mpi-libs. These options must be used
together, and when they are used, the specific type of the MPI
library in use must be specified with the
--enable-parallel=
option.
type
--disable-exceptionsDo not use C++ exceptions. Errors that would previously have generated an exception now cause an abort(). This option is useful if you want to build Sourcery VSIPL++ with a compiler that does not implement exceptions. By default, exceptions are used.
--with-ippEnable the use of the Intel Performance Primitives (IPP) if found. Enabling IPP will accelerate the performance of signal processing and view element-wise operations.
--with-ipp=winEnable the use of the Intel Performance Primitives (IPP) for Windows if found. This option is useful when configuring Sourcery VSIPL++ on a Windows system.
--with-ipp-prefix=directory
Search for IPP installation in
directory first. IPP headers
should be in the include subdirectory of
directory and IPP libraries should
be in the lib subdirectory. This option
has the effect of enabling IPP
(i.e. --with-ipp). This option is useful
if IPP is installed in a non-standard location, or if multiple
IPP versions are installed.
--with-ipp-suffix=suffix
Use a processor specific version of the IPP libraries, as
indicated by suffix. For example,
the suffix em64t will select IPP libraries specific to em64t
processors. By default, non-suffix IPP libraries are used,
which determine the architecture at run-time and dynamically
load the appropriate processor-specific libraries. This
option is useful if the automatic dispatcher is not able to
determine the correct architecture.
--with-salEnable the use of the Mercury Scientific Algorithm Library (SAL) if found. Enabling SAL will accelerate the performance of view element-wise operations, linear algebra, solvers, and signal processing operations.
--with-sal-include=directory
Search for SAL header files in directory
first. This option has the effect of enabling SAL
(i.e. --with-sal). This option is useful
if SAL headers is installed in a non-standard location, such
as when using the CSAL library. However, it should not be
necessary when building native on Mercury system.
--with-sal-lib=directory
Search for SAL library files in directory
first. This option has the effect of enabling SAL
(i.e. --with-sal). This option is useful
if SAL libraries is installed in a non-standard location, such
as when using the CSAL library. However, it should not be
necessary when building native on Mercury system.
--with-cuda
Enable the use of NVidia's Compute Unified Device Architecture (CUDA).
This enables the use of certain graphics processing units (GPUs) as
computational accelerators (see NVidia's website for a list of
compatible cards). For FFT support, use --enable-fft=cuda
in addition to this option.
--enable-fft=lib
Search for and use the FFT library indicated by
lib to perform FFTs. Valid choices
for lib include
fftw3, cuda, ipp,
sal, and cvsip which select
FFTW3, CUDA, IPP, SAL, and C VSIPL libraries respectively. A fourth
option, builtin, selects the FFTW3 library
that comes with Sourcery VSIPL++ (default). This option
should be used if an existing FFTW3 library is not available.
If no FFT library is to be used (disabling Sourcery VSIPL++'s
FFT functionality), no_fft should be chosen
for lib. Multiple libraries may be
given as a comma separated list. When performing an FFT,
VSIPL++ will use the first library in the list that can
support the FFT parameters. For example, on Mercury systems
--enable-fft=sal,builtin would use SAL's FFT
when possible, falling back to VSIPL++'s builtin FFTW3
otherwise.
--with-fftw3-prefix=directory
Search for FFTW3 installation in
directory first. FFTW3 headers
should be in the include subdirectory of
directory and FFTW3 libraries should
be in the lib subdirectory. This option
has the effect of enabling FFTW3 for FFTs
(i.e. --with-fft=fftw3). This option is useful
if FFTW3 is installed in a non-standard location, or if multiple
FFTW3 versions are installed.
--disable-fftw3-simdDisable builtin FFTW3 from using SIMD ISA extensions (such as AltiVec or SSE2). By default, FFTW3 uses SIMD ISA extensions because they improve performance. However, this option is useful when building for a platform that does not support the ISA extensions.
--with-lapackEnable Sourcery VSIPL++ to search for an appropriate LAPACK implementation on the platform. If found, it will be used to perform linear algebra (matrix-vector products and solvers).
--with-lapack=lib
Search for and use the LAPACK library indicated by
lib to perform linear algebra
(matrix-vector products and solvers). Valid choices for
lib include mkl,
acml, atlas,
generic, builtin, and
no.
mkl selects the Intel Math Kernel Library (MKL)
to perform linear algebra if found.
mkl_win selects the Intel Math Kernel Library (MKL)
on Windows systems to perform linear algebra if found.
acml selects the AMD Core Math Library (ACML) to
perform linear algebra if found.
atlas selects the ATLAS library
to perform linear algebra if found.
generic selects a generic LAPACK library
(-llapack) to perform linear algebra if found.
builtin selects a version of LAPACK
that doesn't require ATLAS.
no is used to disable searching for a LAPACK
library.
--with-acml-prefix=directory
Search for ACML installation in
directory first. ACML headers
should be in the include subdirectory of
the install directory, whose path depends on the exact version of
the library you have. Similarly, ACML libraries should
be in the lib subdirectory. This option
has the effect of enabling ACML for lapack
(i.e. --with-lapack=acml). This option is useful
if the ACML is installed in a non-standard location, or if multiple
ACML versions are installed.
--with-atlas-prefix=directory
Search for ATLAS installation in
directory first. ATLAS headers
should be in the include subdirectory of
directory and ATLAS libraries should
be in the lib subdirectory, unless otherwise
specified by --with-atlas-include and
--with-atlas-libdir, respectively. This option
has the effect of enabling ATLAS for lapack
(i.e. --with-lapack=atlas). This option is useful
if ATLAS is installed in a non-standard location, or if multiple
ATLAS versions are installed.
--with-atlas-include=directory
Search for ATLAS include headers in
directory first. This option
has the effect of enabling ATLAS for lapack
(i.e. --with-lapack=atlas). This option is useful
if ATLAS is installed in a location that does not fit
the pattern assumed by --with-atlas-prefix.
--with-atlas-libdir=directory
Search for ATLAS library files in
directory first. This option
has the effect of enabling ATLAS for lapack
(i.e. --with-lapack=atlas). This option is useful
if ATLAS is installed in a location that does not fit
the pattern assumed by --with-atlas-prefix.
--with-mkl-prefix=directory
Search for MKL installation in
directory first. MKL headers
should be in the include subdirectory of
directory and MKL libraries should
be in the lib/(arch) subdirectory. This option
has the effect of enabling MKL for lapack
(i.e. --with-lapack=mkl). This option is useful
if MKL is installed in a non-standard location, or if multiple
MKL versions are installed.
--with-mkl-arch=architecture
Used in conjunction with --with-mkl-prefix to
specify which library subdirectory of MKL to use. If
--with-mkl-prefix=
is used to specify the MKL prefix, libraries are searched for
in directorydirectory/architecture. By default
architecture is deduced based on
the platform. This option is useful if this deduction is
incorrect.
--without-cblasDisables the use of the C BLAS API, forcing the use of the Fortran BLAS API. This option is useful if building on a platform that does not provide the C BLAS API.
--with-cbe-sdkEnable the use of the IBM Cell/B.E. Software Development Kit (SDK) version 3.0 or 3.1 if found. Enabling the Cell/B.E. SDK will accelerate the performance of FFTs, vector-multiplication, vector-matrix multiplication, and fast convolution.
--with-cbe-sdk-sysroot=directory
Search for Cell/B.E. SDK libraries and headers in a sysroot at
directory, rather than in the system
root directory (or the default sysroot location, in the case of
SDK version 2.1). This option has the effect of enabling use of
the Cell/B.E. SDK (i.e. --with-cbe-sdk).
This option is used for cross-compilation.
--with-numaEnable the use of libnuma. This is useful on Cell/B.E. systems to insure that SPE resources allocated for accelertion are local to the PPE running VSIPL++.
--with-cvsip
Enable Sourcery VSIPL++ to search for an appropriate C VSIPL
implementation on the platform. If found, it will be used to
perform linear algebra (matrix-vector products and solvers)
and some signal processing (convolution, correlation, and
FIR). If the --enable-fft=cvsip option is
also given, the VSIPL implementation will be used to perform
FFTs.
--with-cvsip-prefix=directory
Search for a C VSIPL installation in
directory first. Headers should be
in the include subdirectory of
directory and libraries should be
in the lib subdirectory. This option has
the effect of enabling the use of a VSIPL back end as if the
option --with-cvsip had been given. This
option is useful if VSIPL is installed in a non-standard
location, or if multiple VSIPL versions are installed.
--enable-only-ref-impl
Configure Sourcery VSIPL++ to be used as the VSIPL++ reference
implementation. When the BSD licensed files are configured
with this option, the result is the VSIPL++ reference
implementation. This option implies the
--enable-fft=cvsip and
--with-cvsip options. Refer to
Section 3.3.4, “Configuration Notes for the Reference Implementation” for
more information on configuring the reference implementation.
--with-pngEnables PNG I/O support, using libpng. By default, PNG support is enabled if libpng is found during configuration.
--enable-simd-loop-fusionEnable VSIPL++ to generate SIMD instructions for loop-fusion expressions (containing data that is SIMD aligned). This option is useful for increasing performance of many VSIPL++ expressions on platforms with SIMD instruction set extensions (such as Intel SSE, or Power VMX/AltiVec). The default is not to generate SIMD instructions.
--enable-simd-unaligned-loop-fusion
Enable VSIPL++ to generate SIMD instructions for loop-fusion
expressions, possibly containing data that is SIMD unaligned.
This option is useful for increasing performance of VSIPL++
expressions that work with unaligned data on platforms with
SIMD instruction set extensions (such as Intel SSE, or Power
VMX/AltiVec).
The default is to follow the setting of
--enable-simd-loop-fusion.
--with-complex=format
Specify the format for storing
complex numbers.
Valid choices for format are
inter and split, which
select interleaved and split storage respectively.
This option is useful when a platform has better
performance using a particular complex storage format.
The default complex storage format is inter.
--enable-timer=timer
Use timer type of timer for
profiling. Valid choices for timer
include none, posix,
realtime, pentiumtsc, and
x86_64_tsc, and power_tb.
By default no timer is used
(
This option is necessary when you intent to use the libary's
profiling or performance API features.
timer=none
none disables profile timing.
posix selects the POSIX timer if present
on the system.
realtime selects the POSIX realtime timer if present
on the system.
pentiumtsc selects the Pentium time-stamp
counter (TSC) timer if present on the system.
x86_64_tsc selects the x86-64 (or em64t)
time-stamp counter (TSC) timer if present on the system.
power_tb selects the Power architecture
timebase counter timer if present on the system.
--enable-cpu-mhz=speed
Use speed MHz as the counter
frequency for the Pentium and x86-64 timestamp counters. By
default, the counter frequency is queried from the operating
system at runtime. This option is useful if the correct
counter frequency cannot be determined.
--with-obj-ext=EXT
Specify EXT as the file extension
to be used for object files. Object files will be
named file..
Default value is determined heuristically by configure.
EXT
--with-lib-ext=EXT
Specify EXT as the file extension
to be used for library archive files. Library archive files will be
named file..
Default value is determined heuristically by configure.
EXT
--with-exe-ext=EXT
Specify EXT as the file extension
to be used for executable files. Executable files will be
named file.
Unlike EXT--with-obj-ext and
--with-lib-ext, no "." is implied.
Default value is determined heuristically by configure.
--enable-shared-acconfigGenerate an acconfig.hpp that can be shared by different configurations by putting macros on the compiler command line. This is useful when building binary packages. Normally an acconfig.hpp file is generated that can only be used by one configuration.
--enable-shared-libsBuild shared libraries as well as static libraries. This requires that position-independent code be generated, which may reduce performance.
Example 3.1, “Configuring Sourcery VSIPL++” shows how to use the configure script to use particular optimization options for the C++ compiler on a system where MPI support is not required. The exact output will vary from system to system, but the output shown here is representative.
Example 3.1. Configuring Sourcery VSIPL++
> ./configure CXXFLAGS="-O2 -ffast-math" --disable-mpi checking build system type... i686-pc-linux-gnu checking host system type... i686-pc-linux-gnu checking for g++... g++ checking for C++ compiler default output file name... a.out checking whether the C++ compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking for bugs in g++ and its runtime... no bugs found checking for openjade... openjade checking for pdfjadetex... pdfjadetex checking for a BSD-compatible install... /usr/bin/install -c configure: creating ./config.status config.status: creating src/vsip/impl/acconfig.hpp config.status: creating src/vsip/GNUmakefile.inc config.status: creating tests/context config.status: creating tests/QMTest/configuration config.status: creating doc/GNUmakefile.inc config.status: creating GNUmakefile config.status: creating src/vsip/impl/acconfig.hpp
When configuring Sourcery VSIPL++ for a Mercury PowerPC system, the following environment variables and configuration flags are recommended:
CXX=ccmc++
This selects the ccmc++ cross compiler as the
C++ compiler.
CC=ccmc
This selects the ccmc cross compiler as the
C compiler.
AR=armc
This selects the armc archiver.
AR_FLAGS=cr
This selects the c (create archive if it
does not exist) and r (replace files in
archive) flags for the armc archiver.
armc does not support the u
flag (only replace files if they are an update).
CXXFLAGS="--no_implicit_include -Onotailrecursion -t architecture --no_exceptions -Ospeed --max_inlining -DNDEBUG --diag_suppress 177,550"
These are the recommended flags for compiling Sourcery VSIPL++ with the GreenHills C++ compiler on the Mercury platform. These flags fall into two categories: those necessary for a correct build, and those optional for good performance. The following are necessary to correctly build the library:
--no_implicit_include
GreenHills enables implicit inclusion by default. This permits the compiler to assume that if it needs to instantiate a template entity defined in a .hpp file it can implicitly include the corresponding .cpp file to get the source code for the definition.
Sourcery VSIPL++ does not use this capability. Leaving this feature enabled will result in multiple symbol definition errors at link-time.
Note: it is only necessary to disable implicit includes when building the library. After the library has been installed, applications using it may enable implicit includes.
-Onotailrecursion
This disables optimization of tail-recursive functions. This optimization has a defect which is triggered by some of Sourcery VSIPL++'s algorithms.
The following flags will improve the performance of the library and applications. These should be used for production.
-t
architecture
This flag directs the compiler to generate code optimized
for processor variant and endian-ness specified by
architecture.
Valid choices are listed in the ccmc++
documentation and include
ppc7400, ppc7400_le,
ppc7445, and ppc7445_le.
--no_exceptions
Disable exception handling, which can have a large
performance overhead with the GreenHills compiler.
This should be used in conjunction
with the configure flag --disable-exceptions.
-Ospeed
This option instructs the compiler to enable all optimizations which improve speed.
--max_inlining
By default, GreenHills will only consider functions composed
entirely of straightline code (no control flow) for inlining.
--max_inlining instructs the compiler
to consider all functions (whether containing control flow
statements or not) for inlining, subject to the usual
restraints in the case of excessively large or complicated
functions.
-DNDEBUG
Disable assertions. This option should be used when configuring the library for performance.
--diag_suppress 177,550
This option suppresses compiler diagnostics warning
about unused variables. When compiling with
-DNDEBUG assertions are removed that
may be the only reference to a variable.
When compiling a development or debug version of the library,
replace -Ospeed -DNDEBUG with -g.
--host=powerpc
Cross compile for the PowerPC processor.
--with-sal
Enable the SAL library.
--enable-fft=sal,builtin
Use SAL and Sourcery VSIPL++ builtin FFTW3 to perform FFT operations. SAL FFT will be used for FFTs with power-of-two sizes, FFTW3 will be used otherwise.
--with-fftw3-cflags="-O2"
Compile Sourcery VSIPL++'s builtin FFTW3 library with
optimization level -O2. (Compiling
FFTW3 with optimization level -O3
produces link-errors with GreenHills C related to the
handling of static functions. CodeSourcery is currently
developing a work-around for this.)
--with-complex=split
Store complex data in split format by default.
--disable-exceptions
Disable the use of exceptions from within the library.
--enable-parallel=mpipro
Enable the use of Verari MPI/Pro for communications.
--enable-timer=realtime
Use the POSIX-realtime timer for profiling.
The file examples/mercury/mcoe-setup.sh is
an example of how to configure Sourcery VSIPL++ for the Mercury
with these options.
Before configuring Sourcery VSIPL++ for a Microsoft Windows systems, the follow prerequisites are recommended:
The Cygwin environment for Windows, including the GNU make and sed packages. Sourcery VSIPL++ uses this as development environment for configuring and building the Sourcery VSIPL++ library. Cygwin is not necessary to build and run Sourcery VSIPL++ applications. For more information on the Cygwin environment, visit http://www.cygwin.com/
Intel C++ for Windows, version 9.1 or later. This may require installation of a Microsoft C++ compiler and Microsoft SDK for windows. For more information on Intel C++ and its requirements: http://www.intel.com/cd/software/products/asmo-na/eng/compilers/279578.htm
Intel IPP and MKL for Windows.
When configuring Sourcery VSIPL++ for a Microsoft Windows system, the following environment variables and configuration flags are recommended:
CXX=icl
This selects the Intel C/C++ compiler icl as the C++ compiler.
CC=icl
This selects the Intel C/C++ compiler icl at the
C compiler.
CXXFLAGS="/Qcxx-features /Qvc8"
These are the recommended flags for compiling Sourcery VSIPL++ with the Intel C++ compiler on Microsoft Windows platforms. The following are necessary to correctly build the library:
/Qcxx-features
This enables standard C++ features for exception handling and RTTI.
/Qvc8
This enables Microsoft Visual Studio 2005 compatibility. If using another version of Visual Studio, please consult the Intel C++ documentation for the correct option.
--build=i686-cygwin
Configure to build library in the cygwin environment.
--host=i686-mingw32
Target the resulting library to run on Microsoft Windows systems with the Win32 API.
--with-ipp=win
Enable the IPP library for Windows.
This requires that the IPP header, library, and DLL directories be
present in your INCLUDE, LIB,
and PATH directories, respectively. Manually
passing these paths to configure in Windows
is not recommended.
--enable-fft=ipp
Use the IPP FFT functions to perform FFT operations.
--with-lapack=mkl_win
Use the MKL library for Windows to implement linear-algebra
operations.
This requires that the MKL header and library directories be
present in your INCLUDE and LIB,
directories, respectively. Manually passing these paths to
configure in Windows is not recommended.
--disable-parallel
Disable parallel service. Sourcery VSIPL++ does not support MPI on Windows at this time.
When configuring Sourcery VSIPL++ for a Cell/B.E. host system, the following environment variables and configuration flags are recommended:
--with-cbe-sdk
Enable use of the Cell/B.E. SDK and the Cell Math Library
(CML). This is necessary to use the Cell/B.E.'s SPE
processors to accelerate VSIPL++ functionality.
If the SDK is not installed in the standard location, the
--with-cbe-sdk-prefix should be used to
specify the location.
--with-cml-prefix=PATH
Specify the installation path of CML. Headers are installed in a subdirectory named include; libraries in one named lib.
To install headers and libraries in other places, use
instead the options --with-cml-include
and --with-cml-libdir.
--with-cml-include=PATH
Specify the directory containing CML header files.
Use this option in conjunction with
--with-cml-libdir.
Do not use with --with-cml-prefix.
--with-cml-libdir=PATH
Specify the directory containing CML libraries.
Use this option in conjunction with
--with-cml-include.
Do not use with --with-cml-prefix.
--with-numa
Enable use of libnuma for SPE/PPE affinity control. This may improve program performance by allocating SPEs close to the PPEs running VSIPL++.
--enable-timer=power_tb
Enable the Power Timebase high-resolution timer. This option is useful when using profiling or running library benchmarks.
Two additional options must be specified when using a non-Cell/B.E. build system to cross-compile Sourcery VSIPL++ for a Cell/B.E. host system.
--host=powerpc-cell-linux-gnu
Define the host system type.
--with-cbe-sdk-sysroot=directory
Specify the Cell/B.E. sysroot location. Typically, this will be
/opt/cell/sysroot on
a standard SDK 3.0 cross-compiler installation.
If you wish to use the BSD-licensed reference-implementation subset of Sourcery VSIPL++, you must configure with the following option:
--enable-only-ref-impl
Build only the reference-implementation subset of Sourcery VSIPL++. If you do not use this option, the complete, optimized implementation of Sourcery VSIPL++ will be built.