3.3. Configuration

Before building Sourcery VSIPL++, you must run a configuration script to tell Sourcery VSIPL++ what C++ compiler you are using and what optional software you wish to use. After running the configuration script, you will build and install the Sourcery VSIPL++ library.

These instructions assume that your shell's current directory is the sourceryvsipl++-2.2-9 directory created when you unpacked the VSIPL++ distribution. If you want to allow Sourcery VSIPL++ to automatically configure itself, run:

> ./configure

You will see output explaining the configuration decisions that Sourcery VSIPL++ is making.

There are several options that you can use to tell Sourcery VSIPL++ about your particular environment.

CXX=path

Use path as the C++ compiler. If you do not provide this option, Sourcery VSIPL++ will search for a C++ compiler in your PATH.

CXXFLAGS=flags

Use flags as flags to pass to the C++ compiler. The default value depends on your compiler. If you are using multiple flags (like -O2 -ffast-math), you must enclose the flags in quotes so that the shell will consider all of the flags as a single argument.

--prefix=directory

Install the library in directory. Header files will be placed in a subdirectory of directory named include; the library itself will be placed in lib. You will need to have sufficient permissions to write to the installation directory. The default installation directory is /usr/local, which is usually not writable by non-administrators; therefore, you may want to use your home directory as an installation directory.

--host=architecture

Specify the host-architecture that Sourcery VSIPL++ will be built for. The default is to build Sourcery VSIPL++ to run native on build machine. This option is useful when cross-compiling Sourcery VSIPL++.

--disable-parallel

Do not use a parallel communications library, even if an appropriate MPI library is detected. This option is useful if you want to build a uniprocessor version of Sourcery VSIPL++. By default, MPI support will be included if it is available.

--enable-parallel

Search for and use a communications library for support of multi-processor systems for parallel computation.

--enable-parallel=lib

Search for and use the parallel communications library indicated by lib. Available options are lam, mpich2, intelmpi, openmpi, mpipro, and pas.

lam selects the LAM/MPI library.

mpich2 selects the MPICH2 library.

intelmpi selects the Intel MPI Library.

openmpi selects then Open MPI library.

mpipro selects Verari's MPI/Pro. This option is necessary when using MPI/Pro on the Mercury platform.

pas enables the use of Mercury Parallel Acceleration System (PAS) for parallel services if found. This option is necessary to use PAS on the Mercury platform, and when using PAS for Linux clusters.

--with-mpi-prefix=directory

Search for MPI installation in directory first. MPI headers should be in directory/include, MPI libraries in directory/lib, and MPI compilation commands (either mpicxx or mpiCC) should be in directory/bin. This option is useful if MPI is installed in a non-standard location, or if multiple MPI versions are installed.

--with-mpi-cxxflags=flags --with-mpi-libs=flags

In some cases, Sourcery VSIPL++ is unable to automatically detect the required compiler and linker options to enable MPI. In these cases, the required C++ compiler flags can be specified using the --with-mpi-cxxflags option, and the required linker (library) flags can be specified using the --with-mpi-libs. These options must be used together, and when they are used, the specific type of the MPI library in use must be specified with the --enable-parallel=type option.

--disable-exceptions

Do not use C++ exceptions. Errors that would previously have generated an exception now cause an abort(). This option is useful if you want to build Sourcery VSIPL++ with a compiler that does not implement exceptions. By default, exceptions are used.

--with-ipp

Enable the use of the Intel Performance Primitives (IPP) if found. Enabling IPP will accelerate the performance of signal processing and view element-wise operations.

--with-ipp=win

Enable the use of the Intel Performance Primitives (IPP) for Windows if found. This option is useful when configuring Sourcery VSIPL++ on a Windows system.

--with-ipp-prefix=directory

Search for IPP installation in directory first. IPP headers should be in the include subdirectory of directory and IPP libraries should be in the lib subdirectory. This option has the effect of enabling IPP (i.e. --with-ipp). This option is useful if IPP is installed in a non-standard location, or if multiple IPP versions are installed.

--with-ipp-suffix=suffix

Use a processor specific version of the IPP libraries, as indicated by suffix. For example, the suffix em64t will select IPP libraries specific to em64t processors. By default, non-suffix IPP libraries are used, which determine the architecture at run-time and dynamically load the appropriate processor-specific libraries. This option is useful if the automatic dispatcher is not able to determine the correct architecture.

--with-sal

Enable the use of the Mercury Scientific Algorithm Library (SAL) if found. Enabling SAL will accelerate the performance of view element-wise operations, linear algebra, solvers, and signal processing operations.

--with-sal-include=directory

Search for SAL header files in directory first. This option has the effect of enabling SAL (i.e. --with-sal). This option is useful if SAL headers is installed in a non-standard location, such as when using the CSAL library. However, it should not be necessary when building native on Mercury system.

--with-sal-lib=directory

Search for SAL library files in directory first. This option has the effect of enabling SAL (i.e. --with-sal). This option is useful if SAL libraries is installed in a non-standard location, such as when using the CSAL library. However, it should not be necessary when building native on Mercury system.

--with-cuda

Enable the use of NVidia's Compute Unified Device Architecture (CUDA). This enables the use of certain graphics processing units (GPUs) as computational accelerators (see NVidia's website for a list of compatible cards). For FFT support, use --enable-fft=cuda in addition to this option.

--enable-fft=lib

Search for and use the FFT library indicated by lib to perform FFTs. Valid choices for lib include fftw3, cuda, ipp, sal, and cvsip which select FFTW3, CUDA, IPP, SAL, and C VSIPL libraries respectively. A fourth option, builtin, selects the FFTW3 library that comes with Sourcery VSIPL++ (default). This option should be used if an existing FFTW3 library is not available. If no FFT library is to be used (disabling Sourcery VSIPL++'s FFT functionality), no_fft should be chosen for lib. Multiple libraries may be given as a comma separated list. When performing an FFT, VSIPL++ will use the first library in the list that can support the FFT parameters. For example, on Mercury systems --enable-fft=sal,builtin would use SAL's FFT when possible, falling back to VSIPL++'s builtin FFTW3 otherwise.

--with-fftw3-prefix=directory

Search for FFTW3 installation in directory first. FFTW3 headers should be in the include subdirectory of directory and FFTW3 libraries should be in the lib subdirectory. This option has the effect of enabling FFTW3 for FFTs (i.e. --with-fft=fftw3). This option is useful if FFTW3 is installed in a non-standard location, or if multiple FFTW3 versions are installed.

--disable-fftw3-simd

Disable builtin FFTW3 from using SIMD ISA extensions (such as AltiVec or SSE2). By default, FFTW3 uses SIMD ISA extensions because they improve performance. However, this option is useful when building for a platform that does not support the ISA extensions.

--with-lapack

Enable Sourcery VSIPL++ to search for an appropriate LAPACK implementation on the platform. If found, it will be used to perform linear algebra (matrix-vector products and solvers).

--with-lapack=lib

Search for and use the LAPACK library indicated by lib to perform linear algebra (matrix-vector products and solvers). Valid choices for lib include mkl, acml, atlas, generic, builtin, and no.

mkl selects the Intel Math Kernel Library (MKL) to perform linear algebra if found.

mkl_win selects the Intel Math Kernel Library (MKL) on Windows systems to perform linear algebra if found.

acml selects the AMD Core Math Library (ACML) to perform linear algebra if found.

atlas selects the ATLAS library to perform linear algebra if found.

generic selects a generic LAPACK library (-llapack) to perform linear algebra if found.

builtin selects a version of LAPACK that doesn't require ATLAS.

no is used to disable searching for a LAPACK library.

--with-acml-prefix=directory

Search for ACML installation in directory first. ACML headers should be in the include subdirectory of the install directory, whose path depends on the exact version of the library you have. Similarly, ACML libraries should be in the lib subdirectory. This option has the effect of enabling ACML for lapack (i.e. --with-lapack=acml). This option is useful if the ACML is installed in a non-standard location, or if multiple ACML versions are installed.

--with-atlas-prefix=directory

Search for ATLAS installation in directory first. ATLAS headers should be in the include subdirectory of directory and ATLAS libraries should be in the lib subdirectory, unless otherwise specified by --with-atlas-include and --with-atlas-libdir, respectively. This option has the effect of enabling ATLAS for lapack (i.e. --with-lapack=atlas). This option is useful if ATLAS is installed in a non-standard location, or if multiple ATLAS versions are installed.

--with-atlas-include=directory

Search for ATLAS include headers in directory first. This option has the effect of enabling ATLAS for lapack (i.e. --with-lapack=atlas). This option is useful if ATLAS is installed in a location that does not fit the pattern assumed by --with-atlas-prefix.

--with-atlas-libdir=directory

Search for ATLAS library files in directory first. This option has the effect of enabling ATLAS for lapack (i.e. --with-lapack=atlas). This option is useful if ATLAS is installed in a location that does not fit the pattern assumed by --with-atlas-prefix.

--with-mkl-prefix=directory

Search for MKL installation in directory first. MKL headers should be in the include subdirectory of directory and MKL libraries should be in the lib/(arch) subdirectory. This option has the effect of enabling MKL for lapack (i.e. --with-lapack=mkl). This option is useful if MKL is installed in a non-standard location, or if multiple MKL versions are installed.

--with-mkl-arch=architecture

Used in conjunction with --with-mkl-prefix to specify which library subdirectory of MKL to use. If --with-mkl-prefix=directory is used to specify the MKL prefix, libraries are searched for in directory/architecture. By default architecture is deduced based on the platform. This option is useful if this deduction is incorrect.

--without-cblas

Disables the use of the C BLAS API, forcing the use of the Fortran BLAS API. This option is useful if building on a platform that does not provide the C BLAS API.

--with-cbe-sdk

Enable the use of the IBM Cell/B.E. Software Development Kit (SDK) version 3.0 or 3.1 if found. Enabling the Cell/B.E. SDK will accelerate the performance of FFTs, vector-multiplication, vector-matrix multiplication, and fast convolution.

--with-cbe-sdk-sysroot=directory

Search for Cell/B.E. SDK libraries and headers in a sysroot at directory, rather than in the system root directory (or the default sysroot location, in the case of SDK version 2.1). This option has the effect of enabling use of the Cell/B.E. SDK (i.e. --with-cbe-sdk). This option is used for cross-compilation.

--with-numa

Enable the use of libnuma. This is useful on Cell/B.E. systems to insure that SPE resources allocated for accelertion are local to the PPE running VSIPL++.

--with-cvsip

Enable Sourcery VSIPL++ to search for an appropriate C VSIPL implementation on the platform. If found, it will be used to perform linear algebra (matrix-vector products and solvers) and some signal processing (convolution, correlation, and FIR). If the --enable-fft=cvsip option is also given, the VSIPL implementation will be used to perform FFTs.

--with-cvsip-prefix=directory

Search for a C VSIPL installation in directory first. Headers should be in the include subdirectory of directory and libraries should be in the lib subdirectory. This option has the effect of enabling the use of a VSIPL back end as if the option --with-cvsip had been given. This option is useful if VSIPL is installed in a non-standard location, or if multiple VSIPL versions are installed.

--enable-only-ref-impl

Configure Sourcery VSIPL++ to be used as the VSIPL++ reference implementation. When the BSD licensed files are configured with this option, the result is the VSIPL++ reference implementation. This option implies the --enable-fft=cvsip and --with-cvsip options. Refer to Section 3.3.4, “Configuration Notes for the Reference Implementation” for more information on configuring the reference implementation.

--with-png

Enables PNG I/O support, using libpng. By default, PNG support is enabled if libpng is found during configuration.

--enable-simd-loop-fusion

Enable VSIPL++ to generate SIMD instructions for loop-fusion expressions (containing data that is SIMD aligned). This option is useful for increasing performance of many VSIPL++ expressions on platforms with SIMD instruction set extensions (such as Intel SSE, or Power VMX/AltiVec). The default is not to generate SIMD instructions.

--enable-simd-unaligned-loop-fusion

Enable VSIPL++ to generate SIMD instructions for loop-fusion expressions, possibly containing data that is SIMD unaligned. This option is useful for increasing performance of VSIPL++ expressions that work with unaligned data on platforms with SIMD instruction set extensions (such as Intel SSE, or Power VMX/AltiVec). The default is to follow the setting of --enable-simd-loop-fusion.

--with-complex=format

Specify the format for storing complex numbers. Valid choices for format are inter and split, which select interleaved and split storage respectively. This option is useful when a platform has better performance using a particular complex storage format. The default complex storage format is inter.

--enable-timer=timer

Use timer type of timer for profiling. Valid choices for timer include none, posix, realtime, pentiumtsc, and x86_64_tsc, and power_tb. By default no timer is used (timer=none This option is necessary when you intent to use the libary's profiling or performance API features.

none disables profile timing.

posix selects the POSIX timer if present on the system.

realtime selects the POSIX realtime timer if present on the system.

pentiumtsc selects the Pentium time-stamp counter (TSC) timer if present on the system.

x86_64_tsc selects the x86-64 (or em64t) time-stamp counter (TSC) timer if present on the system.

power_tb selects the Power architecture timebase counter timer if present on the system.

--enable-cpu-mhz=speed

Use speed MHz as the counter frequency for the Pentium and x86-64 timestamp counters. By default, the counter frequency is queried from the operating system at runtime. This option is useful if the correct counter frequency cannot be determined.

--with-obj-ext=EXT

Specify EXT as the file extension to be used for object files. Object files will be named file.EXT. Default value is determined heuristically by configure.

--with-lib-ext=EXT

Specify EXT as the file extension to be used for library archive files. Library archive files will be named file.EXT. Default value is determined heuristically by configure.

--with-exe-ext=EXT

Specify EXT as the file extension to be used for executable files. Executable files will be named fileEXT. Unlike --with-obj-ext and --with-lib-ext, no "." is implied. Default value is determined heuristically by configure.

--enable-shared-acconfig

Generate an acconfig.hpp that can be shared by different configurations by putting macros on the compiler command line. This is useful when building binary packages. Normally an acconfig.hpp file is generated that can only be used by one configuration.

--enable-shared-libs

Build shared libraries as well as static libraries. This requires that position-independent code be generated, which may reduce performance.

Example 3.1, “Configuring Sourcery VSIPL++” shows how to use the configure script to use particular optimization options for the C++ compiler on a system where MPI support is not required. The exact output will vary from system to system, but the output shown here is representative.

Example 3.1. Configuring Sourcery VSIPL++

> ./configure CXXFLAGS="-O2 -ffast-math" --disable-mpi
checking build system type... i686-pc-linux-gnu
checking host system type... i686-pc-linux-gnu
checking for g++... g++
checking for C++ compiler default output file name... a.out
checking whether the C++ compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for bugs in g++ and its runtime... no bugs found
checking for openjade... openjade
checking for pdfjadetex... pdfjadetex
checking for a BSD-compatible install... /usr/bin/install -c
configure: creating ./config.status
config.status: creating src/vsip/impl/acconfig.hpp
config.status: creating src/vsip/GNUmakefile.inc
config.status: creating tests/context
config.status: creating tests/QMTest/configuration
config.status: creating doc/GNUmakefile.inc
config.status: creating GNUmakefile
config.status: creating src/vsip/impl/acconfig.hpp

3.3.1. Configuration Notes for Mercury Systems

When configuring Sourcery VSIPL++ for a Mercury PowerPC system, the following environment variables and configuration flags are recommended:

  • CXX=ccmc++

    This selects the ccmc++ cross compiler as the C++ compiler.

  • CC=ccmc

    This selects the ccmc cross compiler as the C compiler.

  • AR=armc

    This selects the armc archiver.

  • AR_FLAGS=cr

    This selects the c (create archive if it does not exist) and r (replace files in archive) flags for the armc archiver. armc does not support the u flag (only replace files if they are an update).

  • CXXFLAGS="--no_implicit_include -Onotailrecursion -t architecture --no_exceptions -Ospeed --max_inlining -DNDEBUG --diag_suppress 177,550"

    These are the recommended flags for compiling Sourcery VSIPL++ with the GreenHills C++ compiler on the Mercury platform. These flags fall into two categories: those necessary for a correct build, and those optional for good performance. The following are necessary to correctly build the library:

    • --no_implicit_include

      GreenHills enables implicit inclusion by default. This permits the compiler to assume that if it needs to instantiate a template entity defined in a .hpp file it can implicitly include the corresponding .cpp file to get the source code for the definition.

      Sourcery VSIPL++ does not use this capability. Leaving this feature enabled will result in multiple symbol definition errors at link-time.

      Note: it is only necessary to disable implicit includes when building the library. After the library has been installed, applications using it may enable implicit includes.

    • -Onotailrecursion

      This disables optimization of tail-recursive functions. This optimization has a defect which is triggered by some of Sourcery VSIPL++'s algorithms.

    The following flags will improve the performance of the library and applications. These should be used for production.

    • -t architecture

      This flag directs the compiler to generate code optimized for processor variant and endian-ness specified by architecture. Valid choices are listed in the ccmc++ documentation and include ppc7400, ppc7400_le, ppc7445, and ppc7445_le.

    • --no_exceptions

      Disable exception handling, which can have a large performance overhead with the GreenHills compiler. This should be used in conjunction with the configure flag --disable-exceptions.

    • -Ospeed

      This option instructs the compiler to enable all optimizations which improve speed.

    • --max_inlining

      By default, GreenHills will only consider functions composed entirely of straightline code (no control flow) for inlining. --max_inlining instructs the compiler to consider all functions (whether containing control flow statements or not) for inlining, subject to the usual restraints in the case of excessively large or complicated functions.

    • -DNDEBUG

      Disable assertions. This option should be used when configuring the library for performance.

    • --diag_suppress 177,550

      This option suppresses compiler diagnostics warning about unused variables. When compiling with -DNDEBUG assertions are removed that may be the only reference to a variable.

    When compiling a development or debug version of the library, replace -Ospeed -DNDEBUG with -g.

  • --host=powerpc

    Cross compile for the PowerPC processor.

  • --with-sal

    Enable the SAL library.

  • --enable-fft=sal,builtin

    Use SAL and Sourcery VSIPL++ builtin FFTW3 to perform FFT operations. SAL FFT will be used for FFTs with power-of-two sizes, FFTW3 will be used otherwise.

  • --with-fftw3-cflags="-O2"

    Compile Sourcery VSIPL++'s builtin FFTW3 library with optimization level -O2. (Compiling FFTW3 with optimization level -O3 produces link-errors with GreenHills C related to the handling of static functions. CodeSourcery is currently developing a work-around for this.)

  • --with-complex=split

    Store complex data in split format by default.

  • --disable-exceptions

    Disable the use of exceptions from within the library.

  • --enable-parallel=mpipro

    Enable the use of Verari MPI/Pro for communications.

  • --enable-timer=realtime

    Use the POSIX-realtime timer for profiling.

The file examples/mercury/mcoe-setup.sh is an example of how to configure Sourcery VSIPL++ for the Mercury with these options.

3.3.2. Configuration Notes for Windows Systems

Before configuring Sourcery VSIPL++ for a Microsoft Windows systems, the follow prerequisites are recommended:

  • The Cygwin environment for Windows, including the GNU make and sed packages. Sourcery VSIPL++ uses this as development environment for configuring and building the Sourcery VSIPL++ library. Cygwin is not necessary to build and run Sourcery VSIPL++ applications. For more information on the Cygwin environment, visit http://www.cygwin.com/

  • Intel C++ for Windows, version 9.1 or later. This may require installation of a Microsoft C++ compiler and Microsoft SDK for windows. For more information on Intel C++ and its requirements: http://www.intel.com/cd/software/products/asmo-na/eng/compilers/279578.htm

  • Intel IPP and MKL for Windows.

When configuring Sourcery VSIPL++ for a Microsoft Windows system, the following environment variables and configuration flags are recommended:

  • CXX=icl

    This selects the Intel C/C++ compiler icl as the C++ compiler.

  • CC=icl

    This selects the Intel C/C++ compiler icl at the C compiler.

  • CXXFLAGS="/Qcxx-features /Qvc8"

    These are the recommended flags for compiling Sourcery VSIPL++ with the Intel C++ compiler on Microsoft Windows platforms. The following are necessary to correctly build the library:

    • /Qcxx-features

      This enables standard C++ features for exception handling and RTTI.

    • /Qvc8

      This enables Microsoft Visual Studio 2005 compatibility. If using another version of Visual Studio, please consult the Intel C++ documentation for the correct option.

  • --build=i686-cygwin

    Configure to build library in the cygwin environment.

  • --host=i686-mingw32

    Target the resulting library to run on Microsoft Windows systems with the Win32 API.

  • --with-ipp=win

    Enable the IPP library for Windows. This requires that the IPP header, library, and DLL directories be present in your INCLUDE, LIB, and PATH directories, respectively. Manually passing these paths to configure in Windows is not recommended.

  • --enable-fft=ipp

    Use the IPP FFT functions to perform FFT operations.

  • --with-lapack=mkl_win

    Use the MKL library for Windows to implement linear-algebra operations. This requires that the MKL header and library directories be present in your INCLUDE and LIB, directories, respectively. Manually passing these paths to configure in Windows is not recommended.

  • --disable-parallel

    Disable parallel service. Sourcery VSIPL++ does not support MPI on Windows at this time.

3.3.3. Configuration Notes for Cell/B.E. Systems

When configuring Sourcery VSIPL++ for a Cell/B.E. host system, the following environment variables and configuration flags are recommended:

  • --with-cbe-sdk

    Enable use of the Cell/B.E. SDK and the Cell Math Library (CML). This is necessary to use the Cell/B.E.'s SPE processors to accelerate VSIPL++ functionality. If the SDK is not installed in the standard location, the --with-cbe-sdk-prefix should be used to specify the location.

  • --with-cml-prefix=PATH

    Specify the installation path of CML. Headers are installed in a subdirectory named include; libraries in one named lib.

    To install headers and libraries in other places, use instead the options --with-cml-include and --with-cml-libdir.

  • --with-cml-include=PATH

    Specify the directory containing CML header files. Use this option in conjunction with --with-cml-libdir. Do not use with --with-cml-prefix.

  • --with-cml-libdir=PATH

    Specify the directory containing CML libraries. Use this option in conjunction with --with-cml-include. Do not use with --with-cml-prefix.

  • --with-numa

    Enable use of libnuma for SPE/PPE affinity control. This may improve program performance by allocating SPEs close to the PPEs running VSIPL++.

  • --enable-timer=power_tb

    Enable the Power Timebase high-resolution timer. This option is useful when using profiling or running library benchmarks.

Two additional options must be specified when using a non-Cell/B.E. build system to cross-compile Sourcery VSIPL++ for a Cell/B.E. host system.

  • --host=powerpc-cell-linux-gnu

    Define the host system type.

  • --with-cbe-sdk-sysroot=directory

    Specify the Cell/B.E. sysroot location. Typically, this will be /opt/cell/sysroot on a standard SDK 3.0 cross-compiler installation.

3.3.4. Configuration Notes for the Reference Implementation

If you wish to use the BSD-licensed reference-implementation subset of Sourcery VSIPL++, you must configure with the following option:

  • --enable-only-ref-impl

    Build only the reference-implementation subset of Sourcery VSIPL++. If you do not use this option, the complete, optimized implementation of Sourcery VSIPL++ will be built.