Sourcery VSIPL++ allows you to write custom functions that participate in expression template dispatch and evaluation. This optimizes handling of the functions return value and allows custom evaluators to recognize fused expressions containing the expression.
Let us work through an example, starting with function scale():
template <typename T, typename BlockType>
Vector<T>
scale(Vector<T, BlockType> a, T value);
{
Vector<T> r = a * value;
return r;
}This function takes a vector, scales it by a scalar value, and returns the result. As the result is returned by-value, it gets copied during assignment. In other words, the return value is a temporary object, which we may want to avoid.
To do that, we use a variant of a technique known as return value optimization.
We rewrite scale() to return an expression type, which is only evaluated once
the result object is available, so the computed value can be stored in-place. To do that,
we capture the function logic into a functor, and rewrite the scale()
function to return an expression block vector:
using vsip_csl::expr::Unary;
using vsip_csl::expr::Unary_functor;
// Scale implements a call operator that scales its input
// argument, and returns it by reference.
template <typename ArgumentBlockType>
struct Scale : Unary_functor<ArgumentBlockType>
{
Scale(ArgumentBlockType const &a, typename ArgumentBlockType::value_type s)
: Unary_functor<ArgumentBlockType>(a), value(s) {}
template <typename ResultBlockType>
void apply(ResultBlockType &r) const
{
ArgumentBlockType const &a = this->arg();
for (index_type i = 0; i != r.size(); ++i)
r.put(i, a.get(i) * value);
}
typename ArgumentBlockType::value_type value;
};
// scale is a return-block optimised function returning an expression.
template <typename T, typename BlockType>
lazy_Vector<T, Unary<Scale, BlockType> const>
scale(const_Vector<T, BlockType> input, T value)
{
Scale<BlockType> s(input.block(), value);
Unary<Scale, BlockType> block(s);
return lazy_Vector<T, Unary<Scale, BlockType> const>(block);
}With that improvement, the scale() function in
Vector<> a(8); Vector<> r = scale(a, 2.f);
is entirely evaluated during the assignment.
The Unary_functor above poses certain requirements on its function parameter. If they can't be met, we need to write a different functor. For example:
using vsip_csl::View_block_storage;
template <typename ArgumentBlockType>
struct Interpolator
{
public:
typedef typename ArgumentBlockType::value_type value_type;
typedef typename ArgumentBlockType::value_type result_type;
typedef typename ArgumentBlockType::map_type map_type;
static vsip::dimension_type const dim = ArgumentBlockType::dim;
Interpolator(ArgumentBlockType const &a, Domain<ArgumentBlockType::dim> const &s)
: argument_(a), size_(s) {}
// Report the size of the new interpolated block
length_type size() const { return size_.size();}
length_type size(dimension_type b, dimension_type d) const
{
assert(b == ArgumentBlockType::dim);
return size_[d].size();
}
map_type const &map() const { return argument_.map();}
ArgumentBlockType const &arg() const { return argument_;}
template <typename ResultBlockType>
void apply(ResultBlockType &) const
{
std::cout << "apply interpolation !" << std::endl;
// interpolate 'argument' into 'result'
}
private:
typename View_block_storage<ArgumentBlockType>::expr_type argument_;
Domain<ArgumentBlockType::dim> size_;
};creates a new vector of different shape than the input Vector. To see the full requirements for the UnaryFunctor, see Section 6.6.3, “The Unary_functor class template”.
Here again, to write an interpolate() function that
evaluates lazily, we need to return an expression block
vector:
template <typename T, typename BlockType>
lazy_Vector<T, Unary<Interpolator, BlockType> const>
interpolate(lazy_Vector<T, BlockType> arg, Domain<1> const &size)
{
typedef Unary<Interpolator, BlockType> expr_block_type;
Interpolator<BlockType> interpolator(arg.block(), size);
expr_block_type block(interpolator);
return lazy_Vector<T, expr_block_type const>(block);
}Now we can combine the above functions into a single expression:
Vector<float> a(8, 2.); Vector<float> b = interpolate(scale(a, 2.f), 32);
The above demonstrates how to improve performance of an expression evaluation by using a technique that is a variant of the well-known return value optimization, where a copy operation (and a temporary object) may in certain cases be elided, if the result can be evaluated in-place.