Overview

This directory contains a simple example that sums values in a tree. The example exhibits some speedup, but not a lot, because it quickly saturates the system bus on a multiprocessor. For good speedup, there needs to be more computation cycles per memory reference. The point of the example is to teach how to use the raw task interface, so the computation is deliberately trivial.

The performance of this example is better when objects are allocated by the Threading Building Blocks scalable_allocator instead of the default "operator new". The reason is that the scalable_allocator typically packs small objects more tightly than the default "operator new", resulting in a smaller memory footprint, and thus more efficient use of cache and virtual memory. In addition, the scalable_allocator performs better for multi-threaded allocations.

Files

SerialSumTree.cpp
Sums sequentially.
SimpleParallelSumTree.cpp
Sums in parallel without any fancy tricks.
OptimizedParallelSumTree.cpp
Sums in parallel, using "recycling" and "continuation-passing" tricks. In this case, it is only slightly faster than the simple version.
common.h
Shared declarations.
main.cpp
Driver.
Makefile
Makefile for building example.

Directories

vc7.1
Contains Microsoft* Visual Studio* .NET 2003 workspace for building and running the example.
vc8
Contains Microsoft* Visual Studio* 2005 workspace for building and running the example.
vc9
Contains Microsoft* Visual Studio* 2008 workspace for building and running the example.
xcode
Contains Xcode* IDE workspace for building and running the example.

To Build

General build directions can be found here.

Usage

tree_sum [-stdmalloc] S N
S is the problem size (the number of nodes in the tree). N is the number of threads to be used.
Passing "-stdmalloc" as the 1st parameter causes the default "operator new" to be used for memory allocations instead of the TBB scalable_allocator.
To run a short version of this example, e.g., for use with Intel® Threading Tools:
Build a debug version of the example (see the build directions).
Run it with a small problem size and the desired number of threads, e.g., tree_sum 100000 4.

Up to parent directory

Copyright © 2005-2009 Intel Corporation. All Rights Reserved.

Intel, Pentium, Intel Xeon, Itanium, Intel XScale and VTune are registered trademarks or trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

* Other names and brands may be claimed as the property of others.