Quantcast
Channel: Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1616

Applying the Profile-Guided Optimization (PGO) to Fluid Animate

$
0
0

Fluid animate is one of a class of algorithms for calculating fluid flow. There is one implementation using Intel® Cilk Plus technology. This sample uses the same serial implementation to demonstrate the use of Intel Profile Guided Optimization (PGO) without any code change. The PGO of Intel® C++ Compiler improves application performance by reorganizing code layout to reduce instruction-cache problems and branch mispredictions. With the collected application runtime information Intel C++ Compiler is able to be more selective and specific in optimizing the application. This may also help with application size introduced by aggressive inlining. You can find more in-depth explanation in this article Guide to Profile-guided Optimization of Computational Fluid Dynamics with Intel® Compiler.

This code originally written as part of the Princeton Parsec benchmark suite by Richard O. Lee and later modified by Christian Bienia and Christian Fensch.

 
  • System Requirements
  • Hardware:
    • Any Intel® processor like 2nd Generation Intel Core™ i3, i5, or i7 processors and Intel Xeon® E3 or E5 processor family
    For Microsoft* Windows*:
    • Microsoft Visual Studio 2010*, 2012*, or 2013* Professional Edition or above
    • Intel® Parallel Studio XE 2015 Composer Edition for C++ Windows*
    For Linux*:
    • GNU* GCC 4.5 or newer
    • Intel® Parallel Studio XE 2015 Composer Edition for C++ Linux*
    For OS X*:
    • OS X 10.9 or above
    • Xcode* 5.0 or above
    • Intel® Parallel Studio XE 2015 Composer Edition for C++ OS X OR
      Intel® Integrated Native Developer Experience 2015 Build Edition for OS X (Intel® INDE 2015 Build Edition for OS X)

Performance Gain with PGO

PGO Speedup(Scalar / SIMD)Code Size reductionCompiler (Intel® 64)Compiler optionsSystem specifications
Scalar 4% over non PGO

SIMD 5% over non PGO

17% reduction with PGOIntel® Parallel Studio XE 2015 Composer Edition for C++ Windows*/O2 /Qipo /fp:fast /Oi /MD /EHsc /Qprof-dir "Intel-Release\"Windows Server 2008 R2 Enterprise
3nd Gen Intel® Core™ i5-4670T CPU @ 2.30GHz + 8GB memory
Scalar 4% over non PGO

SIMD 6% over non PGO

15% reduction with PGOIntel® Parallel Studio XE 2015 Composer Edition for C++ OS X*-O2 -ipo -fast -prof-dir ./intel-releaseOS X 10.9.2

3nd Gen Intel® Core™ i7-2635QM CPU @ 2GHZ + 8GB memory

Scalar 4% over non PGO

SIMD 6% over non PGO

13% reduction with PGOIntel® Parallel Studio XE 2015 Composer Edition for C++ Linux*-O2 -ipo -fast -prof-dir ./intel-releaseUbuntu* 12.04
 3nd Gen Intel® Core™ i7-2600K CPU @ 3.40GHz + 8GB memory

Build Instructions:

  • For Microsoft Visual Studio 2010, 2012 or 2013 users:
  • Open Visual Studio 2010, 2012 or 2013 and load the solution .sln file
    Choose a configuration (for best performance, choose a release configuration):
    • Intel-debug and Intel-release: uses Intel C++ compiler
    • VSC-debug and VSC-release: uses Microsoft Visual C++* compiler
    Run application to get a baseline run time for later use. Use "Intel-Release" configuration for performance testing
    • Project Properties -> C/C++ -> Preprocessor -> Preprocessor Definitions: add PERF_NUM
    Build with Intel C++ compiler PGO:
    • In the Solution Explorer window right click on FluidAnimate project name, select "Intel Compiler XE 15.0 > Profile Guided Optimization". It brings up "Profile Guided Optimization Dialog.
    • Without changing any default setting, click [Run] button. This step will start the following:
      • Build the FluidAnimate with /Qprof-gen option
      • Run the application to collect the profiling data
      • Rebuild the application with /Qprof-use option
      • At the end of this step the PGO optimized FluidAnimate binary is created at "PGOSample\Intel-Release\FluidAnimate.exe"
    Run PGO optimized binary to get the performance number: press Ctrl+F5 to run the application.
  • For Windows Command Line users:
  • For Intel C++ Compiler:
    • Open the appropriate Intel C++ compiler command prompt from Start menu
    • Build and run the application normally without PGO:
      build
      build run -o 0
    • Build with PGO:
      build pgo
      This command will complete the followings:
      • Build FluidAnimate with /Qprof-gen
      • Run the application to collect profile data
      • Rebuild FluidAnimate with /Qprof-use and generate PGO optimized binary "release\FuildAnimatePGO.exe"
    [Optional]For Visual C++ Compiler (only linear/scalar will run):
    • Open the appropriate Microsoft Visual Studio command prompt from the Start menu and navigate to project folder
    • To compile: Build.bat [perf_num]
      • [perf_num]: collect performance numbers (will run example 5 times and average time taken)
    • To run: Build.bat run
  • For Linux* or OS X* users:
  • From a terminal window, navigate to the project folder
    Using Intel® C++ compiler:
    • Set the environment: source <icc-install-dir>/bin/compilervars.sh ia32 or intel-64
    • To compile with PGO: run buildPGO.sh shell script ". ./BuildPGO.sh"
      This step completes the followings:
      • Generate an instrumented binary
      • Run the instrucmented binary to collect profile data
      • Build the application again using the profile data and generates PGO optimized binary intel-release/FluidAnimatePGO
      To run the PGO optimized binary:
      . ./intel-release/FluidAnimatePGO
    [Optional] Using gcc (only linear/scalar will run):
    • To compile: make gcc [perf_num=1]
      • [perf_num=1]: collect performance numbers (will run example 5 times and average time taken)
    • To run: make run
  • For OS X* users using Intel® Integrated Native Developer Experience Build Edition for OS X* (Intel® INDE Build Edition for OS X* ):
  • From a terminal window, navigate to the project folder
    Using Intel® C++ compiler:
    • Set the environment: source <icc-install-dir>/bin/compilervars.sh ia32 or intel-64
    • To compile with PGO: run buildPGO_inde.sh shell script ". ./BuildPGO_inde.sh"
      This step completes the followings:
      • Generate an instrumented binary
      • Run the instrucmented binary to collect profile data
      • Build the application again using the profile data and generates PGO optimized binary intel-release/FluidAnimatePGO
      To run the PGO optimized binary:
      . ./intel-release/FluidAnimatePGO
    [Optional] Using gcc (only linear/scalar will run):
    • To compile: make gcc [perf_num=1]
      • [perf_num=1]: collect performance numbers (will run example 5 times and average time taken)
    • To run: make run
Sin definir

Viewing all articles
Browse latest Browse all 1616

Trending Articles