Quantcast
Channel: Intel® C++ Compiler
Viewing all articles
Browse latest Browse all 1616

Offload Timers for Intel® Graphics Technology

$
0
0

This topic only applies to IA-32 architecture targeting Intel® Graphics Technology. Intel® Graphics Technology is a preview feature.

The timers maintained by the offload runtime are useful for breaking down the time of an offload session and to better focus your tuning efforts. For example, timers may show that further tuning of the compiled code will not help because the overhead of offloading is decisive.

You can set the runtime to print timing information at the end of execution by setting GFX_SHOW_TIME to 1. The runtime prints something similar to the following:

GFX performance timers with non-zero value (milllsecond,activation counter):
                   Offload Total = 3.94, 3
                 Device Creation = 42.10, 1
                 Kernel Creation = 0.09, 1
                Kernel Execution = 2.53, 1
      Kernel Execution on Device = 0.03, 1
                 Buffer Creation = 0.25, 4
              Buffer Destruction = 0.03, 3
                  Buffer Reading = 0.08, 1
                  Buffer Writing = 0.39, 4
       Iteration Space Splitting = 0.02, 1
                  Argument Setup = 0.62, 2
                     ELF Parsing = 0.12, 1
                 Program Loading = 14.31, 1

To disable timer printing, set GFX_SHOW_TIME to either an empty string or 0.

The following table describes the meaning of each timer:

Timer Name

Description

Device Creation

The runtime and device initialization time.

Offload Total

The total time spent for all offload sessions.

Program Loading

The total time spent to load all the kernels in the program, including JIT compilation time.

Kernel Creation

The total time spent for kernel creation. Does not include JIT compilation time.

Kernel Execution

The total time for all kernel executions. Measured from placing a kernel into a queue until receiving the completion signal.

Kernel Execution on Device

The total time for kernel executions measured by the Intel® Graphics Technology driver stack. This time is usually less than Kernel Execution time because it excludes event waiting and other overhead.

Buffer Creation, Buffer Destruction

Time spent for creation and destruction of the target's memory areas.

Buffer Reading, Buffer Writing

Time spent for copying data to and from the target's memory areas.

Optimization Notice

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

Inglés

Viewing all articles
Browse latest Browse all 1616

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>