In this step, you will look at the section of the source code that is defined to run on both the host and the target.
Open the source file src/Workloads/CrossFade/CrossFade.cpp in a code editor.
Find the following code:
#pragma offload target(gfx)if (do_offload) pin(inputArray1, inputArray2, outputArray: length(arraySize)) _Cilk_for (int i=0; i<arraySize; i++){ outputArray[i] = (inputArray1[i] * a1 + inputArray2[i] * a2) >> 8; }
This code block performs a weighted sum of two input arrays. At the start of the code block is the offload pragma. The compiler interprets this pragma to compile the following code block to run on both the host and the target.
The host and the target can share memory. The address of the memory area to share needs to be explicitly provided to the offload runtime. The pin clause of the offload pragma serves for this purpose. The syntax of the offload pragma is similar to in, out, and inout clauses.
The pin clause defines the variable as shared between the CPU and the target.
The in clause defines the variable as strictly an input to the target. The value is not copied back to the host.
The out clause defines the variable as strictly an output of the target. The host does not copy the variable to the target.
The inout clause defines the variable that is both copied to and from the host and target.
All of these clauses accept the optional length modifier which allows you to specify the length of the array in elements if it is not known at compile time.
The pragma has the target parameter to direct the code to the target designated by keyword gfx.
The pragma also has the if clause which provides to the application the run time control whether the code is executed on the host or the target. If the value provided to the if clause evaluates to true, the code with the offload pragma executes on the available target. If the value provided to the if clause evaluates to false, the code executes only on the host.
Immediately after the offload pragma is the _Cilk_for keyword. This keyword reflects one of the most important requirements to the code for the target: only a parallel loop or a parallel loop nest can be offloaded to the target and must immediately follow the offload pragma.
Although the compiler by default compiles the source code into an application that runs on both the host and target, you can also compile the same source code into an application that runs on just the host.
In the next step, you will compile the source code into an application that runs only on the host.