Programming and Compiling for Intel® Many Integrated Core Architecture

Compiler Methodology for Intel® MIC Architecture

This article is part of the Intel® Modern Code Developer Community documentation which supports developers in leveraging application performance in code through a systematic step-by-step optimization framework methodology. This article addresses: parallelization.

This methodology enables you to determine your application's suitability for performance gains using Intel® Many Integrated Core Architecture (Intel® MIC Architecture). The following links will allow you to understand the programming environment and help you evaluate the suitability of your app to the Intel Xeon and MIC environment.

Preparing for the Intel® Many Integrated Core Architecture
- Application Analysis for Intel® MIC Architecture Suitability
- Expectations for User Source Code Changes

Because of the rich and varied programming environments provided by the Intel Xeon and Xeon Phi processors, the Intel compilers offer a wide variety of switches and options for controlling the executable code that they produce. This chapter provides the information necessary to insure that a user gets the maximum benefit from the compilers.

New User Compiler Basic Usage

The Intel® MIC Architecture provides two principal programming models: the native model covers compiling applications to run directly on the coprocessor, the heterogeneous offload model covers running a main host program and offloading work to the coprocessor, including standard offload and the Cilk_Offload model. The following chapter gives you insights into the applicability of these models to your application.

Efficient Parallelization

The third level of parallelism associated with code modernization is vectorization and SIMD instructions. The Intel compilers recognize a broad array of vector constructs and are capable of enabling significant performance boosts for both scalar and vector code. The following chapter provides detailed information on ways to maximize your vector performance.

Vectorization Essentials

The final chapter in the section provides insight into some advanced optimization topics. Included are discussions of floating point accuracy, data movement, thread scheduling, and many more. This is a good chapter for users still not seeing their desired performance OR are looking for the last level of performance enhancements.