Compiler Methodology for Intel® MIC Architecture
Unlike the IA-32 and Intel® 64 architectures, the Intel® MIC Architecture requires all data accesses to be properly aligned according to their size, otherwise the program may behave unpredictably.
For example, an integer variable, which requires four bytes of storage, has to be allocated on an address that is a multiple of four. Likewise, a double-precsion floating point variable or a pointer variable, which requires eight bytes of storage, has to be allocated on an address that is a multiple of eight.
Structures and unions assume the alignment of their most strictly aligned component. Each member is assigned to the lowest available offset with the appropriate alignment. The size of any object is always a multiple of the object‘s alignment.
Note that removing the misaligned accesses on IA-32 and Intel® 64 architectures (through appropriate source changes) will likely lead to improved performance there too.
1. Here is a Fortran example that is not ABI-compliant on the Intel® MIC Architecture - note the use of the “sequence” keyword inside an object.
Consider, the following structure:
type, public :: GridEdge_t
sequence
integer :: head_face ! needed if head vertex has shape (i.e. square)
integer :: tail_face ! needed if tail vertex has shape (i.e. square)
integer :: head_ind !
integer :: tail_ind !
type (GridVertex_t),pointer :: head ! edge head vertex
type (GridVertex_t),pointer :: tail ! edge tail vertex
logical :: reverse
end type GridEdge_t
Adding up the sizes of the individual fields, the size of this object is 36 bytes. Since the sequence keyword is used, they are contiguous in memory. If we had an array of these objects, array elements are packed without padding bytes. So after the first element, subsequent elements would no longer be aligned when trying to access the fields head or tail. According to the ABI requirements, the fields head and tail should be 8-bytes aligned, so alignment of a GridEdge_t should be 8 bytes, and sizeof GridEdge_t should be a multiple of 8, viz. 40. If the SEQUENCE keyword is removed, the compiler automatically creates GridEdge_t wth the correct size of 40 bytes.
2. Here is a simple synthetic example in C that violates the ABI:
#include <malloc.h>
int main(int argc, char **argv)
{
char *blob = (char *)malloc(100); // malloc returns 8-byte aligned pointer
float *ptr = (float *)(blob + argc); // Assume program is invoked with no arguments, argc=1
for(int i = 0; i < argc; i++)
{
ptr[i] = 0; // GP fault here since floating point data is not aligned at 4-bytes
}
return 0;
}
This kind of access violation may happen from a user written memory allocation routine. Its not uncommon for users to write their own memory allocation routines, which could inadvertently result in unaligned allocated memory. This can lead to runtime errors due to the ABI requirements on the Intel® MIC Architecture and should be fixed by the user by making appropriate changes in the source code.