5 Implementation

5.1 Implementation Documentation as per ISO 10514

Some things, like for example the memory requirements of the data types, cannot be standardized. The standard calls such quantities "implementation defined" and stipulates that they should be clearly defined by the respective implementation. Further specifications regarding the operation of the compiler should also be contained in the compiler's documentation. Both of these kinds of information are summarized in this section.

5.1.1 General Information

5.1.2 Implementation-Defined Factors

5.1.3 Constant Expressions

Any standard function calls yielding a constant as result are allowed in a constant expression. Moreover, any functions from "SYSTEM" having this property are also allowed (cf. sections 2.8.2 and 4.1.4). For example:
    CONST c = MAX (INTEGER) - SYSTEM. TSIZE (REAL) * 3;
        dif = VAL (CHAR, ORD ("a") - ORD ("A"));
        low = SYSTEM. CAST (CARDINAL, SYSTEM.CAST (BITSET, c) * BITSET {0..7}));

Constant calculations involving real or complex numbers are carried out in IEEE double precision mode.

5.2 Restrictions

For PowerPC implementation, the degree to which "WITH" statements and arithmetic expressions can be nested is limited by the number of registers available.

5.3 Implementation Details

5.3.1 Size and Alignment of Modula-2 Types

SYSTEM.LOC, SYSTEM.BYTE, CHAR, BOOLEAN, and enumeration types with up to 256 values:
  1 byte (8 bits)

SYSTEM.WORD16, SYSTEM.CARDINAL16, SYSTEM.INTEGER16, and enumeration types with more than 256 values:
  1 half word (16 bits)

SYSTEM.WORD32, SYSTEM.CARDINAL32, SYSTEM.INTEGER32, and REAL:
  1 word (32 bits)

SYSTEM.WORD, CARDINAL, INTEGER, BITSET, PROCEDURE types, SYSTEM.ADDRESS, pointer types:
  1 word (32 bits) for ppc and i386 targets; 2 words (64 bit) for target x86

LONGREAL, COMPLEX, SYSTEM.CARD64, SYSTEM.INT64, SYSTEM.BCD (15 nibbles for the decimal digits, one for the sign):
  2 words (64 bits)

LONGCOMPLEX, SYSTEM.LONGDOUBLE:
  4 words (128 bits)

SYSTEM.DOUBLECOMPLEX:
  8 words (256 bits)

SYSTEM.STR255 (Pascal string of maximum size):
  256 bytes (not necessarily aligned to half-word boundary)

Subrange types:
If the pragma variable "FixedSubrangeSize" is set to "TRUE", subrange types take the size of the base type.
If the pragma variable "FixedSubrangeSize" is set to "FALSE", the size is determined by:
  If lower and upper bound reside within [0..255] or [-128..127]:
    1 byte (8 bits)
  If lower and upper bound reside within [0..65535] or [-32768..32767]:
    1 half word (16 bits)
  If lower and upper bound reside within [0..4294967295] or [-2147483648..2147483647]:
    1 word (32 bits)
  If lower and/or upper bound are outside these ranges (possible for target x86 exclusively):
    2 words (64 bits)

ARRAY:
Number of elements times the size of a single element.

RECORD:
Sum of the sizes of each field (the largest variant in each such case) including possible fill bytes to align individual fields to alignment boundaries and/or a single fill byte appended to the end of the record so that it occupies a whole number of half words (unless the record occupies only one byte anyway, in which case it is not extended).
For example:

TYPE
    Rec = RECORD          (* SIZE (Rec) yields the value 8. *)
        ch: CHAR;         (* After "ch" there is a fill byte, *)
        i: INTEGER;       (* since "i" is aligned to a half-word boundary. *)
        b: BOOLEAN;       (* Another fill byte follows "b", so that the *)
    END(*RECORD*);        (* record occupies a whole number of half words. *)
If the records directly or indirectly contains a vector field the size is padded to an integral multiple of 16.

SET:
The ordinal values of the base type's bounds determine the memory-space requirements of SETs. If the difference between the upper and lower bounds is in the range 0 to 7, then a variable of this SET type occupies one byte; in all other cases an integral number of half words is used—exactly how many is given by the following formula:
  (ORD (MAX (base type)) - ORD (MIN (base type))) DIV 16 + 1 half words

PACKEDSET:
In this case the ordinal value of the base type's upper bound is what determines the amount of memory needed. If this value lies between 0 and 7 the set occupies one byte, otherwise it takes up an integral number of half words according to the following formula:
  ORD (MAX (base type)) DIV 16 + 1 half words

CLASS:
Object variables contain a pointer to the object; it occupies
  1 word (32 bits) for ppc and i386 targets; 2 words (64 bit) for target x86
The object itself takes up the same amount of memory as a record would which had the same fields (including the fields of all ancestor types) plus additional runtime information.

Alignment
Variables of simple types occupying only one byte and of the type SYSTEM.STR255 can be aligned to any byte boundary (i.e. any address is possible), the same is true for arrays of byte alignable elements. Variables of all other types start on half-word, word, double-word, or quad-word boundaries (depending on the alignment conventions). Arrays have the same alignments as their elements. Constants are always aligned to word boundaries.

By assignment to the pragma variable "AlignModelPPC" the alignment model may be chosen:

The above default may be overridden by assigning one of the following values to the pragma variable "AlignBorder". Attention:

5.3.2 Parameter Passing

For PPC:
In accordance with the PPC runtime conventions, parameters are passed in registers as far as possible. Other parameters are passed on stack, always aligned on word boundaries. For parameters with a size of less then 4 bytes only the first bytes are used, the following fill bytes are undefined.
Value parameters of types "COMPLEX" or "LONGCOMPLEX" are passed in register pairs if available.
Any parameter larger than 4 bytes in size, except of types "LONGREAL", "COMPLEX", or "LONGCOMPLEX", is always passed to a procedure as an address. If this "reference" parameter is actually intended to be a value parameter, a copy of it is then created within the procedure itself to avoid side effects.

For I386:
In accordance with the I386 runtime conventions, parameters are passed on stack, always aligned on word boundaries. For parameters with a size of less then 4 bytes only the first bytes are used, the following fill bytes are undefined.

For X86_64:
In accordance with the X86_64 runtime conventions, parameters are passed in registers or on stack, always aligned on word boundaries. For parameters with a size of less then 8 bytes only the first bytes are used, the following fill bytes are undefined.

Any parameter larger than 4 bytes in size for 32-bit targets or 8 bytes for 64-bit targets, except of types "LONGREAL" or "LONGDOUBLE", is always passed to a procedure as an address. If this "reference" parameter is actually intended to be a value parameter, a copy of it is then created within the procedure itself to avoid side effects.

5.3.3 Reference Parameters

It is conventional programming practice always to declare "ARRAYs" as "VAR" parameters, even if the parameter is never actually written to. This saves processing time and memory but masks the parameter's intended purpose.

p1 Modula-2 therefore provides the pragma variable "CopyRefparams". The copying of value parameters can be suppressed using the directive "ASSIGN (CopyRefparams, FALSE)" (or shorter: "CopyRefparams (FALSE)"). This thus has the same effect as placing "VAR" before the formal parameter's name, except that as actual parameter constants and expressions can also be passed As well as this, the compiler can check that the parameter is not written to. However the compiler is not in a position in all cases to determine whether there really is no write access to the parameter. As a result, ill-considered use of this feature can lead to strange side effects (e.g. overwriting a string constant!). In standard mode such reference parameters are always copied.

Examples:

Procedure declarations:

    TYPE
        MyRecord = RECORD
            c1: CARDINAL;
            c2: CARDINAL;
        END(*RECORD*);
  1. <* CopyRefparams (FALSE) *>
    PROCEDURE test (r: MyRecord );
  2. <* CopyRefparams (TRUE) *>
    PROCEDURE test (r: MyRecord );
  3. PROCEDURE test (VAR r: MyRecord );
Procedure calls: Assignment to "r" within procedure "test":

5.3.4 Open Arrays

Open array parameters ("ARRAY OF ...") are passed as an address (1 word) and the maximal index in each dimension (each 1 word with offset +4*n, where "n" is the dimension: 1st: n = 1; 2nd: n = 2, etc.).

The address is passed as usual (i.e. in a register if available), the high indices are always passed on stack.

5.3.5 Value of "NIL" / "EMTPY" / Undefined Variables

For "NIL" and "EMPTY" the value 0 is used.

The initial value of all variables is undefined. The compiler attempts to detect the use of undefined variables statically and issues a warning as the case may be.

5.3.6 Arrangement of Bits in SET Variables

The bits of SET variables having a base type (or universe) comprising no more than 32 values for 32-bit targets / no more than 64 values for 64-bit targets are arranged like the bits in the Intel or 680x0 registers (LSB corresponds to lowest value of base type). Sets with larger base types are allocated bytewise:
  byte position = (bit number - offset) DIV 8
  bit position = (bit number- offset) MOD 8
where
  bit number = ORD (base type value)
  offset = 0 if ORD (MIN (base type)) IN {0 .. 7}
  offset = ORD (MIN (base type)) otherwise

5.3.7 Arrangement of Bits in PACKEDSET Variables

The layout of packed sets depends on the chosen target architecture. For ppc, the bits are arranged as in the 680x0 registers, i.e. the MSB in the byte with the lowest memory address and the LSB in the byte with the highest memory address. For i386 and x86_64, the MSB is in the byte with the highest memory address and the LSB in the byte with the lowest memory address. The following relationship holds:
  CAST (CARDINAL, BITSET {0}) = 1

The position of an single bit can be determined as follows:
ppc
  byte position = maxbytes - bit number DIV 8
  bit position = bit number MOD 8
i386
  byte position = bit number DIV 8
  bit position = bit number MOD 8
where bit position = ORD (base type value) and maxbytes = ORD (MAX (base type)) DIV 8. This means that for example a PACKEDSET OF [100 .. 101] takes up not 1 but 14 bytes. Bits 0 .. 99 and 102 .. 111 are also always present, but their values are undefined.

Example:
The sign of number of type "LONGREAL" can be determined like this:

    TYPE
        LongSet = PACKEDSET OF [0 .. 63];
    VAR
        r: LONGREAL;

        … 63 IN CAST (LongSet, r) …

5.3.8 Register Usage

5.3.8.1 PowerPC

The PowerPC registers are used by the compiler according to the runtime conventions:
r0, r2, r11, r12Scratch registers, used only locally.
r1Stack pointer
r3..r10Parameter registers
r13..r31Local variables, scratch registers in expressions, pointer for "WITH" statements etc.
f0Scratch register, used only locally.
f1..r13Parameter registers
f14..f31Local variables, scratch registers in expressions, etc.
v0, v1,v14..v19Scratch registers, used only locally.
v2..v13Parameter registers
v20..v31Local variables, scratch registers in expressions, etc.

The following additional conventions are used:
r31 If necessary, contains global pointer used for addressing a module's global variables (cf. section 6.3.4)
r30 If necessary, contains static link used for addressing variables of surrounding procedures (cf. section 6.3.4)

All registers used for local variables (r13 .. r31, f14 .. f31, v20..v31) are saved and restored within the called procedure. Setting the register using "SYSTEM. SETREGISTER" is also regarded as a "use" of the register. If the procedure or module has an exception handler, all of the registers listed are saved regardless.

5.3.8.2 Intel 32-Bit

The Intel registers are used by the compiler according to the runtime conventions:
eax, ecx, edxScratch registers, used only locally.
ebxCode base pointer
ebpFrame pointer
espStack pointer
st0..st7Intermediate results in LONGDOUBLE-expressions
xmm0..xmm7Intermediate results in REAL- and LONGREAL-expressions

The following additional conventions are used:
edi Contains global pointer used for addressing a module's global variables (cf. section 6.3.4)
esi If necessary, contains static link used for addressing variables of surrounding procedures (cf. section 6.3.4). Within methods on level 0, it contains the SELF parameter.

The registers ebx, esi, edi, and ebp are saved and restored within the called procedure.

5.3.8.3 Intel 64-Bit

The Intel registers are used by the compiler according to the runtime conventions:
rax, rcx, rdx, rsi, rdi, r8, r9, r10, r11Scratch registers, used only locally.
rbxCode base pointer for PIC, scratch register otherwise
rbpFrame pointer
rspStack pointer
st0..st7Intermediate results in LONGDOUBLE-expressions
xmm0..xmm15Intermediate results in REAL- and LONGREAL-expressions

The following additional conventions are used:
r15 Contains global pointer used for addressing a module's global variables (cf. section 6.3.4)
r14 Within methods on level 0, and all nested procedures it contains the SELF parameter.
r13 If necessary, contains static link used for addressing variables of surrounding procedures (cf. section 6.3.4).

The registers rbx, r12, r13, r14, and r15 are saved and restored within the called procedure.

5.3.9 Generation of Floating-Point Code

On PPC, floating point numbers are always represented in IEEE double precision format when loaded into a register. Therefore floating point arithmetic for values of type "REAL" is also executed in IEEE double precision format. If possible the 4 address instructions "fmadd" and "fmsub" are used. On Intel, floating point arithmetic is always executed in the type of the operand(s).

5.3.10 Branch Control

The PowerPC instruction set allows to specify predictions whether a branch is likely to be taken or not. P1 Modula-2 makes use of this possibility with the following assumptions: Predicts for the branch instructions used for the evaluation of the control expressions of these statements are set according to these assumptions.

The above assumptions may be reverted by assigning the value "TRUE" to the pragma variable "InvertBranches" (cf. 4.2).

chapter 4 (compiler) start page chapter 6 (compiler)