5 Implementation

5.1 Implementation Documentation as per ISO 10514

Some things, like for example the memory requirements of the data types, cannot be standardized. The standard calls such quantities "implementation defined" and stipulates that they should be clearly defined by the respective implementation. Further specifications regarding the operation of the compiler should also be contained in the compiler's documentation. Both of these kinds of information are summarized in this section.

5.1.1 General Information

5.1.2 Implementation-Defined Factors

5.1.3 Constant Expressions

Any standard function calls yielding a constant as result are allowed in a constant expression. Moreover, any functions from "SYSTEM" having this property are also allowed (cf. sections 2.8.2 and 4.1.4). For example:
        dif = VAL (CHAR, ORD ("a") - ORD ("A"));
        low = SYSTEM. CAST (CARDINAL, SYSTEM.CAST (BITSET, c) * BITSET {0..7}));

Constant calculations involving real or complex numbers are carried out in IEEE double precision mode.

5.2 Restrictions

For Arm implementation, the degree to which "WITH" statements and arithmetic expressions can be nested is limited by the number of registers available.

5.3 Implementation Details

5.3.1 Size and Alignment of Modula-2 Types

SYSTEM.LOC, SYSTEM.BYTE, CHAR, BOOLEAN, and enumeration types with up to 256 values:
  1 byte (8 bits)

SYSTEM.WORD16, SYSTEM.CARDINAL16, SYSTEM.INTEGER16, and enumeration types with more than 256 values:
  1 half word (16 bits)

  1 word (32 bits)

  1 doubleword (64 bits)

LONGREAL, COMPLEX, SYSTEM.BCD (15 nibbles for the decimal digits, one for the sign):
  1 doubleword (64 bits)

  2 doublewords (128 bits)

SYSTEM.STR255 (Pascal string of maximum size):
  256 bytes (not necessarily aligned to half-word boundary)

Subrange types:
If the pragma variable "FixedSubrangeSize" is set to "TRUE", subrange types take the size of the base type.
If the pragma variable "FixedSubrangeSize" is set to "FALSE", the size is determined by:
  If lower and upper bound reside within [0..255] or [-128..127]:
    1 byte (8 bits)
  If lower and upper bound reside within [0..65535] or [-32768..32767]:
    1 half word (16 bits)
  If lower and upper bound reside within [0..4294967295] or [-2147483648..2147483647]:
    1 word (32 bits)
  If lower and/or upper bound are outside these ranges:
    2 words (64 bits)

Number of elements times the size of a single element.

Sum of the sizes of each field (the largest variant in each such case) including possible fill bytes to align individual fields to alignment boundaries and/or a single fill byte appended to the end of the record so that it occupies a whole number of half words (unless the record occupies only one byte anyway, in which case it is not extended).
For example:

    Rec = RECORD          (* SIZE (Rec) yields the value 8. *)
        ch: CHAR;         (* After "ch" there is a fill byte, *)
        i: INTEGER;       (* since "i" is aligned to a half-word boundary. *)
        b: BOOLEAN;       (* Another fill byte follows "b", so that the *)
    END(*RECORD*);        (* record occupies a whole number of half words. *)
If the records directly or indirectly contains a vector field the size is padded to an integral multiple of 16.

The ordinal values of the base type's bounds determine the memory-space requirements of SETs. If the difference between the upper and lower bounds is in the range 0 to 7, then a variable of this SET type occupies one byte; in all other cases an integral number of half words is used—exactly how many is given by the following formula:
  (ORD (MAX (base type)) - ORD (MIN (base type))) DIV 16 + 1 half words

In this case the ordinal value of the base type's upper bound is what determines the amount of memory needed. If this value lies between 0 and 7 the set occupies one byte, otherwise it takes up an integral number of half words according to the following formula:
  ORD (MAX (base type)) DIV 16 + 1 half words

Object variables contain a pointer to the object; it occupies
  1 doubleword (64 bits)
The object itself takes up the same amount of memory as a record would which had the same fields (including the fields of all ancestor types) plus additional runtime information.

Variables of simple types occupying only one byte and of the type SYSTEM.STR255 can be aligned to any byte boundary (i.e. any address is possible), the same is true for arrays of byte alignable elements. Variables of all other types start on half-word, word, double-word, or quad-word boundaries (depending on the alignment conventions). Arrays have the same alignments as their elements. Constants are always aligned to word boundaries.

By assignment to the pragma variable "AlignBorder" the alignment model may be chosen modified:


5.3.2 Parameter Passing

For X86_64:
In accordance with the X86_64 runtime conventions, parameters are passed in registers or on stack, always aligned on word boundaries. For parameters with a size of less then 8 bytes only the first bytes are used, the following fill bytes are undefined.

For Arm_64:
In accordance with the Arm_64 runtime conventions, parameters are passed in registers or on stack, always aligned on longword boundaries. For parameters with a size of less then 8 bytes only the first bytes are used, the following fill bytes are undefined.

Any parameter larger than 8 bytes, except of type "LONGCOMPLEX", is always passed to a procedure as an address. If this "reference" parameter is actually intended to be a value parameter, a copy of it is then created within the procedure itself to avoid side effects.

5.3.3 Reference Parameters

It is conventional programming practice always to declare "ARRAYs" as "VAR" parameters, even if the parameter is never actually written to. This saves processing time and memory but masks the parameter's intended purpose.

p1 Modula-2 therefore provides the pragma variable "CopyRefparams". The copying of value parameters can be suppressed using the directive "ASSIGN (CopyRefparams, FALSE)" (or shorter: "CopyRefparams (FALSE)"). This thus has the same effect as placing "VAR" before the formal parameter's name, except that as actual parameter constants and expressions can also be passed As well as this, the compiler can check that the parameter is not written to. However the compiler is not in a position in all cases to determine whether there really is no write access to the parameter. As a result, ill-considered use of this feature can lead to strange side effects (e.g. overwriting a string constant!). In standard mode such reference parameters are always copied.


Procedure declarations:

        MyRecord = RECORD
            c1: CARDINAL;
            c2: CARDINAL;
  1. <* CopyRefparams (FALSE) *>
    PROCEDURE test (r: MyRecord );
  2. <* CopyRefparams (TRUE) *>
    PROCEDURE test (r: MyRecord );
  3. PROCEDURE test (VAR r: MyRecord );
Procedure calls: Assignment to "r" within procedure "test":

5.3.4 Open Arrays

Open array parameters ("ARRAY OF ...") are passed as an address (1 word) and the maximal index in each dimension (each 1 word with offset +4*n, where "n" is the dimension: 1st: n = 1; 2nd: n = 2, etc.).

The address is passed as usual (i.e. in a register if available), the high indices are always passed on stack.

5.3.5 Value of "NIL" / "EMTPY" / Undefined Variables

For "NIL" and "EMPTY" the value 0 is used.

The initial value of all variables is undefined. The compiler attempts to detect the use of undefined variables statically and issues a warning as the case may be.

5.3.6 Arrangement of Bits in SET Variables

The bits of SET variables having a base type (or universe) comprising no more than 64 values are arranged like the bits in the Intel or Arm registers (LSB corresponds to lowest value of base type). Sets with larger base types are allocated bytewise:
  byte position = (bit number - offset) DIV 8
  bit position = (bit number- offset) MOD 8
  bit number = ORD (base type value)
  offset = 0 if ORD (MIN (base type)) IN {0 .. 7}
  offset = ORD (MIN (base type)) otherwise

5.3.7 Arrangement of Bits in PACKEDSET Variables

The MSB is in the byte with the highest memory address and the LSB in the byte with the lowest memory address. The following relationship holds:

The position of an single bit can be determined as follows:
  byte position = maxbytes - bit number DIV 8
  bit position = bit number MOD 8
  byte position = bit number DIV 8
  bit position = bit number MOD 8
where bit position = ORD (base type value) and maxbytes = ORD (MAX (base type)) DIV 8. This means that for example a PACKEDSET OF [100 .. 101] takes up not 1 but 14 bytes. Bits 0 .. 99 and 102 .. 111 are also always present, but their values are undefined.

The sign of number of type "LONGREAL" can be determined like this:

        LongSet = PACKEDSET OF [0 .. 63];
        r: LONGREAL;

        … 63 IN CAST (LongSet, r) …

5.3.8 Register Usage Arm

The Arm registers are used by the compiler according to the runtime conventions:
x0..x7Paramter registers, in leaf procedures used locally.
x8Address register for large procedure results.
x9..x15Temporary registers, used in expression evaluation and WITH-statements.
x16, x17used locally by the compiler.
x18Used by the system exclusively.
x19..x25Preserved registers, used for local variables.
x26SELF register.
x27Static link register.
x28Global pointer.
x29Frame pointer.
x30Link register.
x31Stack pointer.
v0..v7Paramter registers.
v8..v15Preserved registers, used for local variables.
v16..v31Temporary registers, used in expression evaluation.

All registers used for local variables (x19 .. x28, v0 .. v15) are saved and restored within the called procedure. If the procedure or module has an exception handler, all of the registers listed are saved regardless. Intel

The Intel registers are used by the compiler according to the runtime conventions:
rax, rcx, rdx, rsi, rdi, r8, r9, r10, r11Scratch registers, used only locally.
rbxCode base pointer for PIC, scratch register otherwise
rbpFrame pointer
rspStack pointer
st0..st7Intermediate results in LONGDOUBLE-expressions
xmm0..xmm15Intermediate results in REAL- and LONGREAL-expressions

The following additional conventions are used:
r15 Contains global pointer used for addressing a module's global variables (cf. section 6.3.4)
r14 Within methods on level 0, and all nested procedures it contains the SELF parameter.
r13 If necessary, contains static link used for addressing variables of surrounding procedures (cf. section 6.3.4).

The registers rbx, r12, r13, r14, and r15 are saved and restored within the called procedure.

5.3.9 Generation of Floating-Point Code

On Arm, floating point arithmetic is executed in the type of the operand(s). If possible—and optimization is "full"—the 4 address instructions "fmadd" and "fmsub" are used. On Intel, floating point arithmetic is executed in the type of the operand(s).
chapter 4 (compiler) start page chapter 6 (compiler)