4 Extensions to the Language in p1 Modula-2

p1 Modula-2 incorporates a number of extensions with respect to the ISO standard. They comprise access to operating system functions, code written in other languages, p1 specific data types etc.

4.1 Extensions to the Module "SYSTEM"

4.1.1 Extensions to the Pseudo Definition-Module

The extensions to the module "SYSTEM" are summarized in the following pseudo definition module. As might be expected, an exact representation is not possible using Modula-2 language constructs alone.
DEFINITION MODULE SYSTEM;
(* Extensions with respect to the standard *)
CONST
  SOURCELINE = (* a constant containing the actual source line number. *)
  SOURCEFILE = (* a constant containing the name the actual source file. *)
TYPE
  INT8 = [-128 .. 127];
  CARD8 = [0 .. 255];
  INT16 = [-32768 .. 32767];
  CARD16 = [0 .. 65535];
  INT32 = [-2147483648 .. 2147483647];
  CARD32 = [0 .. 4294967295];
  INT64 = [-9223372036854775808 .. 9223372036854775807];
  CARD64 = [0 .. 18446744073709551615];
  ADRCARD = CARDINAL;
  WORD8 = LOC;
  WORD16 = ARRAY [0 .. 1] OF LOC;
  WORD32 = ARRAY [0 .. 3] OF LOC;
  WORD64 = ARRAY [0 .. 7] OF LOC;
  SET8 = PACKEDSET OF [0 .. 7];
  SET16 = PACKEDSET OF [0 .. 15];
  SET32 = PACKEDSET OF [0 .. 31];
  SET64 = PACKEDSET OF [0 .. 63];
  LONGDOUBLE = (* 128-bit double reals *)
  DOUBLECOMPLEX = (* 2 components each 128-bit double real *)
  BCD = (* 8 bytes: 15 + 1 nibbles for digits + sign *)
  STR255 = (* 256 bytes: largest possible Pascal string *)

PROCEDURE REGISTER (r: CARDINAL): <CARDINAL/LONGREAL/VECTOR OF REAL>
(* Delivers the contents of register r *)

PROCEDURE SETREGISTER (r: CARDINAL; x: <WORD32/LONGREAL/ANY VECTOR>);
(* Sets register r to the given value *)

PROCEDURE CODE (x: <CARD32 constant>);
(* Places the value x as a 32-bit word directly into the code *)

PROCEDURE ASSEMBLER (text: ARRAY OF WORD);
(* Places the given text as one line into the emitted assembler source *)

PROCEDURE INCADR (VAR addr: ADDRESS; offset: CARDINAL);
(* Increments addr by offset (or 1 if offset is omitted. *)

PROCEDURE DECADR (VAR addr: ADDRESS; offset: CARDINAL);
(* Decrements addr by offset (or 1 if offset is omitted. *)

PROCEDURE TOSTR255 (str: ARRAY OF CHAR): STR255;
(* Converts a Modula-2 string into a Pascal string *)

PROCEDURE FROMSTR255 (str255: STR255; VAR str: ARRAY OF CHAR);
(* Converts a Pascal string into a Modula-2 string *)

PROCEDURE BREAK;
(* Calls the Modula-2 debugger *)

PROCEDURE SETEXITCODE (x: INTEGER);
(* Sets the return code for the program to x *)

PROCEDURE ROUND (x: <real-number expression>): INTEGER;
(* rounds x to the nearest integer value *)

PROCEDURE CLONE (VAR target: <class type>; source <class type>);
(* generates an exact copy of the objects referenced by source *)

PROCEDURE OFFS f: <qualified record field>): CARDINAL;
(* calculates the offset of f from the beginning of the record in LOCs *)

END SYSTEM.

4.1.2 Additional Constants

SOURCELINE
This constant always reflects the number of the source line in which this constant is used.

SOURCEFILE
This constant contains the name of the source file currently compiled.

4.1.3 Additional Data Types

To make accessing routines from the operating system or other languages easier, p1 Modula-2 provides several additional data types in the module "SYSTEM".

4.1.3.1 Types with Universally Fixed Size

These types were created so that variables of the right size are available for use with external routines. These types could in fact be defined using standard Modula-2, but as they are needed so often, defining them centrally avoids unnecessary imports (the compiler deals with imports from "SYSTEM" internally).

Each data type described here (except "ADRCARD") includes its size in memory in bits at the end of its name.

INT8, CARD8, INT16, CARD16
These types are defined as subranges ([-128 .. 127], [0 .. 255], [-32768 .. 32767], [0 .. 65535]) of "INTEGER" or "CARDINAL" and are therefore compatible to them.

INT32, CARD32
On 32 bit architectures, these are pseudonyms for "INTEGER" and "CARDINAL" respectively and should be used whenever their 32-bit size in memory is to be documented. On 64 bit architectures, these are proper subranges of "INTEGER" resp. "CARDINAL" and thus compatible to them.

INT64, CARD64
These types offer 64-bit ordinal arithmetic as it is used e.g. for calculation positions in large files. In general "INT64" and "CARD64" provide for the same features as "INTEGER" resp. "CARDINAL" with the small restrictions explained below.
On 64-bit architectures (x86_64) these types are synonyms for "INTEGER" resp. "CARDINAL" and no restrictions apply.
On 32-bit architectures (ppc, i386) the types "INT64" and "CARD64" are no superranges of "INTEGER" resp. "CARDINAL". So no subranges may be derived from "INT64/CARD64". But as "INT64/CARD64" are compatible to "INTEGER/CARDINAL" no explicit conversion with "VAL" has to be used; all allowed conversions from and to "INTEGER/CARDINAL" are also applicable to "INT64/CARD64".
The zz-type covers the range from MIN(INT64) to MAX(CARD64), so constant arithmetic is executed in 64-bit mode even for 32-bit targets. If a constant value exceeds the range for "INTEGER/CARDINAL" it is not compatible to expressions of these types but may be used in expressions of type "INT64/CARD64".

WORD8, WORD16, WORD32, WORD64
These are defined as "ARRAY [0 .. n-1] of SYSTEM.LOC" (for n = 1, 2, 4, 8 respectively) and as a result have all the compatibility properties of type "WORD"
(on 32-bit architectures "WORD32" is identical to "WORD", on 64-bit architectures "WORD64" is identical to "WORD").

SET8, SET16, SET32, SET64
These types are defined as "PACKEDSET OF [0 .. n-1]" (for n = 8, 16, 32, 64 respectively). They are intended for use as set types occupying a predefined amount of space in memory.

ADRCARD
Subrange of "CARDINAL" having the same memory space requirement as "ADDRESS". For all actually supported architectures ADRCARD is a synonym for CARDINAL.

4.1.3.2 HANDLE

In the same way that "ADDRESS" serves as universal pointer type, "HANDLE" is provided as universal handle type in the sense of a "handle" in the Macintosh operating system. Implicitly declared as "POINTER TO ADDRESS", this type is compatible to all handle types (i.e. all types of the form "POINTER TO POINTER TO …"), like "ADDRESS" is to all pointer types.

4.1.3.3 STR255

Many Macintosh-specific routines have Pascal-style string parameters (first byte specifies length). "STR255" represents a Pascal string of maximum size as used in most of these routines. Normal string constants are assignment compatible to this type—they do not need to be converted explicitly.

The standard function "LENGTH" is applicable to variables and constants of type "STR255", its result is the actual length of the string.

For converting between variables of this type and Modula-2 strings "SYSTEM" also furnishes the subroutines "FROMSTR255" and "TOSTR255" (cf. 4.1.4).

The type "STR255" may be indexed. The elements are of type "CHAR". The element with index "0" represents the length of the string (i. e. the following equation holds "ORD (str [0]) = LENGTH (str)").

4.1.3.4 Additional Arithmetic Types

LONGDOUBLE, DOUBLECOMPLEX
For special purposes the runtime provides for floating point numbers with a size of 128 bit. "LONGDOUBLE" represents this type in p1 Modula-2. "DOUBLECOMPLEX" represents a complex number with real and imaginary parts in this format. The same rules apply to the types "LONGDOUBLE" and "DOUBLECOMPLEX" as to "REAL" / "LONGREAL" and "COMPLEX" / "LONGCOMPLEX" respectively, i.e. in particular that the four basic arithmetical operations are allowed and that the functions "VAL", "RE", "IM" and "CMPLX" are respectively applicable. Floating point constants of variable type ("RR type") are compatible to "LONGDOUBLE" and complex constants of variable type ("CC type") are compatible to "DOUBLECOMPLEX".
ATTENTION: In ppc arithmetic operations on the types "LONGDOUBLE" / "DOUBLECOMPLEX" are performed with 64 bits, the lower bits of the fraction are set to 0.

BCD
"BCD" is a floating point type represented in decimal form (binary coded decimal) with 15 significant digits. Since the decimal point can be anywhere in the number, absolute values ranging from 0.000000000000001 to 999999999999999.0 are possible. The type covers the range of values -999999999999999.0 to 999999999999999.0 overall.
"BCD" belongs to the scalar types and the four basic arithmetical operations can be used with it. Conversions between expressions of type "BCD" and expressions of other types are possible using "VAL". In doing so, the same restrictions apply as for corresponding conversions involving the type "REAL" (e.g. no direct conversions to or from enumerated types). "MIN", "MAX" and "ABS" can be used.
Constants of type "BCD" are denoted by an appended "$". Exponential format is not allowed. Examples:
1.34$legal
10000000$illegal (no decimal point)
1.8E5$illegal (exponent not allowed)
9876543210.987654$illegal (too many digits)

4.1.4 Additional Functions and Procedures

PROCEDURE REGISTER (r: CARDINAL): CARDINAL/LONGREAL/VECTOR OF REAL;
The contents of register "r" are returned.
For ppc:
Values of "r" from 0 to 31 refer to the integer registers r0 to r31, whereas values from 32 to 63 refer to the fpu registers f0 to f31 respectively. Values from 64 to 95 are valid only for processors with AltiVec instruction set, here they refer to the vector registers v0 to v31.
For i386:
Values of "r" from 0 to 7 refer to the integer registers eax, ebx, ecx, edx, esi, edi, ebp, and esp (in this order), whereas values from 8 to 15 refer to the fpu registers st0 to st7 respectively. Values from 16 to 23 denote mm0 through mm7; values from 24 to 31 denote xmm0 through xmm7 (which are used for REAL and LONGREAL arithmetic).
Access to other registers is not possible.
For x86:
Values of "r" from 0 to 15 refer to the integer registers rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, ... , r15 (in this order), whereas values from 16 to 23 refer to the fpu registers st0 to st7 respectively. Values from 24 to 31 denote mm0 through mm7; values from 32 to 39 denote xmm0 through xmm7 (which are used for REAL and LONGREAL arithmetic).
Access to other registers is not possible.
Only constant values are allowed as parameter.

PROCEDURE SETREGISTER (r: CARDINAL; x: WORD/LONGREAL/ANY VECTOR);
Sets the register specified by "r" to the value passed in the second parameter. For ppc, this register is then added to the list of registers to be saved by the surrounding procedure. When setting registers care must be taken that the register has not already been put to use by the compiler. Besides that, the register might be overwritten in the next statement (see also 5.3.9 and 5.3.10)! Only constant values are allowed as first parameter.

PROCEDURE CODE (x: CARD32);
The parameter's value is placed in the code as a 32-bit word at compile time, i.e. this is only a pseudo procedure, as no procedure call as such is generated. Only constants are allowed as parameters. This function is very dangerous; the compiler does not undertake checks of any kind. A special TRAP command could be inserted using this "procedure" for example.

PROCEDURE ASSEMBLER (text: ARRAY OF CHAR);
The parameter's value is placed in the assembler output file at compile time, i.e. this is only a pseudo procedure, as no procedure call as such is generated. Only constants are allowed as parameters. Care must be taken when using this procedure as the compiler does not undertake checks of any kind.

PROCEDURE TOSTR255 (str: ARRAY OF CHAR): STR255;
Converts an arbitrary Modula-2 string to type STR255 (largest possible Pascal string type, up to 255 characters). If the string passed is too long it is simply truncated without error notification. This function procedure is allowed in constant declarations.

PROCEDURE FROMSTR255 (str255: STR255; VAR str: ARRAY OF CHAR);
Converts a value of type "STR255" (largest possible Pascal string type) into a Modula-2 string. If the array specified is not large enough to accommodate the entire resulting string, the overflowing characters are simply discarded without error notification.

PROCEDURE BREAK;
"BREAK" calls the error dialog of the p1 Modula-2 run-time system as a subprogram. The debugger can be intentionally brought into action this way for example. The program can be continued normally afterwards.

PROCEDURE SETEXITCODE (x: INTEGER);
Sets up the value of the variable "x" as return code for the shell. In this way the shell can be informed to skip a certain series of commands following the program when an error occurs.

PROCEDURE ROUND (x: <floating point expression>): INTEGER;
The value of "x" is rounded to the nearest whole number (as specified by the processor, not necessarily the mathematical definition for rounding of x.5 !). An exception is raised if the value cannot be represented as number of type "INTEGER".

PROCEDURE CLONE (VAR target: <class type>; source: <class type>);
An exact copy is made of the object referenced by "source" and a reference to this new object is stored in "target". If "source" contains the value "EMPTY" or if not enough storage for the copy is available, "EMPTY" is returned in "target". For "source" the type of "target" is used so that all classes assignment compatible to the type of "target" are allowed for "source". If "target" specifies an untraced class, "ALLOCATE" has to be visible as with "CREATE".

PROCEDURE OFFS (f: <qualified record field>): CARDINAL;
The offset of the given record field "f" from the beginning of the record is calculated. Fields of deeper nested records may be specified (multiple qualification).

4.1.5 Additional Features of Predefined Functions and Procedures

PROCEDURE ADR (x: ): ADDRESS;
The address of the given constant or procedure is returned.

4.2 Pragmas

The ISO standard makes a clear distinction between comments (enclosed between "(*" and "*)") and compiler directives (pragmas, enclosed between "<*" and "*>"). It does not define the form and function of the pragmas in any more detail however. p1 Modula-2 follows a recommendation of the DIN team NI22.13 responsible for the standardization of Modula-2. This recommendation is described below.

The syntax of pragmas in EBNF:
pragma="<*" single_pragma { single_pragma } "*>"
single_pragma=[ assignment | definition | environment | save | condition | ";" ]
assignment=known_variable "(" value ")" | "ASSIGN" "(" known_variable "," value ")"
environment="ENVIRON" "(" unknown_variable "," value ")"
definition="DEFINE" "(" unknown_variable "," value ")"
save="PUSH" | "POP"
condition="IF" boolean_expression "THEN" { "ELSIF" boolean_expression "THEN" } [ "ELSE" ] "END"
boolean_expression=cond_expr [ ( "=" | "<>" | "#" ) cond_expr ]
cond_expr=cond_term { "OR" cond_term }
cond_term=cond_factor { "AND" cond_factor }
cond_factor=[ "NOT" ] cond_factor | value | "(" boolean_expression ")"
known_variable=identifier
unknown_variable=identifier
identifier=any legal Modula-2 identifier
value=any legal Modula-2 string | "TRUE" | "FALSE" | predefined_constant | identifier
predefined_constant="ALIGN1" | "ALIGN2" | "ALIGN4" | "ALIGN8" | "ALIGN16" | "ISA68k" | "ISAPPC" | "ISAi386" | "ppc" | "i386" | "68k"
Note:
The elements of a "condition" may be spread over more than one pragma.

The following pragma variables are already known to the compiler (cf. 4.3.3):
NameMeaningNotes
IndexCheckMonitor array indices (for under-/overflow)1, 4
PointerCheckTest for NIL pointer / EMPTY reference (on deref)1, 4
RangeCheckMonitor subrange values (for under-/overflow)1, 4
ComplexCheckSeparate checks for complex arithmetic (see below)1, 4
OverflowCheckTest for overflow in whole-number arithmetic1, 4
DivZeroCheckTest for division by 0 
StackCheckTest for stack overflow4
WarningsInclude warnings in compiler error-output 
CopyRefparamsCopy reference (i.e. value) parameters1, 5
PragmaCheckIssue warnings if unknown pragmas encountered 
SpecialStringsInterpret "\" as meta-symbol in strings2
CastRegisterType casting in registers (see below) 
FixedSubrangeSizeSize of subrange types (cf. 5.3.1)7
VectorAltiVec extensions (cf. 4.6)6
FloatoptFloating point optimization (see below) 
ForeignGenerate foreign module2
UpperCaseNamesCompletely capitalize variable names for linker3
CallingUse different calling sequence (see below) 
ReferenceReference parameter (see below) 
ForeignEntryExternal module initialization (cf. 4.4.2.3) 
EntryNameLinker name (see below) 
AlignBorderAlignment control (see below) 
InvertBranchesControl of branch prediction (cf. 5.3.14) 
ISAtarget code6
ARCHtarget architecture6
OptLevelOptimization level (see below) 
p1Always TRUE6, 7
StonyBrookAlways FALSE6, 7

The value of the pragma variable "CastRegister" effects only ppc code as for Power processors, byte ordering differs between registers and memory. If the value "TRUE" is assigned to the pragma variable "CastRegister", then casting of ordinal types is performed according to register allocation conventions instead of memory allocation conventions. As variables are allocated left justified in memory but right justified in a register, a call of e. g. "CAST (CAST16, card32variable)" accesses the two rightmost bytes in register, but the two leftmost bytes in memory.

The values ""CCalling"", ""CByRef"", ""Pascal"" and ""Default"" can be specified for the pragma variable "Calling" (cf. 4.4.1).

If "TRUE" is assigned to the pragma variable "Reference" the following (value) parameter is passed by reference (cf. 5.3.4).

Any string which is usable as an external linker name can be specified for the pragma variable "EntryName". (see also section 4.4.2.1 / 4.4.2.2).

The values "ALIGN1", "ALIGN2", "ALIGN4", "ALIGN8", and "ALIGN16" can be specified for the pragma variable "AlignBorder" (cf. 5.3.1).

The pragma variable "ARCH" allows to check for the target architecture. Actually, the values "ppc" (PowerPC code generation), "i386" (Intel 32-bit code generation), and "x86" (Intel 64-bit code generation) may be returned.
These values are meaningful if the C back end is used too.

The pragma variable "ISA" (Instruction Set Architecture) is still supported for backward compatibility. Actually, the values "ISAPPC" (PowerPC code generation), "ISAi386" (Intel 32-bit code generation), and "ISAx86" (Intel 64-bit code generation) may be returned.

The pragma variable "OptLevel" allows to check for or set the actual optimization level. In the current implementation the values 0, 1, and 2 are accepted (cf. section 1.2).

If the pragma variable "Floatopt " is set to "TRUE ", multiply add / multiply subtract instructions are used in PowerPC code where possible.

All the other pragma variables take logical values.

If the pragma variable "ComplexCheck" is set to "TRUE", additional code is generated to distinguish between exceptions in real arithmetic and exceptions in complex arithmetic. If this variable is set to "FALSE", complex arithmetic exceptions are reported as real arithmetic exceptions. In standard mode "ComplexCheck" always has the value "TRUE".

Notes:

  1. In standard mode always has the value "TRUE".
  2. In standard mode always has the value "FALSE".
  3. Can only be used when "Foreign" is set to "TRUE". This means that in standard mode "ASSIGN (UpperCaseNames, FALSE)" and "ASSIGN (Calling, "Default")" are both implicitly set without warning cf. 4.4.1).
  4. The corresponding run-time tests can be turned on ("TRUE") and off ("FALSE") using these pragmas.
  5. Controls the way large value parameters are passed (cf. 5.3.4).
  6. Read only variable.
  7. Useful if the same sources are also compiled with other compilers, especially Stony Brook Modula-2.

4.3 Conditional Compilation

Individual sections of source text can be conditionally included or excluded from the compilation using special pragmas. Any pragmas contained in such a section of source text are also subject to this inclusion or exclusion as the case may be. Conditional-compilation directives are subject to the pragma syntax described in section 4.2, which means they are also enclosed in "<*" and "*>". Several such directives, optionally separated by semicolons, can be included, as can be seen from the syntax, in a single pragma. Pragmas can be freely intermingled.

Compilation is controlled by so-called compile-time variables which can be assigned values using special pragmas (and also from the command line). Their values can then be determined using relational expressions. The values "TRUE" and "FALSE" have special meanings. Variables which can take on these and only these values behave like "BOOLEAN" variables. Expressions used in conditions always have to have the type "BOOLEAN".

4.3.1 Syntax of Conditional Compilation Directives

DEFINE (name, value)Declare the variable "name" and initialize it with the value "value". A logical variable is defined if "value" is "TRUE", "FALSE" or another logical variable, in all other cases a string variable is defined.
ASSIGN (name, value)Assign the value "value" to the variable "name".
ENVIRON (name, value)Declare the variable "name" and assign it a value from the command line. If the command line does not contain a value for it, "name" is assigned the value "value". "value" determines the type of the variable.
IF boolExpression1 THENCompile the subsequent text if "boolExpression1" is true.
ELSIF boolExpression2 THENCompile the subsequent parts of the program if previous boolean expression(s) were not true but boolExpression2 is. This pragma is only allowed after a previous "IF booleanExpression THEN" pragma.
ELSECompile the subsequent text if previous boolean expression(s) were not true. This pragma is only allowed after a previous "IF booleanExpression THEN" pragma.
ENDEnd of the conditionally-compiled program text. Every "IF booleanExpression THEN" pragma must be matched by a concluding "END" pragma.
value1 = value2This boolean expression is true if value1 is equal to value2.
value1 <> value2This boolean expression is true if value1 is not equal to value2.
nameSimplified form of "name = TRUE"; "name" must be a logical variable.
NOT valueNegates "value". "value" has to be "TRUE" or "FALSE".
value1 AND value2Logical and, "value1" and "value2" have to be "TRUE" or "FALSE".
value1 OR value2Logical or, "value1" and "value2" have to be "TRUE" or "FALSE".
(boolExpression1)Bracketing of boolean expressions.
"Conditions" (in the sense used in the EBNF syntax above) may be nested in the same way that Modula-2 "IF" statements can be nested.

4.3.2 Compile-Time Variables

The names of compile-time variables may consist of up to 32 characters excluding spaces (" "). Case is significant (i.e. upper and lower case letters are differentiated). New compile-time variables are generated using "DEFINE" and given an initial value. Using "DEFINE" enables the compiler to check whether the name already has some other meaning (e.g. a predefined pragma variable like for example "CopyRefparams"). "ENVIRON" enables variables to be declared like with "DEFINE" and at the same time assigned a value from the command line. Variables which are referred to in conditions must be known to the compiler, i.e. they must either be internal variables or have been explicitly declared using "DEFINE" or "ENVIRON". This makes it possible for the compiler to issue a warning if it encounters an undefined compile-time variable (e.g. due to a typing error).

4.3.3 Examples

<* IF TEST THEN *> WriteCard (x, 1); <* END *>
"WriteCard …" is only compiled if the compile-time variable "TEST" has the value "TRUE".

FROM <* IF LONG THEN *> LongMath <* ELSE *> RealMath <* END *> IMPORT sqrt;
The function "sqrt" is imported either from "LongMath" or from "RealMath" depending on the value of compile-time variable "LONG".

<* IF RANGE THEN RangeCheck (TRUE); END *>
If "RANGE" is specified, subrange monitoring is turned on.

<* ENVIRON (TARGET, "MAC") *>
The variable "TARGET" is defined and given the value ""MAC"", unless it was assigned a value in the command line, in which case it receives that value.

<* IF NOT test THEN *> x := 1; <* END *>
The assignment will only be included in the program if the compile-time variable "test" has the value "FALSE".

4.4 Interfacing to Other Languages

4.4.1 Calling Foreign Code Modules from Modula-2

p1 Modula-2 provides pragmas for defining so-called interface modules. Such modules serve to enable software written in other languages (e.g. system APIs) to be included in a program.

An interface module consists solely of a definition module. It is in fact impossible to compile a corresponding Modula-2 implementation module (the foreign software takes its place after all). Textually an interface module differs from a simple definition module only in having the normal module head preceded by the pragma "<* ASSIGN (Foreign, TRUE) *>". For example:

    <* ASSIGN (Foreign, TRUE) *>
    DEFINITION MODULE Memory;

All programming languages which pass parameters via standard conventions are supported. The programmer is responsible for ensuring that the interface ties in correctly by appropriately mapping parameter types (and possibly also their order) in the foreign software segment. One limitation worth mentioning: no other popular programming language has anything corresponding to Modula-2's concept of open arrays. (see also sections 5.3.2 / 5.3.3 parameter passing.)

In particular, two different kinds of procedure interface are supported by the compiler:

  1. The programming language C
  2. The programming language Pascal

Examples for interfacing to Pascal are given by the Universal Definition Files which are (for the most part automatically) derived from the universal header files written in Pascal.
The definition files for HIToolbox give an example for converting C header files to Modula-2. They contain the original C code as comments, so that the more complex type matching can be followed.

4.4.1.1 Interfacing to C

4.4.1.2 Interfacing to Pascal

4.4.2 Calling Modula-2 Procedures from Other Languages

As it is possible to call procedures written in other languages from a Modula-2 program, it is also possible to call procedures written in Modula-2 from a program written in an other language. Examples for this section can be found in the folder "M2Examples/Foreign main/" and the Cococa examples.

In this case, the main procedure must not be defined in the Modula-2 runtime, so the Modula-2 library has to be replaced by "libm2libforeign.a".

4.4.2.1 Calling Conventions

The pragma variable "Calling" can be used (analogous to the description for interface procedures) to adapt the calling convention of other languages to Modula-2 procedures. This is achieved simply by writing the procedure heading in the implementation module (and in the case of exported procedures also in the definition module) in the following way:
    <* ASSIGN (Calling, "CCalling") *>
    PROCEDURE name (...);

Similarly, procedure types can be declared like this:

    <* ASSIGN (Calling, "CCalling") *>
    TYPE name = PROCEDURE (...);

A procedure type of this kind is not compatible to procedures with default calling conventions. If only the address shall be passed for value parameters of structured types, the calling convention "CByRef" may be specified instead of "CCalling". In this case the parameters have to be declared as "const MyType *", "const MyType&", or something similar in the corresponding C header file.

Modula-2 procedures which conform to C calling conventions (like the examples above) can also be called from Modula-2 without taking any special precautions: the compiler automatically takes these conventions into account when such procedures are called.

The pragma variable "EntryName" allows to specify a linker name in addition to the internal compiler name, e.g. the original procedure name:

    <* ASSIGN (Calling, "CCalling") *>
    <* ASSIGN (EntryName, "MyExternalName") *>
    PROCEDURE MyName (...);

Defined this way, the procedure "MyName" may be called from a C program also via "MyExternalName".

4.4.2.2 Runtime- and module initialization

If the main program is not written in Modula-2, the Modula-2 runtime system is not automatically linked to the program and therefore not initialized. It must be also noted that the initialization parts (if any) of those Modula-2 modules are not necessarily executed.

When required, the initialization of the Modula-2 runtime system (including the execution of the module initialization parts) can be invoked manually. The runtime system provides for the following auxiliary procedures (the headings are written according to Modula-2 conventions):

To define a starting point for the initialization chain inside the desired implementation module, the pragma "<* ASSIGN (ForeignEntry, TRUE) *>" has to be placed inside this module. The pragma must be specified before the initialization part of the module and may be specified only for one module in the whole program. If no such pragma is specified at all, the linker will issue an error message for the undefined symbol "_M2_START_CHAIN".

4.5 Dynamic Arrays

It is a common problem to need some array the size of which cannot be determined until run-time. Known tricks are to use an array large enough for all possible data and to store the top index in a separate variable or to allocate the array on heap using the type "POINTER TO ARRAY [0 .. 0] OF ..." and switching off index check. Both ways are clumsy and unsafe. Dynamic arrays, the element number of which is specified at runtime, solve the problem. Space for dynamic arrays is allocated from heap. This is the reason that dynamic arrays are not declared directly, only pointers to dynamic arrays are possible.

4.5.1 Declaration of the Data Structure

The data structure for one dimensional dynamic arrays has the form:
POINTER TO ARRAY OF ElementType

For multi-dimensional arrays "ARRAY OF" is specified according to the number of dimensions. The declaration of dynamic arrays is a normal type declaration, it may be used in any place a type declaration is possible.

Like with open arrays the index range starts at "0", is a subrange of "CARDINAL", and reaches up to the top index specified on allocation (cf. 4.5.2).

4.5.2 Allocation / Deallocation of Storage

For dynamic arrays storage is allocated by a special call to "NEW"; in addition to the pointer variable as first parameter a top index has to be specified for each dimension (order of notation), e.g.:
VAR
    matrix: POINTER TO ARRAY OF ARRAY OF REAL;
BEGIN
    NEW (matrix, 5, 10);
allocates storage for a two-dimensional array with the actual type of "ARRAY [0 .. 5] OF ARRAY [0 .. 10] OF REAL".

Using "ALLOCATE"directly to allocate storage for a dynamic array is not allowed and results in an illegal program as the data structure for the index descriptors is not set up, though the compiler cannot check whether "ALLOCATE" is used directly to allocate storage.

Deallocation of storage is done by "DISPOSE" as usual.

4.5.3 Operations on Dynamic Arrays

The following operations are legal for dynamic arrays (similar to open arrays):

4.6 Support for the AltiVec instruction set

The AltiVec instruction set for vector operations available on G4 and G5 processors is supported by p1 Modula-2 in a wide range. The new key word "VECTOR" is used to define the according data types. As far as appropriate the predefined index operations are overloaded for vector types. Additionally, procedures or functions are defined in the pseudo module "VECTORS" to make available each machine instruction. This functions expand with one exception to exactly one AltiVec instruction so that no code overhead is produced. The necessary alignment to 16 byte boundaries for vector variables is forced automatically; only for dynamically allocated storage the allocation mechanism has to guarantee alignment on 16 byte boundaries (cf. 4.6.3).

4.6.1 New language elements for vector operations

To make the language elements for vector operations available, the given module has to be compiled with the command line option "-VECTOR". The pragma variable "Vector" reflects the value of this command line option to allow conditional compilation depending on activated vector support.

Vector data types are very similar to array types. The type declaration has the form:
VECTOR OF ElementType
where "ElementType" denotes one of the following types: "REAL", "CARD8", "CARD16", "CARD32", "INT8", "INT16", "INT32", "SET8", "SET16", "SET32", or any compatible type.

Access to vector elements is possible by indexing, the index range is
[0 .. TSIZE (VectorType) / TSIZE (ElementType) - 1].
Value constructors are supported, the syntax is equal to the syntax of value constructors of arrays.

The following operators are overloaded for vector operations with the given element type:

OperatorIntegerCardinalRealSet
+oooo
-oooo
*  oo
/   o
=, ><, >, <, >=, <=oooo
For whole number types modulo arithmetic is used, i. e. overflow may occur. For compare operators the result is true if every vector element meets the given condition.

4.6.2 The Module "VECTORS"

This module contains the functions representing the machine operations available for vector types as well as some often used types of vectors.

Like "SYSTEM", "VECTORS" is also a pseudo-module and therefore does not have a symbol file in p1 Modula-2.

4.6.2.1 The Pseudo Definition-Module "VECTORS"

The module "VECTORS" could have a definition module something like this:
DEFINITION MODULE VECTORS;
TYPE
    INT8VECTOR = VECTOR OF INT8;
    INT16VECTOR = VECTOR OF INT16;
    INT32VECTOR = VECTOR OF INT32;
    CARD8VECTOR = VECTOR OF CARD8;
    CARD16VECTOR = VECTOR OF CARD16;
    CARD32VECTOR = VECTOR OF CARD32;
    SET8VECTOR = VECTOR OF SET8;
    SET16VECTOR = VECTOR OF SET16;
    SET32VECTOR = VECTOR OF SET32;
    REALVECTOR = VECTOR OF REAL;

    VecAbs (<vector>): <vector>;
    VecAbss (<vector>): <vector>;
    VecAdd  (<vector>, <vector>): <vector>;
    VecAddc  (<vector>, <vector>): <vector>;
    VecAdds  (<vector>, <vector>): <vector>;
    VecAnd  (<vector>, <vector>): <vector>;
    VecAndc  (<vector>, <vector>): <vector>;
    VecAvg  (<vector>, <vector>): <vector>;
    VecCeil (<vector>): <vector>;
    VecCmpb  (<vector>, <vector>): <vector>;
    VecCmpeq  (<vector>, <vector>): <vector>;
    VecCmpge  (<vector>, <vector>): <vector>;
    VecCmpgt  (<vector>, <vector>): <vector>;
    VecCmple  (<vector>, <vector>): <vector>;
    VecCmplt  (<vector>, <vector>): <vector>;
    VecCtf  (<vector>, <vector>): <vector>;
    VecCts  (<vector>, <vector>): <vector>;
    VecCtu  (<vector>, <vector>): <vector>;
    VecExpte (<vector>): <vector>;
    VecFloor (<vector>): <vector>;
    VecLoge (<vector>): <vector>;
    VecMadd (<vector>, <vector>, <vector>): <vector>;
    VecMadds (<vector>, <vector>, <vector>): <vector>;
    VecMax  (<vector>, <vector>): <vector>;
    VecMergeh  (<vector>, <vector>): <vector>;
    VecMergel  (<vector>, <vector>): <vector>;
    VecMfvscr  (<vector>, <vector>): <vector>;
    VecMin  (<vector>, <vector>): <vector>;
    VecMladd (<vector>, <vector>, <vector>): <vector>;
    VecMradds (<vector>, <vector>, <vector>): <vector>;
    VecMsum (<vector>, <vector>, <vector>): <vector>;
    VecMsums (<vector>, <vector>, <vector>): <vector>;
    VecMule  (<vector>, <vector>): <vector>;
    VecMulo  (<vector>, <vector>): <vector>;
    VecLd (INTEGER, POINTER TO <vector>): <vector>;
    VecLde (INTEGER, POINTER TO <vector>): <vector>;
    VecLdl (INTEGER, POINTER TO <vector>): <vector>;
    VecLvsl (INTEGER, POINTER TO <vector>): <vector>;
    VecLvsr (INTEGER, POINTER TO <vector>): <vector>;
    VecNmsub (<vector>, <vector>, <vector>): <vector>;
    VecNor  (<vector>, <vector>): <vector>;
    VecOr  (<vector>, <vector>): <vector>;
    VecPack  (<vector>, <vector>): <vector>;
    VecPackpx  (<vector>, <vector>): <vector>;
    VecPacks  (<vector>, <vector>): <vector>;
    VecPacksu  (<vector>, <vector>): <vector>;
    VecPerm  (<vector>, <vector>): <vector>;
    VecRe (<vector>): <vector>;
    VecRl  (<vector>, <vector>): <vector>;
    VecRound (<vector>): <vector>;
    VecRsqrte (<vector>): <vector>;
    VecSel (<vector>, <vector>, <vector>): <vector>;
    VecSl  (<vector>, <vector>): <vector>;
    VecSld  (<vector>, <vector>): <vector>;
    VecSll  (<vector>, <vector>): <vector>;
    VecSlo  (<vector>, <vector>): <vector>;
    VecSplat (<vector>): <vector>;
    VecSplats8 (<vector>): <vector>;
    VecSplats16 (<vector>): <vector>;
    VecSplats32 (<vector>): <vector>;
    VecSplatu8 (<vector>): <vector>;
    VecSplatu16 (<vector>): <vector>;
    VecSplatu32 (<vector>): <vector>;
    VecSr  (<vector>, <vector>): <vector>;
    VecSra  (<vector>, <vector>): <vector>;
    VecSrl  (<vector>, <vector>): <vector>;
    VecSro  (<vector>, <vector>): <vector>;
    VecSub  (<vector>, <vector>): <vector>;
    VecSubc  (<vector>, <vector>): <vector>;
    VecSubs  (<vector>, <vector>): <vector>;
    VecSum4s  (<vector>, <vector>): <vector>;
    VecSum2s  (<vector>, <vector>): <vector>;
    VecSums  (<vector>, <vector>): <vector>;
    VecTrunc (<vector>): <vector>;
    VecUnpackh  (<vector>, <vector>): <vector>;
    VecUnpackl  (<vector>, <vector>): <vector>;
    VecUnpackhpx  (<vector>, <vector>): <vector>;
    VecUnpacklpx  (<vector>, <vector>): <vector>;
    VecXor  (<vector>, <vector>): <vector>;
    VecAlleq  (<vector>, <vector>): BOOLEAN;
    VecAllge  (<vector>, <vector>): BOOLEAN;
    VecAllgt  (<vector>, <vector>): BOOLEAN;
    VecAllin  (<vector>, <vector>): BOOLEAN;
    VecAllle  (<vector>, <vector>): BOOLEAN;
    VecAlllt  (<vector>, <vector>): BOOLEAN;
    VecAllnan  (<vector>, <vector>): BOOLEAN;
    VecAllne  (<vector>, <vector>): BOOLEAN;
    VecAllnge  (<vector>, <vector>): BOOLEAN;
    VecAllngt  (<vector>, <vector>): BOOLEAN;
    VecAllnle  (<vector>, <vector>): BOOLEAN;
    VecAllnlt  (<vector>, <vector>): BOOLEAN;
    VecAllnumeric  (<vector>, <vector>): BOOLEAN;
    VecAnyeq  (<vector>, <vector>): BOOLEAN;
    VecAnyge  (<vector>, <vector>): BOOLEAN;
    VecAnygt  (<vector>, <vector>): BOOLEAN;
    VecAnyle  (<vector>, <vector>): BOOLEAN;
    VecAnylt  (<vector>, <vector>): BOOLEAN;
    VecAnynan  (<vector>, <vector>): BOOLEAN;
    VecAnyne  (<vector>, <vector>): BOOLEAN;
    VecAnynge  (<vector>, <vector>): BOOLEAN;
    VecAnyngt  (<vector>, <vector>): BOOLEAN;
    VecAnynle  (<vector>, <vector>): BOOLEAN;
    VecAnynlt  (<vector>, <vector>): BOOLEAN;
    VecAnynumeric  (<vector>, <vector>): BOOLEAN;
    VecAnyout  (<vector>, <vector>): BOOLEAN;

    VecDss (CARDINAL);
    VecDssall ();
    VecDst (ADDRESS, CARDINAL, <Cardinal constant>);
    VecDstst (ADDRESS, CARDINAL, <Cardinal constant>);
    VecDststt (ADDRESS, CARDINAL, <Cardinal constant>);
    VecDstt (ADDRESS, CARDINAL, <Cardinal constant>);
    VecMtvscr (CARDINAL);
    VecSt (<vector>, INTEGER, ADDRESS);
    VecSte ((<vector>, INTEGER, ADDRESS);
    VecStl ((<vector>, INTEGER, ADDRESS);
END VECTORS.

4.6.4.2 Description

The pseudo definition module shows the number and general type of the parameters of the supported procedures an functions. For more specific information about the meaning of the procedures / functions and the exact types of the parameters please refer to the description of the AltiVec instruction set by Motorola or the corresponding parts of the C/C++ compiler manuals.

4.6.3 Alignment of dynamically allocated storage

For objects the storage of which is allocated by the runtime system ("TRACED CLASS") correct alignment is guaranteed.

For manually allocated storage area the p1 Modula-2 library provides for a module "VectorStorage". The semantics of this module is similar to that of the module "Storage". Additionally all returned addresses point to a storage area aligned on a 16 byte boundary.

4.7 Further Extensions

Identifiers
The character "$" is also allowed in identifiers. It has the same status as a letter, i.e. it can also be placed at the beginning or end of an identifier and can also occur more than once consecutively. This character is needed above all for the names of system procedures in foreign modules.

Importing all items from a module
"FROM Module_Ident IMPORT *;" has the effect of importing from a module every single (exported) entity declared in its definition module. The extension "FROM Module_Ident IMPORT * EXCEPT ident1, ident2, ...;" allows to import all entities but "ident1", "ident2", etc. Too frequent use of this feature is however discouraged, as it can reduce the readability of a module considerably.

Open arrays
Not only arrays having the same element type are compatible to open-array parameters, but also variables and expressions of the array-element type.

String constants
String constants may be indexed like a constant of type ARRAY someRange OF CHAR.
They may be used within value constructores to build an array of character.

Constant "NILPROC"
"NILPROC" is compatibel to every procedure type. If a procedure variable containing the value "NILPROC" is called, the exception "invalidLocation" is raised (cf. 2.11.3).

Function "HIGH"
The standard function "HIGH" can be used not only with open arrays but also with dynamic arrays and arrays of known size. In the case of a multidimensional array, the type of the "constexpr" used to indicate the dimension must be assignment compatible to the corresponding array index type.

Undefined variables
The compiler checks statically whether a local variable used in an expression has actually been assigned a value beforehand. If neither a direct assignment nor an indirect one resulting from use as a "VAR" parameter has taken place a warning is issued. In rare cases ("IF" statements within loops, assignments resulting from side effects of internal procedures) the compiler cannot recognize an assignment as having occurred "before" and issues a warning superfluously. Such warnings can be suppressed locally using "<* Warnings (FALSE) *>".

chapter 3 (compiler) start page chapter 5 (compiler)