The programming language Pascal was originally designed by Professor Niklaus Wirth, and named after a French mathematician and philosopher widely admired for the clear and direct nature of his ideas. Wirth set out two principal aims for the design of Pascal: that it should be systematic and coherent, so far as possible avoiding arbitrary restrictions, and that it should be suitable for efficient implementation on the currently available machines. A "User Manual and Report" defining Pascal was produced by Jensen and Wirth during the 1970's; the most recent third edition corresponds exactly to the first standard mentioned below.
Closely-related but not identical standards were published in 1983 by ISO (originally a BSI standard) and ANSI. While the definition they contained was very precise, and in this respect was something of a landmark, it did not attempt to add much to Wirth's original specification. One consequence was that while many implementations provided the standard language, they also devised their own extensions to provide users with features which had been found to be in demand.
After publication of the first standard, the ANSI/IEEE joint project committee continued to develop extensions, and from 1984 worked closely with the ISO group to produce the definition of an Extended Pascal language which would be upward-compatible with the earlier versions, equally rigorously defined but covering areas which experience had proved to be desirable. Extended Pascal standards which are identical in technical content were published in 1990 and 1991. After finalising the standard, the technical committees have continued to develop new features in the areas of Object Oriented programming and exception handling.
The Extended Pascal language is somewhat larger than the
earlier standard (which will generally be referred to here as
"classic" Pascal), and an introduction of this kind can give only
a general survey. However, the first complete implementation has
been produced for microcomputers running DOS, so it is fair to
conclude that Wirth's original aims have still been kept in
mind.
Before considering major new features, it is useful to start a
survey of the Extended Pascal language with some enhancements
which make the facilities inherited from classic Pascal easier to
use (and which apply also in the major extensions described
later). A number of these enhancements may be familiar as local
extensions to users of current Pascal implementations.
Constant expressions can be used where classic Pascal
requires constants, for example in declarations. These constant
expressions can employ most of the predefined functions, as well
as operators.
CONST linefeed = chr(10);
TYPE buffer = ARRAY[0..bufflen-1] OF char;
Relaxed ordering. The declarations and definitions
(CONST, TYPE, VAR etc.) can be repeated, and can appear
in any order, provided that there are no forward references.
Functions can return results of any assignable type,
including arrays and records, and the processes of dereference
(^), indexing and field selection can be applied when
appropriate to the type. The result variable may be given a name
(different from the function name) that can be referenced in
contexts that require a variable without causing a recursive call
of the function.
Subrange bounds may be constant expressions, in line with
the wider relaxation described above, and such subranges define
conventional static types. The bounds may also be general
expressions, involving run-time variables, and introducing
nonstatic types; this topic is discussed separately later.
CASE enhancements. In both variant record declarations and
CASE statements, ranges are permitted in constant-lists,
and an OTHERWISE clause may be introduced to complete
the list.
CASE ch OF
'0'..'9': digit;
'A','E','I','O','U': vowel;
OTHERWISE other;
END {case};
Short-circuit operators. Variants of the Boolean operators
AND and OR are provided which guarantee that an
expression it evaluated no further than is necessary to determine
the result. The new variants are called AND_THEN and
OR_ELSE, and they can be used as in the following
example to simplify conditional code.
IF (p <> NIL) AND_THEN (p^.f2 = 2)
THEN .....
ELSE .....
If p is nil, control passes immediately to the else
part, and the illegal dereference p^.f2 (which would
result in a protection violation if the program were running in a
protected memory environment) is not executed.
Nondecimal numbers may be introduced into source programs
using the notation base#number, for instance 16#FF or
8#377 (hexadecimal FF = octal 377 = decimal 255).
Characteristics. Just as maxint gives
implementation-specific information about the integer type, there
is a predefined constant maxchar (the character with
largest ordinal value), and constants maxreal,
minreal and epsreal which give the characteristics
of the real type.
Numeric input from textfiles accepts the representation
laid down in the data-interchange standard (ISO 6093), which
permits a decimal point in leading or trailing position.
Zero fieldwidth. The optional fieldwidth parameter in
textfile output may take the value zero.
Inverse ord. An enhancement to the succ and
pred functions allows them to take an optional second
parameter of integer type. This is more versatile than a simple
inverse of ord, but among other things gives succ that
capability; for example, given a conventional day-of-week
enumeration, succ(Monday,3) yields
Thursday.
Underscores are permitted as significant characters within
identifiers, though not in leading or trailing positions.
In classic Pascal, all variables are created in an undefined state (that is, containing an undefined value). In Extended Pascal, an initial state can be defined which is automatically given to a variable when it is created. At the outer level an implementation may preset the initial state, but the facility applies also to local variables of procedures and to variables created on the heap. Furthermore, a type definition may carry an initial state, which is given to every variable of the type unless the variable declaration itself overrides it.
TYPE intz = integer VALUE zero;
trec = RECORD
a,b: intz;
c: char VALUE '*';
END;
VAR recp: ^ trec;
With these declarations, a call of new(recp) causes a
record to be created on the heap with fields a and
b initialised to zero and c initialised to
asterisk.
The set constructors of classic Pascal form the basis of constructors to define other structured values. When all the ingredients are constant, a constructed value can be named as a constant or used to define initial states. Within statements, the constructor may include run-time values.
An array constructor may list specific indexes, with the values those elements are to be given, and/or may provide a default to be given to any elements not individually listed. A record constructor names record fields together with the values to be given; the whole record must be defined. (For some purposes it may be more convenient to specify values of individual fields in the record declaration.)
[3..6,9:5.5; 10:10.5; OTHERWISE 0] {array}
[f1,f4:10; f2:'$'; f3:'Message'] {record}
The processes of indexing and field selection can be applied
to structured constants (including string literals) just as to
variables; furthermore, within statements an index may be a
run-time value, so that different elements of the constant are
accessed on different occasions.
Classic Pascal does not include any facility for separate compilation of parts of a program. Besides limiting the scope of programs which can be produced on small machines, this has the important disadvantage that there is no standard form for the preparation of precompiled libraries. Almost every implementation of Pascal introduced an extension of some kind to overcome this limitation, and it was seen as one of the most important tasks of Extended Pascal to define a form of separate compilation which would not forfeit type security.
Besides the main program, Extended Pascal programs may include components known as modules. A module can export constants, types, variables, procedures and functions through named interfaces, and these interfaces may be imported by other modules or by the main program. By default, an interface is imported complete, with the names of all its constituents accessible, but there are several options to meet the difficulties which can arise in practice when importing from modules which were not designed in conjunction with one another. Instead of importing the whole of an interface, just selected items may be chosen; the names of constituents may be kept apart and referred to by giving the interface name; and constituents may be renamed on import.
A module has two parts: a heading and a block. The module heading contains declarations and definitions of any items which are to be exported, in particular the headings (but no more) of procedures and functions. The block includes the definitions of any exported procedures or functions, together with items which do not need to be known outside the module. There may also be initialization and finalization code.
The heading and block may be combined or separate. When separate, the possibility arises of alternative implementations of the same heading (with and without diagnostic code, for example), or of an implementation coded in some other language such as assembler.
A module heading may import from another module, and may
re-export these imported items, allowing for example composite
library interfaces to be constructed. A module block may
independently import interfaces from other modules. The export
and import of an interface set up what is known as a "supplying"
relationship, in which the exporting module supplies the
importing module or program. The supply network puts some
constraints on the sequence in which modules can be compiled; in
particular, it must not contain any loops (which would imply that
a module was indirectly attempting to supply itself). Any
initialization code for a module is executed before that of any
component which it supplies.
Modular construction is normally appropriate to larger programs,
and small examples inevitably appear trivial. However, the three
related modules which follow demonstrate a number of the
possibilities.
Module one exports an interface named i1, containing two constants named lower and upper. A variable dummy is declared but not exported. Module one has a minimal module block.
MODULE one;
EXPORT i1 = (lower,upper);
CONST lower = 0;
upper = 11; {must be prime}
VAR dummy: Boolean;
END {of module-heading};
END {of module-block}.
Module two imports the constants lower and
upper from one, uses them to define a type, and
also re-exports them. It exports two interfaces named i2
and j2. Interface i2 contains the type
subr; j2 contains the constants lower and
upper. Module two also has a minimal module
block.
MODULE two;
EXPORT i2 = (subr); {has just one constituent}
j2 = (lower,upper);
IMPORT i1; {import all (both) constituents}
TYPE subr = lower..upper;
END {of module-heading};
END {of module-block}.
Module three demonstrates qualified import and
renaming. It exports one interface i3 containing a
function, a type, and two constants. It imports interfaces
i1 and i2 qualified, so references to the
constituents within the module are prefixed with the
interface-names; further, the type subr is renamed
lim_range on import, so it is referred to as
i2.lim_range. The constants lower and
upper are renamed on export as lim_lower and
lim_upper. The function-heading of limited is
declared in the module-heading, and the function-definition in
the module-block. Note that the parameter-list and result-type of
limited are not repeated in the definition; this
arrangement is similar to forward-declared procedures in classic
Pascal.
MODULE three;
EXPORT i3 = (limited,i2.lim_range, {function and type}
i1.lower=>lim_lower,i1.upper=>lim_upper);
IMPORT i1 QUALIFIED; {lower, upper to be referenced
as i1.lower and i1.upper}
i2 QUALIFIED ONLY (subr=>lim_range);
FUNCTION limited(x: integer): i2.lim_range;
END {of module-heading};
FUNCTION limited;
BEGIN
IF x < i1.lower THEN limited := i1.lower
ELSE
IF x > i1.upper THEN limited := i1.upper
ELSE limited := x
END {limited};
END {of module-block}.
Restricted types provide a means of hiding the details of a
type when it is exported. The originator of a module may declare
a "restricted" version of a type, and export only the restricted
form. An importer can declare variables of such a type, and pass
them as parameters to procedures imported from the originating
module, but can only treat them as black boxes with no knowledge
of their internal structure. Within the module, the restricted
parameters are of the unrestricted original type.
In classic Pascal, the only string facilities are associated
with packed arrays of char. This is another area in which a
variety of local extensions have arisen. Extended Pascal includes
provision for dynamic string types, and unifies them with classic
Pascal strings and with characters.
String variables are declared with a maximum capacity, for
instance:
VAR s1,s2: string(20);
fname: PACKED ARRAY [1..20] OF char;
String values have a length (number of characters). A dynamic
string variable such as s1 can hold a value of any
length from zero up to its capacity, and the object code keeps
track of the current length. With a fixed string such as
fname, as found in classic Pascal, the length of the
contents is equal to the capacity; when a shorter value is
assigned to fname, it is padded on the right with spaces
until it fits. A variable of type char has a capacity of
1. Variables of these three kinds, together with string literals
and character constants, produce general string values. In
addition, individual characters or substrings of string variables
can be referenced by indexing, for instance s1[i] or
fname[1..8].
String values can be concatenated using the + operator,
and constants can be defined by constant expressions of string
type, eg. 'ABC'+chr(13). There are predeclared functions
for the commonly-required string operations such as locating a
substring within a longer string.
Strings can be written to or read from textfiles, and versions of
the textfile read and write procedures are
provided which take a string variable in place of the file,
making all the conversion and editing processes available
internally.
A string may be declared with capacity fixed at compile time, as
in the example above, or defined by a run-time variable
expression. There are also provisions for formal parameters which
adjust themselves to the actual parameter at each call.
PROCEDURE p (VAR s: string)
Dynamic strings of different capacities may be passed to this
procedure with each call; the code within the procedure can
discover the capacity of each actual parameter by reference to
s.capacity.
If a variable n has the value 10, a string declared as
string(n) has the same type as one declared as
string(10) for compatibility purposes (though the type
checking cannot be performed until run-time). As will be seen in
the next section, this rule and the adaptable formal parameters
are both particular cases of facilities that apply to all
schematic types, and arise from string being formally
defined to be a predeclared "schema" with additional special
properties.
It has been a characteristic of almost all versions of Pascal
that data types are static, that is, are ultimately defined at
compile time. The ISO version of classic Pascal included an
optional feature called conformant array parameters, a
specialised parameter form for which actual arrays of different
sizes can be supplied in different calls of the same procedure.
This feature has been included in a number of implementations,
and provides a measure of flexibility which suits, for example,
mathematical procedures which manipulate arrays. All that is
required is that actual parameters "conform" to the formal in the
sense of having the same number of dimensions and final element
type.
In the context of classic Pascal, conformant arrays give a degree
of flexibility when ready-made procedures are included in a
program in source form, but the actual parameters must ultimately
all be static, with sizes defined at compile time. Conformant
arrays continue to be an optional feature of Extended Pascal, but
there is in addition a more far-reaching variety of nonstatic
types based on schemata.
A schema is a template describing a family of related types, from
which individual types can be produced by substituting either
compile-time or run-time values, typically to define subrange
bounds or to select record variants. These "schematic" types can
be used in almost all respects just like conventional static
types: they can be used in the declaration of variables, record
fields and formal parameters; they can be used as domain types of
pointers, and returned by functions. It was observed earlier that
subranges can have their bounds determined at runtime; such
subranges are similar to individual schematic types without the
benefit of a family connection.
TYPE s(a,b: integer) = ARRAY [a..b] OF real;
VAR x: s(0,n-1);
The schema s defines a family of array types. The variable x has a type produced from the schema by substituting the values 0 and n-1 for a and b. If n is a variable, the size of the array is determined at run-time. The index bounds of array x can be referenced as x.a and x.b, as in the statement
FOR i := x.a TO x.b DO writeln(x[i]);
Formal parameters may be declared with the original schema
name, and will adapt themselves to the actual parameter at each
call, as described earlier for the particular case of strings. In
this respect they are similar to conformant array parameters, but
require that each actual is of a type produced from the same
schema. A pointer may also be declared to have the schema name as
its domain type, and an additional form of the procedure
new is provided which includes actual values to select a
type from the schema.
A schema can define a family of record types in which a variant
is selected by a parameter of the schema. The selection may be
decided at run-time, and unlike the form of new
inherited from classic Pascal, it does not require a constant
selector. As with all schematic types, such records can be local
variables, parameters, fields of other records, and so on. The
choice of variant produces a specific type, which cannot
subsequently be changed; but as with any schema a parameter may
be declared with the schema name which will accept as actual
arguments any of the produced types. The use of variant records
is made safer and more flexible by these arrangements.
TYPE sub = 1..4;
rec(m: sub; n: integer) =
RECORD
a,b: integer;
CASE m OF
1: (f1: real);
2,3: (f2: string(n));
4: ( );
END;
rec2_20 = rec(2,20);
These definitions show both the selection of a record variant by parameter m and (when m is 2 or 3) how the capacity of string f2 can also be specified by parameter n. The type rec2_20 is one type produced from the schema rec.
PROCEDURE show_cap (r: rec);
BEGIN
IF r.m = 2 THEN writeln(r.f2.capacity);
END {show_cap};
If an actual parameter of type rec2_20 is passed to
procedure show_cap, the value 20 is displayed.
This feature is of use primarily in conjunction with schema types, and allows for example a local work variable to be declared with the same type as an actual parameter, when this type is not known until run-time and may differ from call to call. For example, in procedure show_cap above, a variable v could be declared as
VAR v: TYPE OF r;
This variable acquires the type of the actual parameter at
each activation.
Extended Pascal provides a method of binding a variable within
the program to an external entity; the most common example is the
binding of a file variable to an operating-system file. There is
a predeclared record type called BindingType which holds
binding information; procedures bind and unbind
perform the actions, and a function binding returns the
current state of a variable. File binding can be carried out by a
sequence of operations which is relatively independent of the
environment; some other bindings (such as to a screen image, or a
clock) may be available in specific implementations.
Classic Pascal provided only sequential file processing. Extended
Pascal adds the capability to extend a sequential file, and also
allows file variables to be declared with an index type. Such
variables can provide direct access to individual file elements,
by specifying an index value. Direct-access files allow updating
as well as reading and writing.
This example displays the string which is element i of the
file:
VAR f: FILE [0..9999] OF string(20);
...
SeekRead(f,i); writeln(f^);
A complex data type is provided. It is intentionally opaque, to permit implementations to choose the most appropriate representation; there are functions to obtain the real and imaginary parts (a Cartesian view), the magnitude and argument (a polar view), and also to construct a complex value from either pair of inputs. The mathematical operators and functions of classic Pascal can also take complex arguments and return complex results.
z2 := cos(z1 * 5.5);
writeln(re(z2),im(z2)); {Cartesian view}
writeln(abs(z2),arg(z2)); {polar view}
Two exponentiation operators are included. POW raises a
value to an integer power, and ** accepts a real
exponent. In either case, the left-hand operand can be integer,
real or complex. An integer operand of ** (as with the
/ operator) is cast to real before the operation.
A new operator >< is defined, which takes the symmetric difference of two set values; there is a new predefined function card which returns the cardinality of a set (the number of members present); and the FOR statement allows a new form in which the control variable is given in turn the values defined by a set.
FOR n IN setvalue DO ...
A predeclared record type TimeStamp is defined, which contains fields for year, month, day, hour, minute and second. (It is envisaged that an implementation might add further details such as millisecond or time zone which would be processed transparently by the predefined routines.) A procedure GetTimeStamp sets the current values in a TimeStamp record; functions Date and Time take such a record as a parameter and return strings in display format. This division of tasks allows the display functions to be used independently of system dates:
VAR ts: TimeStamp;
...
ts.year := 1993;
ts.month := 1; {=January}
ts.day := 1;
writeln(date(ts)); {display in local format}
Protection may be given to a variable in two contexts. The
first is on export of the variable from a module; such a variable
can be modified by code within the module, but importers must
treat it as read-only. The originator of the module thereby
ensures the security of that variable. The second context for
protection is in parameter lists. A parameter may be declared to
be protected; the code within the procedure or function must then
not contain statements which might change the parameter. A caller
passing a variable to a protected VAR parameter, for
example, knows there is no risk of it being modified, and does
not need to make a copy first; in the case of a large structure
this may represent a significant saving. Declaring a protected
value parameter indicates to an implementation or to the reader
of the program text that the actual parameter is "safe" and will
not be modified during execution of the procedure.
Pascal, at least in its standard form, has the reputation of being a safe but limited language. The purpose of this introduction to the features of Extended Pascal is to show that the range of the language has been greatly increased without compromising its security.
To err is human, and people (even programmers) make mistakes. In the development and maintenance of software, these are expensive if not worse, and the contribution that the programming language can make to the avoidance of mistakes is very significant. Classic Pascal has features which encourage, and indeed sometimes require, a secure programming style; it also encourages readability, which greatly benefits long-term maintenance. Extended Pascal gives much extra flexibility without sacrificing these advantages. Also, any programmer familiar with classic Pascal can adopt the new features of the extended language gradually, achieving a smooth transition as familiarity grows.
Portability across platforms is important to the serious developer, and the use of a standardised language provides an assurance of continuity both vertically across levels of machine and horizontally in time.
To fill one of the most significant gaps in classic Pascal, the language standard provides a framework in which libraries can be developed and distributed. An implementor with proprietory source code can, if he wishes, supply processed interfaces and compiled object files, his code still retaining the advantages of portability. On the other hand, the standard rightly does not set out to specify what individual libraries should contain. Areas such as graphics or numerical computing are essentially language- independent, where desirable facilities are specified which can then be "bound" to different languages, as for example in the emerging set of Language Independent Arithmetic standards. All languages can then benefit from the care and attention given by experts in each particular field.