The Pitmanual: Arrays


Array	Concept	`Composite Data Object`

An array is a first class, composite data object. It has associated with it a number of cells which are stored contiguously and which are accessed by one or more numerical indices called subscripts.

An array has a fixed number of dimensions, and an array access requires the same number of subscripts as the array has dimensions. Omitted subscripts do not default; it is an error to provide the wrong number of subscripts for an array. The maximum number of indices for an array is implementation-dependent; on the PDP-10, 5 dimensions is the limit.

Although arrays are composite datatypes, the functions CAR and CDR will not accept them as arguments. Hence, arrays are considered to be atoms, and the predicate ATOM will return T for them.

Maclisp supports four basic kinds of arrays: T, FIXNUM, FLONUM, and NIL. The type of the array refers to the kinds of objects which can be stored in the array and how such objects will be treated by the garbage collector.

Type T arrays, also called s-expression arrays, are the most general. Slots in a type T array may contain any kind of object. On the PDP-10, this type of array uses a half-word of memory per cell to address its contents, plus a small number of words for the array header. Hence, a length 1000 array of type T takes up slightly more than 500 words of memory.

Type FIXNUM and type FLONUM arrays are constrained to only contain fixnums or flonums, respectively. In interpreted code, storing a non-fixnum into a fixnum array or a non-flonum into a flonum array will signal an error. In compiled code, the results of such an illegal action are not defined, since for efficiency the compiler may not output code to check for errors of this sort. On the PDP-10, these array types use a full word of memory per cell to store their contents, so a length 1000 array of type FIXNUM would take about 1000 words of memory. The contents are stored as immediate quantities, rather than as pointers into fixnum or flonum space. Hence, accesses to this sort of an array in interpreted code (or in compiled code which is not appropriately declared) will cause a lot of fixnum or flonum consing. For this reason, type T arrays are recommended for use even when fixnums or flonums are involved unless appropriate number declarations are to be provided.

Type NIL arrays are discouraged even for those who think they know what they are doing. They are in all respects like type T arrays except that the garbage collector will not mark through them. This means that if an s-expression is pointed to only by a type NIL array, the garbage collector will not realize that there are any pointers to the object and will reclaim the object. This leaves randomness (e.g., a free-list pointer) in that cell of the array.

A named array is a symbol which has an ARRAY property, the value of which is an array object. For many applications, named arrays may be thought of as functions of a fixed number of fixnum arguments, which are its subscripts. The notation (F X) can refer to either a function F called on an argument X, or to a named array F with the single subscript X.

An access to an array, called an array reference, is done in one of two ways.

Array references to named arrays look like function calls:

(arrayname dim1 dim2 ...) ;named array: general format

For example, a reference to slot 5 of a one-dimensional named array F would look like:

(F 5) ;sample named array referenced

Array references to unnamed arrays (i.e., to array objects), require the use of ARRAYCALL and the specification of the type of the array. The format is:

(ARRAYCALL dtp arrayobj dim1 dim2 ...) ;un-named array: general format

For example, to reference the third slot of the second array in a list, L, of fixnum arrays, one would write:

(ARRAYCALL FIXNUM (CADR L) 3) ;sample un-named array reference

Arrays are also applicable, so may be used as a first argument to such functions as FUNCALL, LEXPR-FUNCALL, and APPLY. The preferred primitive for accessing an array, however, is ARRAYCALL.

There are several distinctions between arrays and other randomly accessible data structures such as hunks and vectors: Only arrays have a header area which contains information about array dimensions and some machine code which can be used accessing the array. The existence of this header area allows the array storage area to be relocated safely by the garbage collector, since user code points to the array storage only indirectly through the header, which does not move. The capability of an array to be multi-dimensional is also unique to that kind of data structure; hunks and vectors have only one dimension. Arrays are also the only composite datatype in Maclisp with special storage modes for certain kinds of data (fixnums and flonums); hunks and vectors always allow any kind of object in their associated storage cells. Arrays may also take up odd-length amounts of memory; hunks can only be allocated in fixed-size bundles of cells---powers of two, less than or equal to 512.

Several other types of arrays exist in the Maclisp world. File arrays, SFA's (Software File Arrays), Readtables, Obarrays, and Job arrays are examples. These are all of datatype ARRAY, as can be verified by passing them to TYPEP, but are not generally accepted by most of the general array functions. Each of these special kinds of objects has a set of primitives which know how to deal with the special needs of that object. Such primitives are discussed elsewhere in this manual.

If a symbol has both an array property and a functional property, the interpreter will prefer the property which is closest to the head of the property list. The compiler, on the other hand, may prefer to open compile such references as array references since it may not know there will be a conflict. As such, it is stylistically an extremely poor idea to overload a symbol's functional meaning in this way.


Array	Style Note	`Named vs Un-Named Arrays`

If you're not sure about whether you want named or un-named arrays, it's recommended that you avoid named arrays in most circumstances.

Error detection can be made harder by the use of named arrays if the programmer tries to pass the name of the array (a symbol) around as data. Not all symbols are arrays and some symbols which are erroneously passed to a function as data might have an ARRAY property which is intended for an unrelated use. Because of this, use of named arrays is recommended only for arrays which are statically allocated and intended to be globally accessible to all parts of a program without having to be passed around as functional arguments. Especially if an array will need to be given as data to some other program, un-named arrays are preferred.

Array Creation


*`ARRAY`**	Function	`(*ARRAY name dtp dim₁ dim₂ ... dimN)`

Creates an N-dimensional array of type given by dtp, which may be one of T, FIXNUM, FLONUM, or NIL.

If the name is NIL then the array object is returned. Else it must be a symbol which is to designate a named array, in which case the requested array object will be put on the ARRAY property of that symbol and the symbol's name will be returned.

The number of dimensions, N, may not exceed 5.

For any dimension, i, each dimi spec must be a positive fixnum specifying the length of that dimension. The valid indices along that dimension are 0 through i-1.

Newly created arrays are guaranteed to come with all cells appropriately initialized. Fresh type T or NIL arrays are filled with NIL's. Flonum and fixnum arrays contain 0.0's or 0's, as appropriate.

A special case of *ARRAY occurs if dtp is one of the symbols OBARRAY or READTABLE. The syntax then becomes: (*ARRAY name dtp [data]). This form will create a readtable or obarray with the given name (or if name is NIL, will return the obarray or readtable).

If dtp is the symbol READTABLE then a third argument, data, of NIL means to initialize the readtable from the current dynamic binding of the symbol READTABLE; this is the default. If data is T, then the system readtable is used instead. If data is a readtable, its contents are used to initialize the newly created readtable.

If dtp is the symbol OBARRAY then a third argument, data, of T means to initialize the obarray from the current dynamic binding of the symbol OBARRAY; this is the default. If data is NIL, then an empty obarray is created. If data is an obarray, then its contents are used to initialize the newly created obarray. Always, when creating an obarray, the buckets are copied but the symbols pointed to by those buckets are not copied.


`ARRAY`	Special Form	`(ARRAY name type dim₁ dim₂ ...)`

Like (*ARRAY name type dim₁ dim₂ ...) except that name and type are not evaluated. The other arguments, the dims, are evaluated. Using *ARRAY is probably a better idea.


*`REARRAY`**	Function	`(*REARRAY x [type dim₁ dim₂ ...])`

*REARRAY is used to redefine the dimensions of an array.

(*REARRAY x) gets rid of the array x. x is evaluated and must evaluate to an atomic symbol or an array object. The array storage is reclaimed and x becomes a “dead array.” The argument, x, is returned.

(*REARRAY x type dim₁ dim₂ ... ) is like (*ARRAY x type dim₁ dim₂ ...) except that the contents of the previously existing array named x are copied into the new array x and then x is made to point at the new array.

Storage in the array is recycled as if you'd done a LISTARRAY of the array before *REARRAY'ing it and then FILLARRAY'd the data back in afterward (see LISTARRAY, FILLARRAY). Because Maclisp uses row major order internally, this won't do the same thing as ADJUST-ARRAY-SIZE would on the Lisp Machine (which uses column major order internally).

Array Access


`ARRAYCALL`	Special Form	`(ARRAYCALL dtp obj i₁ i₂ ...)`

The first argument, dtp, is not evaluated. It should be the symbol T, FIXNUM, FLONUM, or NIL. The other arguments are evaluated. The second argument, obj, must be a valid array object. To access named arrays with arraycall, one must use something like

(ARRAYCALL dtp (GET 'name 'ARRAY) ...).

The remaining arguments are the subscripts for use in the access. The return value is the contents of the specified slot in the specified array.


`STORE`	Special Form	`(STORE arrayref val)`

The first argument, arrayref, must be an array reference. The second argument is evaluated and stored into the array cell referenced by arrayref.

STORE evaluates its second argument before its first argument.

;; Using array objects (un-named arrays)
(setq testarray (*array nil t 100))
=> #T-100-70720

(arraycall t testarray 0)
=> NIL

(store (arraycall t testarray 0) 5)
=> 5

(arraycall t testarray 0)
=> 5

;; Using named arrays.
(*array 'foo 'fixnum 100)
=> FOO

(foo 0)
=> 0

(store (foo 0) 3)
=> 3

(foo 0)
=> 3

Bulk Array Processing


`FILLARRAY`	Function	`(FILLARRAY arrayspec q)`

If q is a list, this fills the array named by arrayspec (which may be either a symbol or an array object) sequentially with successive elements of q. If there are too many elements in q, the excess elements are ignored. If there are too few, the last element of q is used repeatedly to fill all remaining slots in arrayspec.

If q is an array, its successive elements are used to fill in the array the same as if q were a list, except that if q is too short, the slots in the array specified by arrayspec which have no corresponding elements in q are unchanged.

On ITS, a FIXNUM, FLONUM, or NIL array may be saved to a file array open in FIXNUM mode using fillarray. This doesn't work on non-ITS sites.

For multi-dimensional arrays, row-major order is observed in the conversion between an array and its listed format.


`LISTARRAY`	Function	`(LISTARRAY arrayspec)`

Takes the elements of the array specified by arrayspec and returns them as the elements of a list. The length of the list is the size of the array (the product of its dimensions) and the elements are present in the list in the same order as they are stored in the array, starting with the zeroth element.

For multi-dimensional arrays, row-major order is observed in the conversion between an array and its listed format.

On Multics, this function takes an optional second argument, i, which if supplied specifies a maximum length for the returned list. If more than i elements exist in the array, only the first i will be returned.

Array Information


`ARRAYDIMS`	Function	`(ARRAYDIMS q)`

Returns type and dimensionality information about an array, q, in the format

(type dimension1 dimension2 ...).

The array may be either an array object or a symbol denoting a named array.

Examples:

(ARRAYDIMS (*ARRAY NIL NIL 5.))		 => (NIL 5.)

(ARRAYDIMS (*ARRAY NIL T 10. 20.))	 => (T 10. 20.)

(ARRAYDIMS (*ARRAY NIL 'FIXNUM 2. 3. 4.)) => (FIXNUM 2. 3. 4.)

(ARRAYDIMS TYI)				 => (FILE 0.)

(ARRAYDIMS OBARRAY)			 => (OBARRAY 640.)

(ARRAYDIMS 'OBARRAY)			 => (OBARRAY 640.)

(ARRAYDIMS READTABLE)			 => (READTABLE 0.)


`ARRAY-/#-DIMS`	Function	`(ARRAY-/#-DIMS q)`

[PDP-10 Only] Returns the dimensionality (number of dimensions) of an array, q.

Example:

(ARRAY-/#-DIMS (*ARRAY NIL T 3 5)) => 2


`ARRAY-DIMENSION-N`	Function	`(ARRAY-DIMENSION-N i q)`

[PDP-10 Only] Returns the length of dimension i of an array, q. The dimension argument, i, must be a fixnum. If i specifies a valid dimension of q, the length of that dimension (a fixnum) is returned. If i is out of range, NIL is returned.

Note: For Lisp Machine compatibility, the dimension is 1-based. That is, an array of dimensionality 2 is said to have dimensions 1 and 2, rather than 0 and 1 as one might expect. The Lisp Machine uses the 0 dimension of this argument to mean the array leader, a concept that does not exist in Maclisp. For compatibility with the Lisp Machine, the form (ARRAY-DIMENSION-N 0 q) is always NIL in Maclisp, which is what the Lisp Machine's ARRAY-DIMENSION-N would return for an array with no array leader.

Examples:

(ARRAY-DIMENSION-N 3 (*ARRAY NIL T 5 3)) => NIL

(ARRAY-DIMENSION-N 2 (*ARRAY NIL T 5 3)) => 3

(ARRAY-DIMENSION-N 1 (*ARRAY NIL T 5 3)) => 5

(ARRAY-DIMENSION-N 0 (*ARRAY NIL T 5 3)) => NIL


`ARRAY-TYPE`	Function	`(ARRAY-TYPE q)`

[PDP-10 Only] Returns the symbolic type of an array, q.

Examples:

(ARRAY-TYPE (*ARRAY NIL T 10.))		=> T

(ARRAY-TYPE (*ARRAY NIL 'FIXNUM 10.))	=> FIXNUM

Array Saving

Unfortunately, DUMPARRAYS and LOADARRAYS currently work only on ITS. This is because FILLARRAY does not work with file array arguments on non-ITS sites. This is a bug we hope someday will be fixed.


`DUMPARRAYS`	Function	`(DUMPARRAYS names filespec)`

Saves the contents of the specified arrays into filespec in a format that can be restored by LOADARRAYS. The arrays are specified as a list of names, which must be symbols. Each symbol must have an ARRAY property of the ARRAY to be saved. Arrays held by value cells are not handled. Also, only type FIXNUM, FLONUM, and NIL arrays are handled. The filespec may be a namelist or namestring. On Multics, this only works with type FIXNUM and FLONUM arrays. Also on Multics, if the filespec is of the form “(pdp-10 filespec)” the file will be written in PDP-10 dumparrays format.


`LOADARRAYS`	Function	`(LOADARRAYS filespec)`

Retrieves the arrays saved by a DUMPARRAYS command.