The Pitmanual: Datatypes

Type Information


Type	Concept	`Primitive Datatypes`

Abstractly, a datatype is just some representation and a set of operations you intend to use on that representation (e.g., to produce new representations of that same datatype or of some other known datatype). What characterizes a flonum (floating point number), for example, is precisely that the flonum operations are defined to work on that object.

In many languages, datatype information is associated with variables rather than with the objects held by those variables. In such languages, identical internal representations are used to mean more than one thing and the only way programs can do the right thing with objects given them by other programs is if type information has been correctly supplied explicitly in the user program. For example, the octal constant “201600,,000000” on the PDP-10 might mean the floating point number 1.5, the decimal integer 17414750208, or the PDP-10 instruction “MOVEI 14,” depending on what program returned it.

In Lisp, all data is typed. That is, if you look at the return value of some function, you can tell its type without knowing anything about the function that returned it. We call such a language “object-oriented” because it tries very hard to preserve the property that every object has associated type information (which specifies how its internal representation is meant to be interpreted) and that type information is a property of the object itself, not of the program manipulating the object.

If you manage to accidentally pass an object of the wrong type as an argument to a program in some language which does not associate type information with objects themselves, then the program will probably interpret the data as if it were valid (since it will have little other choice) and produce output which is garbage. In Lisp, programs may recognize when they have arguments of the wrong type and signal a useful error. Here's an example of how that sort of thing is typically done:

(DEFUN FACTORIAL (X)
  (IF (OR (NOT (FIXNUMP X))
          (MINUSP X))
      (ERROR "Argument must be a non-negative fixnum" X))
  ...rest of definition...)

In languages such as Fortran or PL/1, detecting errors of this sort must be done at compile time. Designers of these languages would argue that this makes programs more efficient because it reduces the need for runtime type checking. However, the cost of this decision is that it makes it hard to write “generic” programs which allow their arguments to be one of several different types and which decide at runtime how to deal with the argument. Lisp's generic arithmetic functions use just this strategy; namely:

(DEFUN ADD1 (X)
  (COND ((FIXNUMP X) (1+ X)) ;call fixnum-only add1 operator
        ((FLOATP X) (1+$ X)) ;call flonum-only add1 operator
        ...etc))


`TYPEP`	Function	`(TYPEP q)`

Every object in Maclisp has exactly one datatype. On the PDP-10, it is one of LIST, SYMBOL, FIXNUM, FLONUM, BIGNUM, ARRAY, HUNK2, HUNK4, HUNK8, HUNK16, HUNK32, HUNK64, HUNK128, HUNK256, HUNK512, or RANDOM; on Multics, it is one of LIST, SYMBOL, FIXNUM, FLONUM, BIGNUM, ARRAY, STRING, RANDOM. The TYPEP function returns the symbol which specifies the type of q. This may be useful in type checking and writing “generic” functions which do runtime dispatching on the basis of their argument(s)' datatype.

The object NIL, which serves many purposes in Lisp, is defined to be of primitive type SYMBOL.

Examples:

(typep 5)		=>	FIXNUM

(typep 5.0)		=>	FLONUM

(typep 'a)		=>	SYMBOL

(typep '(a . b))	=>	LIST

(typep '(a . b . c .))	=>	HUNK4   ;PDP-10 only

(typep nil)		=>	SYMBOL	;by definition

Atoms and Lists


Atom	Concept	`Non-List`

The term “atom” goes back to the earliest days of Lisp, when it meant “not a composed expression.” Its precise meaning has varied in subtle ways from dialect to dialect but seems to address the idea that a list was the only available aggregate data structure, so non-lists have generally been thought of as atoms in that they are not further decomposable by some recursive description as might be lists. Since the introduction of the term, however, a variety of other structures have been introduced into lisps (most notably arrays and strings). Probably as a result of the fact that programs were by this time already calling ATOM to find out if CAR and CDR would work on something, it was decided that arrays and strings were also atoms. The result of this decision is that “non-atom” has now come to mean “has a mutable car and cdr,” and “atom” means anything else. So lists are non-atoms and so are hunks (though their use is rare). Everything else in Lisp is atomic.


`ATOM`	Function	`(ATOM q)`

Predicate returns T if q is an atom, and NIL otherwise. Lists and hunks are not atoms; objects of all other datatypes are atoms. NIL is defined to be an atom in spite of the fact that some functions allow it to be treated like a list.

Examples:

(atom 3)		=>	T

(atom '(a b))		=>	NIL

(atom nil)		=>	T

(atom '(a . b .))	=>	NIL

Equivalence Predicates

There are many kinds of equivalence so Lisp has many kinds of equality operators. The strongest kind of equality is EQ, which is said to be true of two objects iff they are the same object. Two objects may appear to be the same because most of their interesting attributes are the same. Such is true of the result of X and Y in

(SETQ X (LIST 'A 'B 'C) Y (LIST 'A 'B 'C)).

These lists both display as (A B C) and are said to be EQUAL, but they are not EQ because they are not the same list. If you destructively modify (e.g., with RPLACA) one of the lists, the other does not change. In the case of:

(SETQ X (SETQ Y (LIST 'A 'B 'C)))

the lists held by X and Y are the same object, so they are EQ.

Here's an informal, but useful, analogy someone once made: Consider two bearded people Daniel and David. If you shave Daniel and as a result David becomes cleanshaven, then Daniel and David were EQ. If David does not become cleanshaven as a result, then they were very similar (perhaps twins) but they were not EQ. This weaker kind of equality we call EQUAL.


`EQ`	Function	`(EQ q₁ q₂)`

Returns T if q₁ and q₂ are exactly the same object, NIL otherwise. (Contrast EQUAL.) It should be noted that things that print the same are not necessarily EQ to each other. Numbers with the same value need not be EQ, lists with the same elements need not be EQ, etc. In general, two symbols with the same printname are EQ, but it is possible with MAKNAM or variable obarrays to generate symbols which have the same printname but are not EQ. (SAMEPNAMEP may be useful in that case.)

Fixnums with the same sign and magnitude will always be EQ on Multics. In general, however, numbers with identical sign and magnitude should never be relied upon to be EQ. Always use EQUAL or = to compare numbers. Sometimes they will be EQ, but often they will be unpredictably not so. This effect is even more true in compiled code.

See also: EQUAL, SAMEPNAMEP, =

(eq 'a 'b)
=> NIL

(eq 'a 'a)
=> T

(setq x (list 'a 'b) y (list 'a 'b))
=> (A B)

(eq x y)
=> NIL

;; Sometimes small numbers are EQ, sometimes not. Don't rely on it!
(list (eq 1 1) (eq 98765. 98765.))
=> (T NIL)


`EQUAL`	Function	`(EQUAL q₁ q₂)`

The EQUAL predicate returns T if its arguments have similar structure. (Contrast EQ.) Two numbers are EQUAL if they have the same value (flonums are never equal to fixnums though). Two strings (Multics only) are EQUAL if they have the same length, and the contents are the same. All other atomic objects are EQUAL if and only if they are EQ. For dotted pairs and lists, EQUAL is defined so that:

(EQUAL X Y)

is the same as

(OR (EQ X Y)
    (AND (EQUAL (CAR X) (CAR Y))
         (EQUAL (CDR X) (CDR Y))))

As a consequence of this definition, it may be seen that EQUAL need not terminate when applied to looped list structure. In addition, EQ always implies EQUAL. An intuitive definition of EQUAL (which is not quite correct, but is nevertheless useful) is that two objects are EQUAL if they look the same when printed out.

On Multics, where there is a primitive string datatype, EQUAL will return true for strings which are made up of the same characters. On the PDP-10, using the default “fake string” facility (see String), EQUAL will return NIL for lookalike strings unless they are also EQ. To compare two “fake strings”, use SAMEPNAMEP.

Examples:

(equal 0 0.0)		=> NIL

(equal 0.0 0.0)		=> T

(equal '(a b) '(a b))	=> T

(equal 9876. 9876.)	=> T


`SAMEPNAMEP`	Function	`(SAMEPNAMEP q₁ q₂)`

This function can be used to compare two symbols or strings, q₁ and q₂, to see that they have the same printname. It is an error if either q₁ or q₂ is not a symbol or a string.


`=`	Function	`(= k₁ k₂)`

This function can be used to compare two numbers. It is an error if k₁ and k₂ are not either both fixnums or both flonums.

Examples:

(= 3 4)		=> NIL

(= 3 3)		=> T

(= 3.0 3.0)	=> T

(= 3 'foo)	=> error! ;FOO NON-NUMERIC VALUE