*** This document is now OBSOLETE. Please see setl.org/setl instead. Thanks!

   dB

SETL Documentation

This is reference documentation for the SETL ``library'' of built-in operations, which greatly extends the repertoire given in any of the more tutorial works. But let us begin with some pointers to those works, because until I put the full tree of SETL documentation into perfect on-line hypertext, this document will have to serve as the primary reference.

The textbook for the SETL programming language is Schwartz, J.T., Dewar, R.B.K., Dubinsky, E., and Schonberg, E., Programming with Sets: An Introduction to SETL (1986), Springer-Verlag, New York. You could teach your kids how to program in SETL with this book, and thereby keep their innocent minds unsullied by other languages. It introduces the student to the world of programming through what I regard as the world's most luxurious language. But if you are a seasoned professional, you probably don't want to buy this book until after you've leafed through one or more of the excellent tutorials I will now proceed to describe. (They will print beautifully and cheaply on your PostScript device.)

My personal favourite introduction to SETL is The SETL Programming Language by Robert Dewar. The examples in it should all still work under my free implementation of SETL if you leave out the stuff about macros, backtracking, the data representation sublanguage, and old-style modules. [By the way, if you have trouble downloading the Dewar tutorial using Internet Explorer, try Netscape. I never use IE, so I don't know what kind of trouble you might run into, but at least one person has reported having problems. It may be a mere matter of configuration settings, of course.]

Chapter 1 of that tutorial has been updated for SETL2 by Robert Hummel, and these examples should also all work under my implementation, which flings the concept of language purity to the wind and supports both SETL and ``SETL2'' syntax, or any shameful mix of the two. You really have too much choice in this.

SETL2 is a close relative of SETL created by Kirk Snyder and supported by some excellent (though now rather old) documentation in The SETL2 Programming Language and SETL2: An Update on Current Developments. SETL2 has some clumsy support for closures, and an experimental (and broken) object system. SETL2 also redefines integer division to mean something different than it does in ``real'' SETL: 3 / 2 will give you 1.5 in SETL, but only 1 in SETL2. (I provide a compatibility hack for those who are wedded to that C/Fortran convention instead of the more mathematically satisfying Pascal/Algol one that SETL has historically used and that is the default in my implementation.) Toto Paxia has been doing some further work on SETL2, but I think it is fair to state that as of this writing (January, 1998), the recent work on SETL2 is highly experimental. The old SETL2 implementation remains of interest to those using DOS or Macintosh platforms, though a couple of individuals have already volunteered to try porting my (hopefully more stable and certainly more actively maintained) version of SETL to the newer Mac and PC operating systems.

If you want to play with SETL a little before downloading any implementation, you can do so by visiting Dave's Famous Original SETL Server, which lets you run simple examples right from the Web page.

PSETL by Zhiqing Liu has a fancy back end that records intermediate program states using a space-efficient encoding scheme based on the ``persistent'' data structures (not to be confused with the database meaning of ``persistent'') due to Driscoll, Sarnak, Sleator, and Tarjan in J.Comp.Sys.Sci. 38, 1989. This allows execution histories to be reviewed in complete detail, including all values ever acquired by all variables and when. Its graphical interface makes a great pedagogical and debugging tool.

SETL-S by Robert Dewar and Julius VandeKopple is a high-performance subset implementation for DOS systems.

ISETL by Gary Levin is a more distant relative of SETL that has been widely used in discrete mathematics.

The Slim language by Herman Venter is another interesting ``cousin'' of SETL.

Finally, ProSet is said to be a descendant of SETL.

So is Cantor

And Bagl, now called SequenceL

Griffin

My Ph.D. dissertation is perhaps the best available introduction to Internet programming with SETL. It discusses how to use processes, SETL streams, and sockets harmoniously to create your own smooth-running, low-latency server.

Work on set-theoretic languages and programming continues on various fronts, and Enrico Pontelli at the New Mexico State University maintains a site ``Programming with {Sets}'' containing an extensive bibliography, information on workshops and conferences, links to implementation sites, etc. Gianfranco Rossi at the Universitą di Parma maintains a site in Italy which claims to mirror the NMSU site, while the NMSU site claims to mirror the Italian site. I looped for days trying to figure out which one was the real reflection.

There is also a draft in progress of a new book about SETL at www.settheory.com.

[But now back to our show...

This is called ``SETL Documentation'' because it is the main guide you need for day-to-day use of SETL. It documents the standard operators, functions, and procedures (collectively called ``intrinsics'' or ``built-ins'') that make SETL useful as well as elegant.

So that means it should be called ``SETL Library Documentation'' or something like that, with links to the other things I still need but haven't written yet, like summaries of:

All complete examples should come with links to let you run them interactively straight off the Web page, too.]

[I probably need to proof-read the library routine descriptions to make sure I am in every case distinguishing carefully between what I am defining as the language and what is just a feature of the world's most wonderful implementation. (Now that I am replacing the low-level buffering, it becomes easier than ever to define the semantics of many I/O-oriented, system-oriented, process-oriented etc. routines in terms of Posix routines. It is particularly helpful, for example, to know which flags are passed to open(2) for a given how argument to open.) The terms ``operand'', ``argument'', and perhaps even ``parameter'' are essentially equivalent and I use at least the first two freely. (Need I mention that?)

I also use the notation routine(n) to indicate a system routine in the style of Unix ``man'' pages, so you can for example say ``man 2 read'' (or ``man -s 2 read'' on Solaris) to your friendly neighbourhood shell when you see a reference to read(2). I will probably grab a copy of the Unix 98 documentation from The Open Group at some point if they ever put it into better shape (and provided I can do this without it constituting ``redistribution''), and then process this file to turn all such man-page designations into hyperlinks into the documentation (either the local mirror or the original, depending on how the legalities work out).

One more thing: I speak a little glibly of ``raising exceptions'' in some places; the mechanism for handling them has not yet been fully designed and is certainly not part of the old SETL definition, so you can interpret ``raises exception'' as ``crashes your program, hopefully with some kind of nice traceback or debugger entry.'']


Index to the intrinsics

[Later I should extend this table to include all the words that are ``reserved'' in the default stropping mode, even those which are obsolescent (and therefore will lack hyperlinks). For now, this table just lists the routines (and ``sysvars'' and ``sysvals'').]

# floating not send
* floor notany sendto
** flush notin send_fd
+ fork npow setenv
- from nprint setgid
/ fromb nprinta setpgrp
= frome odd setrandom
/= fsize om setuid
< get open set_intslash
> geta or set_magic
<= getb pack_... shutdown
>= getc peekc shut_rd
? getchar peekchar shut_wr
abs getegid peer_address shut_rdwr
accept getenv peer_name sign
acos geteuid peer_port sin
and getfile pexists sinh
any getgid pid span
arb getline pipe split
asin getn pipe_from_child sqrt
atan getpgrp pipe_to_child status
atan2 gets port stdin
bit_and getuid pow stdout
bit_not getwd pretty stderr
bit_or gmark print store_...
bit_xor gsub printa str
break hex pump strad
call hostaddr put sub
callout hostname puta subset
ceil ichar putb symlink
char impl putc system
chdir in putchar sys_read
clear_error incs putenv sys_write
clock intslash putfile tan
close ip_addresses putline tanh
command_line ip_names puts tie
command_name is_... random time
cos is_open range tmpnam
cosh kill rany to_lower
date last_error rbreak to_upper
denotype len rlen tod
div less rmatch true
domain lessf rnotany type
dup lexists rspan umask
dup2 link read ungetc
eof log reada ungetchar
even lpad readlink unhex
exec magic reads unlink
exp mark recv unpack_...
false match recvfrom unpretty
fdate max recv_fd unsetenv
fetch_... mem_alloc rem unstr
fexists mem_copy reverse val
filename mem_free rewind wait
fileno min round whole
filter mod routine with
fix nargs rpad write
float newat seek writea
fixed no_error select

The syntax of this presentation looks ahead to the SETL in which operators, functions, and procedures can be overloaded. For example, the cardinality operator (the first one below) is declared as three overloadings of ``#'', but if you were to define your own operator card in the existing SETL, it would have to be something like this:

op card (s);
case
is_string s =>   -- length of string
n := 0;
for x in s loop n +:= 1; end loop;
return n;
is_set s =>   -- cardinality of set
n := 0;
for x in s loop n +:= 1; end loop;
return n;
is_tuple s =>   -- length of tuple
n := 0;
for x in s loop n +:= 1; end loop;
return n;
otherwise =>
printa(stderr,"card: invalid argument type: ",type s);
stop 1;
end case;
end op card;

In a proposed version of SETL [to be discussed in my thesis], you will be able to implement the same operator as three overloaded ones like this, with exactly the same effect in the case of a valid argument (and just a possibly different error message otherwise):

op card (set s) : integer
n := 0
for x in s loop n +:= 1; end loop
return n
end op card

op card (string s) : integer
n := 0
for x in s loop n +:= 1; end loop
return n
end op card

op card (tuple t) : integer
n := 0
for x in t loop n +:= 1; end loop
return n
end op card

Noting that the original definition could be written as follows (in old-style syntax for variety), you might wonder what the advantage to overloading is. Ultimately, if you were a computer scientist, the question would lead you into the world of ``polymorphic'' functions; the body of the following operator definition is clearly applicable to all kinds of aggregates, and the restriction to sets, tuples, and strings is an artifice from that point of view (or would be, if you could define other kinds of aggregates as first-class types in SETL).

op card (s);
case of
(is_set s, is_string s, is_tuple s):
n := 0;
(for x in s) n +:= 1; end;
return n;
else
printa(stderr,"card: invalid argument type: ",type s);
stop 1;
end case;
end op card;
It is interesting to note that you could actually just delete the type check in the above, producing the seemingly very tidy definition below. If the argument s were not of a type susceptible to in-style iteration, you would get a different error message, that is all. The computer scientist would consider that a kind of uncontrolled polymorphism, because ideally the inappropriateness of the type would be caught at entry to card instead of at the point where iteration was attempted. That is an advantage that overloading always has over the old free-and-easy approach, though it is perhaps clearer in non-polymorphic cases. But even in cases like the present one, languages such as Griffin and ML can describe an argument such as s purely in terms of the operations (such as iteration) supported by s. If you later add a type called ``list'' that supports in-iteration, your polymorphic definition would still be valid without adding the new case.

op card (s);
n := 0;
loop for x in s do n +:= 1; end loop;
return n;
end op card;
Unfortunately, we cannot give the ``controlled'' polymorphic version in SETL, which means we must spell out a few cases where some languages would be able to express a whole group of such generically related operators in a single sweeping flourish. On the other hand, features like this require considerable implementation effort and are widely regarded as exotic, even though they provide extra opportunities for static checking. They seem to be mainly of value to programmers who are designing highly general libraries and interfaces, and for that erudite audience a language more at the level of Ada 95 is really much more suitable than a language like SETL, whose ``high level'' nature consists mainly in predefining abstractions that are useful in prototyping and data processing.

Indeed, since polymorphism is much like having different versions of code around, in many cases, people will often in fact prefer to generate code with their own tools rather than learn the conventions of a particular programming language (such as how to define ``templates'' or ``generics'' in it). Rolling your own code by your own rules has the advantage that you maintain complete flexibility in how it is generated. This sometimes wins over the disciplined approach, though perhaps not as often as some of us hacks would like to believe.

Finally, getting back to the presentation conventions in the detailed descriptions of built-in routines below, the still miraculously alert reader will note that in two cases (``**'' on integer arguments, and val), the notation steps beyond any reasonable proposal for overloading in SETL, because the return type actually depends on the input values, not just the input types. This presents no problem as a mnemonic notation, but it would be rather difficult to accommodate such declarations formally.

In fact, there is also one case (``/'' on integer arguments) where the return type depends on the current value of a switch (intslash) that can be toggled at run-time, and several cases where a return value of om is possible even though this is not apparent from the given signature but instead is just mentioned in the description of the routine.

For all these cases where multiple return types are listed or described for a single combination of input types, the reality is that barring some extraordinary proposal, there will only ever be one declaration for that combination of input types, and its return type will be general enough to include the union of all the possible return types. In the notation here, only one such union is invoked, and is called var, representing the universe of all SETL values. See for example unstr, which takes strings to various types, all lumped together as var.

[I didn't bother introducing number* as the union of integer and real, because there are really very few cases (all mentioned above) where it would have been useful; in a notation meant for human consumption, it is better to have the explicit, short, and infrequent list of possibilities each time. Is this worth blathering on about? Would it be better to put number* in and change the existing blathering to mention it, lauding it as looking ahead to when there is rational* and complex* as well? I don't think so, and in fact that would miss the point, because obviously (for example) the slash operator would choose just between rational* and integer for the quotient of integers; you still wouldn't want number* given as the return type for that case.]

(*) Imaginary keywords, like spam*, are indicated by a trailing asterisk, to show that they may have been in Indo-European but are not to be found in modern-day SETL.


The intrinsics, in asciibetical order

In the following, var is used to denote an argument or return value of any type. Three dots indicate 0 or more optional arguments. (To repeat, there are no type declarations in this SETL, but we pretend there are in the signatures listed here. If a routine can return a result other than om, this is signified by a colon and a typename. A slight liberty is taken with the typenames where a result is normally of a particular type but sometimes om. In such cases, the particular typename is used to stand for both possibilities. As a general guideline, there's no point in your checking for om if the operation is so outrageous that it raises an exception. If you take a chance and don't bother checking for om, your program will probably crash almost immediately if an unintended one does crop up anyway, so not checking for om can even be perfectly reasonable style in circumstances where om is technically possible. Conversely, checking for om where it cannot occur is at worst redundant. But the documentation should really have been strict on this. Patches will be gratefully accepted.)


Size of set, length of string or tuple

op # (set) : integer
op # (string) : integer
op # (tuple) : integer

Numeric multiplication, set intersection, string or tuple replication

op * (integer, integer) : integer
op * (real, real) : real
op * (real, integer) : real
op * (integer, real) : real
op * (set, set) : set
op * (string, integer) : string
op * (integer, string) : string
op * (tuple, integer) : tuple
op * (integer, tuple) : tuple

Exponentiation

op ** (integer, integer) : integer
op ** (integer, integer) : real
op ** (real, real) : real
op ** (real, integer) : real
op ** (integer, real) : real

When both arguments are of integer type, the return type is real if and only if the second argument is negative.


Numeric affirmation or addition, set union, string or tuple concatenation

op + (integer) : integer
op + (real) : real
op + (integer, integer) : integer
op + (real, real) : real
op + (real, integer) : real
op + (integer, real) : real
op + (set, set) : set
op + (string, string) : string
op + (tuple, tuple) : tuple
op + (string, string) : string
op + (string, var) : string
op + (var, string) : string

The binary forms in which one argument is a real and the other is an integer are treated as if the integer is ``promoted'' to a real before addition using float (which see for more information about floating-point overflow).

The binary forms in which one argument is a string and the other is not are treated as if str is first applied to the non-string argument to convert it preparatory to string concatenation.

See also ``?'' regarding special-case treatment of the ``+:='' operator when the left-hand argument has the initial value om.


Numeric negation or subtraction, set difference

op - (integer) : integer
op - (real) : real
op - (integer, integer) : integer
op - (real, real) : real
op - (real, integer) : real
op - (integer, real) : real
op - (set, set) : set

Numeric quotient

op / (integer, integer) : real
op / (integer, integer) : integer
op / (real, real) : real
op / (real, integer) : real
op / (integer, real) : real

Note the return type of integer division here. You can cause integer / integer to return a truncated (integer) result by setting intslash := true (or equivalently, by calling set_intslash with an argument of true), but you do so at your peril. Consider a program that reads pairs of numbers and computes their quotients. Unless you are careful to ensure that each number (or at least one of each pair) is real, your program will sometimes truncate quotients and sometimes not, depending (mostly!) on whether the input numbers happen to have decimal points in them or not. You can still get truncated integer division using the div operator, as in the Algol tradition. This has the advantage of making explicit the fact that the division is not the kind implied by the customary mathematical symbol. C and Fortran programmers trip constantly over the bad design decision made in those languages for them. Even when they know the rule, it is an easy one to forget when trying to stare down a bug. In such a mode one usually focuses intensely on the ``logic''---my favourite story in that connection is by Jack Schwartz, who reports spending hours once upon a time trying to puzzle out why a particular segment of Fortran code wasn't working. He had isolated the problem down to a single line, and almost succeeded in convincing himself that there was a code generation bug in the compiler, when he finally noticed the comment symbol in column 1. Anyway, I helped a student only yesterday (summer solstice, 1995) with one of these pestilent C integer division things. It hits almost everyone who uses C at some time or other; usually what happens is that you are thinking that at least one operand is floating-point but actually have them both declared as int. Okay, enough diatribe. This is supposed to be a reference guide, isn't it.


Equality and inequality

op = (var, var) : boolean
op /= (var, var) : boolean

Numeric and lexicographic comparisons

op < (integer, integer) : boolean
op < (real, real) : boolean
op < (real, integer) : boolean
op < (integer, real) : boolean
op < (string, string) : boolean
op > (integer, integer) : boolean
op > (real, real) : boolean
op > (real, integer) : boolean
op > (integer, real) : boolean
op > (string, string) : boolean
op <= (integer, integer) : boolean
op <= (real, real) : boolean
op <= (real, integer) : boolean
op <= (integer, real) : boolean
op <= (string, string) : boolean
op >= (integer, integer) : boolean
op >= (real, real) : boolean
op >= (real, integer) : boolean
op >= (integer, real) : boolean
op >= (string, string) : boolean

Query

op ? (var, var) : var

The expression a ? b is equivalent to the expression if a = om then b else a end. Thus the query operator is ``short-circuited'' like and and or. For a map m used to count occurrences of items, the sequence

m(item) ?:= 0;  -- initialize if undefined
m(item) +:= 1;  -- accumulate

is idiomatic in SETL. There is in fact a case to be made for treating an undefined operand of ``+'' as 0 if the other argument is numeric, or as the null string if the other argument is a string, or as {} if it is a set, or as [] if it is a tuple. The fact that ``*'' has overloadings like (integer, string) suggests that it probably should not treat om as the identity element for multiplication (1) even when one argument is numeric. Opinions are welcome.

Experimentally, it has been decided that the above sequence is more idiotic than idiomatic. Bob Paige, if I (dB) recall correctly, thought it was more of a nuisance than a useful guard against failure-to-initialize errors to have to do the ``?:= 0'' part, and I agree.

So, whenever the left-hand side of ``+:='' has the (initial) value om, and the right-hand side is an integer, real, string, set, or tuple, the left-hand side will be treated as if it had the appropriate identity element as its initial value, i.e. 0, 0.0, "", {}, or [] respectively. Thus the above sequence can be shortened to the (more rational?)

m(item) +:= 1;  -- initialize if undefined, and accumulate

This rule takes precedence over the non-assigning-form rule for the ``+'' operator, which makes om + s str om + s for any string s.


Numeric absolute value, integer value of character

op abs (integer) : integer
op abs (real) : real
op abs (string) : integer

For a string operand, abs is equivalent to ichar.


[The reason there is no access routine is that the Unix access(2) permissions relate to the real user and group ids, whereas the success of open depends on the effective uid and gid. I don't want any routines in the SETL library with such subtle semantics. Use system with the test(1) command to get the information access would retrieve if you really need it. See also fexists (the file existence predicate).] [That seems a little silly, doesn't it, dB?]


Accept connection on socket

proc accept (integer) : integer
proc accept (string) : integer

The argument must be a server socket opened by open. The accept procedure blocks until a client connection comes in, and then returns a new socket for that connection. The select procedure can be used to test whether an accept would block on the given server socket, and pump or fork is often used to help programs serve clients concurrently instead of making them queue for service.

It is possible for accept to fail due to conditions arising between the time of a successful select and the issuing of the accept call. In this case accept returns om.

See also my Ph.D. dissertation for detailed guidance on network programming in SETL.


Arc cosine

op acos (real) : real
op acos (integer) : real

Logical conjunction

op and (boolean, boolean) : boolean

The expression a and b is equivalent to the expression if a then b else false end. The and operator is ``short-circuited'' in that it only evaluates its second argument if necessary. This makes it suitable for use as a ``guard'' against erroneous evaluations such as subscripting a tuple with a nonpositive integer (see also or):

if i > 0 and t(i) = "banana" then ...

Break out initial character

proc any (rw string s, string p) : string

If the first character of s appears anywhere in the string p, that character is removed from s and returned as the function result. Otherwise, nothing happens to s, and the null string is returned. See also notany, rany, and rnotany.


Arbitrary element of set

op arb (set) : var

An arbitrary (which is not to say ``random'' but rather nondeterministically chosen) element of the argument set is returned. If the set is empty, om is returned.


Arc sine

op asin (real) : real
op asin (integer) : real

Arc tangent

op atan (real) : real
op atan (integer) : real

See also atan2.


Arc tangent of quotient

op atan2 (real, real) : real
op atan2 (real, integer) : real
op atan2 (integer, real) : real
op atan2 (integer, integer) : real

Suppose y = c * sin t and x = c * cos t, for some c and t. Then y atan2 x = t can be safely evaluated even when the otherwise equivalent expression atan(y/x) would overflow, i.e., when cos t = 0.


Bitwise logical operators

op bit_and (integer, integer) : integer
op bit_not (integer) : integer
op bit_or (integer, integer) : integer
op bit_xor (integer, integer) : integer

These operators treat their operands as binary integers, that is to say as bit patterns. They are machine-dependent with respect to word size, but have some use in expressing certain algorithms that do specialized tricks with bit patterns.


Break out initial substring based on delimiter

proc break (rw string s, string p) : string

If s contains a character that appears in p, the substring of s up to but not including that character is ``broken off'' from the beginning of s and returned as the function result, while s itself is updated to reflect the loss. Otherwise (no character from p appears in s), the function result is the input value of s, and s is reduced to the null string by the operation. See also span, rbreak, and rspan.


Indirect call

proc call (proc_ref, ...) : var

See routine. Note that all arguments to call are read-only, so the procedure referenced through the proc_ref value must not have any rw or wr arguments. It may, however, return a result of any type, including tuple, so multiple values can be returned at relatively minor syntactic cost to the caller like this:

f_ref := routine f;
...
[x, y, z] := call (f_ref, a, b, ...)

Call C function

proc callout (integer, om, tuple) : string

This is a SETL2 compatibility feature of dubious value. It is easier to use the SETL customization protocol, and you get a superior result (the interface you want) that way. The SETL2 callout interface is very awkward, and was invented so that the SETL2 interpreter could be extended without the need for any of its source code to be revealed. Basically, you had to supply a C function with a predetermined name and have it dispatch on the integer argument. You would then link your C function with the SETL2 interpreter in place of the default no-op version.


Ceiling (lowest integer upper bound)

op ceil (real) : integer
op ceil (integer) : integer

See also floor, round, and fix.


Character representation of small integer

op char (integer) : string

For example char 97 = "a" in an ASCII environment. Note that in SETL, it is safe to include null characters (characters corresponding to the integer 0) inside strings, so char 0 is well-defined. See also ichar and abs.


Change directory

proc chdir
proc chdir (string s)

The current working directory is changed to s if given, otherwise to the user's ``home'' directory as found in the HOME environment variable. See also getwd.


Clear error indicator

proc clear_error

The local system's ``last error'' indicator is cleared. It is a good idea to do this just before a call if you intend to check last_error after the call, as in this example:

clear_error;
kill(p, 0);
if last_error = no_error then
-- The process p exists and can be sent a signal.
...
else
-- "No such process", or else the process exists but
-- we can't signal it.
...
end if;

This approach can be used whenever you want to check for errors after calling a ``system'' routine like kill or close, which are normally silent about errors. Contrast this with open, which returns om instead of a file descriptor if there is an error, allowing you to test the result directly.

Note that the above example is for illustrative purposes only; in practice, you would use the pexists operator to test for process existence more directly.


Elapsed time in milliseconds

proc clock : integer

This is the total amount of wall-clock (``real'') time, in milliseconds, that has passed since the current process began. See also time and tod.


Close stream

proc close (string f)
proc close (integer f)

The stream f previously passed to or returned by open (or returned by accept) is closed. It is also permissible to close a predefined stream such as stdin, stdout, and stderr (or equivalently 0, 1, and 2 respectively), and to close a file descriptor (integer f) that is not open at the SETL level but may be open at the underlying system level. Closing a SETL pipe, pump, or line-pump stream causes the associated subprocess to be waited for, and status to be set to its exit status.

See also shutdown.


Command-line parameters

command_line : const tuple

This is a tuple of strings giving the command-line parameters that were passed to the SETL program at execution time. In a Unix script that uses the ``#!'' kernel escape, or a SETL program compiled to machine code, these are all the parameters after (but not including) the command name. If the program is being run under the standard Unix driver (usually the ``setl'' command), command_line lists all the arguments after the ``-x'' flag. See also command_name.


Command name

command_name : const string

This string is the name by which the SETL program was invoked from the system. In Unix, this is usually the name of a file containing a SETL script (see example below), or of a file containing the machine-code version of a SETL program. If, however, the program was run under the standard Unix driver (usually the ``setl'' command), then command_name instead returns the name of the SETL interpreter (usually ``setlrun'').

As a Unix example, suppose the standard executables of the SETL system have been installed in ``/usr/bin''. Then if the following script is stored in the executable file ``Yhwh'' and invoked, it will write "I'm Yhwh" and a newline character to the standard output:

#! /usr/bin/setl -k
print("I'm", command_name);  -- like sh $0 or C argv[0]
Note the use of the ``#!'' kernel escape. See also command_line.

Trigonometric cosine

op cos (real) : real
op cos (integer) : real

Hyperbolic cosine

op cosh (real) : real
op cosh (integer) : real

Date and time of day

proc date : string

See also fdate, tod, clock, and time.


Type of denotation within string

op denotype (string s) : string

If s contains a denotation that would be acceptable to unstr, then denotype s type unstr s, but if s is some other string, then the advantage of checking it with denotype first is that denotype returns om in that case instead of raising an exception as unstr would.

See also val and str.


Integer division

op div (integer, integer) : integer

SETL guarantees that div always truncates fractional results towards zero. (In C, it is machine-dependent for negative quotients.) See also mod and rem.


Domain of map

op domain (set) : set

The argument must be a set of ordered pairs, that is, a set of tuples each of size 2. The result is the set of all first members of those tuples. See also range.


Duplicate file descriptor

op dup (integer) : integer
op dup2 (integer, integer) : integer

These are direct interfaces to the low-level Posix routines dup(2) and dup2(2), useful when you need close control over system-level file descriptors, typically in fork-and-exec situations.


Test for end of file

proc eof : boolean
proc eof (string) : boolean
proc eof (integer) : boolean Called with no arguments, eof indicates whether the last input operation was incomplete because the end of the stream was reached. Called with a single argument referring to an open stream, it returns the end-of-file status of that particular stream. See also open, get, geta, getb, getc, getchar, getline, getn, gets, peekc, peekchar, read, reada, recv, and recvfrom, but do not see getfile. [(:-)]

Test for integer divisible by 2

op even (integer) : boolean

Replace current process

proc exec (string cmd)
proc exec (string cmd, tuple argv)
proc exec (string cmd, tuple argv, tuple envp)

This is a low-level interface to the Posix routine execve(2). The cmd is a full pathname identifying a command in the local system. If there is a second argument, argv it must be a tuple of strings specifying the ``argv'' array that will be supplied to the command. This defaults to a one-element tuple, [cmd]. If envp is present, it must be another tuple of strings defining ``envp'' for the command. (See execve(2) for the meanings of ``argv'' and ``envp''.) If exec is successful, it does not return; the current process is overlaid by the new command. The execve(2) routine does not take the value of the PATH environment variable into consideration, but you can get that effect easily without processing PATH yourself by using something like:

exec ("/bin/sh", ["-sh", "-c", "command and arguments"]);

Compare filter, system, and the open modes "PUMP", "LINE-PUMP", "PIPE-IN", and "PIPE-OUT", one of will probably be able to achieve the effect you want more conveniently than exec does.

See also fork and especially pump.

[I'm not entirely happy with the PATH not applying to cmd, even though exec is decidedly low-level. Anyway, if you stick to absolute pathnames for now, as in the above example with "/bin/sh", you'll be safe even if PATH does get used later. (Whatever that means.) You accept shell expansion of your "command and arguments" whether you like it or not with this approach, though.]


Natural exponential (e raised to a power)

op exp (real) : real
op exp (integer) : real

Predefined ``false'' boolean value

false : const boolean

See also true.


Format date and time

proc fdate (integer ms, string fmt) : string
proc fdate (integer ms) : string

The ms argument represents some number of milliseconds since 1 January 1970 UTC, to be formatted as a date and time according to fmt, which defaults to "%a %b %e %H:%M:%S.%s %Z %Y". For example, fdate (936433255069) is "Sat Sep  4 04:20:55.069 EDT 1999" in the timezone the author occupied at a certain moment in history, and fdate (tod) is a similarly fancy rendering of the current calendar time.

The %-sign patterns in fmt which expand to the various slices of eternity embodied in ms are those patterns defined for strftime(3) when applied to the result of applying localtime(3) to ms div 1000, together with one extension: %s will expand to the low-order 3 decimal digits of ms.

See also date, which should give the same result as the expression fdate (tod, "%c").


Machine memory read

proc fetch_char (integer address) : integer
proc fetch_short (integer address) : integer
proc fetch_int (integer address) : integer
proc fetch_long (integer address) : integer
proc fetch_float (integer address) : real
proc fetch_double (integer address) : real
proc fetch_long_double (integer address) : real
proc fetch_string (integer address, integer n) : string
proc fetch_c_string (integer address) : string

These are extremely low-level, machine-dependent, uncontrolled procedures to get at specific locations in the computer's memory. The integer address is assumed to contain a machine address from which some number n of bytes will be read. For fetch_string, n is given by the second argument. For fetch_c_string, n is deduced using the C convention of a null terminating character. For fetch_char, n = 1. For all the rest, n is machine-dependent. These procedures are typically only used if you have customized a C library interface rather roughly and want to peek inside structs based on pointers to them without going to the trouble of mapping the structs to SETL objects properly.

See also store_..., pack_..., unpack_..., and mem_copy.


Test for existence of file

op fexists (string) : boolean

Return true if the Posix stat(2) routine returns 0, indicating that the file named by the string exists, otherwise false.

See also lexists, link, symlink, readlink, and unlink.


Return filename of open stream

op filename (string) : string
op filename (integer) : string

The filename operator returns the string originally used as the first argument to open, if any. Otherwise it returns om.


Return fd of open stream

op fileno (string) : integer
op fileno (integer) : integer

The fileno operator returns the ``file descriptor'' (fd) associated with the open stream designated by the argument. Applied to an fd, it should merely return that fd, but applied to om, it will raise an exception, so the following idiom is common for programs that would rather crash immediately than continue with a non-fd result from open:

fd := fileno open(...);

The nominal use of fileno, however, is to obtain the fd associated with the string designation of a stream. Note that SETL buffering is active regardless of whether you refer to streams through their fd's or by their original names. The use of the fd is preferable from the standpoint of uniqueness, and on many implementations is likely to be more efficient, so fd's are the generally recommended stream designators.


Filter string through external process

proc filter (string cmd, string input) : string
proc filter (string cmd) : string

The cmd argument designates a command that reads from standard input and writes to standard output. The input argument is a string (default null) that is fed into the command's standard input, and the result string is the contents of the command's standard output. The command itself is processed by the standard Bourne shell, sh(1). Thus it may contain arguments with patterns such as "2>&1" to redirect the standard error stream into the same destination as the standard output refers to, like the SETL call dup2 (1, 2) does, and may also contain special characters such as "*" for filename ``globbing''.

See also status, system, pump, and the open modes "PUMP", "LINE-PUMP", "PIPE-IN", and "PIPE-OUT".


Truncate floating-point number to integer

op fix (real) : integer
op fix (integer) : integer

Truncation is towards zero. See also ceil, floor, and round.


Convert to floating-point

op float (integer) : real
op float (real) : real

Note that since integers in SETL are unbounded, but reals are limited in an implementation-dependent way (usually to 64-bit IEEE floating point or something close to that---the C ``double'' type for local C implementations is likely to be the best guide to the actual constraints), it is possible for this conversion to fail, with unpredictable results. Loss of precision is also possible for integers whose absolute value is larger than what can fit in the ``mantissa'' of the local real representation, which is likely to be on the order of only 50 bits or so.

[The specification for the overflow case may be tightened up at some point to guarantee that the result be some representation of floating-point infinity, which IEEE supports. A clever SETL implementation could work around a lack of such representational support by maintaining the infinity indication in separate bits.]


Format number

proc fixed (real x, integer w, integer a) : string
proc fixed (integer x, integer w, integer a) : string
proc floating (real x, integer w, integer a) : string
proc floating (integer x, integer w, integer a) : string

The number x is converted to a string of length w having a digits after the decimal point. The string is padded on the left with blanks. If a = 0, there is no decimal point. If the converted string does not fit within the requested length, a longer string will be produced as necessary to accommodate the number, with no leading blanks. The only difference between fixed and floating is that floating prints the number in ``scientific'' notation, that is, with E+dd appended on the right to stand for ``times 10 to the power dd'', and fixed does not.

See also whole, str, and strad.


Greatest integer lower bound

op floor (real) : integer
op floor (integer) : integer

See also ceil, round, and fix.


Flush output buffer

proc flush (string)
proc flush (integer)

The open stream designated by the argument is ``flushed''. This can be particularly important when streams are used to communicate between processes. Without a flush, data can remain in a stream's output buffer for arbitrarily long periods. Note that the SETL implementation is supposed to flush bidirectional streams automatically whenever a switch is made from output to input. This includes at least pumps, line-pumps, sockets, some tty-like devices, and inherited ``rw'' file descriptors.


Fork process

proc fork : integer

This is a direct interface to the Posix fork(2) routine. Your process splits into two processes. In the ``parent'' process, fork returns an integer representing the process id of the ``child'' process. In the child process, fork returns 0. If the system cannot spawn a new process, fork returns om, and no child process is created.

In many cases, pump will be preferable to fork because pump also sets up communication with the child without your having to go through the customary low-level pipe, dup2, and close calls to set up the child's environment the hard way.

See also pipe_from_child, pipe_to_child, kill, and exec.


Take from set

op from (wr var x, rw set s)

An arbitrary (which is not to say ``random'' but rather nondeterministically chosen) element is removed from s and assigned to x. If s is empty, x := om instead. In this version of SETL, from is actually a statement form, not an operator, but this may be extended in the future so that the extracted element (or om) is also returned as the result.


Take from beginning of string or tuple

op fromb (wr string x, rw string s)
op fromb (wr var x, rw tuple s)

The string or tuple s is stripped of its first element, which is assigned to x. If s is of length 0, x := om instead. In this version of SETL, fromb is actually a statement form, not an operator, but this may be extended in the future so that the extracted one-character string, tuple element, or om is also returned as the result.


Take from end of string or tuple

op frome (wr string x, rw string s)
op frome (wr var x, rw tuple s)

The string or tuple s is stripped of its last element, which is assigned to x. If s is of length 0, x := om instead. In this version of SETL, frome is actually a statement form, not an operator, but this may be extended in the future so that the extracted one-character string, tuple element, or om is also returned as the result.


Size of disk file in bytes

op fsize (string) : integer

The string argument should be a filename.


Get one or more lines from standard input

proc get (wr string...)

Equivalent to geta (stdin, ...).


Get one or more lines from input stream

proc geta (string f, wr string...)
proc geta (integer f, wr string...)

Lines are read from the input stream f and assigned to the succeeding arguments in order. Lines are terminated by newline ("\n"), and there is no restriction on line length. The trailing newline character is not delivered as part of the assigned string. If an end-of-file condition occurs on f, trailing arguments may be assigned om. The final line before the end of the input need not be terminated by a newline. If f is not already open, an attempt will automatically be made to open it for reading.


Get one or more values from input stream

proc getb (string f, wr var...)
proc getb (integer f, wr var...)

Values are read from the input stream f and assigned to the succeeding arguments in order. If the end of the input is reached, trailing arguments may be assigned om. Values written by putb, except for atoms (see newat) and procedure references (see routine), are guaranteed to be readable by getb. If f is not already open, an attempt will automatically be made to open it for reading.

There is a subtle difference between getb and reada in that reada will always start reading at the beginning of a line, skipping ahead to just after the next newline character if necessary, but getb will simply start with the next available character in the input stream. Both routines will happily cross newline boundaries if necessary to obtain more values, however.


Get character from input stream

op getc (string f) : string
op getc (integer f) : string

One character is read from the input stream f and returned as a string of length 1. If there are no more characters (the end of the file was reached), getc returns om instead. If f is not already open, an attempt will automatically be made to open it for reading.


Get a character from standard input

proc getchar : string

Equivalent to getc stdin.


Get effective group id

proc getegid : integer

Returns the result of calling the Posix getegid(2) routine. See also getgid and setgid.


Get value of environment variable

op getenv (string v) : string

If the environment variable named by the string v exists, its value (a string) is returned; otherwise you get om. See also putenv and setenv.


Get effective user id

proc geteuid : integer

Returns the result of calling the Posix geteuid(2) routine. See also getuid and setuid.


Read stream up to the end

op getfile (string f) : string
op getfile (integer f) : string

Characters are read into a string from the input stream f until the end of the input is reached. If the end-of-file condition is immediate, getfile returns a null string. If f is not already open, an attempt will automatically be made to open it for reading. Since getfile always reads until it reaches the end of the file, it does not alter the current value of eof.

The getfile intrinsic is unique among routines that will attempt to auto-open a file for input in that if getfile fails on the auto-open attempt, it humbly returns om rather than raising an exception. This is part of an effort to dissuade SETL programmers from writing racy code like

x := if fexists f then getfile f else default_string end;

when the idiomatic

x := getfile f ? default_string;

is race-free. Of course, it would have been possible for the hardy thinker to have coded

if (fd := open(f, "r")) /= om then
x := getfile fd;
close(fd);
else
x := default_string;
end if;

but that is hardly in the spirit of a language which aims for concise and natural expression.


Get real group id

proc getgid : integer

Returns the result of calling the Posix getgid(2) routine. See also getegid and setgid.


Get line from input stream

op getline (string f) : string
op getline (integer f) : string

This is an operator-form alternative to geta. Characters are read up to the next newline character, and the accumulated string is returned, not including the newline. If the end of the input is reached with no characters being read, om is returned. If the input stream f is not already open, an attempt will automatically be made to open it for reading.


Get fixed number of characters from input stream

proc getn (string f, integer n) : string
proc getn (integer f, integer n) : string

Exactly n characters are read from the input stream f if at least that many remain before the end of the input. If fewer than n characters remain in the stream f, a shorter string is returned.


Get process group id

proc getpgrp : integer

Retrieve the process group id of the current process by calling the Posix getpgrp(2) system routine. This is the process id of the process group leader. See also setpgrp and pid.


Direct-access read

proc gets (string f, integer start, integer n, wr string x)
proc gets (integer f, integer start, integer n, wr string x)

The direct-access stream f (see open mode "RANDOM") is viewed as a string, where start specifies the index (1 or higher) of the first character to read. The gets procedure will read n characters from f if that many remain before the end of the file, and assign that substring to x. If fewer remain, x will be assigned a shorter substring. If the end-of-file condition is immediate, x := "" (the null string). See also puts, seek, and rewind.


Get real user id

proc getuid : integer

Returns the result of calling the Posix getuid(2) routine. See also geteuid and setuid.


Current working directory

proc getwd : string

This is the current working directory of the process. See also chdir.


Find all occurrences of pattern in string

proc gmark (string s, string p) : tuple
proc gmark (string s, tuple p) : tuple

All non-overlapping leftmost occurrences of the regular expression p are found in s, and the result is returned as a tuple of pairs of integers [ij], where each matched substring of s can be addressed as s(i..j). See also mark, gsub, and sub.

[Need to spell out the exact rules for regular expressions, which are similar to those supported by the egrep(1) command, and how the ``tuple'' regexps work: basically, the tuple must be a pair of string regexps, meaning ``from this first regexp up to the first subsequent occurrence of this second regexp''. It would also be helpful to show the near-correspondence between certain slice-update operations and sub, and how the older-style "pattern matching" operations can be regarded as degenerate forms of the regexp-based operations.]


Substitute all occurrences of pattern in string

proc gsub (rw string s, string p) : tuple
proc gsub (rw string s, tuple p) : tuple
proc gsub (rw string s, string p, string r) : tuple
proc gsub (rw string s, tuple p, string r) : tuple

All non-overlapping leftmost occurrences in s of the pattern given by the regular expression p are replaced by r, which defaults to the null string. The original substrings of s replaced by this operation are returned as a tuple of strings. See also sub, gmark, and mark.

[Need rules for regexps, as mentioned under gmark.]


Convert to hexadecimal

op hex (string) : string

For example, hex "djB" = "646A42" in an ASCII environment. See also unhex.


Local host address

proc hostaddr : string

Primary Internet (IP) address of the current host, if it can be found; otherwise om. See also peer_address, ip_addresses, and hostname.


Local hostname

proc hostname : string

Primary Internet (DNS) name of the current host. See also peer_name, ip_names, and hostaddr.


Integer equivalent of character

op ichar (string) : integer

For example, ichar "a" = 97 in an ASCII environment.

The argument must be 1 character in length. That will not change with the advent of Unicode, but the range of the result could then change from 0-255 to 0-65535.

See also abs.


Implication

op impl (boolean, boolean) : boolean

Here is the customary ``truth table'' defining this operator:

true impl true = true
true impl false = false
false impl true = true
false impl false = true

This rarely-used operator could have been ``short-circuited'' like and, or, and the query operator ``?'', but it isn't.


Membership test

op in (var x, set s) : boolean
op in (var x, tuple s) : boolean
op in (string x, string s) : boolean

The keyword in is also used in a common iterator form, as in

for x in s loop
...
end loop;
or
squares := {x*x : x in s};

This use of in as an iterator should not be confused with its use as a membership test, where it is simply a boolean-valued binary operator.

For strings x and s, the expression x in s is true if x is a substring of s.


Subset test

op incs (set s, set ss) : boolean

Returns true if ss is a subset of s. Thus (s incs ss) = (ss subset s).


Integer quotient type switch

intslash : boolean

By default, the result of dividing two integer values in SETL is real, as in Pascal and in the Algol family (and as opposed to the Fortran/C/SETL2 family). This default corresponds to intslash = false. See ``/'' for a rather heated and windy argument in favour of leaving it set this way, and see also set_intslash for the preferred way of changing it should you desire to do so.


Internet addresses

proc ip_addresses : set
proc ip_addresses (string) : set

Called with no parameters, ip_addresses returns the set of all Internet (IP) addresses of the machine hosting the current process, as strings like "128.122.129.66". Otherwise, it returns a set of such strings for the host whose name or IP address is given in dotted notation by the string parameter.

See also hostaddr, peer_address, and ip_names.


Internet hostnames

proc ip_names : set
proc ip_names (string) : set

Called with no parameters, ip_names returns the set of all Internet (IP) names of the machine hosting the current process, as strings like "GALT.CS.NYU.EDU". Otherwise, it returns a set of such strings for the host whose name or IP address is given in dotted notation by the string parameter.

See also hostname, peer_name, and ip_addresses.


Type testers

op is_atom (var x) : boolean
op is_boolean (var x) : boolean
op is_integer (var x) : boolean
op is_map (var x) : boolean
op is_mmap (var x) : boolean
op is_numeric (var x) : boolean
op is_om (var x) : boolean
op is_real (var x) : boolean
op is_routine (var x) : boolean
op is_set (var x) : boolean
op is_smap (var x) : boolean
op is_string (var x) : boolean
op is_tuple (var x) : boolean

The operator is_map (or equivalently is_mmap) returns true if x is a set consisting entirely of ordered pairs (tuples of length 2). The operator is_smap adds the further condition that #domain x = #x, which is to say that the map is ``single-valued'' in the sense of taking every domain element to just one range element.


Test for open stream

op is_open (string f) : boolean
op is_open (integer f) : boolean

Returns true if f is one of the pre-opened streams stdin, stdout, or stderr, a stream opened by open, or an automatically opened stream, provided the stream is still open.


Send signal to process

proc kill (integer p)
proc kill (integer p, integer signal)
proc kill (integer p, string signal)

Calls the Posix kill(2) routine on p, which is a process number if positive. If p is negative and not equal to -1, then -p is a process group id, and the signal is sent to every process in that process group. If p is -1, the signal is sent to every process owned by the caller. If p is 0, the meaning is system-dependent, and if p indicates a nonexistent process or process group, the call has no effect except to set last_error.

If signal is omitted, it defaults to "TERM", or equivalently "SIGTERM". Signals may be specified as integers or more portably as strings. Case is not significant. The signal names HUP, INT, QUIT, ILL, ABRT, FPE, KILL, SEGV, PIPE, ALRM, TERM, USR1, USR2, CHLD, CONT, STOP, TSTP, TTIN, and TTOU are defined by POSIX.1; see your local kill(2) documentation (or perhaps signal(2) or signal(5), or the C ``header'' files customarily found under the /usr/include directory) for additional signal names that may also be available on your system.

See also pid, fork, pump, pipe_from_child, pipe_to_child, and pexists for more details on processes, and the clear_error example for more information on detecting and handling errors.


Last error message from system routine

last_error : string

This is the error message corresponding to the last setting of the C global variable ``errno'' by a Posix routine. See also clear_error and no_error.


Break out initial substring based on length

proc len (rw string s, integer n) : string

The lesser of n and #s characters are removed from the beginning of s and returned as the function result. See also rlen.


Set less one element

op less (set, var) : set

Definition: s less x = s - {x}.


Map less one domain element

op lessf (set s, var x) : set

The set s must be a map. The lessf operator returns a copy of the map in which all pairs having x as a domain element are removed.


Test for existence of symbolic link

op lexists (string) : boolean

Return true if lstat(2) returns 0, indicating that the symbolic link named by the string exists, otherwise false.

See also fexists, link, symlink, readlink, and unlink.

When fexists is applied to a symbolic link, it interrogates the existence of the file referred to by that link, whereas when lexists is applied to a symbolic link, it interrogates the existence of the link itself.


Create hard link

proc link (string existing, string new)

Atomically create a ``hard link'' new to the existing file existing using link(2), if new does not exist before the call. There is no return value, but calling clear_error before the operation and inspecting last_error after it can be used to determine whether the operation was successful. Thus link can be used to implement a ``test and set'' mutex lock in the file system: assuming existing exists, then if new also exists, the operation will fail; and if it doesn't exist, it will be created and the calling process will then ``own'' the lock.

See also symlink, unlink, and fexists.


Natural logarithm

op log (real) : real
op log (integer) : real

Pad string on left with blanks

proc lpad (string s, integer n) : string

If n > #s, the returned string is a copy of s padded on the left with blanks to length n. Otherwise, (a copy of) s is returned. See also rpad.


Regular expressions switch

magic : boolean

By default, magic = true, meaning that subscripting and slicing of strings by pattern strings causes the pattern strings to be interpreted as regular expressions. This also affects sub, gsub, mark, gmark, and split. You can assign magic := false to turn off this behaviour, so that pattern strings are interpreted literally. It is a good idea to save and restore the value of magic in any utility routines you write that need to have it set one way or the other. See also set_magic for the preferred way of setting this switch.


Find first occurrence of pattern in string

proc mark (string s, string p) : tuple
proc mark (string s, tuple p) : tuple

The leftmost occurrence if any of the regular expression p is found in s, and the result is returned as a pair of integers [ij] such that the matched substring of s can be addressed as s(i..j). If there is no such occurrence, om is returned. See also gmark, sub, and gsub.

[Need rules for regexps, as mentioned under gmark.]


Break out initial substring based on exact match

proc match (rw string s, string p) : string

If p occurs as an initial substring of s, it is removed from s and returned as the function result. Otherwise, nothing happens to s and the null string is returned. See also rmatch.


Numeric maximum

op max (integer, integer) : integer
op max (integer, real) : real
op max (real, integer) : real
op max (real, real) : real

Allocate machine memory

proc mem_alloc (integer n) : integer

This is an extremely low-level interface to the malloc(3) routine that should never be needed in ``normal'' SETL use. It allocates n bytes of system memory and returns the address of that memory. It is typically only used if you have customized a C library interface rather roughly and need to supply the address of a struct to some C function without going to the trouble of mapping the struct to a SETL value properly.

See also mem_copy and mem_free.


Copy machine memory

proc mem_copy (integer dst, integer src, integer n)

This is an extremely low-level procedure for copying machine memory. It is very dangerous and should be needed in ``normal'' SETL use. It copies n bytes starting at any src address to consecutive locations starting at any dst address. If {src..src+n-1} * {dst..dst+n-1} /= {} (the memory regions overlap), the copying may produce unexpected results.

See also mem_alloc, mem_free, fetch_..., and store_....


Free machine memory

proc mem_free (integer)

This is an extremely low-level interface to the free(3) routine that should never be needed in ``normal'' SETL use. Pass it an address as returned by mem_alloc when (and only when) you are sure that the memory block associated with that address can be released. Wildly unpredictable results can ensue if you try to free a given block twice, or try in any way to refer to a block you have already freed.

See also mem_copy.


Numeric minimum

op min (integer, integer) : integer
op min (integer, real) : real
op min (real, integer) : real
op min (real, real) : real

Integer modulus, symmetric set difference

op mod (integer, integer) : integer
op mod (set, set) : set

SETL guarantees a non-negative remainder as the result of mod, following the usual mathematical definition. The sign of the divisor is ignored, so:

5 mod 3 = 2
-5 mod 3 = 1
5 mod -3 = 2
-5 mod -3 = 1

The use of a single operator name for both the non-commutative, non-associative operation of integer modulus and the commutative, associative operation of symmetric set difference is an unfortunate matter of history.

The set-theoretic mod is analogous to the boolean ``exclusive or'' or ``not equals'' operator.

See also div and rem.


Number of arguments passed to subroutine

nargs : const integer

This is particularly useful in cases where there is a possibility of trailing om and/or writable parameters on procedures that take a variable number of arguments. Note that nargs is the total number of arguments passed to the currently active routine, including the required ones. It is a constant in that you cannot assign to it, but of course its value will depend on which routine is currently active.


Create new atom

proc newat : atom

This creates a unique atom, whose salient property is merely that it is different from all other atoms created by the current process.


Non-error message

no_error : string

This is the value last_error has immediately after a call to clear_error.


Logical negation

op not (boolean) : boolean

Break out initial character

proc notany (rw string s, string p) : string

If the first character of s does not appear anywhere in the string p, that character is removed from s and returned as the function result. Otherwise, nothing happens to s, and the null string is returned. See also any, rnotany, and rany.


Membership test

op notin (var x, set s) : boolean
op notin (var x, tuple s) : boolean
op notin (string x, string s) : boolean

Definition: x notin s = not (x in s).


All subsets of a given size

op npow (integer n, set s) : set
op npow (set s, integer n) : set

Definition: s npow n = n npow s = {ss in pow s | #ss = n}, i.e., the set of all subsets of s that have n members. For m = #s >= n >= 0, the number of such subsets is m!/(m-n)!n!.


Print to standard output with no trailing newline

proc nprint (...)

Equivalent to nprinta (stdout, ...).


Print to output stream with no trailing newline

proc nprinta (string f, ...)
proc nprinta (integer f, ...)

There can be 0 or more arguments after f, of any type. They are sent in sequence to the stream f, separated by single spaces. String arguments are written directly; all others are written as if they had been passed through str first.

Note that the output of the program

nprinta (stdout, 1, 2);

is ``1 2'', which is not the same as the output of the program

nprinta (stdout, 1);
nprinta (stdout, 2);

which is ``12''.

See also nprint and printa.


Test for integer not divisible by 2

op odd (integer) : boolean

The ``undefined'' value

om

This is the default value of all uninitialized SETL variables, undefined set, range, or tuple elements, the implicit return value of all proc routines that are not defined to return anything else, and the default result of various operations when they do not ``succeed'' in obtaining a primary result. Om really has no type, but type om = "OM", and str om = "*".


Open stream

proc open (string f, string how) : integer
proc open (integer f, string how) : integer

This procedure returns an integer file descriptor (fd) for f, which may be a string or may be an integer fd that is open at the operating system level but not at the SETL level. If open cannot open the file, pipe, command, socket, or whatever f identifies, it raises an exception or returns om.

[Obviously the intention is to return om for things like unwriteable files and unreachable hosts, and to trap for things that look like clear mistakes. But in the absence of a formal specification of all relevant exceptional conditions, I must leave the semantics of all opens that don't succeed as simply ``undefined''. This does not even rule out erroneously returning an integer when for example an open-on-fd looks good until the first I/O operation is then attempted. Implementations are implicitly expected to catch as much as they can at the time of open, but I know of no specific legal documents on this. My dissertation is a bit more specific.]

[I've been a bit schizophrenic about what to call fd's. They seem to be ``stream designators'' all over the place now, which isn't quite right since the advent of UDP sockets (which are not streams). ``Streams'' is a good enough informal term anyway, but I think I should adopt the traditional term ``file descriptor'' which after all gave rise to this ``fd'' convention, and just point out here in the open documentation that the ``file'' can be any of the world of wonders suggested by the mighty list below.]

[On Unix-like systems, the fd's correspond directly to standard fd's, which are small integers, but here again we stray from what should be part of the standard semantics. New programmers may be a bit mystified as to what open actually does, and the best examples are probably not old-fashioned files, but rather client sockets (the connection to the server must be established) and pipes (the co-process must be started). Of course, implementations have to do a bunch of other housekeeping, like setting up I/O buffers, and in fact the existence of buffering should be written right into the semantics of streams so that the programmer's responsibilities with respect to flushing can be specified the more clearly.]

This is almost upward-compatible with the open described in the Schwartz et al. textbook, and completely compatible for programs which ignored the return value, because all the I/O routines accept the argument that was originally passed to open if it is unambiguous (i.e., if there isn't more than one stream by that name open at once - note, however, that multiple opens of a file, socket, or subcommand under the same name are perfectly reasonable under normal circumstances, so some danger of ambiguity lurks in the direct use of the original f on I/O calls for those cases). This open is also compatible with SETL2 in that the file descriptor serves as a unique handle.

But this open offers far more I/O modes, including network, inter-process, pipe, co-process (pump), signal, and timing streams, than any previous definition. See the second table below.

There are three predefined streams with the following aliases:

fd aliases meaning
stdin 0,"","-","input","INPUT","stdin" standard input
stdout 1,"","-","output","OUTPUT","stdout" standard output
stderr 2,"error","ERROR","stderr" standard error

Files whose actual names are "input", "ERROR", etc. may still be referred to by explicitly opening them before starting I/O on them. This will cause such names not to act as standard aliases again until they are closed as streams.

The null string acts as stdin or stdoutdepending on the direction of the stream operation. Likewise "-".

You can close stdin, stdout, or stderr at any time, and on Unix-like systems the next open will choose the lowest fd, providing a mechanism by which you can implement redirection à la shell. See also dup2.

The fileno operator returns the fd of any open stream.

[Perhaps the deliberate violation of abstraction bears mention here: stdin, stdout, and stderr really are the integers 0, 1, and 2 respectively, not (as you might think they should be) a proper file* type or something. The point is that integers are easier to exchange with programs written in other languages, being more universal than any particular language's ``file'' type. This is a communications feature, not an exercise in academic purity.

On the other hand, I am a would-be academic purist, so I am far from happy with this, and I would suggest to those who are writing SETL programs as prototypes of Ada programs that they introduce a unary ifd* operator, implemented as the identity function on (some) integers, to serve as a placeholder for an operator to extract the integer within a file*. This is not a conversion operator, and should have no inverse. The only way to obtain a file* from an integer would be (as at present) to call open on a file descriptor that is already open at the underlying system level.

Should ifd* be built in to SETL? What do you think? Send me your comments and opinions before it's too late!]

The is_open operator can be used to test whether a stream is open without otherwise disturbing it, and is in fact the only thing you can call without error or side-effects on a stream that is not open.

Note that many input and output routines open files automatically on first reference if they can.

Any file that has been automatically opened for input will be automatically closed on end-of-file. The only routine that auto-closes an auto-opened output file is putfile.

Summary and hints:

  • For simplicity at the risk of ambiguity, code open in statement form and use whatever you passed to it as a handle everywhere.
  • Or, for SETL2 compatibility and a bit of extra efficiency, use the result of open as the file handle.
  • If open fails, it may return om. The value of last_error will give the cause in string form.
  • Put fd := fileno open (...) to make sure open does not return om (otherwise fileno raises an exception).

    Here are the valid second argument (how) values for open. They are case-insensitive, despite their presentation. The modes towards the end of the table, presented with a lowercase "b" in them, and the modes with the word "BINARY" in them, are completely redundant with other modes in the table on Unix systems, but in the stream-oriented cases may be more ``raw'' on non-Unix systems, though it is more likely that even on those systems, they will still be redundant and that any nonsense such as inserting a carriage return before each newline character will have to be done by completely independent external filters. Where simple files are concerned, all the semantically distinguishable open modes are actually captured in the first 8 entries. For fifos, processes, networks, signals, and timers, though, there is a world of wonders starting at "rw":

    [It would probably be better to present these as groups of synonyms.]

    how meaning
    "r" stream input
    "w" stream output
    "a" stream output at end
    "n" stream output to new file
    "r+" direct access
    "w+" direct access, empty file first
    "a+" direct access, write at end
    "n+" direct access, new file
    "INPUT" stream input
    "OUTPUT" stream output
    "APPEND" stream output at end
    "OUTPUT-APPEND" stream output at end
    "NEW" stream output to new file
    "NEW+" direct access, new file
    "NEW-r+" direct access, new file
    "NEW-w+" direct access, new file
    "CODED" stream input
    "CODED-IN" stream input
    "CODED-OUT" stream output
    "CODED-APPEND" stream output at end
    "CODED-NEW" stream output to new file
    "NEW-CODED" stream output to new file
    "PRINT" stream output
    "PRINT-APPEND" stream output at end
    "TEXT" stream input
    "TEXT-IN" stream input
    "TEXT-OUT" stream output
    "TEXT-APPEND" stream output at end
    "TEXT-NEW" stream output to new file
    "NEW-TEXT" stream output to new file
    "DIRECT" direct access
    "DIRECT-NEW" direct access, new file
    "NEW-DIRECT" direct access, new file
    "RANDOM" direct access
    "RANDOM-NEW" direct access, new file
    "NEW-RANDOM" direct access, new file
    "rw" stream input/output
    "TWO-WAY" stream input/output
    "TWOWAY" stream input/output
    "BIDIRECTIONAL" stream input/output
    "INPUT-OUTPUT" stream input/output
    "READ-WRITE" stream input/output
    "PIPE-IN" input from command's stdout
    "PIPE-FROM" input from command's stdout
    "PIPE-OUT" output to command's stdin
    "PIPE-TO" output to command's stdin
    "PUMP" I/O on command's stdin and stdout (co-process)
    "TTY-PUMP" co-process pumping through a pseudo-tty
    "LINE-PUMP" co-process pumping through a pseudo-tty
    "SOCKET" Internet TCP client socket
    "CLIENT-SOCKET" Internet TCP client socket
    "TCP-CLIENT-SOCKET" Internet TCP client socket
    "SERVER-SOCKET" Internet TCP server socket
    "TCP-SERVER-SOCKET" Internet TCP server socket
    "UDP-CLIENT-SOCKET" Internet UDP client socket
    "UDP-SERVER-SOCKET" Internet UDP server socket
    "SIGNAL" signals input as newlines
    "SIGNAL-IN" signals input as newlines
    "IGNORE" ignore signals
    "IGNORE-SIGNAL" ignore signals
    "SIGNAL-IGNORE" ignore signals
    "REAL-MS" input newline per wall clock interval
    "VIRTUAL-MS" input newline per user CPU time interval
    "VIRT-MS" input newline per user CPU time interval
    "PROFILE-MS" input newline per total CPU time interval
    "PROF-MS" input newline per total CPU time interval
    "rb" stream input
    "wb" stream output
    "ab" stream output at end
    "nb" stream output to new file
    "rb+" direct access
    "r+b" direct access
    "wb+" direct access, empty file first
    "w+b" direct access, empty file first
    "ab+" direct access, write at end
    "a+b" direct access, write at end
    "nb+" direct access, new file
    "n+b" direct access, new file
    "BINARY" stream input
    "BINARY-IN" stream input
    "BINARY-OUT" stream output
    "BINARY-APPEND" stream output at end
    "BINARY-NEW" stream output to new file
    "NEW-BINARY" stream output to new file
    "BINARY-DIRECT" direct access
    "DIRECT-BINARY" direct access
    "RANDOM-BINARY" direct access
    "BINARY-DIRECT-NEW" direct access, new file
    "BINARY-RANDOM-NEW" direct access, new file
    "DIRECT-BINARY-NEW" direct access, new file
    "RANDOM-BINARY-NEW" direct access, new file
    "NEW-BINARY-DIRECT" direct access, new file
    "NEW-BINARY-RANDOM" direct access, new file
    "NEW-DIRECT-BINARY" direct access, new file
    "NEW-RANDOM-BINARY" direct access, new file

    For how = "TCP-CLIENT-SOCKET" or "UDP-CLIENT-SOCKET", the f argument should be of the form "ip.name.or.address:portnum". For example, here is a program to fetch a document from a popular HTTP server and write it on standard output:

    fd := open("galt.cs.nyu.edu:80", "tcp-client-socket");
    printa(fd, "GET /");
    putchar(getfile fd);

    For how = "TCP-SERVER-SOCKET" or "UDP-SERVER-SOCKET", the f argument should be a port number contained in a string or a service name that is listed in the local ``/etc/services'' file or equivalent and can therefore be mapped to a port number by getservbyname(2). It should not be an integer except where you intend an already open fd. For example, "80" indicates a port number, but naked 80 indicates a file descriptor that probably doesn't exist. In the special case when f is "0", the system chooses a port number which can be retrieved using port like this:

    fd := open("0", "tcp-server-socket");
    print("server port number is", port fd);

    The only I/O (data-transferring) operations allowed on UDP client sockets are send and recv, and the only ones allowed on UDP server sockets are sendto and recvfrom. And conversely, recv and send can only be used on UDP client sockets, and recvfrom and sendto can only be used on UDP server sockets.

    For how = "PIPE-IN" ("PIPE-FROM"), "PIPE-OUT" ("PIPE-TO"), "PUMP", or "TTY-PUMP" ("LINE-PUMP"), the f argument is a command. The difference between "TTY-PUMP" ("LINE-PUMP") and ordinary "PUMP" is that the ordinary pump is fully buffered, whereas the child process in the ``tty pump'' is given an environment in which its standard input and output are connected to the slave end of a pseudo-terminal (of which your SETL program gets the master end as an fd), so whatever buffering applies to interactive use for the command takes place. This can be used to allow off-the-shelf programs like ``sed'' or ``awk'' to be used as pumps by causing them to flush after every output line, or to implement very fancy drivers of programs that really are meant to be interactive, sending them commands and getting back results line by line. [Need to separate out the Unix-specific semantics there.]

    For how = "SIGNAL" ("SIGNAL-IN") or "IGNORE" ("SIGNAL-IGNORE", "IGNORE-SIGNAL"), the signals that can be received as input newlines or studiously ignored are as follows, where the names are given as strings with or without the "SIG" prefix, and case is not significant (in contrast with most other interpretations of the f parameter to open):

    signal default action meaning
    "HUP" terminate hangup
    "INT" terminate interrupt from keyboard
    "QUIT" terminate quit from keyboard
    "USR1" terminate user-defined signal 1
    "USR2" terminate user-defined signal 2
    "PIPE" terminate write to pipe with no readers
    "TERM" terminate software termination
    "CHLD" ignore child exit
    "CONT" ignore continue after suspension
    "PWR" ignore battery low
    "WINCH"ignore terminal window size change

    Whenever a signal is received and there is at least one input stream open on that signal type, a newline is delivered to every such stream. Otherwise, if the signal is ``open for ignoring'', it is ignored. Otherwise, its effect defaults to the action specified in the table above. All stream input routines and select can be used on signal streams just as they can on regular input streams.

    For how = "REAL-MS", "VIRT-MS" ("VIRTUAL-MS"), or "PROF-MS" ("PROFILE-MS"), case not significant, f must be a pure digit string indicating a positive integer. This is interpreted as the number of milliseconds that should elapse between newlines to be delivered on the created input stream. Any number of input streams can be open on each kind of timer. Just as for signal streams, all stream input routines and select can be used on timer streams.

    [In the dB SETL implementation, the greater of 10 ms and the GCD of all the timers defines the tick interval requested of the Unix system timer of that type. This helps to reduce overhead and improve timing accuracy, especially if the SETL programmer knows about this small refinement.]

    See also umask, close, shutdown, fexists, link, symlink, lexists, and unlink.

    [A current difference in the behaviour of the SETL pipe and pump stream file descriptor dispositions from the analogous streams created by popen (there is no Unix analogy to the pump stream, however) is that the ``close on exec'' (FD_CLOEXEC) flag is not set for these file descriptors in SETL, whereas it is for the file descriptor underlying a popen-created stream. This may change soon; the jury's still out on this one, but I'm inclined to side with the C-library decision as of this writing, because there is no obvious way to communicate information such as the pid (which needs to be waited for upon the closing of such a stream) across exec(2) boundaries automatically, though the programmer could implement that spam.

    Actually, FD_CLOEXEC is asserted for pipe and pump streams currently. It is nice (though not essential semantics) that these SETL streams work the same way as Unix popened streams, and nicer still that a succession of equivalent child processes will tend to start with the same set of available file descriptors. But probably the best reason for keeping FD_CLOEXEC in place is that not doing so doesn't really buy you anything very useful. You might think that the ``prophylactic pump over a socket'' technique might generalize well to untrusted child processes, but it doesn't: you can't just close the fd associated with the untrusted child process after passing it to the prophylactic pump (though indeed the latter will preserve the reference to the kernel data structure by having inherited the fd), as you would with a newly accepted socket fd, because you will get stuck trying to reap the child process immediately instead of holding off until it is ready to exit or has already become a zombie (see wait).]


  • Logical disjunction

    op or (boolean, boolean) : boolean

    The expression a or b is equivalent to the expression if a then true else b end. The or operator is ``short-circuited'' in that it only evaluates its second argument if necessary. This makes it suitable for use as a ``guard'' (see also and).


    Byte packing

    op pack_char (integer) : string
    op pack_short (integer) : string
    op pack_int (integer) : string
    op pack_long (integer) : string
    op pack_float (real) : string
    op pack_double (real) : string
    op pack_long_double (real) : string

    These are low-level, machine-dependent (but not dangerous) operators for obtaining strings corresponding to the primitive predefined C types on the machine you are running SETL on. For example, pack_long 97 might produce, on a typical machine, a 4- or 8-byte string consisting entirely of null bytes except for a character having the value char 97 (ASCII ``a'') in either the first or the last position. You could try the one-line program

    print (hex pack_long 97);

    to see what your machine does. See also unpack_..., fetch_..., and store_....


    Peek at next character in input stream

    op peekc (string f) : string
    op peekc (integer f) : string

    The next available character, if any, in the input stream f is retured as a string of length 1, just as with getc. However, the character also remains in the input stream as if it had been ``pushed back'' by ungetc. If there are no more characters (the end of the input was reached), peekc behaves exactly the same as getc, returning om. If f is not already open, an attempt will automatically be made to open it for reading.

    See also peekchar.


    Peek at next character in standard input

    proc peekchar : string

    Equivalent to peekc stdin.


    Peer host address

    proc peer_address (integer f): string
    proc peer_address (string f): string

    If f refers to a socket in a ``connected'' state, peer_address returns the remote peer's Internet (IP) address.

    See open, and see also hostaddr, ip_addresses, and peer_name.


    Peer hostname

    proc peer_name (integer f): string
    proc peer_name (string f): string

    If f refers to a socket in a ``connected'' state, peer_name returns the remote peer's Internet (DNS) name. If the name cannot be found, peer_name returns om.

    See open, and see also hostname, ip_names, and peer_address.


    Peer port number

    proc peer_port (integer f): integer
    proc peer_port (string f): integer

    If f refers to a socket in a ``connected'' state, peer_port returns the remote peer's TCP or UDP port number.

    See open, and see also port, peer_name, and peer_address.


    Test for existence of process

    op pexists (integer) : boolean

    Return true if the the process identified by the integer argument exists on the local system, otherwise false. See also kill, which can make a similar test a little more clumsily as suggested in the clear_error example, and pid.


    Process id of current process or subtask

    proc pid : integer
    proc pid (string) : integer
    proc pid (integer) : integer

    Called with no argument, pid returns the Posix process identifier (pid) of the current process. Called with an argument that refers to an open pipe, pump, or line-pump stream, it returns the child's process id.

    See also open, pipe_from_child, pipe_to_child, pump, pexists, kill, getpgrp, and setpgrp.


    Create primitive pipe

    proc pipe : tuple

    This is a low-level interface to the Posix routine pipe(2), and returns a tuple of two integer file descriptors [pull,push] such that pull is open (at the system level, not the SETL level) for reading and and push is open for writing. See the open modes "PIPE-IN", "PIPE-OUT", and "PUMP", and also the pump primitive, one of which probably does whatever you want to do with pipes much more conveniently than the raw pipe procedure does.

    [Since I now map ``pipe'' calls to socketpair(2) when possible, and pretty well everyone that doesn't have socketpair is probably SVR4 by now, the low-level pipe you get from this call is almost certainly bidirectional. Indeed, this is taken for granted in the current dB implementation of pumps: gone is the messiness of having to have a separate fd for each direction.]


    Create pipe stream over local subprocess

    proc pipe_from_child : integer
    proc pipe_to_child : integer

    The pipe_from_child and pipe_to_child primitives are degenerate forms of pump. Each creates a unidirectional stream, but is in other respects just like pump except that shutdown cannot be used on a stream created by pipe_from_child or pipe_to_child. In the child, either stdin or stdout as appropriate is connected through a pipe to the parent process, and the other is merely inherited from the parent process in a similar manner as the C-library popen(3) routine arranges.


    Retrieve port number

    op port (string f) : integer
    op port (integer f) : integer

    Returns the local port number associated with a client or server socket f. See open, and see also peer_port.


    Power set

    op pow (set s) : set

    Returns the set of all subsets of s, including the null set {} and s itself. There are 2 ** #s such subsets in all. See also npow.


    ``Prettify'' string

    op pretty (string) : string

    The pretty operator returns a copy of its argument in which the 95 characters that ASCII considers ``printable'' are left unmolested, except for the apostrophe ('), which becomes two apostrophes in a row, and the backslash, which becomes two backslashes in a row. An apostrophe is also added at each end. Among the other codes, the audible alarm, backspace, formfeed, newline, return, horizontal tab, and vertical tab are converted to \a, \b, \f, \n, \r, \t, and \v respectively (these are the same as the C conventions), and all remaining codes are converted to \xyz, where x, y, and z are octal digits. See also unpretty, str, and unstr.


    Print to standard output

    proc print (...)

    Equivalent to printa (stdout, ...).


    Print to output stream

    proc printa (string f, ...)
    proc printa (integer f, ...)

    There can be 0 or more arguments after f, of any type. They are sent in sequence to the stream f, separated by single spaces and followed by a newline character. String arguments are written directly; all others are written as if they had been passed through str first.

    See also print and nprinta (which omits the trailing newline).


    Create pump stream over local subprocess

    proc pump : integer

    The pump primitive creates a child co-process just as fork does, but returns in the parent a bidirectional stream that is connected to the child's standard input and output. Other streams open in the parent are left open in the child, and pump returns -1 in the child.

    If the system cannot create a new process, or if it cannot create the requisite bidirectional stream, pump returns om.

    See also open (particularly the "PUMP" and "LINE-PUMP" modes), close, shutdown, pipe_from_child, pipe_to_child, and filter.


    Put line(s) on standard output

    proc put (...)

    Equivalent to puta (stdout, ...).


    Put line(s) on output stream

    proc puta (string f, ...)
    proc puta (integer f, ...)

    There can be 0 or more string arguments after the output stream designator f. They are written directly to f, with a newline character after each. A synonym for puta is putline.


    Put value(s) on output stream

    proc putb (string f, ...)
    proc putb (integer f, ...)

    There can be 0 or more arguments after f, of any type. They are sent in sequence to the stream f, separated by single spaces and followed by a newline character. All of them are written as if they had been passed through str first, with no exception for strings (contrast printa). Values written by putb, except for atoms (see newat) and procedure references (see routine), can be read by getb. This procedure is functionally identical to writea.


    Put character(s) on output stream

    proc putc (string f, string c)
    proc putc (integer f, string c)

    The 0 or more characters in c are sent to the stream f.


    Put character(s) on standard output

    proc putchar (string c)

    The call putchar (c) is the same as putc (stdoutc).


    Set environment variable

    proc putenv (string)

    The argument should have the form "id=value" as in the Posix putenv(3) routine. However, it is possible on some systems to ``unset'' an environment variable by passing just the "id" part. The use of setenv and unsetenv is recommended in preference to the rather old-fashioned putenv. See also getenv.


    Put character(s) on output stream

    proc putfile (string f, string c)
    proc putfile (integer f, string c)

    The 0 or more characters in c are sent to the stream f; thus putfile is equivalent to putc, except that putfile will automatically close a file that was automatically opened.


    Put line(s) on output stream

    proc putline (string f, ...)
    proc putline (integer f, ...)

    There can be 0 or more string arguments after the output stream designator f. They are written directly to f, with a newline character after each. A synonym for putline is puta.


    Direct-access write

    proc puts (string f, integer start, string x)
    proc puts (integer f, integer start, string x)

    The direct-access stream f (see open mode "RANDOM") is viewed as a string, where start specifies the index (1 or higher) of the first character to write. The puts procedure will write n characters to f, increasing the size of the file as necessary. See also gets, seek, and rewind.


    Pseudo-random numbers and selections

    op random (integer i) : integer
    op random (real r) : real
    op random (string s) : string
    op random (set t) : var
    op random (tuple t) : var

    For an integer i > 0, random i returns a pseudo-random integer in {0..i}; otherwise in {-i..0}. Note that this is a little unusual compared to other programming languages, in that both endpoints are included in this definition (so random i can produce any of i + 1 different numbers).

    For a real r, random r returns r * u, where u is a pseudo-random real in the half-open interval [0,1).

    For a string s, random s returns a pseudo-randomly chosen character from s, or om if s is the null string.

    For a set or tuple t, random t returns a pseudo-randomly chosen element from t, or om if #t = 0.

    See also setrandom.


    Range of map

    op range (set) : set

    The argument must be a set of ordered pairs, that is, a set of tuples each of size 2. The result is the set of all second members of those tuples. See also domain.


    Right-to-left string breakers

    proc rany (rw string s, string p) : string
    proc rbreak (rw string s, string p) : string
    proc rlen (rw string s, integer n) : string
    proc rmatch (rw string s, string p) : string
    proc rnotany (rw string s, string p) : string
    proc rspan (rw string s, string p) : string

    These procedures look for rightmost occurrences of something. Thus rany and rnotany try to pick off the character at s(#s), rbreak and rspan scan from right to left, rlen tries to consume n characters on the right, and rmatch succeeds only if p occupies the last #p characters of s.

    No reversal of actual character order is implied here; see instead reverse.


    Get one or more values from standard input

    proc read (wr var...)

    Equivalent to reada (stdin, ...).


    Get one or more values from input stream

    proc reada (string f, wr var...)
    proc reada (integer f, wr var...)

    Values are read from the input stream f and assigned to the succeeding arguments in order. If the end of the input is reached, trailing arguments may be assigned om. Values written by writea, except for atoms (see newat) and procedure references (see routine), are guaranteed to be readable by reada. If f is not already open, an attempt will automatically be made to open it for reading. See also unstr and reads.

    There is a subtle difference between reada and getb in that reada will always start reading at the beginning of a line, skipping ahead to just after the next newline character if necessary in order to do so, but getb will simply start with the next available character in the input stream.


    Symbolic link referent

    op readlink (string f) : string

    When f names a file that is really a symbolic link, readlink yields the string associated with f. The resulting string may or may not name another existing file.

    Contrast this with a regular input operation on f, which will fail if f is a symbolic link to a file that doesn't exist.

    If f itself doesn't exist, or is not a symbolic link, readlink yields om, and last_error indicates which case applies.

    See also lexists, fexists, symlink, link, and unlink.


    Get one or more values from a string

    proc reads (string s, wr var...)

    Values are ``read'' from the string s and assigned to the succeeding arguments in order. If the end of the string is reached, unsatisfied trailing arguments will be assigned om. The rules for value recognition and conversion are the same as for reada and unstr.


    Receive datagram on UDP client socket

    op recv (string f) : string
    op recv (integer f) : string

    A datagram is read from the UDP client socket f and yielded as a string. The select procedure may be used to check or wait for input of this kind.

    See open, and see also recvfrom, send, and sendto.


    Receive datagram on UDP server socket

    op recvfrom (string f) : tuple
    op recvfrom (integer f) : tuple

    A datagram is read from the UDP server socket f, and its its sender's address is sensed. The address is formatted as a string of the form "ip.address:portnum", and returned with the datagram as the pair of strings [address, datagram]. Example of use:

    [address, datagram] := recvfrom f;

    The select procedure may be used to check or wait for input of this kind.

    See open, and see also recv, send, and sendto.


    Receive file descriptor

    op recv_fd (string f) : integer
    op recv_fd (integer f) : integer

    A file descriptor sent by a process executing a send_fd is returned and left open at the underlying system level (but not at the SETL level, so you still have to do an open on it to specify the I/O mode you want). The stream designator f must refer to a Unix-domain socket. Note that the returned file descriptor may arrive as a different integer than was passed by the sending process to send_fd.

    [While there are no explicit open modes that create Unix-domain sockets, they are in fact what you get from opening in a "PIPE-..." or "PUMP" mode, and are also what pipe creates. The system-specific semantics obviously won't do, but will serve as a stand-in until I learn what the equivalent of the bidirectional (modern) form of BSD Unix pipes is on NT and Mac OS 8 (hah!) etc.]

    The select procedure may be used to test or wait for the presence of a file descriptor ready to be received on f.


    Integer remainder

    op rem (integer, integer) : integer

    The definition of rem is such that for all a and b, where b is nonzero:

    a rem b = a - ((a div b) * b

    so, for example:

    5 rem 3 = 2
    -5 rem 3 = -2
    5 rem -3 = 2
    -5 rem -3 = -2

    Note that unless the remainder is zero, it has the same sign as the dividend. See also div and mod.


    Reverse string

    op reverse (string) : string

    Characters in reverse order.


    Rewind direct-access stream

    proc rewind (string f)
    proc rewind (integer f)

    The call rewind(f) is equivalent to the call seek (f0).


    Round to nearest integer

    op round (real) : integer
    op round (integer) : integer

    Numbers ending in .5 are always rounded upward, so round x = floor (x+0.5). See also ceil and fix.


    Create procedure reference

    op routine (proc_name) : proc_ref

    This pseudo-operator produces a value that can subsequently be passed to call in order to effect an indirect procedure call. The typenames in the signature shown here do not really exist as SETL keywords, but suggest how this operator is used: you pass it the name of a procedure in your program, and routine returns an opaque handle which you can save for later use. For example, it is sometimes convenient to use a mapping to associate strings with procedure references, as is illustrated by the use of select in the ``callback'' style of programming. A more familiar application is to have a generic numerical integration function to which you pass a reference to the function that is to be integrated, like this:

    f_ref := routine f;
    area := integrate(f_ref, 0, 1);

    proc f(x);  -- function to be integrated
    return sin x;
    end f;

    proc integrate(g, x_lo, x_hi);
    ...
    sum +:= dx * call(g, x);
    ...
    return sum;
    end integrate;

    Pad string on right with blanks

    proc rpad (string s, integer n) : string

    If n > #s, the returned string is a copy of s padded on the right with blanks to length n. Otherwise, (a copy of) s is returned. See also lpad.


    Reposition direct-access stream

    proc seek (string f, integer offset)
    proc seek (integer f, integer offset)

    The direct-access stream f (see open mode "RANDOM") is repositioned so that the next ordinary read or write operation will start at offset bytes past the beginning of the file. Note that offset should be 0 or more, consistent with the conventions of Unix fseek(3), unlike the start parameter of gets and puts: offset is the same as start - 1. See also rewind.


    Wait for I/O event or timeout

    proc select (tuple fds) : tuple
    proc select (tuple fds, integer ms) : tuple

    This is an extended interface to the Unix select(2) routine, which allows a program to wait for I/O events on multiple streams simultaneously, optionally taking a timeout or immediate return. Because interprocess communication, signals, and interval timers are all wrapped as I/O streams in SETL too, this is the fundamental procedure for event-driven programming in SETL.

    The fds argument contains up to 3 sets of stream designators, where fds(1) lists streams that may produce input (the meaning of this is actually extended to included TCP server sockets that are ready to accept without blocking, UDP sockets that have datagrams ready to be received by recv or recvfrom, and Unix-domain sockets on which recv_fd can be called without blocking), fds(2) lists streams that take output (including UDP sockets ready for send or sendto operations, and Unix-domain sockets that can take send_fd calls without blocking), and fds(3) lists streams that can generate exceptional conditions. For convenience, an empty fds tuple can be indicated by passing fds as om.

    The ms argument, if present, gives the number of milliseconds select should wait if none of the stream designators given in fds doesn't become ready sooner. If ms is 0, it expires immediately, giving the effect of ``polling'' (no wait). If it is absent or om, select waits indefinitely for a stream to become ready, so the expression select(om) means wait (``sleep'') indefinitely.

    The return value from select is always a 3-element tuple consisting of 3 sets of stream designators [input, output, except]. Any of these sets can be empty, and if select returns because ms milliseconds has elapsed with no file descriptors becoming ready, all three sets are empty. Otherwise, they contain stream designators corresponding to file descriptors that are ready for input, output, or exception processing, respectively.

    The ``callback'' style of event-driven programming can be supported easily by passing control to a routine that repeatedly calls select and looks up what procedure to call from a map from fd's. Here is an example that happens to deal only in input streams (a typical case):

    proc callback_scheduler;
    loop
    [ready] := select([pool]);
    for fd in ready loop
    call(callback_map(fd), [fd]);
    end loop;
    end loop;
    end proc;

    A callback procedure p is ``registered'' in the callback map by executing

    callback_map(fd) := routine p;

    which associates it with fd. It can later be ``unregistered'' thus:

    callback_map(fd) := om;

    The idea is that you pass control to callback_scheduler after setting up your callback procedures (``event routines''), and everything thereafter is driven from them.

    Notice that the callback procedure has the option of reading further information from the fd passed to it as a parameter if this if appropriate, so the callback model is as convenient to use in conjunction with SETL streams of all kinds as is the basic select model.

    [The above discussion should be updated to warn the unwary about the hazards of modifying callback_map in the callback routines, as expounded in my dissertation, where a solution is given. Since everything is supposed to happen in the callback routines when you program in this style, everyone who does so should be aware of this!]


    Send datagram on UDP client socket

    proc send (string f, string datagram)
    proc send (integer f, string datagram)

    The datagram is sent to the UDP client socket f. The select procedure may be used to check or wait for f to be ready to take a datagram.

    See open, and see also recv, recvfrom, and sendto.


    Send datagram on UDP server socket

    proc sendto (string f, string address, string datagram)
    proc sendto (integer f, string address, string datagram)

    The datagram is sent by the UDP server socket f to the destination address, which should be formatted as a string of the form "ip.name.or.address:portnum". The select procedure may be used to check or wait for f to be ready to take a datagram.

    See open, and see also recvfrom, recv, and send.


    Send file descriptor

    proc send_fd (string f, integer fd)
    proc send_fd (integer f, integer fd)

    The file descriptor fd is sent to a process that is executing recv_fd. The stream designator f must refer to a Unix-domain socket.

    The select procedure may be used to check or wait for f to be ready to take a file descriptor.


    Set environment variable

    proc setenv (string name, string value)
    proc setenv (string name)

    The call setenv (name, value) gives the environment variable name the value value, and omitting the value is equivalent to specifying the null string. See also getenv, unsetenv, and the mildly deprecated putenv.


    Set group id

    proc setgid (rd integer)

    On Unix systems, if the caller has ``superuser'' privileges, setgid sets both the real and effective group id. For more humble callers, the effect depends on whether the BSD or System V conventions are being followed locally. Refer to the Unix setgid(2) manual page for more details.

    See also getgid, getegid, and setuid.


    Set process group id

    proc setpgrp

    Make the current process a process group leader by making its process group id equal to its process id. This is done by calling the Posix setpgid(2) routine with both arguments zero, not by calling the non-Posix setpgrp(2) routine (because the latter on some systems has the side-effect of making the process a session leader as well). See also getpgrp and pid.


    Set random seed

    proc setrandom (rd integer)

    Establishes a starting point (``seed'') for pseudo-random number generation. If the integer argument is 0, the sequence will be seeded by some unpredictable number related to the clock time. See also random.


    Set user id

    proc setuid (rd integer)

    On Unix systems, if the caller has ``superuser'' privileges, setuid sets both the real and effective user id. For more humble callers, the effect depends on whether the BSD or System V conventions are being followed locally. Refer to the Unix setuid(2) manual page for more details.

    See also getuid, geteuid, and setgid.


    Determine the type of integer quotients

    proc set_intslash (boolean) : boolean

    Calling set_intslash is equivalent to retrieving the current value of intslash and setting it to a new (boolean) value. Using set_intslash, should you have the bad taste to override the default false setting, is arguably more polite than assigning directly to intslash. See also ``/''.


    Distinguish regular expressions, or not

    proc set_magic (boolean) : boolean

    Calling set_magic is equivalent to retrieving the current value of magic and setting it to a new (boolean) value. If you are writing utility routines that depend on the value of magic, use of set_magic is likely to be more convenient (and probably better style) than fetching and assigning magic directly.


    Close I/O in one or both directions

    proc shutdown (string f, integer how)
    proc shutdown (integer f, integer how)
    shut_rd : const integer
    shut_wr : const integer
    shut_rdwr : const integer

    If f designates a bidirectional stream that is (still) open for the I/O direction(s) indicated in the how value, which may be any one of the predefined constants shut_rd, shut_wr, or shut_rdwr, then that direction (or both directions, for shut_rdwr) is closed, using a call to shutdown(2). The underlying file descriptor remains open, however.

    For TCP streams, shutdown can perform what is called a ``half-close'', which can be used, for example, to tell a peer that you have finished sending data (thus making the peer see an end-of-file condition on its input side) but that you would still like to receive a reply on the same (still ``half-open'') connection.

    It is an error to call shutdown on any other kind of stream, except for a file descriptor that is not open at the SETL level (as with close), which is always acceptable but may set last_error.

    See also pump and open.


    Numeric sign

    op sign (integer x) : integer
    op sign (real x) : integer

    Returns -1, 0, or 1 according as x < 0, x = 0, or x > 0 respectively.


    Trigonometric sine

    op sin (real) : real
    op sin (integer) : real

    Hyperbolic sine

    op sinh (real) : real
    op sinh (integer) : real

    Consume initial substring

    proc span (rw string s, string p) : string

    As many initial characters of s as appear in p are removed from s and returned as the function result. Otherwise, nothing happens to s, and the null string is returned. See also break, rspan, and rbreak.


    Split string into tuple

    proc split (string s, string p) : tuple
    proc split (string s, tuple p) : tuple
    proc split (string s) : tuple

    Substrings of s are returned as a tuple, where the regular expression p is a ``delimiter'' pattern defaulting to whitespace, "[ \f\n\r\t\v]+".

    The subject string s is considered to be surrounded by strings satisfying the delimiter pattern p, and split returns the strings between the delimiters. So for example split("David Bacon::WGP:", ":") is ["David Bacon", "", "WGP", ""], but when a line is split into whitespace-delimited words, extra whitespace on either end of the line does not appear as a null string in the result of split. This is because the leading or trailing whitespace merges with the added surrounding whitespace from the delimiter pattern's point of view, and indeed there will be some end-merging whenever p matches 2 or more copies of a leading or trailing substring of s.

    As a special case, split("", p) = [] for any pattern p. (Splitting the null string yields the null tuple.) If you would prefer to regard the semantics of this case as nothing special, you can visualize that when the imaginary surrounding delimiter-satisfying strings actually touch each other, they fuse into just one such satisfying string, leaving no ``between'' in which to find the null string lurking, even for a delimiter pattern like ":".

    See also magic.


    Square root

    op sqrt (real) : real
    op sqrt (integer) : real

    Last subprocess exit status

    status : integer

    Always contains the status of the subprocess that last exited and was waited for by filter, system, or wait, or by close as applied to a pipe, pump, or line-pump stream.

    See also open, pipe_from_child, pipe_to_child, and pump.


    Standard input, output, and error streams

    stdin : const integer = 0
    stdout : const integer = 1
    stderr : const integer = 2

    See also open.


    Machine memory write

    proc store_char (integer i, integer address)
    proc store_short (integer i, integer address)
    proc store_int (integer i, integer address)
    proc store_long (integer i, integer address)
    proc store_float (real r, integer address)
    proc store_double (real r, integer address)
    proc store_long_double (real r, integer address)
    proc store_string (string s, integer address)
    proc store_c_string (string s, integer address)

    These are extremely low-level, machine-dependent, uncontrolled, dangerous procedures to clobber specific locations in the computer's memory. The integer address is assumed to contain a machine address to which some number n of bytes will be written. For store_string, n = #s. For store_c_string, n is 1 more than the lesser of #s and the number of bytes before the first null character ("\0") in s (a trailing null character is always written for store_c_string). For store_char, n = 1. For all the rest, n is machine-dependent. These procedures are typically only used if you have customized a C library interface rather roughly and want to store into structs based on pointers to them without going to the trouble of mapping the structs to SETL objects properly.

    See also fetch_..., pack_..., unpack_..., and mem_copy.


    String representation of value

    op str (var x) : string

    The argument x is converted to a string such that the result could be converted back to the same value using unstr. (Some loss of precision is possible in the case of real x, however. This can be overcome by the use of fixed or floating in place of str on reals. Also, for a procedure reference x, unstr str x is not defined, because str x = "<ROUTINE>".) For a string x that does not have the form of a SETL identifier (alphabetic character followed by 0 or more alphanumeric or underscore characters), the result string is identical to x except that each apostrophe (') is twinned (producing two apostrophes in a row), and an apostrophe is added at each end. If a string x does have the form of a SETL identifier, it is returned unmolested (and this is indeed recognizable by unstr as a string).

    See also also pretty, unpretty, whole, and strad.


    Radix-prefixed string representation of integer

    proc strad (integer x, integer radix) : string

    The integer argument x is converted to a string representing its value as an explicit-radix denotation. The radix argument, which must be an integer in the range 2 to 36, give the radix (``base'') for the denotation, which will be produced in the form "radix#digits", where the radix is represented in decimal and the digits after the sharp sign are some subset of the characters "0" through "9" and lowercase "a" through "z". For example,

    strad (10, 10) = "10#10"
    strad (10, 16) = "16#a"
    strad (10, 2) = "2#1010"
    strad (-899, 36) = "-36#oz"

    Also, the following identity holds for any integer x when the integer radix is in the range 2 through 36:

    val strad (x, radix) = x

    Note that val in the above identity can be replaced by the more general unstr; strad always produces a denotation acceptable to them both.

    See also str and whole.


    Substitute first occurrence of pattern in string

    proc sub (rw string s, string p) : string
    proc sub (rw string s, tuple p) : string
    proc sub (rw string s, string p, string r) : string
    proc sub (rw string s, tuple p, string r) : string

    The leftmost occurrence in s, if any, of the pattern given by the regular expression p is replaced by r, which defaults to the null string. The substring of s replaced by this operation is returned as a string, unless p did not occur in s, in which case om is returned and s is left unmodified. See also gsub, mark, and gmark.

    [Need rules for regexps, as mentioned under gmark.]


    Subset test

    op subset (set ss, set s) : boolean

    Returns true if ss is a subset of s. Thus (ss subset s) = (s incs ss).


    Create symbolic link

    proc symlink (string f, string new)

    Atomically create a symbolic link to f under the filename new using symlink(2), if new does not exist before the call. There is no return value, but calling clear_error before the operation and inspecting last_error after it can be used to determine whether the operation was successful. Thus symlink can be used to implement a ``test and set'' mutex lock in the file system: if new already exists, the operation will fail; and if it doesn't exist, it will be created and the calling process will then ``own'' the lock. Note that f may or may not refer to an existing file. In any case, if new ``points to'' f, such as after a successful call to symlink, subsequent attempts to read or write new will attempt to read or write f.

    See also link, readlink, unlink, lexists, and fexists.


    Execute system command in subshell

    proc system (string cmd) : integer

    The command cmd is passed to the Posix system(3) routine for execution as ``/bin/sh -c cmd'', so the cmd may itself include parameters to the program to be executed in the subshell. The subshell's exit status is returned, and also made available in status.


    Low-level read

    proc sys_read (integer fd, integer n) : string

    This procedure bypasses SETL buffering and calls the Posix read(2) routine directly. The file descriptor fd may or may not be open at the SETL level (see open). Up to n bytes are read from the presumed input stream and returned as a string. It may be less than n bytes long if the end of the input is encountered before n bytes have been read or if the fd is a socket or pipe or pump where the output process does a system-level ``write'' or ``send'' or flushes its buffer after sending fewer than n bytes.


    Low-level write

    proc sys_write (integer fd, string s) : integer

    This procedure bypasses SETL buffering and calls the Posix write(2) routine directly. The file descriptor fd may or may not be open at the SETL level (see open). An attempt is made to write all #s bytes of s. This should succeed except in rather obscure circumstances, but in any case the number of bytes actually written is returned.


    Trigonometric tangent

    op tan (real) : real
    op tan (integer) : real

    Hyperbolic tangent

    op tanh (real) : real
    op tanh (integer) : real

    Autoflush output stream per other stream input

    proc tie (string, string)
    proc tie (string, integer)
    proc tie (integer, string)
    proc tie (integer, integer)

    The arguments designate open streams. After the call to tie, whenever an input operation such as reada or geta is requested on one of the two streams, any buffered output on the other is written out first. See also flush.

    There is no untie routine; it does not seem useful.


    CPU time in milliseconds

    proc time : integer

    This gives the total amount of CPU time used by the current process and all its child processes that have exited and been waited for (automatically or by wait), in milliseconds. This includes both ``user'' time and time spent by the ``system'' on behalf of the user. See also clock and tod.


    Unique temporary filename

    proc tmpnam : string

    This is an interface to the Posix tmpnam(3) routine.


    Alphabetic case conversions

    op to_lower (string) : string
    op to_upper (string) : string

    A string of length equal to that of the argument is returned. Characters other than A-Z and a-z are unaffected.


    Calendar time in milliseconds

    proc tod : integer

    This is the total number of milliseconds that have elapsed in the ``epoch'' beginning 1 January 1970 UTC. See also clock, time, date, and fdate.


    Predefined ``true'' boolean value

    true : const boolean

    See also false.


    Type of SETL value

    op type (var) : string

    Returns "ATOM", "BOOLEAN", "INTEGER", "REAL", "SET", "STRING", "TUPLE", "ROUTINE", or "OM".


    Set file creation mask

    proc umask : integer
    proc umask (integer) : integer

    This is an interface to the Posix routine umask(2). Both forms of the call return the current value of the mask, and the second form also changes it to a new value. For example,

    umask(8#022);

    arranges that files created by the SETL program and its child processes will not be writable by other users or groups unless subsequently made so by the chmod(1) command. See also system and open.


    Push character(s) back on input stream

    proc ungetc (string f, string c)
    proc ungetc (integer f, string c)

    Immediately after any input operation on the stream f that yielded at least one character c, the call ungetc(f,c) ``pushes back'' the character c on the stream f so that c will appear as the next input character. At least one character of pushback is guaranteed after at least one character has been fetched.

    See also peekc and ungetchar.


    Push character(s) back on standard input

    proc ungetchar (string c)

    The call ungetchar (c) is the same as ungetc (stdin, c).


    Convert from hexadecimal

    op unhex (string) : string

    This is the inverse of hex, but returns om if its argument is not a string consisting of an even number of (case-insensitively recognized) hexadecimal characters.


    Destroy file reference

    proc unlink (string f)

    Remove the name f from the file system. If f is the last reference (link) to the underlying file, including ``invisible'' references from running processes, the file will be destroyed.

    See also link, symlink, readlink, fexists, and lexists.


    Byte unpacking

    op unpack_char (string) : integer
    op unpack_short (string) : integer
    op unpack_int (string) : integer
    op unpack_long (string) : integer
    op unpack_float (string) : real
    op unpack_double (string) : real
    op unpack_long_double (string) : real

    These are low-level, machine-dependent (but not very dangerous) operators for interpreting strings as representations of the predefined C types on the machine you are running SETL on. You must pass a string of the right length for the C type suggested by the operator name (the SETL implementation should check this).

    See also pack_..., fetch_..., and store_....


    ``Unprettify'' string

    op unpretty (string s) : string

    The string s should be in ``pretty'' form, although the unpretty operator is somewhat liberal in what it accepts relative to what pretty produces. However, s must still begin and end with an apostrophe (') or begin and end with a double quote (").

    Inside s, every character must be one of the 95 characters ASCII considers ``fit to print'', including blank. The unpretty operator makes the following interpretations in transforming s into an unrestricted string:

    Backslash followed by any of the 32 ``glyphs'' (that is, all of the ASCII ``printable'' characters apart from alphanumerics and blank) means just that glyph.

    Backslash followed by up to 3 octal digits means a character having the bit pattern suggested by the digits, as in C.

    Backslash followed by x and then 1 or 2 hexadecimal digits (0123456789abcdefABCDEF) is an alternative to the octal escape.

    Backslash followed by a, b, f, n, r, t, or v means the same thing as it does in C (i.e., audible alarm, backspace, formfeed, newline, carriage return, horizontal tab, or vertical tab, respectively).

    Currently these are exactly the rules governing what can be in a literal character string in SETL source code, and the escape sequences have the same meanings.

    The rules may be liberalized in the future, however, because I don't think insisting on the use of octal or hexadecimal escapes for specifying, say, the ESC character is more portable than just embedding the thing right into a string---a good ASCII<->EBCDIC translation of the source code would leave ESC meaning ESC either way, whereas the use of \x1b (the ASCII code for ESC) would be an error in an EBCDIC environment if the source code was just translated and otherwise unmodified. So maybe you want an escape convention that allows special characters to be identified by name, e.g., ESC, SOH, etc. Suggestions are welcome.

    See also pretty, unstr, and str.


    Remove environment variable definition

    op unsetenv (string name)

    If the environment variable name was defined, undefine it. Note that this is not the same as setting it to the null string. See also setenv, getenv, and poor old putenv.


    Read value from string

    op unstr (string s) : var

    Essentially, this is the inverse of str, but more liberal in what it accepts relative to what str produces. In particular, quoted strings can use either the apostrophe (') or the double quote (") as the beginning and ending character q. Whichever one is used is also the one that should be twinned internally to represent q. As a convenience (to be consistent with what reada accepts), each backslash followed immediately by a true newline is silently absorbed when it occurs in a quoted string. Apart from the interpretation of qq and the backslash-newline absorption, unstr is completely literal about how it interprets what is inside a quoted string. Other types are recognized by their first non-whitespace character, except that numeric types don't necessarily get resolved to real or integer that early.

    The denotype operator can be used to check whether a string is acceptable to unstr.

    See also val, reads, and unpretty.


    Read numeric value from string

    op val (string) : integer
    op val (string) : real

    This is similar to unstr but expects a numeric denotation as an argument. Unlike unstr, however, it will return om instead of raising an exception if the argument string does not satisfy its syntactic requirements.

    See also denotype and strad.


    Wait for subprocess to complete

    proc wait : integer
    proc wait (boolean) : integer

    The call wait() (or wait) is equivalent to wait (true), which means block until some child process exits. The return value is the process id of a child process that has exited, or 0 if wait (false) was called but no child has terminated yet.

    Waiting is automatic for child processes started by system or filter, and happens upon close for those started by pipe_from_child, pipe_to_child, pump, or open on a pipe, pump, or line-pump stream. Only in the case of fork is it necessary (or even wise) to call wait to clear the child process entry from the kernel's record. This wait-and-clear sequence is sometimes called ``reaping'' the child process. The process itself is called a ``zombie'' between the time it exits and the time it is reaped. If the parent exits with zombies outstanding, they are reaped automatically by the ancestor of all processes on the host system. That primordial process, incidentally, also inherits as direct children any unterminated processes that are ``orphaned'' by a parent that exits before they do.

    See also exec, status, and time.


    Format integer

    proc whole (integer x, integer w) : string
    proc whole (real x, integer w) : string

    The number x is converted to a string of at least w characters, with blank padding on the left if necessary if w is positive. If w is negative, the result is padded on the right as necessary to reach abs w characters. For integer x, whole (x, 0) is the same as str x.

    However, if x is real, then whole treats it as round x.

    See also fixed floating, and strad.


    Set plus one element

    op with (set, var) : set

    Definition: s with x = s + {x}.


    Write value(s) to standard output

    proc write (...)

    Equivalent to writea (stdout, ...).


    Write value(s) to output stream

    proc writea (string f, ...)
    proc writea (integer f, ...)

    There can be 0 or more arguments after f, of any type. They are sent in sequence to the stream f, separated by single spaces and followed by a newline character. All of them are written as if they had been passed through str first, with no exception for strings (contrast printa). Values written by writea, except for atoms (see newat) and procedure references (see routine), can be read by reada. This procedure is functionally identical to putb.



    [Terry Boult has requested that some credit go to MURI and the NSF for their tacit support of the recent portions of this work. This is right and proper, and in fact there should be a whole string of credits to people who have put up with my working on this, going right back to 1988.]

    [In fact, I should also thank people like Toto Paxia and Eric Freudenthal for giving me access to some machines I might otherwise not have been able to test the SETL system on.]

           dB    bacon@cs.nyu.edu