Copyright (C) 1994-1996 Marc Feeley.
Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.
Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the copyright holder.
The Gambit programming system is a full implementation of the Scheme
language which conforms to the R4RS and IEEE Scheme standards. It
consists of two programs: gsi
, the Gambit Scheme interpreter,
and gsc
, the Gambit Scheme compiler.
Gambit-C is a version of the Gambit system in which the compiler generates portable C code, making the whole Gambit-C system and the programs compiled with it easily portable to many computer architectures for which a C compiler is available.
For the most up to date information on Gambit please check the Gambit web page at `http://www.iro.umontreal.ca/~gambit' or send mail to `gambit@iro.umontreal.ca'.
Bug reports should be sent to `gambit@iro.umontreal.ca'.
Synopsis:
gsi [-:runtimeoption,...] [-f] [-i] [file...]
The interpreter is executed in interactive mode when no command line argument is given other than options and the input does not come from a pipe. Pipe mode is when no command line argument is given and the input comes from a pipe. Finally, batch mode is when command line arguments are present. The interpreter ignores the `-i' option.
In this mode the interpreter starts a read-eval-print loop (REPL) to interact with the user. The system prompts the user for a command, reads the command from standard input and executes it, sending any output generated including error messages to standard output.
The commands entered by the user are typically expressions that are to
be evaluated. These expressions are evaluated in the global
interaction environment. The REPL adds to this environment any
definition entered using the define
and define-macro
special forms.
The result of evaluation is written to standard output unless it is the
special "void" object. This object is returned by most procedures and
special forms which the standard defines as returning an unspecified
value (e.g. write
, set!
, define
).
When an evaluation error occurs or the user interrupts the system (usually by typing ^C), a nested REPL is initiated at the point of error, making it possible to inspect the context of the error. The prompt of nested REPLs includes the nesting level. An end of file (usually ^D on UNIX and ^Z on MSDOS and Windows) will cause the current REPL to be aborted and the enclosing REPL (one nesting level less) to be resumed.
Gambit combines the standard REPL functions with those of the debugger. At any time the user can examine the frames in the REPL's continuation (which is the continuation of the error or the initial continuation). This is useful to determine which part of the program triggered an error and which chain of calls lead to the error.
Expressions entered at a nested REPL are evaluated in the environment
of the continuation frame currently being examined if that frame was
created by interpreted Scheme code. If the frame was created by
compiled Scheme code then expressions get evaluated in the global
interaction environment. This feature may be used in interpreted code
to get the value of a variable in the current frame or to change its
value with set!
. Note that some special forms (define
in particular) can only be evaluated in the global interaction
environment.
In addition to expressions, the REPL accepts the following special "comma" commands:
,?
,q
,t
,d
,r
,n
,+
,-
,b
,l
,i
,y
Here is a sample interaction with gsi
:
% gsi Gambit Version 2.4 > (define (f x) (let* ((y 10) (z (* x y))) (- x z))) > (define (g n) (if (> n 1) (+ 1 (g (/ n 2))) (f 'oops))) > (g 8) *** ERROR -- NUMBER expected (* 'oops 10) 1> ,b 0 f (* x y) -1 g (g (/ n 2)) -2 g (g (/ n 2)) -3 g (g (/ n 2)) -4 (interaction) (g 8) -5 ##initial-continuation 1> ,i #<procedure f> = (lambda (x) (let ((y 10)) (let ((z (* x y))) (- x z)))) 1> ,y (* x y) 1> ,l y = 10 x = oops 1> (set! x 1) 1> ,l y = 10 x = 1 1> ,r Return value: (* x y) -6 > ,q
In pipe mode the interpreter evaluates the expressions read from standard input in the global interaction environment and writes each result on a separate line on standard output. Evaluation errors cause the interpreter to exit. Error messages are sent to standard error.
For example, under UNIX:
% echo "(sqrt (read)) 9 (expt 2 100)" | gsi 3 1267650600228229401496703205376
In batch mode the command line arguments designate files to be loaded.
The interpreter loads these files in left-to-right order using the
load
procedure. The files can have no extension, or the
extension `.scm' or `.on' where n is a positive
integer that acts as a version number (the `.on' extension
is used for object files produced by gsc
). When the file name
has no extension the load
procedure first attempts to load the
file with no extension as a Scheme source file. If that file doesn't
exist it completes the file name with a `.on' extension
with the highest consecutive version number starting with 1, and loads
that file as an object file. If that file doesn't exist the file name
is completed with a `.scm' extension and the file is loaded as a
Scheme source file.
The interpreter exits after loading the files or as soon as an error occurs. Input is taken from standard input and any output generated is sent to standard output except for error messages which go to standard error.
For example, under UNIX:
% cat m1.scm (display "hello") (newline) % cat m2.scm (display "world") (newline) % gsi m1 m2 hello world
There are two ways to customize the interpreter. When the interpreter starts off it tries to execute a `(load "~~/gambc")' (for an explanation of how file names are interpreted see section Handling of file names). An error is not signaled if the file does not exist. Interpreter extensions and patches that are meant to apply to all users and all modes should go in that file.
Extensions which are meant to apply to a single user or to a specific
directory are best placed in the initialization file, which is a
file containing Scheme code. In all modes, the interpreter first tries
to locate the initialization file by searching the following locations:
`gambc.scm' and `~/gambc.scm'. The first file that is found
is examined as though the expression (include
initialization-file)
had been entered at the read-eval-print loop
where initialization-file is the file that was found. Note that
by using an include
the macros defined in the initialization file
will be visible from the read-eval-print loop (this would not have been
the case if load
had been used). The initialization file is not
searched for or examined if the `-f' option is specified.
Under UNIX, the status is 0 when the interpreter exits normally and is 1 when the interpreter exits due to an error.
For example, if the shell is csh
:
% echo "(/ 1 0)" | gsi *** ERROR -- Division by zero (/ 1 0) % echo $status 1
Gambit's load
procedure treats specially any Scheme source file
beginning with the token `#!'. The load
procedure discards
the rest of the line and then loads the rest of the file normally. If
this file is being loaded because it is an argument on the interpreter's
command line, then the interpreter is terminated after loading the file.
This feature can be used under UNIX to write Scheme scripts by simply
prefixing a file of Scheme code with a line containing `#! gsi'
(note the space between the `#!' and the `gsi' so that the
`#!' token is read properly by gsi
). When such a script
is executed, the script's file name followed by the script's command
line arguments are added to the arguments passed to the interpreter.
Thus, the interpreter will be run in batch mode and the interpreter
will call load
with the script's file name as argument. The
script's arguments can be accessed by calling the procedure
argv
. This nullary procedure returns the script's file name and
its arguments as a list of strings.
For example:
% cat upto #! gsi -f (define (usage) (display "usage: upto n") (newline)) (if (not (= (length (argv)) 2)) (usage) (let ((n (string->number (list-ref (argv) 1)))) (if (and n (exact? n) (integer? n)) (let loop ((i 1)) (if (<= i n) (begin (write i) (newline) (loop (+ i 1))))) (usage)))) % upto 3 1 2 3
An interesting application of Scheme scripts is to implement CGI scripts. Here is a sample CGI script that maintains a counter that is incremented each time the CGI script is accessed:
#! gsi -f (define n (+ 1 (with-input-from-file "counter" read))) (with-output-to-file "counter" (lambda () (write n))) (display "Content-type: text/html") (newline) (newline) (display "Access #") (display n) (newline)
Synopsis:
gsc [-:runtimeoption,...] [-f] [-i] [-verbose] [-report] [-expansion] [-gvm] [-debug] [-o output] [-c] [-flat] [-l base] [file...]
When no command line argument is present other than options the
compiler behaves like the interpreter. This means that interactive
mode is selected if the input does not come from a pipe, otherwise
pipe mode is selected. In these modes, the only difference with the
interpreter is that some additional predefined procedures are
available (notably compile-file
).
Just like the interpreter, the compiler will examine the initialization file unless the `-f' option is specified.
In batch mode gsc
takes a set of file names (either with
`.scm', `.c', or no extension) on the command line and
compiles each Scheme source file into a C file. File names with no
extension are taken to be Scheme source files and a `.scm'
extension is automatically appended to the file name. For each Scheme
source file `file.scm', the C file `file.c' will
be produced.
The C files produced by the compiler serve two purposes. They will have to be compiled by a C compiler to generate object files, and also they contain information to be read by Gambit's linker to generate a link file. The link file is a C file that collects various linking information for a group of modules, such as the set of all symbols and global variables used by the modules. The linker is automatically invoked unless the `-c' option appears on the command line.
Compiler options must be specified before the first file name and after the `-:' runtime option (see section Runtime options for all programs). If present, the `-f' and `-i' compiler options must come first. The available options are:
-f
-i
-verbose
-report
-expansion
-gvm
-debug
-o output
-c
-flat
-l base
The `-i' option forces the compiler to process the remaining command line arguments like the interpreter.
The `-verbose' option displays on standard output a trace of the compiler's activity.
The `-report' option displays on standard output a global variable usage report. Each global variable used in the program is listed with 4 flags that indicate if the global variable is defined, referenced, mutated and called.
The `-expansion' option displays on standard output the source code after expansion and inlining by the front end.
The `-gvm' option generates a listing of the intermediate code for the "Gambit Virtual Machine" (GVM) of each Scheme file on `file.gvm'.
The `-debug' option causes debugging information to be saved in
the code generated. This makes it possible for the REPL to display a
more precise backtrace and for pp
to display the source code of
procedures. The debugging information is very large (it typically
increases the size of the object file by a factor of 5).
The `-o' option sets the name of the output file generated by the compiler. If a link file is being generated the name specified is that of the link file. Otherwise the name specified is that of the C file (this option is ignored if the compiler generates more than one C file).
If the `-c' option does not appear on the command line, the Gambit linker is invoked to generate the link file from the set of C files specified on the command line or produced by the Gambit compiler. Unless the name is specified explicitly with the `-o' option, the link file is named `last_.c', where `last.c' is the last file in the set of C files.
The `-flat' option is only meaningful if a link file is being generated (i.e. the `-c' option is absent). The `-flat' option directs the Gambit linker to generate a flat link file. By default, the linker generates an incremental link file (see the next section for a description of the two types of link files).
The `-l' option is only meaningful if an incremental link file is being generated (i.e. the `-c' and `-flat' options are absent). The `-l' option specifies the link file (without the `.c' extension) of the base library to use for the incremental link. By default the link file of the Gambit runtime library is used (i.e. `~~/_gambc.c').
Gambit can be used to create applications and libraries of Scheme modules. This section explains the steps required to do so and the role played by the link files.
In general, an application is composed of a set of Scheme modules and C modules. Some of the modules are part of the Gambit runtime library and the other modules are supplied by the user. When the application is started it must setup various global tables (including the symbol table and the global variable table) and then sequentially execute all the Scheme modules. The information required for this is contained in one or more link files generated by the Gambit linker from the C files produced by the Gambit compiler.
When a single link file is used to contain the linking information of all the Scheme modules it is called a flat link file. Thus an application built with a flat link file contains in its link file both information on the user modules and on the runtime library. This is fine if the application is to be statically linked but is wasteful in a shared-library context because the linking information of the runtime library can't be shared and will be duplicated in all applications (this linking information typically takes 150 Kbytes).
Flat link files are mainly useful to bundle multiple Scheme modules to
make a runtime library (such as the Gambit runtime library) or to make a
single file that can be loaded with the load
procedure.
An incremental link file contains only the linking information that is not already contained in a second link file (the "base" link file). Assuming that a flat link file was produced when the runtime library was linked, an application can be built by linking the user modules with the runtime library's link file, producing an incremental link file. This allows the creation of a shared-library which contains the modules of the runtime library and its flat link file. The application is dynamically linked with this shared-library and only contains the user modules and the incremental link file. For small applications this approach greatly reduces the size of the application because the incremental link file is small. A "hello world" program built this way can be as small as 5 Kbytes. Note that it is perfectly fine to use an incremental link file for statically linked programs (there is very little loss compared to a single flat link file).
Incremental link files may be built from other incremental link files. This allows the creation of shared-libraries which extend the functionality of the Gambit runtime library.
The simplest way to create an executable program is to call up
gsc
to compile each Scheme module into a C file and create an
incremental link file. The C files and the link file must then be
compiled with a C compiler and linked (at the object file level) with
the Gambit runtime library and possibly other libraries (such as the
math library and the dynamic loading library). Here is for example how
a program with three modules (one in C and two in Scheme) can be built:
% uname -a Linux bailey 1.2.13 #2 Wed Aug 28 16:29:41 GMT 1996 i586 % cat m1.c int power_of_2 (int x) { return 1<<x; } % cat m2.scm (c-declare "extern int power_of_2 ();") (define pow2 (c-lambda (int) int "power_of_2")) (define (twice x) (cons x x)) % cat m3.scm (write (map twice (map pow2 '(1 2 3 4)))) (newline) % gsc m2 m3 m2: m3: % gcc m1.c m2.c m3.c m3_.c -lgambc % a.out ((2 . 2) (4 . 4) (8 . 8) (16 . 16))
To bundle multiple modules into a single file that can be loaded with
the load
procedure, a flat link file is needed. When compiling
the C files and link file generated, the flags `-D___LIBRARY',
`-D___SHARED' and `-D___BIND_LATE' must be passed to the C
compiler. The three modules of the previous example can be bundled in
this way:
% uname -a Linux bailey 1.2.13 #2 Wed Aug 28 16:29:41 GMT 1996 i586 % gsc -flat -o foo.c m2 m3 m2: m3: *** WARNING -- "cons" is not defined, *** referenced in: ("m2.c") *** WARNING -- "map" is not defined, *** referenced in: ("m3.c") *** WARNING -- "newline" is not defined, *** referenced in: ("m3.c") *** WARNING -- "write" is not defined, *** referenced in: ("m3.c") % gcc -shared -fPIC -D___LIBRARY -D___SHARED -D___BIND_LATE m1.c m2.c m3.c foo.c -o foo.o1 % gsi Gambit Version 2.4 > (load "foo") ((2 . 2) (4 . 4) (8 . 8) (16 . 16)) "/users/feeley/foo.o1" > ,q
The warnings indicate that there are no definitions (define
s or
set!
s) of the variables cons
, map
, newline
and write
in the set of modules being linked. Before
`foo.o1' is loaded, these variables will have to be bound; either
implicitly (by the runtime library) or explicitly.
To build a shared-library which extends the functionality of the Gambit runtime library, an incremental link file should be used. When compiling the C files and link file generated, the flags `-D___LIBRARY' and `-D___SHARED' must be passed to the C compiler. The shared-library `mylib.so' containing the two first modules of the previous example can be built this way:
% uname -a Linux bailey 1.2.13 #2 Wed Aug 28 16:29:41 GMT 1996 i586 % gsc -o mylib.c m2 % gcc -shared -fPIC -D___LIBRARY -D___SHARED m1.c m2.c mylib.c -o mylib.so
This shared-library can then be used to build an executable program from the third module of the previous example:
% gsc -l mylib m3 % gcc m3.c m3_.c mylib.so -lgambc % LD_LIBRARY_PATH=. a.out ((2 . 2) (4 . 4) (8 . 8) (16 . 16))
The performance of the code can be increased by passing the `-D___SINGLE_HOST' flag to the C compiler. This will merge all the procedures of a module into a single C procedure, which reduces the cost of intra-module procedure calls. In addition the `-O' option can be passed to the C compiler. For large modules, it will not be practical to specify both `-O' and `-D___SINGLE_HOST' for typical C compilers because the compile time will be high and the C compiler might even fail to compile the program for lack of memory.
Both gsi
and gsc
as well as executable programs compiled
and linked using gsc
take a `-:' option which supplies
parameters to the runtime system. This option must appear first on
the command line. The colon is followed by a comma separated list of
options with no intervening spaces.
The available options are:
d
t
stdin
, stdout
and stderr
as terminals.
mheapsize
hheapsize
c
1
2
4
8
The `d' option selects debugging mode which displays a trace on standard error to monitor the activity of the runtime system.
The `t' option forces the standard input and output to be treated
like a terminal (i.e. as though isatty
was true on stdin
,
stdout
and stderr
). This is useful in situations, such as
running emacs under Windows-NT, where running the interpreter as a
subprocess invokes pipe mode. By using the `t' option in this
situation, the interpreter will enter interactive mode.
The `m' option specifies the minimum size of the heap. The `m' is immediately followed by an integer indicating the number of kilobytes of memory. The heap will not shrink lower than this size. By default, the minimum size is 0.
The `h' option specifies the maximum size of the heap. The `h' is immediately followed by an integer indicating the number of kilobytes of memory. The heap will not grow larger than this size. By default, there is no limit (i.e. the heap will grow until the virtual memory is exhausted).
The `c' option selects the native character encoding (1 byte per character) as the default character encoding for I/O. This is used by default if no default encoding is specified.
The `1' option selects `LATIN-1' (1 byte Unicode) default character encoding for I/O.
The `2' option selects `UCS-2' (2 byte Unicode) default character encoding for I/O.
The `4' option selects `UCS-4' (4 byte Unicode) default character encoding for I/O.
The `8' option selects `UTF-8' (variable length Unicode) default character encoding for I/O.
A file name which starts with the characters `~/' designates a file in the user's home directory. The user's home directory is contained in the `HOME' environment variable.
A file name which starts with the characters `~user/' designates a file in the home directory of the given user.
A file name which starts with the characters `~~/' designates a file in the Gambit installation directory. This directory is defined when the Gambit runtime library is compiled. By default it is `/usr/local/share/gambc'. To override this binding define the `GAMBCDIR' environment variable.
The Gambit Scheme system conforms to the R4RS and IEEE Scheme standards. Gambit supports a number of extensions to these standards by extending the behavior of standard special forms and procedures, and by adding special forms and procedures.
The extensions given in this section are all compatible with the Scheme standards. This means that the special forms and procedures behave as defined in the standards when they are used according to the standards.
procedure: open-input-file file [char-encoding]
procedure: open-output-file file [char-encoding]
procedure: call-with-input-file file proc [char-encoding]
procedure: call-with-output-file file proc [char-encoding]
procedure: with-input-from-file file thunk [char-encoding]
procedure: with-output-to-file file thunk [char-encoding]
procedure: load file [char-encoding]
These procedures take an optional argument which specifies the character encoding to use for I/O operations on the port. char-encoding must be one of the following symbols:
char
latin1
ucs2
ucs4
utf8
If char-encoding is not specified, the default character encoding is used (see section Runtime options for all programs).
procedure: transcript-on file
These procedures do nothing.
procedure: read [port]
The read
and write
procedures support the following
names for characters:
#\newline
#\space
#\nul
#\backspace
#\tab
#\linefeed
#\page
#\return
#\rubout
The read
and write
procedures support the following
escape sequences inside character strings:
\n
\a
\b
\t
\v
\f
\r
\"
"
\\
\
\ooo
\xhh
The read
and write
procedures support the following
objects, which are distinct from any other type:
#!
#!eof
#!optional
#!rest
#!key
special form: include file
file must be a string naming an existing file containing Scheme
source code. The include
special form splices the content of the
specified source file. This form can only appear where a define
form is acceptable.
special form: define-macro (name arg...) body
Define name as a macro special form which expands into body.
This form can only appear where a define
form is acceptable.
Macros are lexically scoped. The scope of a local macro definition
extends from the definition to the end of the body of the surrounding
binding construct. Macros defined at the top level of a Scheme module
are only visible in that module. To have access to the macro
definitions contained in a file, that file must be included using the
include
special form. Macros which are visible from the REPL are
also visible during the compilation of Scheme source files.
special form: declare declaration...
This form introduces declarations to be used by the compiler
(currently the interpreter ignores the declarations). This form can
only appear where a define
form is acceptable. Declarations
are lexically scoped in the same way as macros. The following
declarations are accepted by the compiler:
(dialect)
(strategy)
([not] inline)
(inlining-limit n)
define
or lambda
) and the call site must
be declared as (inline)
, and the compiler must be able to find
the definition of the procedure referred to at the call site (if the
procedure is bound to a global variable, the definition site must have a
(block)
declaration). Note that inlining usually causes much
less code expansion than specified by the inlining limit (an expansion
around 10% is common for n=300).
([not] lambda-lift)
([not] standard-bindings var...)
([not] extended-bindings var...)
([not] safe)
char=?
may disregard the type of its arguments in
`safe' as well as `not safe' mode.
([not] interrupts-enabled)
(number-type primitive...)
The default declarations used by the compiler are:
(ieee-scheme) (separate) (inline) (inlining-limit 300) (lambda-lift) (not standard-bindings) (not extended-bindings) (safe) (interrupts-enabled) (generic)
These declarations are compatible with the semantics of Scheme.
Typically used declarations that enhance performance, at the cost of
violating the Scheme semantics, are: (standard-bindings)
,
(block)
, (not safe)
and (fixnum)
.
special form: c-declare c-declaration
special form: c-initialize c-code
special form: c-lambda (type1...) result-type c-name-or-code
special form: c-define (name param1...) (type1...) result-type c-name scope body...)
These special forms are part of the "C-interface" which allows Scheme code to interact with C code. For a complete description of the C-interface see section Interface to C.
special form: define-structure name field...
Record data types similar to Pascal records and C struct
types can be defined using the define-structure
special form.
The identifier name specifies the name of the new data type. The
structure name is followed by k identifiers naming each field of
the record. The define-structure
expands into a set of definitions
of the following procedures:
Record data types are printed out as
`#s(name (field value)...)',
where the field/value pair appears for each field and
value is the value contained in the corresponding field. Record
data types can not be read by the read
procedure.
special form: trace var...
trace
starts tracing calls to the specified procedures.
untrace
stops the tracing. The form (trace)
returns the
names of the currently traced procedures. The void object is returned
by trace
if it is passed one or more arguments. The form
(untrace)
stops the tracing on all those procedures and returns
the void object.
procedure: file-exists? file
file must be a string. file-exists?
returns #t
if
a file by that name exists and can be opened for reading, and returns
#f
otherwise.
procedure: flush-output [port]
flush-output
causes all data buffered on the output port
port to be written out. If port is not specified, the
current output port is used.
procedure: pretty-print obj [port [width]]
procedure: pp obj [port [width]]
pretty-print
and pp
are similar to write
except
that the result is nicely formatted. The argument width
specifies the width of the page. If obj is a procedure created
by the interpreter or a procedure created by code compiled with the
`-debug' option, pp
will display its source code.
procedure: open-input-string string
These procedures implement string ports. String ports can be used
like normal ports. open-input-string
returns an input string
port which obtains characters from the given string instead of a file.
When the port is closed with a call to close-input-port
, a
string containing the characters that were not read is returned.
open-output-string
returns an output string port which
accumulates the characters written to it. When the port is closed
with a call to close-output-port
, a string containing the
characters accumulated is returned.
procedure: with-input-from-string string thunk
procedure: with-output-to-string thunk
The procedure with-input-from-string
is similar to
with-input-from-file
except that the characters are obtained from
a string. The procedure with-output-to-string
calls the thunk
and returns a string containing all characters output to the current
output port.
procedure: with-input-from-port port thunk
procedure: with-output-to-port port thunk
These procedures are respectively similar to
with-input-from-file
and with-output-to-file
. The
difference is that the first argument is a port instead of a file name.
procedure: set-gc-report report?
set-gc-report
controls the generation of reports during garbage
collections. If the argument is true, a brief report of memory usage
is generated after every garbage collection. It contains: the proportion
of the heap that contains live data, the size of the heap in kilobytes, and
the number of bytes allocated to movable and non-movable objects.
procedure: make-will owner action
These procedures implement the will data type. Wills provide
support for object finalization. A will is an object that contains a
reference to an owner object (the owner of the will), and an
action procedure which is a single argument procedure. When the
runtime system detects that an object is only referenced as the owner
of a will (this is normally detected by the garbage-collector), the
current computation is interrupted, the will's owner is set to
#f
and the will's action procedure is called with the owner as
the sole argument.
procedure: gensym [prefix]
gensym
returns a new uninterned symbol. Uninterned symbols
are guaranteed to be distinct from the symbols generated by the
procedures read
and string->symbol
. The symbol
prefix is the prefix used to generate the new symbol's name. If
it is not specified, the prefix defaults to `g'.
procedure: void
void
returns the void object. The read-eval-print loop prints
nothing when the result is the void object.
procedure: eval expr [env]
eval
's first argument is a datum representing an expression.
eval
evaluates this expression in the global interaction
environment and returns the result. If present, the second argument is
ignored (it is provided for compatibility with R5RS).
procedure: compile-file-to-c file [options [output]]
file must be a string naming an existing file containing Scheme source code. The extension can be omitted from file if the Scheme file has a `.scm' extension. This procedure compiles the source file into a file containing C code. By default, this file is named after file with the extension replaced with `.c'. However, if output is supplied the file is named `output'.
Compilation options are given as a list of symbols after the file name. Any combination of the following options can be used: `verbose', `report', `expansion', `gvm', and `debug'.
Note that this procedure is only available in gsc
.
procedure: compile-file file [options]
The arguments of compile-file
are the same as the first two
arguments of compile-file-to-c
. The compile-file
procedure compiles the source file into an object file by first
generating a C file and then compiling it with the C compiler. The
object file is named after file with the extension replaced with
`.on', where n is a positive integer that acts as a
version number. The next available version number is generated
automatically by compile-file
. Object files can be loaded
dynamically by using the load
procedure. The `.on'
extension can be specified (to select a particular version) or omitted
(to load the highest numbered version). Versions which are no longer
needed must be deleted manually and the remaining version(s) must be
renamed to start with extension `.o1'.
Note that this procedure is only available in gsc
and that it
is only useful on operating systems that support dynamic loading.
procedure: link-incremental module-list [output [base]]
The first argument must be a non empty list of strings naming Scheme modules to link (extensions must be omitted). The remaining optional arguments must be strings. An incremental link file is generated for the modules specified in the first argument. By default the link file generated is named `last_.c', where last is the name of the last module. However, if output is supplied the link file is named `output'. The base link file is specified by the base parameter. By default the base link file is the Gambit runtime library link file `~~/_gambc.c'. However, if base is supplied the base link file is named `base.c'.
Note that this procedure is only available in gsc
.
The following example shows how to build an executable program `hello' which contains the two Scheme modules `m1.scm' and `m2.scm'.
% uname -a Linux bailey 1.2.13 #2 Wed Aug 28 16:29:41 GMT 1996 i586 % cat m1.scm (display "hello") (newline) % cat m2.scm (display "world") (newline) % gsc Gambit Version 2.4 > (compile-file-to-c "m1") #t > (compile-file-to-c "m2") #t > (link-incremental '("m1" "m2") "hello.c") > ,q % gcc m1.c m2.c hello.c -lgambc -o hello % hello hello world
procedure: link-flat module-list [output]
The first argument must be a non empty list of strings. The first string must be the name of a Scheme module or the name of a link file and the remaining strings must name Scheme modules (in all cases extensions must be omitted). The second argument must be a string, if it is supplied. A flat link file is generated for the modules specified in the first argument. By default the link file generated is named `last_.c', where last is the name of the last module. However, if output is supplied the link file is named `output'.
Note that this procedure is only available in gsc
.
The following example shows how to build a dynamically loadable Scheme library `lib.o1' which contains the two Scheme modules `m1.scm' and `m2.scm'.
% uname -a Linux bailey 1.2.13 #2 Wed Aug 28 16:29:41 GMT 1996 i586 % cat m1.scm (define (f x) (g (* x x))) % cat m2.scm (define (g y) (+ n y)) % gsc Gambit Version 2.4 > (compile-file-to-c "m1") #t > (compile-file-to-c "m2") #t > (link-flat '("m1" "m2") "lib.c") *** WARNING -- "*" is not defined, *** referenced in: ("m1.c") *** WARNING -- "+" is not defined, *** referenced in: ("m2.c") *** WARNING -- "n" is not defined, *** referenced in: ("m2.c") > ,q % gcc -shared -fPIC -D___LIBRARY -D___SHARED -D___BIND_LATE m1.c m2.c lib.c -o lib.o1 % gsc Gambit Version 2.4 > (load "lib") *** WARNING -- Variable "n" used in module ";m2" is undefined "/users/feeley/lib.o1" > (define n 10) > (f 5) 35 > ,q
The warnings indicate that there are no definitions (define
s or
set!
s) of the variables *
, +
and n
in the
modules contained in the library. Before the library is used, these
variables will have to be bound; either implicitly (by the runtime
library) or explicitly.
procedure: error string obj...
error
signals an error and causes a nested REPL to be started.
The error message displayed is string followed by the remaining
arguments. The continuation of the REPL is the same as the one passed
to error
. Thus, returning from the REPL with the `,r'
command causes a return from the call to error
.
procedure: exit [status]
exit
causes the program to terminate with the status
status. If it is not specified, the status defaults to 0.
procedure: argv
argv
returns a list of strings corresponding to the command
line arguments, including the program file name as the first element
of the list. When the interpreter executes a Scheme script, the list
returned by argv
contains the script's file name followed by
the remaining command line arguments.
procedure: runtime
runtime
returns the amount of process time (user time plus
system time) in seconds since the program was started.
special form: time expr
time
evaluates expr and returns the result. As a side
effect it displays a message which indicates how long the evaluation
took.
This section contains additional special forms and procedures which are documented only in the interest of experimentation. They may be modified or removed in future releases of Gambit. The procedures in this section do not check the type of their arguments so they may cause the program to crash if called improperly.
procedure: ##add-gc-interrupt-job thunk
procedure: ##clear-gc-interrupt-jobs
Using the procedure ##add-gc-interrupt-job
it is possible to
add a thunk that is called at the end of every garbage collection.
The procedure ##clear-gc-interrupt-jobs
removes all the thunks
added with ##add-gc-interrupt-job
.
procedure: ##add-timer-interrupt-job thunk
procedure: ##clear-timer-interrupt-jobs
The runtime system sets up a free running timer that raises an
interrupt at approximately 10 Hz. Using the procedure
##add-timer-interrupt-job
it is possible to add a thunk that is
called every time a timer interrupt is received. The procedure
##clear-timer-interrupt-jobs
removes all the thunks added with
##add-timer-interrupt-job
. It is relatively easy to implement
threads by using these procedures in conjunction with
call-with-current-continuation
.
procedure: ##shell-command command
The procedure ##shell-command
calls up the shell to execute
command which must be a string. ##shell-command
returns
the exit status of the shell in the form that the C system
command returns.
procedure: ##path-expand path
procedure: ##path-absolute? path
procedure: ##path-extension path
procedure: ##path-strip-extension path
procedure: ##path-directory path
procedure: ##path-strip-directory path
These procedures manipulate file paths. ##path-expand
takes a
relative or absolute path of a file or directory (possibly starting with
`~/', `~~/' or `~user/') and returns the absolute
path of the file or directory. The expanded path of a directory will
always end with `/'. If the path is the empty string, the current
working directory is returned. #f
is returned if the path is
invalid.
The procedure ##path-absolute?
tests if the given path is
absolute.
The remaining procedures extract various parts of a path.
##path-extension
returns the file extension (including the
period) or the empty string if there is no extension.
##path-strip-extension
returns the path with the extension
stripped off. ##path-directory
returns the file's directory
(including the last path separator) or the empty string if no directory
is specified in the path. ##path-strip-directory
returns the
path with the directory stripped off.
special form: dynamic-define var val
special form: dynamic-set! var val
special form: dynamic-let ((var val)...) body...
These special forms provide support for "dynamic variables" which
have dynamic scope. Dynamic variables and normal (lexically scoped)
variables are in different namespaces so there is no possible naming
conflict between them. In all these special forms var is an
identifier which names the dynamic variable. dynamic-define
defines the global dynamic variable var (if it doesn't already
exist) and assigns to it the value of val. dynamic-let
has a syntax similar to let
. It creates bindings of the given
dynamic variables which are accessible for the duration of the
evaluation of body. dynamic-ref
returns the value
currently bound to the dynamic variable var.
dynamic-set!
assigns the value of val to the dynamic
variable var. The dynamic environment that was in effect when a
continuation was created by call-with-current-continuation
is
restored when that continuation is invoked.
Gambit supports the Unicode character encoding standard
(ISO/IEC-10646-1). Scheme characters can be any of the characters in
the 16 bit subset of Unicode known as UCS-2. Scheme strings can
contain any character in UCS-2. Source code can also contain any
character in UCS-2. However, to read such source code properly
gsi
and gsc
must be told which character encoding to use
for reading the source code (i.e. UTF-8, UCS-2, or UCS-4). This can
be done by passing a character encoding parameter to load
or by
specifying one of the runtime options `-:8', `-:2', or
`-:4' when gsi
and gsc
are started.
The Gambit Scheme system offers a simple mechanism for interfacing Scheme code and C code called the "C-interface". A Scheme program indicates which C functions it needs to have access to and which Scheme procedures can be called from C, and the C interface automatically constructs the corresponding Scheme procedures and C functions. The conversions needed to transform data from the Scheme representation to the C representation (and back), are generated automatically in accordance with the argument and result types of the C function or Scheme procedure.
The interface places some restrictions on the types of data that can be exchanged between C and Scheme. The mapping of datatypes between C and Scheme is discussed in the next section. The remaining sections of this chapter describe each special form of the C-interface.
Scheme and C do not provide the same set of built-in datatypes so it is important to understand which Scheme type is compatible with which C type and how values get mapped from one environment to the other. For the sake of explaining the mapping, we assume that Scheme and C have been augmented with some new datatypes. To Scheme is added the datatype `C-pointer' to support the C concept of pointer. The following datatypes are added to C:
scheme-object
___WORD
defined in `gambit.h')
boolean
latin1
___LATIN1
defined in `gambit.h')
ucs2
___UCS2
defined in `gambit.h')
ucs4
___UCS4
defined in `gambit.h')
char-string
latin1-string
___LATIN1*
)
ucs2-string
___UCS2*
)
ucs4-string
___UCS4*
)
utf8-string
char
, i.e. char*
)
To specify a particular C type inside the c-lambda
and
c-define
forms, the following "Scheme notation" is used:
Scheme notation
void
void
boolean
boolean
char
char
(may be signed or unsigned depending on the C compiler)
signed-char
signed char
unsigned-char
unsigned char
latin1
latin1
ucs2
ucs2
ucs4
ucs4
short
short
unsigned-short
unsigned short
int
int
unsigned-int
unsigned int
long
long
unsigned-long
unsigned long
float
float
double
double
(pointer S)
T*
(where T is the C equivalent of the type S;
S must be the Scheme notation of a type or a string naming a
C type, e.g. (pointer int)
or (pointer "FILE")
)
(function (type1...) result-type)
char-string
char-string
latin1-string
latin1-string
ucs2-string
ucs2-string
ucs4-string
ucs4-string
utf8-string
utf8-string
scheme-object
scheme-object
The following table gives the C types to which each Scheme type can be converted:
#f
scheme-object
; boolean
; any string, pointer or function type
#t
scheme-object
; boolean
scheme-object
; boolean
;
[[un]signed
] char
; latin1
; ucs2
;
ucs4
scheme-object
; boolean
; [unsigned
]
short
/int
/long
scheme-object
; boolean
; float
; double
scheme-object
; boolean
; any string type
scheme-object
; boolean
; any pointer type
scheme-object
; boolean
scheme-object
; boolean
scheme-object
; boolean
; any function type
scheme-object
; boolean
The following table gives the Scheme types to which each C type will be converted:
C type
scheme-object
boolean
character types
integer types
float/double
string types
#f
if it is equal to `NULL'
pointer types
#f
if it is equal to `NULL'
function types
#f
if it is equal to `NULL'
void
All Scheme types are compatible with the C types scheme-object
and boolean
. Conversion to and from the C type
scheme-object
is the identity function on the object encoding.
This provides a low-level mechanism for accessing Scheme's object
representation from C (with the help of the macros in the
`gambit.h' header file). When a C boolean
type is expected,
an extended Scheme boolean can be passed (#f
is converted to 0
and all other values are converted to 1).
The Scheme boolean #f
can be passed to the C environment where
any C string type, C pointer type, or C function type is expected. In
this case, #f
is converted to the `NULL' pointer. C
boolean
s are extended booleans so any value different from 0
represents true. Thus, a C boolean
passed to the Scheme
environment is mapped as follows: 0 to #f
and all other values to
#t
.
A Scheme character passed to the C environment where any C character type is expected is converted to the corresponding character in the C environment. An error is signaled if the Scheme character does not fit in the C character. Any C character type passed to Scheme is converted to the corresponding Scheme character. An error is signaled if the C character does not fit in the Scheme character.
A Scheme exact integer passed to the C environment where the C types
short
, int
, and long
are expected is converted to
the corresponding integral value. An error is signaled if the value
falls outside of the range representable by that integral type. C
short
, int
and long
values passed to the Scheme
environment are mapped to the same Scheme exact integer. If the value
is outside the fixnum range, a bignum is created.
A Scheme inexact real passed to the C environment is converted to the
corresponding float
or double
value. C float
and
double
values passed to the Scheme environment are mapped to the
closest Scheme inexact real.
Scheme's rational numbers and complex numbers are not compatible with any C numeric type.
A Scheme string passed to the C environment where any C string type is expected is converted to a null terminated string using the appropriate encoding. The C string is a fresh copy of the Scheme string. Any C string type passed to the Scheme environment causes the creation of a fresh Scheme string containing a copy of the C string.
A C pointer passed to the Scheme environment causes the creation and
initialization of a new `C-pointer' object. This object is
simply a cell containing the pointer to a memory location in the C
environment. The pointer is ignored by the garbage collector. As a
special case, the `NULL' C pointer is converted to #f
. A
Scheme `C-pointer' and #f
can be passed to the C
environment where a C pointer is expected. The conversion simply
recreates the original C pointer or `NULL' pointer.
Only Scheme procedures defined with the c-define
special form and
#f
can be passed where a C function is expected. Conversion from
C functions to Scheme procedures is not currently implemented.
c-declare
special formSynopsis:
(c-declare c-declaration)
Initially, the C file produced by gsc
contains only an
`#include' of `gambit.h'. This header file provides a
number of macro and procedure declarations to access the Scheme object
representation. The special form c-declare
adds
c-declaration (which must be a string containing the C
declarations) to the C file. This string is copied to the C file on a
new line so it can start with preprocessor directives. All types of C
declarations are allowed (including type declarations, variable
declarations, function declarations, `#include' directives,
`#define's, and so on). These declarations are visible to
subsequent c-declare
s, c-initialize
s, and
c-lambda
s, and c-define
s in the same module. The most
common use of this special form is to declare the external functions
that are referenced in c-lambda
special forms. Such functions
must either be declared explicitly or by including a header file which
contains the appropriate C declarations.
The c-declare
special form does not return a value.
It can only appear at top level.
For example:
(c-declare " #include <stdio.h> extern char *getlogin (); #ifdef sparc char *host = \"sparc\"; /* note backslashes */ #else char *host = \"unknown\"; #endif FILE *tfile; ")
c-initialize
special formSynopsis:
(c-initialize c-code)
Just after the program is loaded and before control is passed to the
Scheme code, each C file is initialized by calling its associated
initialization function. The body of this function is normally empty
but it can be extended by using the c-initialize
form. Each
occurence of the c-initialize
form adds code to the body of the
initialization function in the order of appearance in the source file.
c-code must be a string containing the C code to execute. This
string is copied to the C file on a new line so it can start with
preprocessor directives.
The c-initialize
special form does not return a value.
It can only appear at top level.
For example:
(c-initialize "tfile = tmpfile ();")
c-lambda
special formSynopsis:
(c-lambda (type1...) result-type c-name-or-code)
The c-lambda
special form makes it possible to create a Scheme
procedure that will act as a representative of some C function or C
code sequence. The first subform is a list containing the type of
each argument. The type of the function's result is given next.
Finally, the last subform is a string that either contains the name of
the C function to call or some sequence of C code to execute.
Variadic C functions are not supported. The resulting Scheme procedure
takes exactly the number of arguments specified and delivers them in
the same order to the C function. When the Scheme procedure is
called, the arguments will be converted to their C representation and
then the C function will be called. The result returned by the C
function will be converted to its Scheme representation and this value
will be returned from the Scheme procedure call. An error will be
signaled if some conversion is not possible (see below for supported
conversions).
When c-name-or-code is not a valid C identifier, it is treated as
an arbitrary piece of C code. Within the C code the variables
`___arg1', `___arg2', etc. can be referenced to access the
converted arguments. Similarly, the result to be returned from the call
should be assigned to the variable `___result'. If no result needs
to be returned, the result-type should be void
and no
assignment to the variable `___result' should take place. Note
that the C code should not contain return
statements as this is
meaningless. Control must always fall off the end of the C code. The C
code is copied to the C file on a new line so it can start with
preprocessor directives. Moreover the C code is always placed at the
head of a compound statement whose lifetime encloses the C to Scheme
conversion of the result. Consequently, temporary storage (strings in
particular) declared at the head of the C code can be returned by
assigning them to `___result'.
When passed to the Scheme environment, the C void
type is
converted to the void object.
For example:
(define fopen (c-lambda (char-string char-string) (pointer "FILE") "fopen")) (define fgetc (c-lambda ((pointer "FILE")) int "fgetc")) (let ((f (fopen "datafile" "r"))) (if f (write (fgetc f)))) (define char-code (c-lambda (char) int "___result = ___arg1;")) (define host ((c-lambda () char-string "___result = host;"))) (define stdin ((c-lambda () (pointer "FILE") "___result = stdin;"))) ((c-lambda () void "printf( \"hello\\\n\" ); printf( \"world\\\n\" );")) (define pack-2-chars (c-lambda (char char) char-string " char s[3]; s[0] = ___arg1; s[1] = ___arg2; s[2] = 0; ___result = s; "))
c-define
special formSynopsis:
(c-define (name param1...) (type1...) result-type c-name scope body...)
The c-define
special form makes it possible to create a C
function that will act as a representative of some Scheme procedure. A
C function named c-name as well as a Scheme procedure bound to the
variable name are defined. The parameters of the Scheme procedure
are param1, etc (the same syntax as a normal define
is
permitted) and its body is at the end of the form. The type of each
argument of the C function, its result type and c-name (which must
be a string) are specified after the parameter specification of the
Scheme procedure. When the C function c-name is called from C,
its arguments are converted to their Scheme representation and passed to
the Scheme procedure. The result of the Scheme procedure is then
converted to its C representation and the C function c-name
returns it to its caller.
The scope of the C function can be changed with the scope
parameter, which must be a string. This string is placed immediately
before the declaration of the C function. So if the scope
is the string "static"
, the scope of c-name is local to
the module it is in, whereas if the scope is the empty
string, c-name is visible from other modules.
Nested C to Scheme calls (that is calls from C to Scheme during the execution of a call from C to Scheme) are not allowed.
The c-define
special form does not return a value.
It can only appear at top level.
For example:
(c-define (proc x y . z) (char int int float) int "f" "" (display "proc was called with: ") (write (cons x (cons y z))) (newline) (+ y (car z))) (proc #\x 1 2 1.5) => 3 and prints "proc was called with: (#\x 1 2 1.5)" ; if f is called from C with the call f ('x', 1, 2, 1.5) the value 3 ; is returned and "proc was called with: (#\x 1 2 1.5)" is printed
The c-define
special form is particularly useful when the
driving part of an application is written in C and Scheme procedures
are called directly from C. The Scheme part of the application is in
a sense a "server" that is providing services to the C part. The
Scheme procedures that are to be called from C need to be defined
using the c-define
special form. Before it can be used, the
Scheme part must be initialized with a call to the function
`___setup_and_run'. Before the program terminates, it must call
the function `___cleanup' so that the Scheme part may do final
cleanup. A sample application is given in the file
`check/server.scm'.
define
) of the same procedure. Replace all
but the first define
with assignments (set!
).
write
may be read
back by read
as a slightly different number.
define-structure
) can be written with
write
but can not be read by read
.
floor
and ceiling
procedures gave incorrect
results for negative arguments.
round
procedure did not obey the round to even rule.
A value exactly in between two consecutive integers is now correctly
rounded to the closest even integer.
The Gambit system (including the Gambit-C version) is Copyright (C) 1994-1996 by Marc Feeley, all rights reserved.
The Gambit system and programs developed with it may be distributed
only under the following conditions: they must not be sold or
transferred for compensation and they must include this copyright and
distribution notice. For a commercial license please contact
gambit@iro.umontreal.ca
.