ocaml-doc-4.05/ 0000755 0001750 0001750 00000000000 13142673356 012250 5 ustar mehdi mehdi ocaml-doc-4.05/ocaml.html/ 0000755 0001750 0001750 00000000000 13142673356 014306 5 ustar mehdi mehdi ocaml-doc-4.05/ocaml.html/libunix.html 0000644 0001750 0001750 00000015460 13131636457 016653 0 ustar mehdi mehdi
The unix library makes many Unix system calls and system-related library functions available to OCaml programs. This chapter describes briefly the functions provided. Refer to sections 2 and 3 of the Unix manual for more details on the behavior of these functions.
Not all functions are provided by all Unix variants. If some functions are not available, they will raise Invalid_arg when called.
Programs that use the unix library must be linked as follows:
ocamlc other options unix.cma other files ocamlopt other options unix.cmxa other files
For interactive use of the unix library, do:
ocamlmktop -o mytop unix.cma ./mytop
or (if dynamic linking of C libraries is supported on your platform), start ocaml and type #load "unix.cma";;.
Windows: A fairly complete emulation of the Unix system calls is provided in the Windows version of OCaml. The end of this chapter gives more information on the functions that are not supported under Windows.
Windows: The Cygwin port of OCaml fully implements all functions from the Unix module. The native Win32 ports implement a subset of them. Below is a list of the functions that are not implemented, or only partially implemented, by the Win32 ports. Functions not mentioned are fully implemented and behave as described previously in this chapter.
Functions Comment fork not implemented, use create_process or threads wait not implemented, use waitpid waitpid can only wait for a given PID, not any child process getppid not implemented (meaningless under Windows) nice not implemented truncate, ftruncate not implemented link implemented (since 3.02) symlink, readlink implemented (since 4.03.0) access execute permission X_OK cannot be tested, it just tests for read permission instead fchmod not implemented chown, fchown not implemented (make no sense on a DOS file system) umask not implemented mkfifo not implemented kill partially implemented (since 4.00.0): only the sigkill signal is implemented pause not implemented (no inter-process signals in Windows) alarm not implemented times partially implemented, will not report timings for child processes getitimer, setitimer not implemented getuid, geteuid, getgid, getegid always return 1 getgroups always returns [|1|] (since 2.00) setuid, setgid, setgroups not implemented getpwnam, getpwuid always raise Not_found getgrnam, getgrgid always raise Not_found type socket_domain PF_INET is fully supported; PF_INET6 is fully supported (since 4.01.0); PF_UNIX is not supported establish_server not implemented; use threads terminal functions (tc*) not implemented
This document is intended as a reference manual for the OCaml language. It lists the language constructs, and gives their precise syntax and informal semantics. It is by no means a tutorial introduction to the language: there is not a single example. A good working knowledge of OCaml is assumed.
No attempt has been made at mathematical rigor: words are employed with their intuitive meaning, without further definition. As a consequence, the typing rules have been left out, by lack of the mathematical framework required to express them, while they are definitely part of a full formal definition of the language.
The syntax of the language is given in BNF-like notation. Terminal symbols are set in typewriter font (like this). Non-terminal symbols are set in italic font (like that). Square brackets […] denote optional components. Curly brackets {…} denotes zero, one or several repetitions of the enclosed components. Curly brackets with a trailing plus sign {…}+ denote one or several repetitions of the enclosed components. Parentheses (…) denote grouping.
The str library provides high-level string processing functions, some based on regular expressions. It is intended to support the kind of file processing that is usually performed with scripting languages such as awk, perl or sed.
Programs that use the str library must be linked as follows:
ocamlc other options str.cma other files ocamlopt other options str.cmxa other files
For interactive use of the str library, do:
ocamlmktop -o mytop str.cma ./mytop
or (if dynamic linking of C libraries is supported on your platform), start ocaml and type #load "str.cma";;.
This chapter introduces the module system of OCaml.
A primary motivation for modules is to package together related definitions (such as the definitions of a data type and associated operations over that type) and enforce a consistent naming scheme for these definitions. This avoids running out of names or accidentally confusing names. Such a package is called a structure and is introduced by the struct…end construct, which contains an arbitrary sequence of definitions. The structure is usually given a name with the module binding. Here is for instance a structure packaging together a type of priority queues and their operations:
module PrioQueue = struct type priority = int type 'a queue = Empty | Node of priority * 'a * 'a queue * 'a queue let empty = Empty let rec insert queue prio elt = match queue with Empty -> Node(prio, elt, Empty, Empty) | Node(p, e, left, right) -> if prio <= p then Node(prio, elt, insert right p e, left) else Node(p, e, insert right prio elt, left) exception Queue_is_empty let rec remove_top = function Empty -> raise Queue_is_empty | Node(prio, elt, left, Empty) -> left | Node(prio, elt, Empty, right) -> right | Node(prio, elt, (Node(lprio, lelt, _, _) as left), (Node(rprio, relt, _, _) as right)) -> if lprio <= rprio then Node(lprio, lelt, remove_top left, right) else Node(rprio, relt, left, remove_top right) let extract = function Empty -> raise Queue_is_empty | Node(prio, elt, _, _) as queue -> (prio, elt, remove_top queue) end;;module PrioQueue : sig type priority = int type 'a queue = Empty | Node of priority * 'a * 'a queue * 'a queue val empty : 'a queue val insert : 'a queue -> priority -> 'a -> 'a queue exception Queue_is_empty val remove_top : 'a queue -> 'a queue val extract : 'a queue -> priority * 'a * 'a queue end
Outside the structure, its components can be referred to using the “dot notation”, that is, identifiers qualified by a structure name. For instance, PrioQueue.insert is the function insert defined inside the structure PrioQueue and PrioQueue.queue is the type queue defined in PrioQueue.
PrioQueue.insert PrioQueue.empty 1 "hello";;- : string PrioQueue.queue = PrioQueue.Node (1, "hello", PrioQueue.Empty, PrioQueue.Empty)
Signatures are interfaces for structures. A signature specifies which components of a structure are accessible from the outside, and with which type. It can be used to hide some components of a structure (e.g. local function definitions) or export some components with a restricted type. For instance, the signature below specifies the three priority queue operations empty, insert and extract, but not the auxiliary function remove_top. Similarly, it makes the queue type abstract (by not providing its actual representation as a concrete type).
module type PRIOQUEUE = sig type priority = int (* still concrete *) type 'a queue (* now abstract *) val empty : 'a queue val insert : 'a queue -> int -> 'a -> 'a queue val extract : 'a queue -> int * 'a * 'a queue exception Queue_is_empty end;;module type PRIOQUEUE = sig type priority = int type 'a queue val empty : 'a queue val insert : 'a queue -> int -> 'a -> 'a queue val extract : 'a queue -> int * 'a * 'a queue exception Queue_is_empty end
Restricting the PrioQueue structure by this signature results in another view of the PrioQueue structure where the remove_top function is not accessible and the actual representation of priority queues is hidden:
module AbstractPrioQueue = (PrioQueue : PRIOQUEUE);;module AbstractPrioQueue : PRIOQUEUE
AbstractPrioQueue.remove_top ;;Error: Unbound value AbstractPrioQueue.remove_top
AbstractPrioQueue.insert AbstractPrioQueue.empty 1 "hello";;- : string AbstractPrioQueue.queue = <abstr>
The restriction can also be performed during the definition of the structure, as in
module PrioQueue = (struct ... end : PRIOQUEUE);;
An alternate syntax is provided for the above:
module PrioQueue : PRIOQUEUE = struct ... end;;
Functors are “functions” from structures to structures. They are used to express parameterized structures: a structure A parameterized by a structure B is simply a functor F with a formal parameter B (along with the expected signature for B) which returns the actual structure A itself. The functor F can then be applied to one or several implementations B1 …Bn of B, yielding the corresponding structures A1 …An.
For instance, here is a structure implementing sets as sorted lists, parameterized by a structure providing the type of the set elements and an ordering function over this type (used to keep the sets sorted):
type comparison = Less | Equal | Greater;;type comparison = Less | Equal | Greater
module type ORDERED_TYPE = sig type t val compare: t -> t -> comparison end;;module type ORDERED_TYPE = sig type t val compare : t -> t -> comparison end
module Set = functor (Elt: ORDERED_TYPE) -> struct type element = Elt.t type set = element list let empty = [] let rec add x s = match s with [] -> [x] | hd::tl -> match Elt.compare x hd with Equal -> s (* x is already in s *) | Less -> x :: s (* x is smaller than all elements of s *) | Greater -> hd :: add x tl let rec member x s = match s with [] -> false | hd::tl -> match Elt.compare x hd with Equal -> true (* x belongs to s *) | Less -> false (* x is smaller than all elements of s *) | Greater -> member x tl end;;module Set : functor (Elt : ORDERED_TYPE) -> sig type element = Elt.t type set = element list val empty : 'a list val add : Elt.t -> Elt.t list -> Elt.t list val member : Elt.t -> Elt.t list -> bool end
By applying the Set functor to a structure implementing an ordered type, we obtain set operations for this type:
module OrderedString = struct type t = string let compare x y = if x = y then Equal else if x < y then Less else Greater end;;module OrderedString : sig type t = string val compare : 'a -> 'a -> comparison end
module StringSet = Set(OrderedString);;module StringSet : sig type element = OrderedString.t type set = element list val empty : 'a list val add : OrderedString.t -> OrderedString.t list -> OrderedString.t list val member : OrderedString.t -> OrderedString.t list -> bool end
StringSet.member "bar" (StringSet.add "foo" StringSet.empty);;- : bool = false
As in the PrioQueue example, it would be good style to hide the actual implementation of the type set, so that users of the structure will not rely on sets being lists, and we can switch later to another, more efficient representation of sets without breaking their code. This can be achieved by restricting Set by a suitable functor signature:
module type SETFUNCTOR = functor (Elt: ORDERED_TYPE) -> sig type element = Elt.t (* concrete *) type set (* abstract *) val empty : set val add : element -> set -> set val member : element -> set -> bool end;;module type SETFUNCTOR = functor (Elt : ORDERED_TYPE) -> sig type element = Elt.t type set val empty : set val add : element -> set -> set val member : element -> set -> bool end
module AbstractSet = (Set : SETFUNCTOR);;module AbstractSet : SETFUNCTOR
module AbstractStringSet = AbstractSet(OrderedString);;module AbstractStringSet : sig type element = OrderedString.t type set = AbstractSet(OrderedString).set val empty : set val add : element -> set -> set val member : element -> set -> bool end
AbstractStringSet.add "gee" AbstractStringSet.empty;;- : AbstractStringSet.set = <abstr>
In an attempt to write the type constraint above more elegantly, one may wish to name the signature of the structure returned by the functor, then use that signature in the constraint:
module type SET = sig type element type set val empty : set val add : element -> set -> set val member : element -> set -> bool end;;module type SET = sig type element type set val empty : set val add : element -> set -> set val member : element -> set -> bool end
module WrongSet = (Set : functor(Elt: ORDERED_TYPE) -> SET);;module WrongSet : functor (Elt : ORDERED_TYPE) -> SET
module WrongStringSet = WrongSet(OrderedString);;module WrongStringSet : sig type element = WrongSet(OrderedString).element type set = WrongSet(OrderedString).set val empty : set val add : element -> set -> set val member : element -> set -> bool end
WrongStringSet.add "gee" WrongStringSet.empty ;;Error: This expression has type string but an expression was expected of type WrongStringSet.element = WrongSet(OrderedString).element
The problem here is that SET specifies the type element abstractly, so that the type equality between element in the result of the functor and t in its argument is forgotten. Consequently, WrongStringSet.element is not the same type as string, and the operations of WrongStringSet cannot be applied to strings. As demonstrated above, it is important that the type element in the signature SET be declared equal to Elt.t; unfortunately, this is impossible above since SET is defined in a context where Elt does not exist. To overcome this difficulty, OCaml provides a with type construct over signatures that allows enriching a signature with extra type equalities:
module AbstractSet2 = (Set : functor(Elt: ORDERED_TYPE) -> (SET with type element = Elt.t));;module AbstractSet2 : functor (Elt : ORDERED_TYPE) -> sig type element = Elt.t type set val empty : set val add : element -> set -> set val member : element -> set -> bool end
As in the case of simple structures, an alternate syntax is provided for defining functors and restricting their result:
module AbstractSet2(Elt: ORDERED_TYPE) : (SET with type element = Elt.t) = struct ... end;;
Abstracting a type component in a functor result is a powerful technique that provides a high degree of type safety, as we now illustrate. Consider an ordering over character strings that is different from the standard ordering implemented in the OrderedString structure. For instance, we compare strings without distinguishing upper and lower case.
module NoCaseString = struct type t = string let compare s1 s2 = OrderedString.compare (String.lowercase_ascii s1) (String.lowercase_ascii s2) end;;module NoCaseString : sig type t = string val compare : string -> string -> comparison end
module NoCaseStringSet = AbstractSet(NoCaseString);;module NoCaseStringSet : sig type element = NoCaseString.t type set = AbstractSet(NoCaseString).set val empty : set val add : element -> set -> set val member : element -> set -> bool end
NoCaseStringSet.add "FOO" AbstractStringSet.empty ;;Error: This expression has type AbstractStringSet.set = AbstractSet(OrderedString).set but an expression was expected of type NoCaseStringSet.set = AbstractSet(NoCaseString).set
Note that the two types AbstractStringSet.set and NoCaseStringSet.set are not compatible, and values of these two types do not match. This is the correct behavior: even though both set types contain elements of the same type (strings), they are built upon different orderings of that type, and different invariants need to be maintained by the operations (being strictly increasing for the standard ordering and for the case-insensitive ordering). Applying operations from AbstractStringSet to values of type NoCaseStringSet.set could give incorrect results, or build lists that violate the invariants of NoCaseStringSet.
All examples of modules so far have been given in the context of the interactive system. However, modules are most useful for large, batch-compiled programs. For these programs, it is a practical necessity to split the source into several files, called compilation units, that can be compiled separately, thus minimizing recompilation after changes.
In OCaml, compilation units are special cases of structures and signatures, and the relationship between the units can be explained easily in terms of the module system. A compilation unit A comprises two files:
These two files together define a structure named A as if the following definition was entered at top-level:
module A: sig (* contents of file A.mli *) end = struct (* contents of file A.ml *) end;;
The files that define the compilation units can be compiled separately using the ocamlc -c command (the -c option means “compile only, do not try to link”); this produces compiled interface files (with extension .cmi) and compiled object code files (with extension .cmo). When all units have been compiled, their .cmo files are linked together using the ocamlc command. For instance, the following commands compile and link a program composed of two compilation units Aux and Main:
$ ocamlc -c Aux.mli # produces aux.cmi $ ocamlc -c Aux.ml # produces aux.cmo $ ocamlc -c Main.mli # produces main.cmi $ ocamlc -c Main.ml # produces main.cmo $ ocamlc -o theprogram Aux.cmo Main.cmo
The program behaves exactly as if the following phrases were entered at top-level:
module Aux: sig (* contents of Aux.mli *) end = struct (* contents of Aux.ml *) end;; module Main: sig (* contents of Main.mli *) end = struct (* contents of Main.ml *) end;;
In particular, Main can refer to Aux: the definitions and declarations contained in Main.ml and Main.mli can refer to definition in Aux.ml, using the Aux.ident notation, provided these definitions are exported in Aux.mli.
The order in which the .cmo files are given to ocamlc during the linking phase determines the order in which the module definitions occur. Hence, in the example above, Aux appears first and Main can refer to it, but Aux cannot refer to Main.
Note that only top-level structures can be mapped to separately-compiled files, but neither functors nor module types. However, all module-class objects can appear as components of a structure, so the solution is to put the functor or module type inside a structure, which can then be mapped to a file.
This chapter describes the functions provided by the OCaml standard library. The modules from the standard library are automatically linked with the user’s object code files by the ocamlc command. Hence, these modules can be used in standalone programs without having to add any .cmo file on the command line for the linking phase. Similarly, in interactive use, these globals can be used in toplevel phrases without having to load any .cmo file in memory.
Unlike the Pervasives module from the core library, the modules from the standard library are not automatically “opened” when a compilation starts, or when the toplevel system is launched. Hence it is necessary to use qualified identifiers to refer to the functions provided by these modules, or to add open directives.
For easy reference, the modules are listed below in alphabetical order of module names. For each module, the declarations from its signature are printed one by one in typewriter font, followed by a short comment. All modules and the identifiers they export are indexed at the end of this report.
This chapter describes the OCaml core library, which is composed of declarations for built-in types and exceptions, plus the module Pervasives that provides basic operations on these built-in types. The Pervasives module is special in two ways:
The declarations of the built-in types and the components of module Pervasives are printed one by one in typewriter font, followed by a short comment. All library modules and the components they provide are indexed at the end of this report.
The following built-in types and predefined exceptions are always defined in the compilation environment, but are not part of any module. As a consequence, they can only be referred by their short names.
type int
The type of integer numbers.
type char
The type of characters.
type bytes
The type of (writable) byte sequences.
type string
The type of (read-only) character strings.
type float
The type of floating-point numbers.
type bool = false | true
The type of booleans (truth values).
type unit = ()
The type of the unit value.
type exn
The type of exception values.
type 'a array
The type of arrays whose elements have type 'a.
type 'a list = [] | :: of 'a * 'a list
The type of lists whose elements have type 'a.
type 'a option = None | Some of 'a
The type of optional values of type 'a.
type int32
The type of signed 32-bit integers. See the Int32[Int32] module.
type int64
The type of signed 64-bit integers. See the Int64[Int64] module.
type nativeint
The type of signed, platform-native integers (32 bits on 32-bit processors, 64 bits on 64-bit processors). See the Nativeint[Nativeint] module.
type ('a, 'b, 'c, 'd, 'e, 'f) format6
The type of format strings. 'a is the type of the parameters of the format, 'f is the result type for the printf-style functions, 'b is the type of the first argument given to %a and %t printing functions (see module Printf[Printf]), 'c is the result type of these functions, and also the type of the argument transmitted to the first argument of kprintf-style functions, 'd is the result type for the scanf-style functions (see module Scanf[Scanf]), and 'e is the type of the receiver function for the scanf-style functions.
type 'a lazy_t
This type is used to implement the Lazy[Lazy] module. It should not be used directly.
exception Match_failure of (string * int * int)
Exception raised when none of the cases of a pattern-matching apply. The arguments are the location of the match keyword in the source code (file name, line number, column number).
exception Assert_failure of (string * int * int)
Exception raised when an assertion fails. The arguments are the location of the assert keyword in the source code (file name, line number, column number).
exception Invalid_argument of string
Exception raised by library functions to signal that the given arguments do not make sense. The string gives some information to the programmer. As a general rule, this exception should not be caught, it denotes a programming error and the code should be modified not to trigger it.
exception Failure of string
Exception raised by library functions to signal that they are
undefined on the given arguments. The string is meant to give some
information to the programmer; you must not pattern match on
the string literal because it may change in future versions (use
Failure _
instead).
exception Not_found
Exception raised by search functions when the desired object could not be found.
exception Out_of_memory
Exception raised by the garbage collector when there is insufficient memory to complete the computation.
exception Stack_overflow
Exception raised by the bytecode interpreter when the evaluation stack reaches its maximal size. This often indicates infinite or excessively deep recursion in the user’s program. (Not fully implemented by the native-code compiler; see section 11.5.)
exception Sys_error of string
Exception raised by the input/output functions to report an
operating system error. The string is meant to give some
information to the programmer; you must not pattern match on
the string literal because it may change in future versions (use
Sys_error _
instead).
exception End_of_file
Exception raised by input functions to signal that the end of file has been reached.
exception Division_by_zero
Exception raised by integer division and remainder operations when their second argument is zero.
exception Sys_blocked_io
A special case of Sys_error raised when no I/O is possible on a non-blocking I/O channel.
exception Undefined_recursive_module of (string * int * int)
Exception raised when an ill-founded recursive module definition is evaluated. (See section 7.4.) The arguments are the location of the definition in the source code (file name, line number, column number).
The graphics library provides a set of portable drawing primitives. Drawing takes place in a separate window that is created when Graphics.open_graph is called.
Unix: This library is implemented under the X11 windows system. Programs that use the graphics library must be linked as follows:ocamlc other options graphics.cma other filesFor interactive use of the graphics library, do:ocamlmktop -o mytop graphics.cma ./mytopor (if dynamic linking of C libraries is supported on your platform), start ocaml and type #load "graphics.cma";;.Here are the graphics mode specifications supported by Graphics.open_graph on the X11 implementation of this library: the argument to Graphics.open_graph has the format "display-name geometry", where display-name is the name of the X-windows display to connect to, and geometry is a standard X-windows geometry specification. The two components are separated by a space. Either can be omitted, or both. Examples:
- Graphics.open_graph "foo:0"
- connects to the display foo:0 and creates a window with the default geometry
- Graphics.open_graph "foo:0 300x100+50-0"
- connects to the display foo:0 and creates a window 300 pixels wide by 100 pixels tall, at location (50,0)
- Graphics.open_graph " 300x100+50-0"
- connects to the default display and creates a window 300 pixels wide by 100 pixels tall, at location (50,0)
- Graphics.open_graph ""
- connects to the default display and creates a window with the default geometry.
Windows: This library is available both for standalone compiled programs and under the toplevel application ocamlwin.exe. For the latter, this library must be loaded in-core by typing#load "graphics.cma";;
The screen coordinates are interpreted as shown in the figure below. Notice that the coordinate system used is the same as in mathematics: y increases from the bottom of the screen to the top of the screen, and angles are measured counterclockwise (in degrees). Drawing is clipped to the screen.
|
See also the following language extensions: lazy patterns, local opens, first-class modules, attributes, extension nodes and exception cases in pattern matching.
The table below shows the relative precedences and associativity of operators and non-closed pattern constructions. The constructions with higher precedences come first.
Operator | Associativity |
.. | – |
lazy (see section 7.3) | – |
Constructor application, Tag application | right |
:: | right |
, | – |
| | left |
as | – |
Patterns are templates that allow selecting data structures of a given shape, and binding identifiers to components of the data structure. This selection operation is called pattern matching; its outcome is either “this value does not match this pattern”, or “this value matches this pattern, resulting in the following bindings of names to values”.
A pattern that consists in a value name matches any value, binding the name to the value. The pattern _ also matches any value, but does not bind any name.
Patterns are linear: a variable cannot be bound several times by a given pattern. In particular, there is no way to test for equality between two parts of a data structure using only a pattern (but when guards can be used for this purpose).
A pattern consisting in a constant matches the values that are equal to this constant.
The pattern pattern1 as value-name matches the same values as pattern1. If the matching against pattern1 is successful, the name value-name is bound to the matched value, in addition to the bindings performed by the matching against pattern1.
The pattern ( pattern1 ) matches the same values as pattern1. A type constraint can appear in a parenthesized pattern, as in ( pattern1 : typexpr ). This constraint forces the type of pattern1 to be compatible with typexpr.
The pattern pattern1 | pattern2 represents the logical “or” of the two patterns pattern1 and pattern2. A value matches pattern1 | pattern2 if it matches pattern1 or pattern2. The two sub-patterns pattern1 and pattern2 must bind exactly the same identifiers to values having the same types. Matching is performed from left to right. More precisely, in case some value v matches pattern1 | pattern2, the bindings performed are those of pattern1 when v matches pattern1. Otherwise, value v matches pattern2 whose bindings are performed.
The pattern constr ( pattern1 , … , patternn ) matches all variants whose constructor is equal to constr, and whose arguments match pattern1 … patternn. It is a type error if n is not the number of arguments expected by the constructor.
The pattern constr _ matches all variants whose constructor is constr.
The pattern pattern1 :: pattern2 matches non-empty lists whose heads match pattern1, and whose tails match pattern2.
The pattern [ pattern1 ; … ; patternn ] matches lists of length n whose elements match pattern1 …patternn, respectively. This pattern behaves like pattern1 :: … :: patternn :: [].
The pattern `tag-name pattern1 matches all polymorphic variants whose tag is equal to tag-name, and whose argument matches pattern1.
If the type [('a,'b,…)] typeconstr = [ ` tag-name1 typexpr1 | … | ` tag-namen typexprn] is defined, then the pattern #typeconstr is a shorthand for the following or-pattern: ( `tag-name1(_ : typexpr1) | … | ` tag-namen(_ : typexprn)). It matches all values of type [< typeconstr ].
The pattern pattern1 , … , patternn matches n-tuples whose components match the patterns pattern1 through patternn. That is, the pattern matches the tuple values (v1, …, vn) such that patterni matches vi for i = 1,… , n.
The pattern { field1 = pattern1 ; … ; fieldn = patternn } matches records that define at least the fields field1 through fieldn, and such that the value associated to fieldi matches the pattern patterni, for i = 1,… , n. The record value can define more fields than field1 …fieldn; the values associated to these extra fields are not taken into account for matching. Optional type constraints can be added field by field with { field1 : typexpr1 = pattern1 ;… ; fieldn : typexprn = patternn } to force the type of fieldk to be compatible with typexprk.
The pattern [| pattern1 ; … ; patternn |] matches arrays of length n such that the i-th array element matches the pattern patterni, for i = 1,… , n.
The pattern ' c ' .. ' d ' is a shorthand for the pattern
where c1, c2, …, cn are the characters that occur between c and d in the ASCII character set. For instance, the pattern '0'..'9' matches all characters that are digits.
This chapter describes the OCaml batch compiler ocamlc, which compiles OCaml source files to bytecode object files and links these object files to produce standalone bytecode executable files. These executable files are then run by the bytecode interpreter ocamlrun.
The ocamlc command has a command-line interface similar to the one of most C compilers. It accepts several types of arguments and processes them sequentially, after all options have been processed:
If the interface file x.mli exists, the implementation x.ml is checked against the corresponding compiled interface x.cmi, which is assumed to exist. If no interface x.mli is provided, the compilation of x.ml produces a compiled interface file x.cmi in addition to the compiled object code file x.cmo. The file x.cmi produced corresponds to an interface that exports everything that is defined in the implementation x.ml.
The output of the linking phase is a file containing compiled bytecode that can be executed by the OCaml bytecode interpreter: the command named ocamlrun. If a.out is the name of the file produced by the linking phase, the command
ocamlrun a.out arg1 arg2 … argn
executes the compiled code contained in a.out, passing it as arguments the character strings arg1 to argn. (See chapter 10 for more details.)
On most systems, the file produced by the linking phase can be run directly, as in:
./a.out arg1 arg2 … argn
The produced file has the executable bit set, and it manages to launch the bytecode interpreter by itself.
The following command-line options are recognized by ocamlc. The options -pack, -a, -c and -output-obj are mutually exclusive.
If -custom, -cclib or -ccopt options are passed on the command line, these options are stored in the resulting .cmalibrary. Then, linking with this library automatically adds back the -custom, -cclib and -ccopt options as if they had been provided on the command line, unless the -noautolink option is given.
The environment variable OCAML_COLOR is considered if -color is not provided. Its values are auto/always/never as above.
Unix: Never use the strip command on executables produced by ocamlc -custom, this would remove the bytecode part of the executable.
Unix: Security warning: never set the “setuid” or “setgid” bits on executables produced by ocamlc -custom, this would make them vulnerable to attacks.
If the given directory starts with +, it is taken relative to the standard library directory. For instance, -I +labltk adds the subdirectory labltk of the standard library to the search path.
The -opaque option, available since 4.04, disables cross-module optimization information for the currently compiled unit. When compiling .mli interface, using -opaque marks the compiled .cmi interface so that subsequent compilations of modules that depend on it will not rely on the corresponding .cmx file, nor warn if it is absent. When the native compiler compiles a .ml implementation, using -opaque generates a .cmx that does not contain any cross-module optimization information.
Using this option may degrade the quality of generated code, but it reduces compilation time, both on clean and incremental builds. Indeed, with the native compiler, when the implementation of a compilation unit changes, all the units that depend on it may need to be recompiled – because the cross-module information may have changed. If the compilation unit whose implementation changed was compiled with -opaque, no such recompilation needs to occur. This option can thus be used, for example, to get faster edit-compile-test feedback loops.
ocamlc -pack -o p.cmo a.cmo b.cmo c.cmogenerates compiled files p.cmo and p.cmi describing a compilation unit having three sub-modules A, B and C, corresponding to the contents of the object files a.cmo, b.cmo and c.cmo. These contents can be referenced as P.A, P.B and P.C in the remainder of the program.
The warning-list argument is a sequence of warning specifiers, with no separators between them. A warning specifier is one of the following:
Warning numbers and letters which are out of the range of warnings that are currently defined are ignored. The warnings are as follows.
The default setting is -w +a-4-6-7-9-27-29-32..39-41..42-44-45-48-50. It is displayed by ocamlc -help. Note that warnings 5 and 10 are not always triggered, depending on the internals of the type checker.
Note: it is not recommended to use warning sets (i.e. letters) as arguments to -warn-error in production code, because this can break your build when future versions of OCaml add some new warnings.
The default setting is -warn-error -a+31 (only warning 31 is fatal).
The compiler command line can be modified “from the outside” with the following mechanisms. These are experimental and subject to change. They should be used only for experimental and development work, not in released packages.
This short section is intended to clarify the relationship between the names of the modules corresponding to compilation units and the names of the files that contain their compiled interface and compiled implementation.
The compiler always derives the module name by taking the capitalized base name of the source file (.ml or .mli file). That is, it strips the leading directory name, if any, as well as the .ml or .mli suffix; then, it set the first letter to uppercase, in order to comply with the requirement that module names must be capitalized. For instance, compiling the file mylib/misc.ml provides an implementation for the module named Misc. Other compilation units may refer to components defined in mylib/misc.ml under the names Misc.name; they can also do open Misc, then use unqualified names name.
The .cmi and .cmo files produced by the compiler have the same base name as the source file. Hence, the compiled files always have their base name equal (modulo capitalization of the first letter) to the name of the module they describe (for .cmi files) or implement (for .cmo files).
When the compiler encounters a reference to a free module identifier Mod, it looks in the search path for a file named Mod.cmi or mod.cmi and loads the compiled interface contained in that file. As a consequence, renaming .cmi files is not advised: the name of a .cmi file must always correspond to the name of the compilation unit it implements. It is admissible to move them to another directory, if their base name is preserved, and the correct -I options are given to the compiler. The compiler will flag an error if it loads a .cmi file that has been renamed.
Compiled bytecode files (.cmo files), on the other hand, can be freely renamed once created. That’s because the linker never attempts to find by itself the .cmo file that implements a module with a given name: it relies instead on the user providing the list of .cmo files by hand.
This section describes and explains the most frequently encountered error messages.
If filename has the format mod.cmo, this means you are trying to link a bytecode object file that does not exist yet. Fix: compile mod.ml first.
If your program spans several directories, this error can also appear because you haven’t specified the directories to look into. Fix: add the correct -I options to the command line.
In some cases, it is hard to understand why the two types t1 and t2 are incompatible. For instance, the compiler can report that “expression of type foo cannot be used with type foo”, and it really seems that the two types foo are compatible. This is not always true. Two type constructors can have the same name, but actually represent different types. This can happen if a type constructor is redefined. Example:
type foo = A | B let f = function A -> 0 | B -> 1 type foo = C | D f C
This result in the error message “expression C of type foo cannot be used with type foo”.
Non-generalized type variables in a type cause no difficulties inside a given structure or compilation unit (the contents of a .ml file, or an interactive session), but they cannot be allowed inside signatures nor in compiled interfaces (.cmi file), because they could be used inconsistently later. Therefore, the compiler flags an error when a structure or compilation unit defines a value name whose type contains non-generalized type variables. There are two ways to fix this error:
let sort_int_list = Sort.list (<) (* inferred type 'a list -> 'a list, with 'a not generalized *)write
let sort_int_list = (Sort.list (<) : int list -> int list);;
let map_length = List.map Array.length (* inferred type 'a array list -> int list, with 'a not generalized *)write
let map_length lv = List.map Array.length lv
Of course, you will always encounter this error if you have mutually recursive functions across modules. That is, function Mod1.f calls function Mod2.g, and function Mod2.g calls function Mod1.f. In this case, no matter what permutations you perform on the command line, the program will be rejected at link-time. Fixes:
mod1.ml: let f x = ... Mod2.g ... mod2.ml: let g y = ... Mod1.f ...define
mod1.ml: let f g x = ... g ... mod2.ml: let rec g y = ... Mod1.f g ...and link mod1.cmo before mod2.cmo.
mod1.ml: let forward_g = ref((fun x -> failwith "forward_g") : <type>) let f x = ... !forward_g ... mod2.ml: let g y = ... Mod1.f ... let _ = Mod1.forward_g := g
This section describes and explains in detail some warnings:
Some constructors, such as the exception constructors Failure and Invalid_argument, take as parameter a string value holding a text message intended for the user.
These text messages are usually not stable over time: call sites building these constructors may refine the message in a future version to make it more explicit, etc. Therefore, it is dangerous to match over the precise value of the message. For example, until OCaml 4.02, Array.iter2 would raise the exception
Invalid_argument "arrays must have the same length"
Since 4.03 it raises the more helpful message
Invalid_argument "Array.iter2: arrays must have the same length"
but this means that any code of the form
try ... with Invalid_argument "arrays must have the same length" -> ...
is now broken and may suffer from uncaught exceptions.
Warning 52 is there to prevent users from writing such fragile code in the first place. It does not occur on every matching on a literal string, but only in the case in which library authors expressed their intent to possibly change the constructor parameter value in the future, by using the attribute ocaml.warn_on_literal_pattern (see the manual section on builtin attributes in 7.18.1):
type t = | Foo of string [@ocaml.warn_on_literal_pattern] | Bar of string let no_warning = function | Bar "specific value" -> 0 | _ -> 1 let warning = function | Foo "specific value" -> 0 | _ -> 1 > | Foo "specific value" -> 0 > ^^^^^^^^^^^^^^^^ > Warning 52: Code should not depend on the actual values of > this constructor's arguments. They are only for information > and may change in future versions. (See manual section 8.5)
In particular, all built-in exceptions with a string argument have this attribute set: Invalid_argument, Failure, Sys_error will all raise this warning if you match for a specific string argument.
If your code raises this warning, you should not change the way you test for the specific string to avoid the warning (for example using a string equality inside the right-hand-side instead of a literal pattern), as your code would remain fragile. You should instead enlarge the scope of the pattern by matching on all possible values.
let warning = function | Foo _ -> 0 | _ -> 1
This may require some care: if the scrutinee may return several different cases of the same pattern, or raise distinct instances of the same exception, you may need to modify your code to separate those several cases.
For example,
try (int_of_string count_str, bool_of_string choice_str) with | Failure "int_of_string" -> (0, true) | Failure "bool_of_string" -> (-1, false)
should be rewritten into more atomic tests. For example, using the exception patterns documented in Section 7.21, one can write:
match int_of_string count_str with | exception (Failure _) -> (0, true) | count -> begin match bool_of_string choice_str with | exception (Failure _) -> (-1, false) | choice -> (count, choice) end
The only case where that transformation is not possible is if a given function call may raises distinct exceptions with the same constructor but different string values. In this case, you will have to check for specific string values. This is dangerous API design and it should be discouraged: it’s better to define more precise exception constructors than store useful information in strings.
The semantics of or-patterns in OCaml is specified with a left-to-right bias: a value v matches the pattern p | q if it matches p or q, but if it matches both, the environment captured by the match is the environment captured by p, never the one captured by q.
While this property is generally intuitive, there is at least one specific case where a different semantics might be expected. Consider a pattern followed by a when-guard: | p when g -> e, for example:
| ((Const x, _) | (_, Const x)) when is_neutral x -> branch
The semantics is clear: match the scrutinee against the pattern, if it matches, test the guard, and if the guard passes, take the branch. In particular, consider the input (Const a, Const b), where a fails the test is_neutral a, while b passes the test is_neutral b. With the left-to-right semantics, the clause above is not taken by its input: matching (Const a, Const b) against the or-pattern succeeds in the left branch, it returns the environment x -> a, and then the guard is_neutral a is tested and fails, the branch is not taken.
However, another semantics may be considered more natural here: any pair that has one side passing the test will take the branch. With this semantics the previous code fragment would be equivalent to
| (Const x, _) when is_neutral x -> branch | (_, Const x) when is_neutral x -> branch
This is not the semantics adopted by OCaml.
Warning 57 is dedicated to these confusing cases where the specified left-to-right semantics is not equivalent to a non-deterministic semantics (any branch can be taken) relatively to a specific guard. More precisely, it warns when guard uses “ambiguous” variables, that are bound to different parts of the scrutinees by different sides of a or-pattern.
Spacetime is the name given to functionality within the OCaml compiler that provides for accurate profiling of the memory behaviour of a program. Using Spacetime it is possible to determine the source of memory leaks and excess memory allocation quickly and easily. Excess allocation slows programs down both by imposing a higher load on the garbage collector and reducing the cache locality of the program’s code. Spacetime provides full backtraces for every allocation that occurred on the OCaml heap during the lifetime of the program including those in C stubs.
Spacetime only analyses the memory behaviour of a program with respect to the OCaml heap allocators and garbage collector. It does not analyse allocation on the C heap. Spacetime does not affect the memory behaviour of a program being profiled with the exception of any change caused by the overhead of profiling (see section 21.3)—for example the program running slower might cause it to allocate less memory in total.
Spacetime is currently only available for x86-64 targets and has only been tested on Linux systems (although it is expected to work on most modern Unix-like systems and provision has been made for running under Windows). It is expected that the set of supported platforms will be extended in the future.
To use Spacetime it is necessary to use an OCaml compiler that was configured with the -spacetime option. It is not possible to select Spacetime on a per-source-file basis or for a subset of files in a project; all files involved in the executable being profiled must be built with the Spacetime compiler. Only native code compilation is supported (not bytecode).
If the libunwind library is not available on the system then it will not be possible for Spacetime to profile allocations occurring within C stubs. If the libunwind library is available but in an unusual location then that location may be specified to the configure script using the -libunwinddir option (or alternatively, using separate -libunwindinclude and -libunwindlib options).
OPAM switches will be provided for Spacetime-configured compilers.
Once the appropriate compiler has been selected the program should be built as normal (ensuring that all files are built with the Spacetime compiler—there is currently no protection to ensure this is the case, but it is essential). For many uses it will not be necessary to change the code of the program to use the profiler.
Spacetime-configured compilers run slower and occupy more memory than their counterparts. It is hoped this will be fixed in the future as part of improved cross compilation support.
Programs built with Spacetime instrumentation have a dependency on the libunwind library unless that was unavailable at configure time or the -disable-libunwind option was specified (see section 21.3).
Setting the OCAML_SPACETIME_INTERVAL environment variable to an integer representing a number of milliseconds before running a program built with Spacetime will cause memory profiling to be in operation when the program is started. The contents of the OCaml heap will be sampled each time the number of milliseconds that the program has spent executing since the last sample exceeds the given number. (Note that the time base is combined user plus system time—not wall clock time. This peculiarity may be changed in future.)
The program being profiled must exit normally or be caused to exit using the SIGINT signal (e.g. by pressing Ctrl+C). When the program exits files will be written in the directory that was the working directory when the program was started. One Spacetime file will be written for each process that was involved, indexed by process ID; there will normally only be one such. The Spacetime files may be substantial. The directory to which they are written may be overridden by setting the OCAML_SPACETIME_SNAPSHOT_DIR environment variable before the program is started.
Instead of using the automatic snapshot facility described above it is also possible to manually control Spacetime profiling. (The environment variables OCAML_SPACETIME_INTERVAL and OCAML_SPACETIME_SNAPSHOT_DIR are then not relevant.) Full documentation as regards this method of profiling is provided in the standard library documentation (section 24) for the Spacetime module.
The compiler distribution does not itself provide the facility for analysing Spacetime output files; this is left to external tools. The first such tool will appear in OPAM as a package called prof_spacetime. That tool will provide interactive graphical and terminal-based visualisation of the results of profiling.
The runtime overhead imposed by Spacetime varies considerably depending on the particular program being profiled. The overhead may be as low as ten percent—but more usually programs should be expected to run at perhaps a third or quarter of their normal speed. It is expected that this overhead will be reduced in future versions of the compiler.
Execution speed of instrumented programs may be increased by using a compiler configured with the -disable-libunwind option. This prevents collection of profiling information from C stubs.
Programs running with Spacetime instrumentation consume significantly more memory than their non-instrumented counterparts. It is expected that this memory overhead will also be reduced in the future.
The compiler distribution provides an “otherlibs” library called raw_spacetime_lib for decoding Spacetime files. This library provides facilities to read not only memory profiling information but also the full dynamic call graph of the profiled program which is written into Spacetime output files.
A library package spacetime_lib will be provided in OPAM to provide an interface for decoding profiling information at a higher level than that provided by raw_spacetime_lib.
|
Compilation units bridge the module system and the separate compilation system. A compilation unit is composed of two parts: an interface and an implementation. The interface contains a sequence of specifications, just as the inside of a sig … end signature expression. The implementation contains a sequence of definitions and expressions, just as the inside of a struct … end module expression. A compilation unit also has a name unit-name, derived from the names of the files containing the interface and the implementation (see chapter 8 for more details). A compilation unit behaves roughly as the module definition
A compilation unit can refer to other compilation units by their names, as if they were regular modules. For instance, if U is a compilation unit that defines a type t, other compilation units can refer to that type under the name U.t; they can also refer to U as a whole structure. Except for names of other compilation units, a unit interface or unit implementation must not have any other free variables. In other terms, the type-checking and compilation of an interface or implementation proceeds in the initial environment
where name1 … namen are the names of the other compilation units available in the search path (see chapter 8 for more details) and specification1 … specificationn are their respective interfaces.
(Chapter written by Jacques Garrigue)
This chapter gives an overview of the new features in OCaml 3: labels, and polymorphic variants.
If you have a look at modules ending in Labels in the standard library, you will see that function types have annotations you did not have in the functions you defined yourself.
ListLabels.map;;- : f:('a -> 'b) -> 'a list -> 'b list = <fun>
StringLabels.sub;;- : string -> pos:int -> len:int -> string = <fun>
Such annotations of the form name: are called labels. They are meant to document the code, allow more checking, and give more flexibility to function application. You can give such names to arguments in your programs, by prefixing them with a tilde ~.
let f ~x ~y = x - y;;val f : x:int -> y:int -> int = <fun>
let x = 3 and y = 2 in f ~x ~y;;- : int = 1
When you want to use distinct names for the variable and the label appearing in the type, you can use a naming label of the form ~name:. This also applies when the argument is not a variable.
let f ~x:x1 ~y:y1 = x1 - y1;;val f : x:int -> y:int -> int = <fun>
f ~x:3 ~y:2;;- : int = 1
Labels obey the same rules as other identifiers in OCaml, that is you cannot use a reserved keyword (like in or to) as label.
Formal parameters and arguments are matched according to their respective labels1, the absence of label being interpreted as the empty label. This allows commuting arguments in applications. One can also partially apply a function on any argument, creating a new function of the remaining parameters.
let f ~x ~y = x - y;;val f : x:int -> y:int -> int = <fun>
f ~y:2 ~x:3;;- : int = 1
ListLabels.fold_left;;- : f:('a -> 'b -> 'a) -> init:'a -> 'b list -> 'a = <fun>
ListLabels.fold_left [1;2;3] ~init:0 ~f:( + );;- : int = 6
ListLabels.fold_left ~init:0;;- : f:(int -> 'a -> int) -> 'a list -> int = <fun>
If several arguments of a function bear the same label (or no label), they will not commute among themselves, and order matters. But they can still commute with other arguments.
let hline ~x:x1 ~x:x2 ~y = (x1, x2, y);;val hline : x:'a -> x:'b -> y:'c -> 'a * 'b * 'c = <fun>
hline ~x:3 ~y:2 ~x:5;;- : int * int * int = (3, 5, 2)
As an exception to the above parameter matching rules, if an application is total (omitting all optional arguments), labels may be omitted. In practice, many applications are total, so that labels can often be omitted.
f 3 2;;- : int = 1
ListLabels.map succ [1;2;3];;- : int list = [2; 3; 4]
But beware that functions like ListLabels.fold_left whose result type is a type variable will never be considered as totally applied.
ListLabels.fold_left ( + ) 0 [1;2;3];;Error: This expression has type int -> int -> int but an expression was expected of type 'a list
When a function is passed as an argument to a higher-order function, labels must match in both types. Neither adding nor removing labels are allowed.
let h g = g ~x:3 ~y:2;;val h : (x:int -> y:int -> 'a) -> 'a = <fun>
h f;;- : int = 1
h ( + ) ;;Error: This expression has type int -> int -> int but an expression was expected of type x:int -> y:int -> 'a
Note that when you don’t need an argument, you can still use a wildcard pattern, but you must prefix it with the label.
h (fun ~x:_ ~y -> y+1);;- : int = 3
An interesting feature of labeled arguments is that they can be made optional. For optional parameters, the question mark ? replaces the tilde ~ of non-optional ones, and the label is also prefixed by ? in the function type. Default values may be given for such optional parameters.
let bump ?(step = 1) x = x + step;;val bump : ?step:int -> int -> int = <fun>
bump 2;;- : int = 3
bump ~step:3 2;;- : int = 5
A function taking some optional arguments must also take at least one non-optional argument. The criterion for deciding whether an optional argument has been omitted is the non-labeled application of an argument appearing after this optional argument in the function type. Note that if that argument is labeled, you will only be able to eliminate optional arguments through the special case for total applications.
let test ?(x = 0) ?(y = 0) () ?(z = 0) () = (x, y, z);;val test : ?x:int -> ?y:int -> unit -> ?z:int -> unit -> int * int * int = <fun>
test ();;- : ?z:int -> unit -> int * int * int = <fun>
test ~x:2 () ~z:3 ();;- : int * int * int = (2, 0, 3)
Optional parameters may also commute with non-optional or unlabeled ones, as long as they are applied simultaneously. By nature, optional arguments do not commute with unlabeled arguments applied independently.
test ~y:2 ~x:3 () ();;- : int * int * int = (3, 2, 0)
test () () ~z:1 ~y:2 ~x:3;;- : int * int * int = (3, 2, 1)
(test () ()) ~z:1 ;;Error: This expression has type int * int * int This is not a function; it cannot be applied.
Here (test () ()) is already (0,0,0) and cannot be further applied.
Optional arguments are actually implemented as option types. If you do not give a default value, you have access to their internal representation, type 'a option = None | Some of 'a. You can then provide different behaviors when an argument is present or not.
let bump ?step x = match step with | None -> x * 2 | Some y -> x + y ;;val bump : ?step:int -> int -> int = <fun>
It may also be useful to relay an optional argument from a function call to another. This can be done by prefixing the applied argument with ?. This question mark disables the wrapping of optional argument in an option type.
let test2 ?x ?y () = test ?x ?y () ();;val test2 : ?x:int -> ?y:int -> unit -> int * int * int = <fun>
test2 ?x:None;;- : ?y:int -> unit -> int * int * int = <fun>
While they provide an increased comfort for writing function applications, labels and optional arguments have the pitfall that they cannot be inferred as completely as the rest of the language.
You can see it in the following two examples.
let h' g = g ~y:2 ~x:3;;val h' : (y:int -> x:int -> 'a) -> 'a = <fun>
h' f ;;Error: This expression has type x:int -> y:int -> int but an expression was expected of type y:int -> x:int -> 'a
let bump_it bump x = bump ~step:2 x;;val bump_it : (step:int -> 'a -> 'b) -> 'a -> 'b = <fun>
bump_it bump 1 ;;Error: This expression has type ?step:int -> int -> int but an expression was expected of type step:int -> 'a -> 'b
The first case is simple: g is passed ~y and then ~x, but f expects ~x and then ~y. This is correctly handled if we know the type of g to be x:int -> y:int -> int in advance, but otherwise this causes the above type clash. The simplest workaround is to apply formal parameters in a standard order.
The second example is more subtle: while we intended the argument bump to be of type ?step:int -> int -> int, it is inferred as step:int -> int -> 'a. These two types being incompatible (internally normal and optional arguments are different), a type error occurs when applying bump_it to the real bump.
We will not try here to explain in detail how type inference works. One must just understand that there is not enough information in the above program to deduce the correct type of g or bump. That is, there is no way to know whether an argument is optional or not, or which is the correct order, by looking only at how a function is applied. The strategy used by the compiler is to assume that there are no optional arguments, and that applications are done in the right order.
The right way to solve this problem for optional parameters is to add a type annotation to the argument bump.
let bump_it (bump : ?step:int -> int -> int) x = bump ~step:2 x;;val bump_it : (?step:int -> int -> int) -> int -> int = <fun>
bump_it bump 1;;- : int = 3
In practice, such problems appear mostly when using objects whose methods have optional arguments, so that writing the type of object arguments is often a good idea.
Normally the compiler generates a type error if you attempt to pass to a function a parameter whose type is different from the expected one. However, in the specific case where the expected type is a non-labeled function type, and the argument is a function expecting optional parameters, the compiler will attempt to transform the argument to have it match the expected type, by passing None for all optional parameters.
let twice f (x : int) = f(f x);;val twice : (int -> int) -> int -> int = <fun>
twice bump 2;;- : int = 8
This transformation is coherent with the intended semantics, including side-effects. That is, if the application of optional parameters shall produce side-effects, these are delayed until the received function is really applied to an argument.
Like for names, choosing labels for functions is not an easy task. A good labeling is a labeling which
We explain here the rules we applied when labeling OCaml libraries.
To speak in an “object-oriented” way, one can consider that each function has a main argument, its object, and other arguments related with its action, the parameters. To permit the combination of functions through functionals in commuting label mode, the object will not be labeled. Its role is clear from the function itself. The parameters are labeled with names reminding of their nature or their role. The best labels combine nature and role. When this is not possible the role is to be preferred, since the nature will often be given by the type itself. Obscure abbreviations should be avoided.
ListLabels.map : f:('a -> 'b) -> 'a list -> 'b list
UnixLabels.write : file_descr -> buf:bytes -> pos:int -> len:int -> unit
When there are several objects of same nature and role, they are all left unlabeled.
ListLabels.iter2 : f:('a -> 'b -> 'c) -> 'a list -> 'b list -> unit
When there is no preferable object, all arguments are labeled.
BytesLabels.blit : src:bytes -> src_pos:int -> dst:bytes -> dst_pos:int -> len:int -> unit
However, when there is only one argument, it is often left unlabeled.
BytesLabels.create : int -> bytes
This principle also applies to functions of several arguments whose return type is a type variable, as long as the role of each argument is not ambiguous. Labeling such functions may lead to awkward error messages when one attempts to omit labels in an application, as we have seen with ListLabels.fold_left.
Here are some of the label names you will find throughout the libraries.
Label | Meaning |
f: | a function to be applied |
pos: | a position in a string, array or byte sequence |
len: | a length |
buf: | a byte sequence or string used as buffer |
src: | the source of an operation |
dst: | the destination of an operation |
init: | the initial value for an iterator |
cmp: | a comparison function, e.g. Pervasives.compare |
mode: | an operation mode or a flag list |
All these are only suggestions, but keep in mind that the choice of labels is essential for readability. Bizarre choices will make the program harder to maintain.
In the ideal, the right function name with right labels should be enough to understand the function’s meaning. Since one can get this information with OCamlBrowser or the ocaml toplevel, the documentation is only used when a more detailed specification is needed.
Variants as presented in section 1.4 are a powerful tool to build data structures and algorithms. However they sometimes lack flexibility when used in modular programming. This is due to the fact that every constructor is assigned to an unique type when defined and used. Even if the same name appears in the definition of multiple types, the constructor itself belongs to only one type. Therefore, one cannot decide that a given constructor belongs to multiple types, or consider a value of some type to belong to some other type with more constructors.
With polymorphic variants, this original assumption is removed. That is, a variant tag does not belong to any type in particular, the type system will just check that it is an admissible value according to its use. You need not define a type before using a variant tag. A variant type will be inferred independently for each of its uses.
In programs, polymorphic variants work like usual ones. You just have to prefix their names with a backquote character `.
[`On; `Off];;- : [> `Off | `On ] list = [`On; `Off]
`Number 1;;- : [> `Number of int ] = `Number 1
let f = function `On -> 1 | `Off -> 0 | `Number n -> n;;val f : [< `Number of int | `Off | `On ] -> int = <fun>
List.map f [`On; `Off];;- : int list = [1; 0]
[>`Off|`On] list means that to match this list, you should at least be able to match `Off and `On, without argument. [<`On|`Off|`Number of int] means that f may be applied to `Off, `On (both without argument), or `Number n where n is an integer. The > and < inside the variant types show that they may still be refined, either by defining more tags or by allowing less. As such, they contain an implicit type variable. Because each of the variant types appears only once in the whole type, their implicit type variables are not shown.
The above variant types were polymorphic, allowing further refinement. When writing type annotations, one will most often describe fixed variant types, that is types that cannot be refined. This is also the case for type abbreviations. Such types do not contain < or >, but just an enumeration of the tags and their associated types, just like in a normal datatype definition.
type 'a vlist = [`Nil | `Cons of 'a * 'a vlist];;type 'a vlist = [ `Cons of 'a * 'a vlist | `Nil ]
let rec map f : 'a vlist -> 'b vlist = function | `Nil -> `Nil | `Cons(a, l) -> `Cons(f a, map f l) ;;val map : ('a -> 'b) -> 'a vlist -> 'b vlist = <fun>
Type-checking polymorphic variants is a subtle thing, and some expressions may result in more complex type information.
let f = function `A -> `C | `B -> `D | x -> x;;val f : ([> `A | `B | `C | `D ] as 'a) -> 'a = <fun>
f `E;;- : [> `A | `B | `C | `D | `E ] = `E
Here we are seeing two phenomena. First, since this matching is open (the last case catches any tag), we obtain the type [> `A | `B] rather than [< `A | `B] in a closed matching. Then, since x is returned as is, input and return types are identical. The notation as 'a denotes such type sharing. If we apply f to yet another tag `E, it gets added to the list.
let f1 = function `A x -> x = 1 | `B -> true | `C -> false let f2 = function `A x -> x = "a" | `B -> true ;;val f1 : [< `A of int | `B | `C ] -> bool = <fun> val f2 : [< `A of string | `B ] -> bool = <fun>
let f x = f1 x && f2 x;;val f : [< `A of string & int | `B ] -> bool = <fun>
Here f1 and f2 both accept the variant tags `A and `B, but the argument of `A is int for f1 and string for f2. In f’s type `C, only accepted by f1, disappears, but both argument types appear for `A as int & string. This means that if we pass the variant tag `A to f, its argument should be both int and string. Since there is no such value, f cannot be applied to `A, and `B is the only accepted input.
Even if a value has a fixed variant type, one can still give it a larger type through coercions. Coercions are normally written with both the source type and the destination type, but in simple cases the source type may be omitted.
type 'a wlist = [`Nil | `Cons of 'a * 'a wlist | `Snoc of 'a wlist * 'a];;type 'a wlist = [ `Cons of 'a * 'a wlist | `Nil | `Snoc of 'a wlist * 'a ]
let wlist_of_vlist l = (l : 'a vlist :> 'a wlist);;val wlist_of_vlist : 'a vlist -> 'a wlist = <fun>
let open_vlist l = (l : 'a vlist :> [> 'a vlist]);;val open_vlist : 'a vlist -> [> 'a vlist ] = <fun>
fun x -> (x :> [`A|`B|`C]);;- : [< `A | `B | `C ] -> [ `A | `B | `C ] = <fun>
You may also selectively coerce values through pattern matching.
let split_cases = function | `Nil | `Cons _ as x -> `A x | `Snoc _ as x -> `B x ;;val split_cases : [< `Cons of 'a | `Nil | `Snoc of 'b ] -> [> `A of [> `Cons of 'a | `Nil ] | `B of [> `Snoc of 'b ] ] = <fun>
When an or-pattern composed of variant tags is wrapped inside an alias-pattern, the alias is given a type containing only the tags enumerated in the or-pattern. This allows for many useful idioms, like incremental definition of functions.
let num x = `Num x let eval1 eval (`Num x) = x let rec eval x = eval1 eval x ;;val num : 'a -> [> `Num of 'a ] = <fun> val eval1 : 'a -> [< `Num of 'b ] -> 'b = <fun> val eval : [< `Num of 'a ] -> 'a = <fun>
let plus x y = `Plus(x,y) let eval2 eval = function | `Plus(x,y) -> eval x + eval y | `Num _ as x -> eval1 eval x let rec eval x = eval2 eval x ;;val plus : 'a -> 'b -> [> `Plus of 'a * 'b ] = <fun> val eval2 : ('a -> int) -> [< `Num of int | `Plus of 'a * 'a ] -> int = <fun> val eval : ([< `Num of int | `Plus of 'a * 'a ] as 'a) -> int = <fun>
To make this even more comfortable, you may use type definitions as abbreviations for or-patterns. That is, if you have defined type myvariant = [`Tag1 of int | `Tag2 of bool], then the pattern #myvariant is equivalent to writing (`Tag1(_ : int) | `Tag2(_ : bool)).
Such abbreviations may be used alone,
let f = function | #myvariant -> "myvariant" | `Tag3 -> "Tag3";;val f : [< `Tag1 of int | `Tag2 of bool | `Tag3 ] -> string = <fun>
or combined with with aliases.
let g1 = function `Tag1 _ -> "Tag1" | `Tag2 _ -> "Tag2";;val g1 : [< `Tag1 of 'a | `Tag2 of 'b ] -> string = <fun>
let g = function | #myvariant as x -> g1 x | `Tag3 -> "Tag3";;val g : [< `Tag1 of int | `Tag2 of bool | `Tag3 ] -> string = <fun>
After seeing the power of polymorphic variants, one may wonder why they were added to core language variants, rather than replacing them.
The answer is twofold. One first aspect is that while being pretty efficient, the lack of static type information allows for less optimizations, and makes polymorphic variants slightly heavier than core language ones. However noticeable differences would only appear on huge data structures.
More important is the fact that polymorphic variants, while being type-safe, result in a weaker type discipline. That is, core language variants do actually much more than ensuring type-safety, they also check that you use only declared constructors, that all constructors present in a data-structure are compatible, and they enforce typing constraints to their parameters.
For this reason, you must be more careful about making types explicit when you use polymorphic variants. When you write a library, this is easy since you can describe exact types in interfaces, but for simple programs you are probably better off with core language variants.
Beware also that some idioms make trivial errors very hard to find. For instance, the following code is probably wrong but the compiler has no way to see it.
type abc = [`A | `B | `C] ;;type abc = [ `A | `B | `C ]
let f = function | `As -> "A" | #abc -> "other" ;;val f : [< `A | `As | `B | `C ] -> string = <fun>
let f : abc -> string = f ;;val f : abc -> string = <fun>
You can avoid such risks by annotating the definition itself.
let f : abc -> string = function | `As -> "A" | #abc -> "other" ;;Error: This pattern matches values of type [? `As ] but a pattern was expected which matches values of type abc The second variant type does not allow tag(s) `As
This chapter describes OCamldoc, a tool that generates documentation from special comments embedded in source files. The comments used by OCamldoc are of the form (**…*) and follow the format described in section 15.2.
OCamldoc can produce documentation in various formats: HTML, LATEX, TeXinfo, Unix man pages, and dot dependency graphs. Moreover, users can add their own custom generators, as explained in section 15.3.
In this chapter, we use the word element to refer to any of the following parts of an OCaml source file: a type declaration, a value, a module, an exception, a module type, a type constructor, a record field, a class, a class type, a class method, a class value or a class inheritance clause.
OCamldoc is invoked via the command ocamldoc, as follows:
ocamldoc options sourcefiles
The following options determine the format for the generated documentation.
OCamldoc calls the OCaml type-checker to obtain type information. The following options impact the type-checking phase. They have the same meaning as for the ocamlc and ocamlopt commands.
The following options apply in conjunction with the -html option:
module M : functor (A:Module) -> functor (B:Module2) -> sig .. endis displayed as:
module M (A:Module) (B:Module2) : sig .. end
The following options apply in conjunction with the -latex option:
These options are useful when you have, for example, a type and a value with the same name. If you do not specify prefixes, LATEX will complain about multiply defined labels.
The following options apply in conjunction with the -texi option:
The following options apply in conjunction with the -dot option:
The following options apply in conjunction with the -man option:
Information on a module can be extracted either from the .mli or .ml file, or both, depending on the files given on the command line. When both .mli and .ml files are given for the same module, information extracted from these files is merged according to the following rules:
The following rules must be respected in order to avoid name clashes resulting in cross-reference errors:
open Foo (* which has a module Bar with a value x *) module Foo = struct module Bar = struct let x = 1 end end let dummy = Bar.xIn this case, OCamldoc will associate Bar.x to the x of module Foo defined just above, instead of to the Bar.x defined in the opened module Foo.
Comments containing documentation material are called special comments and are written between (** and *). Special comments must start exactly with (**. Comments beginning with ( and more than two * are ignored.
OCamldoc can associate comments to some elements of the language encountered in the source files. The association is made according to the locations of comments with respect to the language elements. The locations of comments in .mli and .ml files are different.
A special comment is associated to an element if it is placed before or
after the element.
A special comment before an element is associated to this element if :
A special comment after an element is associated to this element if there is no blank line or comment between the special comment and the element.
There are two exceptions: for constructors and record fields in type definitions, the associated comment can only be placed after the constructor or field definition, without blank lines or other comments between them. The special comment for a constructor with another constructor following must be placed before the ’|’ character separating the two constructors.
The following sample interface file foo.mli illustrates the placement rules for comments in .mli files.
(** The first special comment of the file is the comment associated with the whole module.*) (** Special comments can be placed between elements and are kept by the OCamldoc tool, but are not associated to any element. @-tags in these comments are ignored.*) (*******************************************************************) (** Comments like the one above, with more than two asterisks, are ignored. *) (** The comment for function f. *) val f : int -> int -> int (** The continuation of the comment for function f. *) (** Comment for exception My_exception, even with a simple comment between the special comment and the exception.*) (* Hello, I'm a simple comment :-) *) exception My_exception of (int -> int) * int (** Comment for type weather *) type weather = | Rain of int (** The comment for constructor Rain *) | Sun (** The comment for constructor Sun *) (** Comment for type weather2 *) type weather2 = | Rain of int (** The comment for constructor Rain *) | Sun (** The comment for constructor Sun *) (** I can continue the comment for type weather2 here because there is already a comment associated to the last constructor.*) (** The comment for type my_record *) type my_record = { val foo : int ; (** Comment for field foo *) val bar : string ; (** Comment for field bar *) } (** Continuation of comment for type my_record *) (** Comment for foo *) val foo : string (** This comment is associated to foo and not to bar. *) val bar : string (** This comment is associated to bar. *) (** The comment for class my_class *) class my_class : object (** A comment to describe inheritance from cl *) inherit cl (** The comment for attribute tutu *) val mutable tutu : string (** The comment for attribute toto. *) val toto : int (** This comment is not attached to titi since there is a blank line before titi, but is kept as a comment in the class. *) val titi : string (** Comment for method toto *) method toto : string (** Comment for method m *) method m : float -> int end (** The comment for the class type my_class_type *) class type my_class_type = object (** The comment for variable x. *) val mutable x : int (** The commend for method m. *) method m : int -> int end (** The comment for module Foo *) module Foo = struct (** The comment for x *) val x : int (** A special comment that is kept but not associated to any element *) end (** The comment for module type my_module_type. *) module type my_module_type = sig (** The comment for value x. *) val x : int (** The comment for module M. *) module M = struct (** The comment for value y. *) val y : int (* ... *) end end
A special comment is associated to an element if it is placed before the element and there is no blank line between the comment and the element. Meanwhile, there can be a simple comment between the special comment and the element. There are two exceptions, for constructors and record fields in type definitions, whose associated comment must be placed after the constructor or field definition, without blank line between them. The special comment for a constructor with another constructor following must be placed before the ’|’ character separating the two constructors.
The following example of file toto.ml shows where to place comments in a .ml file.
(** The first special comment of the file is the comment associated to the whole module. *) (** The comment for function f *) let f x y = x + y (** This comment is not attached to any element since there is another special comment just before the next element. *) (** Comment for exception My_exception, even with a simple comment between the special comment and the exception.*) (* A simple comment. *) exception My_exception of (int -> int) * int (** Comment for type weather *) type weather = | Rain of int (** The comment for constructor Rain *) | Sun (** The comment for constructor Sun *) (** The comment for type my_record *) type my_record = { val foo : int ; (** Comment for field foo *) val bar : string ; (** Comment for field bar *) } (** The comment for class my_class *) class my_class = object (** A comment to describe inheritance from cl *) inherit cl (** The comment for the instance variable tutu *) val mutable tutu = "tutu" (** The comment for toto *) val toto = 1 val titi = "titi" (** Comment for method toto *) method toto = tutu ^ "!" (** Comment for method m *) method m (f : float) = 1 end (** The comment for class type my_class_type *) class type my_class_type = object (** The comment for the instance variable x. *) val mutable x : int (** The commend for method m. *) method m : int -> int end (** The comment for module Foo *) module Foo = struct (** The comment for x *) val x : int (** A special comment in the class, but not associated to any element. *) end (** The comment for module type my_module_type. *) module type my_module_type = sig (* Comment for value x. *) val x : int (* ... *) end
The special comment (**/**) tells OCamldoc to discard elements placed after this comment, up to the end of the current class, class type, module or module type, or up to the next stop comment. For instance:
class type foo = object (** comment for method m *) method m : string (**/**) (** This method won't appear in the documentation *) method bar : int end (** This value appears in the documentation, since the Stop special comment in the class does not affect the parent module of the class.*) val foo : string (**/**) (** The value bar does not appear in the documentation.*) val bar : string (**/**) (** The type t appears since in the documentation since the previous stop comment toggled off the "no documentation mode". *) type t = string
The -no-stop option to ocamldoc causes the Stop special comments to be ignored.
The inside of documentation comments (**…*) consists of free-form text with optional formatting annotations, followed by optional tags giving more specific information about parameters, version, authors, … The tags are distinguished by a leading @ character. Thus, a documentation comment has the following shape:
(** The comment begins with a description, which is text formatted according to the rules described in the next section. The description continues until the first non-escaped '@' character. @author Mr Smith @param x description for parameter x *)
Some elements support only a subset of all @-tags. Tags that are not relevant to the documented element are simply ignored. For instance, all tags are ignored when documenting type constructors, record fields, and class inheritance clauses. Similarly, a @param tag on a class instance variable is ignored.
At last, (**) is the empty documentation comment.
Here is the BNF grammar for the simple markup language used to format text descriptions.
|
text-element | ::= |
∣ | { { 0 … 9 }+ text } | format text as a section header; the integer following { indicates the sectioning level. |
∣ | { { 0 … 9 }+ : label text } | same, but also associate the name label to the current point. This point can be referenced by its fully-qualified label in a {! command, just like any other element. |
∣ | {b text } | set text in bold. |
∣ | {i text } | set text in italic. |
∣ | {e text } | emphasize text. |
∣ | {C text } | center text. |
∣ | {L text } | left align text. |
∣ | {R text } | right align text. |
∣ | {ul list } | build a list. |
∣ | {ol list } | build an enumerated list. |
∣ | {{: string } text } | put a link to the given address (given as string) on the given text. |
∣ | [ string ] | set the given string in source code style. |
∣ | {[ string ]} | set the given string in preformatted source code style. |
∣ | {v string v} | set the given string in verbatim style. |
∣ | {% string %} | target-specific content (LATEX code by default, see details in 15.2.4.4) |
∣ | {! string } | insert a cross-reference to an element (see section 15.2.4.2 for the syntax of cross-references). |
∣ | {!modules: string string ... } | insert an index table for the given module names. Used in HTML only. |
∣ | {!indexlist} | insert a table of links to the various indexes (types, values, modules, ...). Used in HTML only. |
∣ | {^ text } | set text in superscript. |
∣ | {_ text } | set text in subscript. |
∣ | escaped-string | typeset the given string as is; special characters (’{’, ’}’, ’[’, ’]’ and ’@’) must be escaped by a ’\’ |
∣ | blank-line | force a new line. |
|
A shortcut syntax exists for lists and enumerated lists:
(** Here is a {b list} - item 1 - item 2 - item 3 The list is ended by the blank line.*)
is equivalent to:
(** Here is a {b list} {ul {- item 1} {- item 2} {- item 3}} The list is ended by the blank line.*)
The same shortcut is available for enumerated lists, using ’+’ instead of ’-’. Note that only one list can be defined by this shortcut in nested lists.
Cross-references are fully qualified element names, as in the example {!Foo.Bar.t}. This is an ambiguous reference as it may designate a type name, a value name, a class name, etc. It is possible to make explicit the intended syntactic class, using {!type:Foo.Bar.t} to designate a type, and {!val:Foo.Bar.t} a value of the same name.
The list of possible syntactic class is as follows:
tag | syntactic class |
module: | module |
modtype: | module type |
class: | class |
classtype: | class type |
val: | value |
type: | type |
exception: | exception |
attribute: | attribute |
method: | class method |
section: | ocamldoc section |
const: | variant constructor |
recfield: | record field |
In the case of variant constructors or record field, the constructor or field name should be preceded by the name of the correspond type – to avoid the ambiguity of several types having the same constructor names. For example, the constructor Node of the type tree will be referenced as {!tree.Node} or {!const:tree.Node}, or possibly {!Mod1.Mod2.tree.Node} from outside the module.
In the description of a value, type, exception, module, module type, class or class type, the first sentence is sometimes used in indexes, or when just a part of the description is needed. The first sentence is composed of the first characters of the description, until
outside of the following text formatting : {ul list } , {ol list } , [ string ] , {[ string ]} , {v string v} , {% string %} , {! string } , {^ text } , {_ text } .
The content inside {%foo: ... %} is target-specific and will only be interpreted by the backend foo, and ignored by the others. The backends of the distribution are latex, html, texi and man. If no target is specified (syntax {% ... %}), latex is chosen by default. Custom generators may support their own target prefix.
The HTML tags <b>..</b>, <code>..</code>, <i>..</i>, <ul>..</ul>, <ol>..</ol>, <li>..</li>, <center>..</center> and <h[0-9]>..</h[0-9]> can be used instead of, respectively, {b ..} , [..] , {i ..} , {ul ..} , {ol ..} , {li ..} , {C ..} and {[0-9] ..}.
The following table gives the list of predefined @-tags, with their
syntax and meaning.
@author string | The author of the element. One author per @author tag. There may be several @author tags for the same element. |
@deprecated text | The text should describe when the element was deprecated, what to use as a replacement, and possibly the reason for deprecation. |
@param id text | Associate the given description (text) to the given parameter name id. This tag is used for functions, methods, classes and functors. |
@raise Exc text | Explain that the element may raise the exception Exc. |
@return text | Describe the return value and its possible values. This tag is used for functions and methods. |
@see < URL > text | Add a reference to the URL with the given text as comment. |
@see 'filename' text | Add a reference to the given file name (written between single quotes), with the given text as comment. |
@see "document-name" text | Add a reference to the given document name (written between double quotes), with the given text as comment. |
@since string | Indicate when the element was introduced. |
@before version text | Associate the given description (text) to the given version in order to document compatibility issues. |
@version string | The version number for the element. |
You can use custom tags in the documentation comments, but they will have no effect if the generator used does not handle them. To use a custom tag, for example foo, just put @foo with some text in your comment, as in:
(** My comment to show you a custom tag. @foo this is the text argument to the [foo] custom tag. *)
To handle custom tags, you need to define a custom generator, as explained in section 15.3.2.
OCamldoc operates in two steps:
Users can provide their own documentation generator to be used during step 2 instead of the default generators. All the information retrieved during the analysis step is available through the Odoc_info module, which gives access to all the types and functions representing the elements found in the given modules, with their associated description.
The files you can use to define custom generators are installed in the ocamldoc sub-directory of the OCaml standard library.
The type of a generator module depends on the kind of generated documentation. Here is the list of generator module types, with the name of the generator class in the module :
That is, to define a new generator, one must implement a module with the expected signature, and with the given generator class, providing the generate method as entry point to make the generator generates documentation for a given list of modules :
method generate : Odoc_info.Module.t_module list -> unit
This method will be called with the list of analysed and possibly merged Odoc_info.t_module structures.
It is recommended to inherit from the current generator of the same kind as the one you want to define. Doing so, it is possible to load various custom generators to combine improvements brought by each one.
This is done using first class modules (see chapter 7.10).
The easiest way to define a custom generator is the following this example, here extending the current HTML generator. We don’t have to know if this is the original HTML generator defined in ocamldoc or if it has been extended already by a previously loaded custom generator :
module Generator (G : Odoc_html.Html_generator) = struct class html = object(self) inherit G.html as html (* ... *) method generate module_list = (* ... *) () (* ... *) end end;; let _ = Odoc_args.extend_html_generator (module Generator : Odoc_gen.Html_functor);;
To know which methods to override and/or which methods are available, have a look at the different base implementations, depending on the kind of generator you are extending :
Making a custom generator handle custom tags (see 15.2.5) is very simple.
Here is how to develop a HTML generator handling your custom tags.
The class Odoc_html.Generator.html inherits from the class Odoc_html.info, containing a field tag_functions which is a list pairs composed of a custom tag (e.g. "foo") and a function taking a text and returning HTML code (of type string). To handle a new tag bar, extend the current HTML generator and complete the tag_functions field:
module Generator (G : Odoc_html.Html_generator) = struct class html = object(self) inherit G.html (** Return HTML code for the given text of a bar tag. *) method html_of_bar t = (* your code here *) initializer tag_functions <- ("bar", self#html_of_bar) :: tag_functions end end let _ = Odoc_args.extend_html_generator (module Generator : Odoc_gen.Html_functor);;
Another method of the class Odoc_html.info will look for the function associated to a custom tag and apply it to the text given to the tag. If no function is associated to a custom tag, then the method prints a warning message on stderr.
You can act the same way for other kinds of generators.
The command line analysis is performed after loading the module containing the documentation generator, thus allowing command line options to be added to the list of existing ones. Adding an option can be done with the function
Odoc_args.add_option : string * Arg.spec * string -> unit
Note: Existing command line options can be redefined using this function.
Let custom.ml be the file defining a new generator class. Compilation of custom.ml can be performed by the following command :
ocamlc -I +ocamldoc -c custom.ml
The file custom.cmo is created and can be used this way :
ocamldoc -g custom.cmo other-options source-files
Options selecting a built-in generator to ocamldoc, such as -html, have no effect if a custom generator of the same kind is provided using -g. If the kinds do not match, the selected built-in generator is used and the custom one is ignored.
It is possible to define a generator class in several modules, which are defined in several files file1.ml[i], file2.ml[i], ..., filen.ml[i]. A .cma library file must be created, including all these files.
The following commands create the custom.cma file from files file1.ml[i], ..., filen.ml[i] :
ocamlc -I +ocamldoc -c file1.ml[i] ocamlc -I +ocamldoc -c file2.ml[i] ... ocamlc -I +ocamldoc -c filen.ml[i] ocamlc -o custom.cma -a file1.cmo file2.cmo ... filen.cmo
Then, the following command uses custom.cma as custom generator:
ocamldoc -g custom.cma other-options source-files
This chapter describes the toplevel system for OCaml, that permits interactive use of the OCaml system through a read-eval-print loop. In this mode, the system repeatedly reads OCaml phrases from the input, then typechecks, compile and evaluate them, then prints the inferred type and result value, if any. The system prints a # (sharp) prompt before reading each phrase.
Input to the toplevel can span several lines. It is terminated by ;; (a double-semicolon). The toplevel input consists in one or several toplevel phrases, with the following syntax:
|
A phrase can consist of a definition, like those found in implementations of compilation units or in struct … end module expressions. The definition can bind value names, type names, an exception, a module name, or a module type name. The toplevel system performs the bindings, then prints the types and values (if any) for the names thus defined.
A phrase may also consist in a value expression (section 6.7). It is simply evaluated without performing any bindings, and its value is printed.
Finally, a phrase can also consist in a toplevel directive, starting with # (the sharp sign). These directives control the behavior of the toplevel; they are listed below in section 9.2.
Unix: The toplevel system is started by the command ocaml, as follows:ocaml options objects # interactive mode ocaml options objects scriptfile # script modeoptions are described below. objects are filenames ending in .cmo or .cma; they are loaded into the interpreter immediately after options are set. scriptfile is any file name not ending in .cmo or .cma.If no scriptfile is given on the command line, the toplevel system enters interactive mode: phrases are read on standard input, results are printed on standard output, errors on standard error. End-of-file on standard input terminates ocaml (see also the #quit directive in section 9.2).
On start-up (before the first phrase is read), if the file .ocamlinit exists in the current directory, its contents are read as a sequence of OCaml phrases and executed as per the #use directive described in section 9.2. The evaluation outcode for each phrase are not displayed. If the current directory does not contain an .ocamlinit file, but the user’s home directory (environment variable HOME) does, the latter is read and executed as described below.
The toplevel system does not perform line editing, but it can easily be used in conjunction with an external line editor such as ledit, ocaml2 or rlwrap (see the Caml Hump). Another option is to use ocaml under Gnu Emacs, which gives the full editing power of Emacs (command run-caml from library inf-caml).
At any point, the parsing, compilation or evaluation of the current phrase can be interrupted by pressing ctrl-C (or, more precisely, by sending the INTR signal to the ocaml process). The toplevel then immediately returns to the # prompt.
If scriptfile is given on the command-line to ocaml, the toplevel system enters script mode: the contents of the file are read as a sequence of OCaml phrases and executed, as per the #use directive (section 9.2). The outcome of the evaluation is not printed. On reaching the end of file, the ocaml command exits immediately. No commands are read from standard input. Sys.argv is transformed, ignoring all OCaml parameters, and starting with the script file name in Sys.argv.(0).
In script mode, the first line of the script is ignored if it starts with #!. Thus, it should be possible to make the script itself executable and put as first line #!/usr/local/bin/ocaml, thus calling the toplevel system automatically when the script is run. However, ocaml itself is a #! script on most installations of OCaml, and Unix kernels usually do not handle nested #! scripts. A better solution is to put the following as the first line of the script:
#!/usr/local/bin/ocamlrun /usr/local/bin/ocaml
The following command-line options are recognized by the ocaml command.
If the given directory starts with +, it is taken relative to the standard library directory. For instance, -I +labltk adds the subdirectory labltk of the standard library to the search path.
Directories can also be added to the list once the toplevel is running with the #directory directive (section 9.2).
The warning-list argument is a sequence of warning specifiers, with no separators between them. A warning specifier is one of the following:
Warning numbers and letters which are out of the range of warnings that are currently defined are ignored. The warnings are as follows.
The default setting is -w +a-4-6-7-9-27-29-32..39-41..42-44-45-48-50. It is displayed by -help. Note that warnings 5 and 10 are not always triggered, depending on the internals of the type checker.
Note: it is not recommended to use warning sets (i.e. letters) as arguments to -warn-error in production code, because this can break your build when future versions of OCaml add some new warnings.
The default setting is -warn-error -a+31 (only warning 31 is fatal).
Unix: The following environment variables are also consulted:
- TERM
- When printing error messages, the toplevel system attempts to underline visually the location of the error. It consults the TERM variable to determines the type of output terminal and look up its capabilities in the terminal database.
- HOME
- Directory where the .ocamlinit file is searched.
The following directives control the toplevel behavior, load files in memory, and trace program execution.
Note: all directives start with a # (sharp) symbol. This # must be typed before the directive, and must not be confused with the # prompt displayed by the interactive loop. For instance, typing #quit;; will exit the toplevel loop, but typing quit;; will result in an “unbound value quit” error.
For directives that take file names as arguments, if the given file name specifies no directory, the file is searched in the following directories:
The printing function printer-name should have type Format.formatter -> t -> unit, where t is the type for the values to be printed, and should output its textual representation for the value of type t on the given formatter, using the functions provided by the Format library. For backward compatibility, printer-name can also have type t -> unit and should then output on the standard formatter, but this usage is deprecated.
Toplevel phrases can refer to identifiers defined in compilation units with the same mechanisms as for separately compiled units: either by using qualified names (Modulename.localname), or by using the open construct and unqualified names (see section 6.3).
However, before referencing another compilation unit, an implementation of that unit must be present in memory. At start-up, the toplevel system contains implementations for all the modules in the the standard library. Implementations for user modules can be entered with the #load directive described above. Referencing a unit for which no implementation has been provided results in the error Reference to undefined global `...'.
Note that entering open Mod merely accesses the compiled interface (.cmi file) for Mod, but does not load the implementation of Mod, and does not cause any error if no implementation of Mod has been loaded. The error “reference to undefined global Mod” will occur only when executing a value or module definition that refers to Mod.
This section describes and explains the most frequently encountered error messages.
If filename has the format mod.cmi, this means you have referenced the compilation unit mod, but its compiled interface could not be found. Fix: compile mod.mli or mod.ml first, to create the compiled interface mod.cmi.
If filename has the format mod.cmo, this means you are trying to load with #load a bytecode object file that does not exist yet. Fix: compile mod.ml first.
If your program spans several directories, this error can also appear because you haven’t specified the directories to look into. Fix: use the #directory directive to add the correct directories to the search path.
The ocamlmktop command builds OCaml toplevels that contain user code preloaded at start-up.
The ocamlmktop command takes as argument a set of .cmo and .cma files, and links them with the object files that implement the OCaml toplevel. The typical use is:
ocamlmktop -o mytoplevel foo.cmo bar.cmo gee.cmo
This creates the bytecode file mytoplevel, containing the OCaml toplevel system, plus the code from the three .cmo files. This toplevel is directly executable and is started by:
./mytoplevel
This enters a regular toplevel loop, except that the code from foo.cmo, bar.cmo and gee.cmo is already loaded in memory, just as if you had typed:
#load "foo.cmo";; #load "bar.cmo";; #load "gee.cmo";;
on entrance to the toplevel. The modules Foo, Bar and Gee are not opened, though; you still have to do
open Foo;;
yourself, if this is what you wish.
The following command-line options are recognized by ocamlmktop.
The dynlink library supports type-safe dynamic loading and linking of bytecode object files (.cmo and .cma files) in a running bytecode program, or of native plugins (usually .cmxs files) in a running native program. Type safety is ensured by limiting the set of modules from the running program that the loaded object file can access, and checking that the running program and the loaded object file have been compiled against the same interfaces for these modules. In native code, there are also some compatibility checks on the implementations (to avoid errors with cross-module optimizations); it might be useful to hide .cmx files when building native plugins so that they remain independent of the implementation of modules in the main program.
Programs that use the dynlink library simply need to link dynlink.cma or dynlink.cmxa with their object files and other libraries.
This section describes the kinds of values that are manipulated by OCaml programs.
Integer values are integer numbers from −230 to 230−1, that is −1073741824 to 1073741823. The implementation may support a wider range of integer values: on 64-bit platforms, the current implementation supports integers ranging from −262 to 262−1.
Floating-point values are numbers in floating-point representation. The current implementation uses double-precision floating-point numbers conforming to the IEEE 754 standard, with 53 bits of mantissa and an exponent ranging from −1022 to 1023.
Character values are represented as 8-bit integers between 0 and 255. Character codes between 0 and 127 are interpreted following the ASCII standard. The current implementation interprets character codes between 128 and 255 following the ISO 8859-1 standard.
String values are finite sequences of characters. The current implementation supports strings containing up to 224 − 5 characters (16777211 characters); on 64-bit platforms, the limit is 257 − 9.
Tuples of values are written (v1, …, vn), standing for the n-tuple of values v1 to vn. The current implementation supports tuple of up to 222 − 1 elements (4194303 elements).
Record values are labeled tuples of values. The record value written { field1 = v1; …; fieldn = vn } associates the value vi to the record field fieldi, for i = 1 … n. The current implementation supports records with up to 222 − 1 fields (4194303 fields).
Arrays are finite, variable-sized sequences of values of the same type. The current implementation supports arrays containing up to 222 − 1 elements (4194303 elements) unless the elements are floating-point numbers (2097151 elements in this case); on 64-bit platforms, the limit is 254 − 1 for all arrays.
Variant values are either a constant constructor, or a non-constant constructor applied to a number of values. The former case is written constr; the latter case is written constr (v1, ... , vn ), where the vi are said to be the arguments of the non-constant constructor constr. The parentheses may be omitted if there is only one argument.
The following constants are treated like built-in constant constructors:
Constant | Constructor |
false | the boolean false |
true | the boolean true |
() | the “unit” value |
[] | the empty list |
The current implementation limits each variant type to have at most 246 non-constant constructors and 230−1 constant constructors.
Polymorphic variants are an alternate form of variant values, not belonging explicitly to a predefined variant type, and following specific typing rules. They can be either constant, written `tag-name, or non-constant, written `tag-name(v).
Functional values are mappings from values to values.
Objects are composed of a hidden internal state which is a record of instance variables, and a set of methods for accessing and modifying these variables. The structure of an object is described by the toplevel class that created it.
The ocamlrun command executes bytecode files produced by the linking phase of the ocamlc command.
The ocamlrun command comprises three main parts: the bytecode interpreter, that actually executes bytecode files; the memory allocator and garbage collector; and a set of C functions that implement primitive operations such as input/output.
The usage for ocamlrun is:
ocamlrun options bytecode-executable arg1 ... argn
The first non-option argument is taken to be the name of the file containing the executable bytecode. (That file is searched in the executable path as well as in the current directory.) The remaining arguments are passed to the OCaml program, in the string array Sys.argv. Element 0 of this array is the name of the bytecode executable file; elements 1 to n are the remaining arguments arg1 to argn.
As mentioned in chapter 8, the bytecode executable files produced by the ocamlc command are self-executable, and manage to launch the ocamlrun command on themselves automatically. That is, assuming a.out is a bytecode executable file,
a.out arg1 ... argn
works exactly as
ocamlrun a.out arg1 ... argn
Notice that it is not possible to pass options to ocamlrun when invoking a.out directly.
Windows: Under several versions of Windows, bytecode executable files are self-executable only if their name ends in .exe. It is recommended to always give .exe names to bytecode executables, e.g. compile with ocamlc -o myprog.exe ... rather than ocamlc -o myprog ....
The following command-line options are recognized by ocamlrun.
The following environment variables are also consulted:
If the option letter is not recognized, the whole parameter is ignored; if the equal sign or the number is missing, the value is taken as 1; if the multiplier is not recognized, it is ignored.
For example, on a 32-bit machine, under bash the command
export OCAMLRUNPARAM='b,s=256k,v=0x015'
tells a subsequent ocamlrun to print backtraces for uncaught exceptions, set its initial minor heap size to 1 megabyte and print a message at the start of each major GC cycle, when the heap size changes, and when compaction is triggered.
On platforms that support dynamic loading, ocamlrun can link dynamically with C shared libraries (DLLs) providing additional C primitives beyond those provided by the standard runtime system. The names for these libraries are provided at link time as described in section 19.1.4), and recorded in the bytecode executable file; ocamlrun, then, locates these libraries and resolves references to their primitives when the bytecode executable program starts.
The ocamlrun command searches shared libraries in the following directories, in the order indicated:
This section describes and explains the most frequently encountered error messages.
To help you diagnose this error, run your program with the -v option to ocamlrun, or with the OCAMLRUNPARAM environment variable set to v=63. If it displays lots of “Growing stack…” messages, this is probably a looping recursive function. If it displays lots of “Growing heap…” messages, with the heap size growing slowly, this is probably an attempt to construct a data structure with too many (infinitely many?) cells. If it displays few “Growing heap…” messages, but with a huge increment in the heap size, this is probably an attempt to build an excessively large array, string or byte sequence.
(Chapter written by Jérôme Vouillon, Didier Rémy and Jacques Garrigue)
This chapter gives an overview of the object-oriented features of OCaml. Note that the relation between object, class and type in OCaml is very different from that in mainstream object-oriented languages like Java or C++, so that you should not assume that similar keywords mean the same thing.
3.1 Classes and objects
3.2 Immediate objects
3.3 Reference to self
3.4 Initializers
3.5 Virtual methods
3.6 Private methods
3.7 Class interfaces
3.8 Inheritance
3.9 Multiple inheritance
3.10 Parameterized classes
3.11 Polymorphic methods
3.12 Using coercions
3.13 Functional objects
3.14 Cloning objects
3.15 Recursive classes
3.16 Binary methods
3.17 Friends
The class point below defines one instance variable x and two methods get_x and move. The initial value of the instance variable is 0. The variable x is declared mutable, so the method move can change its value.
class point = object val mutable x = 0 method get_x = x method move d = x <- x + d end;;class point : object val mutable x : int method get_x : int method move : int -> unit end
We now create a new point p, instance of the point class.
let p = new point;;val p : point = <obj>
Note that the type of p is point. This is an abbreviation automatically defined by the class definition above. It stands for the object type <get_x : int; move : int -> unit>, listing the methods of class point along with their types.
We now invoke some methods to p:
p#get_x;;- : int = 0
p#move 3;;- : unit = ()
p#get_x;;- : int = 3
The evaluation of the body of a class only takes place at object creation time. Therefore, in the following example, the instance variable x is initialized to different values for two different objects.
let x0 = ref 0;;val x0 : int ref = {contents = 0}
class point = object val mutable x = incr x0; !x0 method get_x = x method move d = x <- x + d end;;class point : object val mutable x : int method get_x : int method move : int -> unit end
new point#get_x;;- : int = 1
new point#get_x;;- : int = 2
The class point can also be abstracted over the initial values of the x coordinate.
class point = fun x_init -> object val mutable x = x_init method get_x = x method move d = x <- x + d end;;class point : int -> object val mutable x : int method get_x : int method move : int -> unit end
Like in function definitions, the definition above can be abbreviated as:
class point x_init = object val mutable x = x_init method get_x = x method move d = x <- x + d end;;class point : int -> object val mutable x : int method get_x : int method move : int -> unit end
An instance of the class point is now a function that expects an initial parameter to create a point object:
new point;;- : int -> point = <fun>
let p = new point 7;;val p : point = <obj>
The parameter x_init is, of course, visible in the whole body of the definition, including methods. For instance, the method get_offset in the class below returns the position of the object relative to its initial position.
class point x_init = object val mutable x = x_init method get_x = x method get_offset = x - x_init method move d = x <- x + d end;;class point : int -> object val mutable x : int method get_offset : int method get_x : int method move : int -> unit end
Expressions can be evaluated and bound before defining the object body of the class. This is useful to enforce invariants. For instance, points can be automatically adjusted to the nearest point on a grid, as follows:
class adjusted_point x_init = let origin = (x_init / 10) * 10 in object val mutable x = origin method get_x = x method get_offset = x - origin method move d = x <- x + d end;;class adjusted_point : int -> object val mutable x : int method get_offset : int method get_x : int method move : int -> unit end
(One could also raise an exception if the x_init coordinate is not on the grid.) In fact, the same effect could here be obtained by calling the definition of class point with the value of the origin.
class adjusted_point x_init = point ((x_init / 10) * 10);;class adjusted_point : int -> point
An alternate solution would have been to define the adjustment in a special allocation function:
let new_adjusted_point x_init = new point ((x_init / 10) * 10);;val new_adjusted_point : int -> point = <fun>
However, the former pattern is generally more appropriate, since the code for adjustment is part of the definition of the class and will be inherited.
This ability provides class constructors as can be found in other languages. Several constructors can be defined this way to build objects of the same class but with different initialization patterns; an alternative is to use initializers, as described below in section 3.4.
There is another, more direct way to create an object: create it without going through a class.
The syntax is exactly the same as for class expressions, but the result is a single object rather than a class. All the constructs described in the rest of this section also apply to immediate objects.
let p = object val mutable x = 0 method get_x = x method move d = x <- x + d end;;val p : < get_x : int; move : int -> unit > = <obj>
p#get_x;;- : int = 0
p#move 3;;- : unit = ()
p#get_x;;- : int = 3
Unlike classes, which cannot be defined inside an expression, immediate objects can appear anywhere, using variables from their environment.
let minmax x y = if x < y then object method min = x method max = y end else object method min = y method max = x end;;val minmax : 'a -> 'a -> < max : 'a; min : 'a > = <fun>
Immediate objects have two weaknesses compared to classes: their types are not abbreviated, and you cannot inherit from them. But these two weaknesses can be advantages in some situations, as we will see in sections 3.3 and 3.10.
A method or an initializer can send messages to self (that is, the current object). For that, self must be explicitly bound, here to the variable s (s could be any identifier, even though we will often choose the name self.)
class printable_point x_init = object (s) val mutable x = x_init method get_x = x method move d = x <- x + d method print = print_int s#get_x end;;class printable_point : int -> object val mutable x : int method get_x : int method move : int -> unit method print : unit end
let p = new printable_point 7;;val p : printable_point = <obj>
p#print;;7- : unit = ()
Dynamically, the variable s is bound at the invocation of a method. In particular, when the class printable_point is inherited, the variable s will be correctly bound to the object of the subclass.
A common problem with self is that, as its type may be extended in subclasses, you cannot fix it in advance. Here is a simple example.
let ints = ref [];;val ints : '_a list ref = {contents = []}
class my_int = object (self) method n = 1 method register = ints := self :: !ints end ;;Error: This expression has type < n : int; register : 'a; .. > but an expression was expected of type 'b Self type cannot escape its class
You can ignore the first two lines of the error message. What matters is the last one: putting self into an external reference would make it impossible to extend it through inheritance. We will see in section 3.12 a workaround to this problem. Note however that, since immediate objects are not extensible, the problem does not occur with them.
let my_int = object (self) method n = 1 method register = ints := self :: !ints end;;val my_int : < n : int; register : unit > = <obj>
Let-bindings within class definitions are evaluated before the object is constructed. It is also possible to evaluate an expression immediately after the object has been built. Such code is written as an anonymous hidden method called an initializer. Therefore, it can access self and the instance variables.
class printable_point x_init = let origin = (x_init / 10) * 10 in object (self) val mutable x = origin method get_x = x method move d = x <- x + d method print = print_int self#get_x initializer print_string "new point at "; self#print; print_newline () end;;class printable_point : int -> object val mutable x : int method get_x : int method move : int -> unit method print : unit end
let p = new printable_point 17;;new point at 10 val p : printable_point = <obj>
Initializers cannot be overridden. On the contrary, all initializers are evaluated sequentially. Initializers are particularly useful to enforce invariants. Another example can be seen in section 5.1.
It is possible to declare a method without actually defining it, using the keyword virtual. This method will be provided later in subclasses. A class containing virtual methods must be flagged virtual, and cannot be instantiated (that is, no object of this class can be created). It still defines type abbreviations (treating virtual methods as other methods.)
class virtual abstract_point x_init = object (self) method virtual get_x : int method get_offset = self#get_x - x_init method virtual move : int -> unit end;;class virtual abstract_point : int -> object method get_offset : int method virtual get_x : int method virtual move : int -> unit end
class point x_init = object inherit abstract_point x_init val mutable x = x_init method get_x = x method move d = x <- x + d end;;class point : int -> object val mutable x : int method get_offset : int method get_x : int method move : int -> unit end
Instance variables can also be declared as virtual, with the same effect as with methods.
class virtual abstract_point2 = object val mutable virtual x : int method move d = x <- x + d end;;class virtual abstract_point2 : object val mutable virtual x : int method move : int -> unit end
class point2 x_init = object inherit abstract_point2 val mutable x = x_init method get_offset = x - x_init end;;class point2 : int -> object val mutable x : int method get_offset : int method move : int -> unit end
Private methods are methods that do not appear in object interfaces. They can only be invoked from other methods of the same object.
class restricted_point x_init = object (self) val mutable x = x_init method get_x = x method private move d = x <- x + d method bump = self#move 1 end;;class restricted_point : int -> object val mutable x : int method bump : unit method get_x : int method private move : int -> unit end
let p = new restricted_point 0;;val p : restricted_point = <obj>
p#move 10 ;;Error: This expression has type restricted_point It has no method move
p#bump;;- : unit = ()
Note that this is not the same thing as private and protected methods in Java or C++, which can be called from other objects of the same class. This is a direct consequence of the independence between types and classes in OCaml: two unrelated classes may produce objects of the same type, and there is no way at the type level to ensure that an object comes from a specific class. However a possible encoding of friend methods is given in section 3.17.
Private methods are inherited (they are by default visible in subclasses), unless they are hidden by signature matching, as described below.
Private methods can be made public in a subclass.
class point_again x = object (self) inherit restricted_point x method virtual move : _ end;;class point_again : int -> object val mutable x : int method bump : unit method get_x : int method move : int -> unit end
The annotation virtual here is only used to mention a method without providing its definition. Since we didn’t add the private annotation, this makes the method public, keeping the original definition.
An alternative definition is
class point_again x = object (self : < move : _; ..> ) inherit restricted_point x end;;class point_again : int -> object val mutable x : int method bump : unit method get_x : int method move : int -> unit end
The constraint on self’s type is requiring a public move method, and this is sufficient to override private.
One could think that a private method should remain private in a subclass. However, since the method is visible in a subclass, it is always possible to pick its code and define a method of the same name that runs that code, so yet another (heavier) solution would be:
class point_again x = object inherit restricted_point x as super method move = super#move end;;class point_again : int -> object val mutable x : int method bump : unit method get_x : int method move : int -> unit end
Of course, private methods can also be virtual. Then, the keywords must appear in this order method private virtual.
Class interfaces are inferred from class definitions. They may also be defined directly and used to restrict the type of a class. Like class declarations, they also define a new type abbreviation.
class type restricted_point_type = object method get_x : int method bump : unit end;;class type restricted_point_type = object method bump : unit method get_x : int end
fun (x : restricted_point_type) -> x;;- : restricted_point_type -> restricted_point_type = <fun>
In addition to program documentation, class interfaces can be used to constrain the type of a class. Both concrete instance variables and concrete private methods can be hidden by a class type constraint. Public methods and virtual members, however, cannot.
class restricted_point' x = (restricted_point x : restricted_point_type);;class restricted_point' : int -> restricted_point_type
Or, equivalently:
class restricted_point' = (restricted_point : int -> restricted_point_type);;class restricted_point' : int -> restricted_point_type
The interface of a class can also be specified in a module signature, and used to restrict the inferred signature of a module.
module type POINT = sig class restricted_point' : int -> object method get_x : int method bump : unit end end;;module type POINT = sig class restricted_point' : int -> object method bump : unit method get_x : int end end
module Point : POINT = struct class restricted_point' = restricted_point end;;module Point : POINT
We illustrate inheritance by defining a class of colored points that inherits from the class of points. This class has all instance variables and all methods of class point, plus a new instance variable c and a new method color.
class colored_point x (c : string) = object inherit point x val c = c method color = c end;;class colored_point : int -> string -> object val c : string val mutable x : int method color : string method get_offset : int method get_x : int method move : int -> unit end
let p' = new colored_point 5 "red";;val p' : colored_point = <obj>
p'#get_x, p'#color;;- : int * string = (5, "red")
A point and a colored point have incompatible types, since a point has no method color. However, the function get_x below is a generic function applying method get_x to any object p that has this method (and possibly some others, which are represented by an ellipsis in the type). Thus, it applies to both points and colored points.
let get_succ_x p = p#get_x + 1;;val get_succ_x : < get_x : int; .. > -> int = <fun>
get_succ_x p + get_succ_x p';;- : int = 8
Methods need not be declared previously, as shown by the example:
let set_x p = p#set_x;;val set_x : < set_x : 'a; .. > -> 'a = <fun>
let incr p = set_x p (get_succ_x p);;val incr : < get_x : int; set_x : int -> 'a; .. > -> 'a = <fun>
Multiple inheritance is allowed. Only the last definition of a method is kept: the redefinition in a subclass of a method that was visible in the parent class overrides the definition in the parent class. Previous definitions of a method can be reused by binding the related ancestor. Below, super is bound to the ancestor printable_point. The name super is a pseudo value identifier that can only be used to invoke a super-class method, as in super#print.
class printable_colored_point y c = object (self) val c = c method color = c inherit printable_point y as super method print = print_string "("; super#print; print_string ", "; print_string (self#color); print_string ")" end;;class printable_colored_point : int -> string -> object val c : string val mutable x : int method color : string method get_x : int method move : int -> unit method print : unit end
let p' = new printable_colored_point 17 "red";;new point at (10, red) val p' : printable_colored_point = <obj>
p'#print;;(10, red)- : unit = ()
A private method that has been hidden in the parent class is no longer visible, and is thus not overridden. Since initializers are treated as private methods, all initializers along the class hierarchy are evaluated, in the order they are introduced.
Reference cells can be implemented as objects. The naive definition fails to typecheck:
class oref x_init = object val mutable x = x_init method get = x method set y = x <- y end;;Error: Some type variables are unbound in this type: class oref : 'a -> object val mutable x : 'a method get : 'a method set : 'a -> unit end The method get has type 'a where 'a is unbound
The reason is that at least one of the methods has a polymorphic type (here, the type of the value stored in the reference cell), thus either the class should be parametric, or the method type should be constrained to a monomorphic type. A monomorphic instance of the class could be defined by:
class oref (x_init:int) = object val mutable x = x_init method get = x method set y = x <- y end;;class oref : int -> object val mutable x : int method get : int method set : int -> unit end
Note that since immediate objects do not define a class type, they have no such restriction.
let new_oref x_init = object val mutable x = x_init method get = x method set y = x <- y end;;val new_oref : 'a -> < get : 'a; set : 'a -> unit > = <fun>
On the other hand, a class for polymorphic references must explicitly list the type parameters in its declaration. Class type parameters are listed between [ and ]. The type parameters must also be bound somewhere in the class body by a type constraint.
class ['a] oref x_init = object val mutable x = (x_init : 'a) method get = x method set y = x <- y end;;class ['a] oref : 'a -> object val mutable x : 'a method get : 'a method set : 'a -> unit end
let r = new oref 1 in r#set 2; (r#get);;- : int = 2
The type parameter in the declaration may actually be constrained in the body of the class definition. In the class type, the actual value of the type parameter is displayed in the constraint clause.
class ['a] oref_succ (x_init:'a) = object val mutable x = x_init + 1 method get = x method set y = x <- y end;;class ['a] oref_succ : 'a -> object constraint 'a = int val mutable x : int method get : int method set : int -> unit end
Let us consider a more complex example: define a circle, whose center may be any kind of point. We put an additional type constraint in method move, since no free variables must remain unaccounted for by the class type parameters.
class ['a] circle (c : 'a) = object val mutable center = c method center = center method set_center c = center <- c method move = (center#move : int -> unit) end;;class ['a] circle : 'a -> object constraint 'a = < move : int -> unit; .. > val mutable center : 'a method center : 'a method move : int -> unit method set_center : 'a -> unit end
An alternate definition of circle, using a constraint clause in the class definition, is shown below. The type #point used below in the constraint clause is an abbreviation produced by the definition of class point. This abbreviation unifies with the type of any object belonging to a subclass of class point. It actually expands to < get_x : int; move : int -> unit; .. >. This leads to the following alternate definition of circle, which has slightly stronger constraints on its argument, as we now expect center to have a method get_x.
class ['a] circle (c : 'a) = object constraint 'a = #point val mutable center = c method center = center method set_center c = center <- c method move = center#move end;;class ['a] circle : 'a -> object constraint 'a = #point val mutable center : 'a method center : 'a method move : int -> unit method set_center : 'a -> unit end
The class colored_circle is a specialized version of class circle that requires the type of the center to unify with #colored_point, and adds a method color. Note that when specializing a parameterized class, the instance of type parameter must always be explicitly given. It is again written between [ and ].
class ['a] colored_circle c = object constraint 'a = #colored_point inherit ['a] circle c method color = center#color end;;class ['a] colored_circle : 'a -> object constraint 'a = #colored_point val mutable center : 'a method center : 'a method color : string method move : int -> unit method set_center : 'a -> unit end
While parameterized classes may be polymorphic in their contents, they are not enough to allow polymorphism of method use.
A classical example is defining an iterator.
List.fold_left;;- : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun>
class ['a] intlist (l : int list) = object method empty = (l = []) method fold f (accu : 'a) = List.fold_left f accu l end;;class ['a] intlist : int list -> object method empty : bool method fold : ('a -> int -> 'a) -> 'a -> 'a end
At first look, we seem to have a polymorphic iterator, however this does not work in practice.
let l = new intlist [1; 2; 3];;val l : '_a intlist = <obj>
l#fold (fun x y -> x+y) 0;;- : int = 6
l;;- : int intlist = <obj>
l#fold (fun s x -> s ^ string_of_int x ^ " ") "" ;;Error: This expression has type int but an expression was expected of type string
Our iterator works, as shows its first use for summation. However, since objects themselves are not polymorphic (only their constructors are), using the fold method fixes its type for this individual object. Our next attempt to use it as a string iterator fails.
The problem here is that quantification was wrongly located: it is not the class we want to be polymorphic, but the fold method. This can be achieved by giving an explicitly polymorphic type in the method definition.
class intlist (l : int list) = object method empty = (l = []) method fold : 'a. ('a -> int -> 'a) -> 'a -> 'a = fun f accu -> List.fold_left f accu l end;;class intlist : int list -> object method empty : bool method fold : ('a -> int -> 'a) -> 'a -> 'a end
let l = new intlist [1; 2; 3];;val l : intlist = <obj>
l#fold (fun x y -> x+y) 0;;- : int = 6
l#fold (fun s x -> s ^ string_of_int x ^ " ") "";;- : string = "1 2 3 "
As you can see in the class type shown by the compiler, while polymorphic method types must be fully explicit in class definitions (appearing immediately after the method name), quantified type variables can be left implicit in class descriptions. Why require types to be explicit? The problem is that (int -> int -> int) -> int -> int would also be a valid type for fold, and it happens to be incompatible with the polymorphic type we gave (automatic instantiation only works for toplevel types variables, not for inner quantifiers, where it becomes an undecidable problem.) So the compiler cannot choose between those two types, and must be helped.
However, the type can be completely omitted in the class definition if it is already known, through inheritance or type constraints on self. Here is an example of method overriding.
class intlist_rev l = object inherit intlist l method fold f accu = List.fold_left f accu (List.rev l) end;;
The following idiom separates description and definition.
class type ['a] iterator = object method fold : ('b -> 'a -> 'b) -> 'b -> 'b end;;
class intlist l = object (self : int #iterator) method empty = (l = []) method fold f accu = List.fold_left f accu l end;;
Note here the (self : int #iterator) idiom, which ensures that this object implements the interface iterator.
Polymorphic methods are called in exactly the same way as normal methods, but you should be aware of some limitations of type inference. Namely, a polymorphic method can only be called if its type is known at the call site. Otherwise, the method will be assumed to be monomorphic, and given an incompatible type.
let sum lst = lst#fold (fun x y -> x+y) 0;;val sum : < fold : (int -> int -> int) -> int -> 'a; .. > -> 'a = <fun>
sum l ;;Error: This expression has type intlist but an expression was expected of type < fold : (int -> int -> int) -> int -> 'a; .. > Types for method fold are incompatible
The workaround is easy: you should put a type constraint on the parameter.
let sum (lst : _ #iterator) = lst#fold (fun x y -> x+y) 0;;val sum : int #iterator -> int = <fun>
Of course the constraint may also be an explicit method type. Only occurences of quantified variables are required.
let sum lst = (lst : < fold : 'a. ('a -> _ -> 'a) -> 'a -> 'a; .. >)#fold (+) 0;;val sum : < fold : 'a. ('a -> int -> 'a) -> 'a -> 'a; .. > -> int = <fun>
Another use of polymorphic methods is to allow some form of implicit subtyping in method arguments. We have already seen in section 3.8 how some functions may be polymorphic in the class of their argument. This can be extended to methods.
class type point0 = object method get_x : int end;;class type point0 = object method get_x : int end
class distance_point x = object inherit point x method distance : 'a. (#point0 as 'a) -> int = fun other -> abs (other#get_x - x) end;;class distance_point : int -> object val mutable x : int method distance : #point0 -> int method get_offset : int method get_x : int method move : int -> unit end
let p = new distance_point 3 in (p#distance (new point 8), p#distance (new colored_point 1 "blue"));;- : int * int = (5, 2)
Note here the special syntax (#point0 as 'a) we have to use to quantify the extensible part of #point0. As for the variable binder, it can be omitted in class specifications. If you want polymorphism inside object field it must be quantified independently.
class multi_poly = object method m1 : 'a. (< n1 : 'b. 'b -> 'b; .. > as 'a) -> _ = fun o -> o#n1 true, o#n1 "hello" method m2 : 'a 'b. (< n2 : 'b -> bool; .. > as 'a) -> 'b -> _ = fun o x -> o#n2 x end;;class multi_poly : object method m1 : < n1 : 'b. 'b -> 'b; .. > -> bool * string method m2 : < n2 : 'b -> bool; .. > -> 'b -> bool end
In method m1, o must be an object with at least a method n1, itself polymorphic. In method m2, the argument of n2 and x must have the same type, which is quantified at the same level as 'a.
Subtyping is never implicit. There are, however, two ways to perform subtyping. The most general construction is fully explicit: both the domain and the codomain of the type coercion must be given.
We have seen that points and colored points have incompatible types. For instance, they cannot be mixed in the same list. However, a colored point can be coerced to a point, hiding its color method:
let colored_point_to_point cp = (cp : colored_point :> point);;val colored_point_to_point : colored_point -> point = <fun>
let p = new point 3 and q = new colored_point 4 "blue";;val p : point = <obj> val q : colored_point = <obj>
let l = [p; (colored_point_to_point q)];;val l : point list = [<obj>; <obj>]
An object of type t can be seen as an object of type t' only if t is a subtype of t'. For instance, a point cannot be seen as a colored point.
(p : point :> colored_point);;Error: Type point = < get_offset : int; get_x : int; move : int -> unit > is not a subtype of colored_point = < color : string; get_offset : int; get_x : int; move : int -> unit >
Indeed, narrowing coercions without runtime checks would be unsafe. Runtime type checks might raise exceptions, and they would require the presence of type information at runtime, which is not the case in the OCaml system. For these reasons, there is no such operation available in the language.
Be aware that subtyping and inheritance are not related. Inheritance is a syntactic relation between classes while subtyping is a semantic relation between types. For instance, the class of colored points could have been defined directly, without inheriting from the class of points; the type of colored points would remain unchanged and thus still be a subtype of points.
The domain of a coercion can often be omitted. For instance, one can define:
let to_point cp = (cp :> point);;val to_point : #point -> point = <fun>
In this case, the function colored_point_to_point is an instance of the function to_point. This is not always true, however. The fully explicit coercion is more precise and is sometimes unavoidable. Consider, for example, the following class:
class c0 = object method m = {< >} method n = 0 end;;class c0 : object ('a) method m : 'a method n : int end
The object type c0 is an abbreviation for <m : 'a; n : int> as 'a. Consider now the type declaration:
class type c1 = object method m : c1 end;;class type c1 = object method m : c1 end
The object type c1 is an abbreviation for the type <m : 'a> as 'a. The coercion from an object of type c0 to an object of type c1 is correct:
fun (x:c0) -> (x : c0 :> c1);;- : c0 -> c1 = <fun>
However, the domain of the coercion cannot always be omitted. In that case, the solution is to use the explicit form. Sometimes, a change in the class-type definition can also solve the problem
class type c2 = object ('a) method m : 'a end;;class type c2 = object ('a) method m : 'a end
fun (x:c0) -> (x :> c2);;- : c0 -> c2 = <fun>
While class types c1 and c2 are different, both object types c1 and c2 expand to the same object type (same method names and types). Yet, when the domain of a coercion is left implicit and its co-domain is an abbreviation of a known class type, then the class type, rather than the object type, is used to derive the coercion function. This allows leaving the domain implicit in most cases when coercing form a subclass to its superclass. The type of a coercion can always be seen as below:
let to_c1 x = (x :> c1);;val to_c1 : < m : #c1; .. > -> c1 = <fun>
let to_c2 x = (x :> c2);;val to_c2 : #c2 -> c2 = <fun>
Note the difference between these two coercions: in the case of to_c2, the type #c2 = < m : 'a; .. > as 'a is polymorphically recursive (according to the explicit recursion in the class type of c2); hence the success of applying this coercion to an object of class c0. On the other hand, in the first case, c1 was only expanded and unrolled twice to obtain < m : < m : c1; .. >; .. > (remember #c1 = < m : c1; .. >), without introducing recursion. You may also note that the type of to_c2 is #c2 -> c2 while the type of to_c1 is more general than #c1 -> c1. This is not always true, since there are class types for which some instances of #c are not subtypes of c, as explained in section 3.16. Yet, for parameterless classes the coercion (_ :> c) is always more general than (_ : #c :> c).
A common problem may occur when one tries to define a coercion to a class c while defining class c. The problem is due to the type abbreviation not being completely defined yet, and so its subtypes are not clearly known. Then, a coercion (_ :> c) or (_ : #c :> c) is taken to be the identity function, as in
function x -> (x :> 'a);;- : 'a -> 'a = <fun>
As a consequence, if the coercion is applied to self, as in the following example, the type of self is unified with the closed type c (a closed object type is an object type without ellipsis). This would constrain the type of self be closed and is thus rejected. Indeed, the type of self cannot be closed: this would prevent any further extension of the class. Therefore, a type error is generated when the unification of this type with another type would result in a closed object type.
class c = object method m = 1 end and d = object (self) inherit c method n = 2 method as_c = (self :> c) end;;Error: This expression cannot be coerced to type c = < m : int >; it has type < as_c : c; m : int; n : int; .. > but is here used with type c Self type cannot escape its class
However, the most common instance of this problem, coercing self to its current class, is detected as a special case by the type checker, and properly typed.
class c = object (self) method m = (self :> c) end;;class c : object method m : c end
This allows the following idiom, keeping a list of all objects belonging to a class or its subclasses:
let all_c = ref [];;val all_c : '_a list ref = {contents = []}
class c (m : int) = object (self) method m = m initializer all_c := (self :> c) :: !all_c end;;class c : int -> object method m : int end
This idiom can in turn be used to retrieve an object whose type has been weakened:
let rec lookup_obj obj = function [] -> raise Not_found | obj' :: l -> if (obj :> < >) = (obj' :> < >) then obj' else lookup_obj obj l ;;val lookup_obj : < .. > -> (< .. > as 'a) list -> 'a = <fun>
let lookup_c obj = lookup_obj obj !all_c;;val lookup_c : < .. > -> < m : int > = <fun>
The type < m : int > we see here is just the expansion of c, due to the use of a reference; we have succeeded in getting back an object of type c.
The previous coercion problem can often be avoided by first
defining the abbreviation, using a class type:
class type c' = object method m : int end;;class type c' = object method m : int end
class c : c' = object method m = 1 end and d = object (self) inherit c method n = 2 method as_c = (self :> c') end;;class c : c' and d : object method as_c : c' method m : int method n : int end
It is also possible to use a virtual class. Inheriting from this class simultaneously forces all methods of c to have the same type as the methods of c'.
class virtual c' = object method virtual m : int end;;class virtual c' : object method virtual m : int end
class c = object (self) inherit c' method m = 1 end;;class c : object method m : int end
One could think of defining the type abbreviation directly:
type c' = <m : int>;;
However, the abbreviation #c' cannot be defined directly in a similar way. It can only be defined by a class or a class-type definition. This is because a #-abbreviation carries an implicit anonymous variable .. that cannot be explicitly named. The closer you get to it is:
type 'a c'_class = 'a constraint 'a = < m : int; .. >;;
with an extra type variable capturing the open object type.
It is possible to write a version of class point without assignments on the instance variables. The override construct {< ... >} returns a copy of “self” (that is, the current object), possibly changing the value of some instance variables.
class functional_point y = object val x = y method get_x = x method move d = {< x = x + d >} end;;class functional_point : int -> object ('a) val x : int method get_x : int method move : int -> 'a end
let p = new functional_point 7;;val p : functional_point = <obj>
p#get_x;;- : int = 7
(p#move 3)#get_x;;- : int = 10
p#get_x;;- : int = 7
Note that the type abbreviation functional_point is recursive, which can be seen in the class type of functional_point: the type of self is 'a and 'a appears inside the type of the method move.
The above definition of functional_point is not equivalent to the following:
class bad_functional_point y = object val x = y method get_x = x method move d = new bad_functional_point (x+d) end;;class bad_functional_point : int -> object val x : int method get_x : int method move : int -> bad_functional_point end
While objects of either class will behave the same, objects of their subclasses will be different. In a subclass of bad_functional_point, the method move will keep returning an object of the parent class. On the contrary, in a subclass of functional_point, the method move will return an object of the subclass.
Functional update is often used in conjunction with binary methods as illustrated in section 5.2.1.
Objects can also be cloned, whether they are functional or imperative. The library function Oo.copy makes a shallow copy of an object. That is, it returns a new object that has the same methods and instance variables as its argument. The instance variables are copied but their contents are shared. Assigning a new value to an instance variable of the copy (using a method call) will not affect instance variables of the original, and conversely. A deeper assignment (for example if the instance variable is a reference cell) will of course affect both the original and the copy.
The type of Oo.copy is the following:
Oo.copy;;- : (< .. > as 'a) -> 'a = <fun>
The keyword as in that type binds the type variable 'a to the object type < .. >. Therefore, Oo.copy takes an object with any methods (represented by the ellipsis), and returns an object of the same type. The type of Oo.copy is different from type < .. > -> < .. > as each ellipsis represents a different set of methods. Ellipsis actually behaves as a type variable.
let p = new point 5;;val p : point = <obj>
let q = Oo.copy p;;val q : point = <obj>
q#move 7; (p#get_x, q#get_x);;- : int * int = (5, 12)
In fact, Oo.copy p will behave as p#copy assuming that a public method copy with body {< >} has been defined in the class of p.
Objects can be compared using the generic comparison functions = and <>. Two objects are equal if and only if they are physically equal. In particular, an object and its copy are not equal.
let q = Oo.copy p;;val q : point = <obj>
p = q, p = p;;- : bool * bool = (false, true)
Other generic comparisons such as (<, <=, ...) can also be used on objects. The relation < defines an unspecified but strict ordering on objects. The ordering relationship between two objects is fixed once for all after the two objects have been created and it is not affected by mutation of fields.
Cloning and override have a non empty intersection. They are interchangeable when used within an object and without overriding any field:
class copy = object method copy = {< >} end;;class copy : object ('a) method copy : 'a end
class copy = object (self) method copy = Oo.copy self end;;class copy : object ('a) method copy : 'a end
Only the override can be used to actually override fields, and only the Oo.copy primitive can be used externally.
Cloning can also be used to provide facilities for saving and restoring the state of objects.
class backup = object (self : 'mytype) val mutable copy = None method save = copy <- Some {< copy = None >} method restore = match copy with Some x -> x | None -> self end;;class backup : object ('a) val mutable copy : 'a option method restore : 'a method save : unit end
The above definition will only backup one level. The backup facility can be added to any class by using multiple inheritance.
class ['a] backup_ref x = object inherit ['a] oref x inherit backup end;;class ['a] backup_ref : 'a -> object ('b) val mutable copy : 'b option val mutable x : 'a method get : 'a method restore : 'b method save : unit method set : 'a -> unit end
let rec get p n = if n = 0 then p # get else get (p # restore) (n-1);;val get : (< get : 'b; restore : 'a; .. > as 'a) -> int -> 'b = <fun>
let p = new backup_ref 0 in p # save; p # set 1; p # save; p # set 2; [get p 0; get p 1; get p 2; get p 3; get p 4];;- : int list = [2; 1; 1; 1; 1]
We can define a variant of backup that retains all copies. (We also add a method clear to manually erase all copies.)
class backup = object (self : 'mytype) val mutable copy = None method save = copy <- Some {< >} method restore = match copy with Some x -> x | None -> self method clear = copy <- None end;;class backup : object ('a) val mutable copy : 'a option method clear : unit method restore : 'a method save : unit end
class ['a] backup_ref x = object inherit ['a] oref x inherit backup end;;class ['a] backup_ref : 'a -> object ('b) val mutable copy : 'b option val mutable x : 'a method clear : unit method get : 'a method restore : 'b method save : unit method set : 'a -> unit end
let p = new backup_ref 0 in p # save; p # set 1; p # save; p # set 2; [get p 0; get p 1; get p 2; get p 3; get p 4];;- : int list = [2; 1; 0; 0; 0]
Recursive classes can be used to define objects whose types are mutually recursive.
class window = object val mutable top_widget = (None : widget option) method top_widget = top_widget end and widget (w : window) = object val window = w method window = window end;;class window : object val mutable top_widget : widget option method top_widget : widget option end and widget : window -> object val window : window method window : window end
Although their types are mutually recursive, the classes widget and window are themselves independent.
A binary method is a method which takes an argument of the same type as self. The class comparable below is a template for classes with a binary method leq of type 'a -> bool where the type variable 'a is bound to the type of self. Therefore, #comparable expands to < leq : 'a -> bool; .. > as 'a. We see here that the binder as also allows writing recursive types.
class virtual comparable = object (_ : 'a) method virtual leq : 'a -> bool end;;class virtual comparable : object ('a) method virtual leq : 'a -> bool end
We then define a subclass money of comparable. The class money simply wraps floats as comparable objects. We will extend it below with more operations. We have to use a type constraint on the class parameter x because the primitive <= is a polymorphic function in OCaml. The inherit clause ensures that the type of objects of this class is an instance of #comparable.
class money (x : float) = object inherit comparable val repr = x method value = repr method leq p = repr <= p#value end;;class money : float -> object ('a) val repr : float method leq : 'a -> bool method value : float end
Note that the type money is not a subtype of type comparable, as the self type appears in contravariant position in the type of method leq. Indeed, an object m of class money has a method leq that expects an argument of type money since it accesses its value method. Considering m of type comparable would allow a call to method leq on m with an argument that does not have a method value, which would be an error.
Similarly, the type money2 below is not a subtype of type money.
class money2 x = object inherit money x method times k = {< repr = k *. repr >} end;;class money2 : float -> object ('a) val repr : float method leq : 'a -> bool method times : float -> 'a method value : float end
It is however possible to define functions that manipulate objects of type either money or money2: the function min will return the minimum of any two objects whose type unifies with #comparable. The type of min is not the same as #comparable -> #comparable -> #comparable, as the abbreviation #comparable hides a type variable (an ellipsis). Each occurrence of this abbreviation generates a new variable.
let min (x : #comparable) y = if x#leq y then x else y;;val min : (#comparable as 'a) -> 'a -> 'a = <fun>
This function can be applied to objects of type money or money2.
(min (new money 1.3) (new money 3.1))#value;;- : float = 1.3
(min (new money2 5.0) (new money2 3.14))#value;;- : float = 3.14
More examples of binary methods can be found in sections 5.2.1 and 5.2.3.
Note the use of override for method times. Writing new money2 (k *. repr) instead of {< repr = k *. repr >} would not behave well with inheritance: in a subclass money3 of money2 the times method would return an object of class money2 but not of class money3 as would be expected.
The class money could naturally carry another binary method. Here is a direct definition:
class money x = object (self : 'a) val repr = x method value = repr method print = print_float repr method times k = {< repr = k *. x >} method leq (p : 'a) = repr <= p#value method plus (p : 'a) = {< repr = x +. p#value >} end;;class money : float -> object ('a) val repr : float method leq : 'a -> bool method plus : 'a -> 'a method print : unit method times : float -> 'a method value : float end
The above class money reveals a problem that often occurs with binary methods. In order to interact with other objects of the same class, the representation of money objects must be revealed, using a method such as value. If we remove all binary methods (here plus and leq), the representation can easily be hidden inside objects by removing the method value as well. However, this is not possible as soon as some binary method requires access to the representation of objects of the same class (other than self).
class safe_money x = object (self : 'a) val repr = x method print = print_float repr method times k = {< repr = k *. x >} end;;class safe_money : float -> object ('a) val repr : float method print : unit method times : float -> 'a end
Here, the representation of the object is known only to a particular object. To make it available to other objects of the same class, we are forced to make it available to the whole world. However we can easily restrict the visibility of the representation using the module system.
module type MONEY = sig type t class c : float -> object ('a) val repr : t method value : t method print : unit method times : float -> 'a method leq : 'a -> bool method plus : 'a -> 'a end end;;
module Euro : MONEY = struct type t = float class c x = object (self : 'a) val repr = x method value = repr method print = print_float repr method times k = {< repr = k *. x >} method leq (p : 'a) = repr <= p#value method plus (p : 'a) = {< repr = x +. p#value >} end end;;
Another example of friend functions may be found in section 5.2.3. These examples occur when a group of objects (here objects of the same class) and functions should see each others internal representation, while their representation should be hidden from the outside. The solution is always to define all friends in the same module, give access to the representation and use a signature constraint to make the representation abstract outside the module.
The bigarray library implements large, multi-dimensional, numerical arrays. These arrays are called “big arrays” to distinguish them from the standard OCaml arrays described in Module Array. The main differences between “big arrays” and standard OCaml arrays are as follows:
Programs that use the bigarray library must be linked as follows:
ocamlc other options bigarray.cma other files ocamlopt other options bigarray.cmxa other files
For interactive use of the bigarray library, do:
ocamlmktop -o mytop bigarray.cma ./mytop
or (if dynamic linking of C libraries is supported on your platform), start ocaml and type #load "bigarray.cma";;.
C stub code that interface C or Fortran code with OCaml code, as described in chapter 19, can exploit big arrays as follows.
The include file <caml/bigarray.h> must be included in the C stub file. It declares the functions, constants and macros discussed below.
If v is a OCaml value representing a big array, the expression Caml_ba_data_val(v) returns a pointer to the data part of the array. This pointer is of type void * and can be cast to the appropriate C type for the array (e.g. double [], char [][10], etc).
Various characteristics of the OCaml big array can be consulted from C as follows:
C expression | Returns |
Caml_ba_array_val(v)->num_dims | number of dimensions |
Caml_ba_array_val(v)->dim[i] | i-th dimension |
Caml_ba_array_val(v)->flags & BIGARRAY_KIND_MASK | kind of array elements |
The kind of array elements is one of the following constants:
Constant | Element kind |
CAML_BA_FLOAT32 | 32-bit single-precision floats |
CAML_BA_FLOAT64 | 64-bit double-precision floats |
CAML_BA_SINT8 | 8-bit signed integers |
CAML_BA_UINT8 | 8-bit unsigned integers |
CAML_BA_SINT16 | 16-bit signed integers |
CAML_BA_UINT16 | 16-bit unsigned integers |
CAML_BA_INT32 | 32-bit signed integers |
CAML_BA_INT64 | 64-bit signed integers |
CAML_BA_CAML_INT | 31- or 63-bit signed integers |
CAML_BA_NATIVE_INT | 32- or 64-bit (platform-native) integers |
The following example shows the passing of a two-dimensional big array to a C function and a Fortran function.
extern void my_c_function(double * data, int dimx, int dimy); extern void my_fortran_function_(double * data, int * dimx, int * dimy); value caml_stub(value bigarray) { int dimx = Caml_ba_array_val(bigarray)->dim[0]; int dimy = Caml_ba_array_val(bigarray)->dim[1]; /* C passes scalar parameters by value */ my_c_function(Caml_ba_data_val(bigarray), dimx, dimy); /* Fortran passes all parameters by reference */ my_fortran_function_(Caml_ba_data_val(bigarray), &dimx, &dimy); return Val_unit; }
A pointer p to an already-allocated C or Fortran array can be wrapped and returned to OCaml as a big array using the caml_ba_alloc or caml_ba_alloc_dims functions.
Return an OCaml big array wrapping the data pointed to by p. kind is the kind of array elements (one of the CAML_BA_ kind constants above). layout is CAML_BA_C_LAYOUT for an array with C layout and CAML_BA_FORTRAN_LAYOUT for an array with Fortran layout. numdims is the number of dimensions in the array. dims is an array of numdims long integers, giving the sizes of the array in each dimension.
Same as caml_ba_alloc, but the sizes of the array in each dimension are listed as extra arguments in the function call, rather than being passed as an array.
The following example illustrates how statically-allocated C and Fortran arrays can be made available to OCaml.
extern long my_c_array[100][200]; extern float my_fortran_array_[300][400]; value caml_get_c_array(value unit) { long dims[2]; dims[0] = 100; dims[1] = 200; return caml_ba_alloc(CAML_BA_NATIVE_INT | CAML_BA_C_LAYOUT, 2, my_c_array, dims); } value caml_get_fortran_array(value unit) { return caml_ba_alloc_dims(CAML_BA_FLOAT32 | CAML_BA_FORTRAN_LAYOUT, 2, my_fortran_array_, 300L, 400L); }
Classes are defined using a small language, similar to the module language.
Class types are the class-level equivalent of type expressions: they specify the general shape and type properties of classes.
|
See also the following language extensions: attributes and extension nodes.
The expression classtype-path is equivalent to the class type bound to the name classtype-path. Similarly, the expression [ typexpr1 , … typexprn ] classtype-path is equivalent to the parametric class type bound to the name classtype-path, in which type parameters have been instantiated to respectively typexpr1, …typexprn.
The class type expression typexpr -> class-type is the type of class functions (functions from values to classes) that take as argument a value of type typexpr and return as result a class of type class-type.
The class type expression object [( typexpr )] {class-field-spec} end is the type of a class body. It specifies its instance variables and methods. In this type, typexpr is matched against the self type, therefore providing a name for the self type.
A class body will match a class body type if it provides definitions for all the components specified in the class body type, and these definitions meet the type requirements given in the class body type. Furthermore, all methods either virtual or public present in the class body must also be present in the class body type (on the other hand, some instance variables and concrete private methods may be omitted). A virtual method will match a concrete method, which makes it possible to forget its implementation. An immutable instance variable will match a mutable instance variable.
The inheritance construct inherit class-body-type provides for inclusion of methods and instance variables from other class types. The instance variable and method types from class-body-type are added into the current class type.
A specification of an instance variable is written val [mutable] [virtual] inst-var-name : typexpr, where inst-var-name is the name of the instance variable and typexpr its expected type. The flag mutable indicates whether this instance variable can be physically modified. The flag virtual indicates that this instance variable is not initialized. It can be initialized later through inheritance.
An instance variable specification will hide any previous specification of an instance variable of the same name.
The specification of a method is written method [private] method-name : poly-typexpr, where method-name is the name of the method and poly-typexpr its expected type, possibly polymorphic. The flag private indicates that the method cannot be accessed from outside the object.
The polymorphism may be left implicit in public method specifications: any type variable which is not bound to a class parameter and does not appear elsewhere inside the class specification will be assumed to be universal, and made polymorphic in the resulting method type. Writing an explicit polymorphic type will disable this behaviour.
If several specifications are present for the same method, they must have compatible types. Any non-private specification of a method forces it to be public.
A virtual method specification is written method [private] virtual method-name : poly-typexpr, where method-name is the name of the method and poly-typexpr its expected type.
The construct constraint typexpr1 = typexpr2 forces the two type expressions to be equal. This is typically used to specify type parameters: in this way, they can be bound to specific type expressions.
Class expressions are the class-level equivalent of value expressions: they evaluate to classes, thus providing implementations for the specifications expressed in class types.
|
|
See also the following language extensions: locally abstract types, explicit overriding in class definitions, attributes and extension nodes.
The expression class-path evaluates to the class bound to the name class-path. Similarly, the expression [ typexpr1 , … typexprn ] class-path evaluates to the parametric class bound to the name class-path, in which type parameters have been instantiated respectively to typexpr1, …typexprn.
The expression ( class-expr ) evaluates to the same module as class-expr.
The expression ( class-expr : class-type ) checks that class-type matches the type of class-expr (that is, that the implementation class-expr meets the type specification class-type). The whole expression evaluates to the same class as class-expr, except that all components not specified in class-type are hidden and can no longer be accessed.
Class application is denoted by juxtaposition of (possibly labeled) expressions. It denotes the class whose constructor is the first expression applied to the given arguments. The arguments are evaluated as for expression application, but the constructor itself will only be evaluated when objects are created. In particular, side-effects caused by the application of the constructor will only occur at object creation time.
The expression fun [[?]label-name:] pattern -> class-expr evaluates to a function from values to classes. When this function is applied to a value v, this value is matched against the pattern pattern and the result is the result of the evaluation of class-expr in the extended environment.
Conversion from functions with default values to functions with patterns only works identically for class functions as for normal functions.
The expression
is a short form for
The let and let rec constructs bind value names locally, as for the core language expressions.
If a local definition occurs at the very beginning of a class definition, it will be evaluated when the class is created (just as if the definition was outside of the class). Otherwise, it will be evaluated when the object constructor is called.
|
The expression object class-body end denotes a class body. This is the prototype for an object : it lists the instance variables and methods of an objet of this class.
A class body is a class value: it is not evaluated at once. Rather, its components are evaluated each time an object is created.
In a class body, the pattern ( pattern [: typexpr] ) is matched against self, therefore providing a binding for self and self type. Self can only be used in method and initializers.
Self type cannot be a closed object type, so that the class remains extensible.
Since OCaml 4.01, it is an error if the same method or instance variable name is defined several times in the same class body.
The inheritance construct inherit class-expr allows reusing methods and instance variables from other classes. The class expression class-expr must evaluate to a class body. The instance variables, methods and initializers from this class body are added into the current class. The addition of a method will override any previously defined method of the same name.
An ancestor can be bound by appending as lowercase-ident to the inheritance construct. lowercase-ident is not a true variable and can only be used to select a method, i.e. in an expression lowercase-ident # method-name. This gives access to the method method-name as it was defined in the parent class even if it is redefined in the current class. The scope of this ancestor binding is limited to the current class. The ancestor method may be called from a subclass but only indirectly.
The definition val [mutable] inst-var-name = expr adds an instance variable inst-var-name whose initial value is the value of expression expr. The flag mutable allows physical modification of this variable by methods.
An instance variable can only be used in the methods and initializers that follow its definition.
Since version 3.10, redefinitions of a visible instance variable with the same name do not create a new variable, but are merged, using the last value for initialization. They must have identical types and mutability. However, if an instance variable is hidden by omitting it from an interface, it will be kept distinct from other instance variables with the same name.
A variable specification is written val [mutable] virtual inst-var-name : typexpr. It specifies whether the variable is modifiable, and gives its type.
Virtual instance variables were added in version 3.10.
A method definition is written method method-name = expr. The definition of a method overrides any previous definition of this method. The method will be public (that is, not private) if any of the definition states so.
A private method, method private method-name = expr, is a method that can only be invoked on self (from other methods of the same object, defined in this class or one of its subclasses). This invocation is performed using the expression value-name # method-name, where value-name is directly bound to self at the beginning of the class definition. Private methods do not appear in object types. A method may have both public and private definitions, but as soon as there is a public one, all subsequent definitions will be made public.
Methods may have an explicitly polymorphic type, allowing them to be used polymorphically in programs (even for the same object). The explicit declaration may be done in one of three ways: (1) by giving an explicit polymorphic type in the method definition, immediately after the method name, i.e. method [private] method-name : {' ident}+ . typexpr = expr; (2) by a forward declaration of the explicit polymorphic type through a virtual method definition; (3) by importing such a declaration through inheritance and/or constraining the type of self.
Some special expressions are available in method bodies for manipulating instance variables and duplicating self:
|
The expression inst-var-name <- expr modifies in-place the current object by replacing the value associated to inst-var-name by the value of expr. Of course, this instance variable must have been declared mutable.
The expression {< inst-var-name1 = expr1 ; … ; inst-var-namen = exprn >} evaluates to a copy of the current object in which the values of instance variables inst-var-name1, …, inst-var-namen have been replaced by the values of the corresponding expressions expr1, …, exprn.
A method specification is written method [private] virtual method-name : poly-typexpr. It specifies whether the method is public or private, and gives its type. If the method is intended to be polymorphic, the type must be explicitly polymorphic.
The construct constraint typexpr1 = typexpr2 forces the two type expressions to be equals. This is typically used to specify type parameters: in that way they can be bound to specific type expressions.
A class initializer initializer expr specifies an expression that will be evaluated whenever an object is created from the class, once all its instance variables have been initialized.
|
A class definition class class-binding { and class-binding } is recursive. Each class-binding defines a class-name that can be used in the whole expression except for inheritance. It can also be used for inheritance, but only in the definitions that follow its own.
A class binding binds the class name class-name to the value of expression class-expr. It also binds the class type class-name to the type of the class, and defines two type abbreviations : class-name and # class-name. The first one is the type of objects of this class, while the second is more general as it unifies with the type of any object belonging to a subclass (see section 6.4).
A class must be flagged virtual if one of its methods is virtual (that is, appears in the class type, but is not actually defined). Objects cannot be created from a virtual class.
The class type parameters correspond to the ones of the class type and of the two type abbreviations defined by the class binding. They must be bound to actual types in the class definition using type constraints. So that the abbreviations are well-formed, type variables of the inferred type of the class must either be type parameters or be bound in the constraint clause.
|
This is the counterpart in signatures of class definitions. A class specification matches a class definition if they have the same type parameters and their types match.
|
A class type definition class class-name = class-body-type defines an abbreviation class-name for the class body type class-body-type. As for class definitions, two type abbreviations class-name and # class-name are also defined. The definition can be parameterized by some type parameters. If any method in the class type body is virtual, the definition must be flagged virtual.
Two class type definitions match if they have the same type parameters and they expand to matching types.
Flambda is the term used to describe a series of optimisation passes provided by the native code compilers as of OCaml 4.03.
Flambda aims to make it easier to write idiomatic OCaml code without incurring performance penalties.
To use the Flambda optimisers it is necessary to pass the -flambda option to the OCaml configure script. (There is no support for a single compiler that can operate in both Flambda and non-Flambda modes.) Code compiled with Flambda cannot be linked into the same program as code compiled without Flambda. Attempting to do this will result in a compiler error.
Whether or not a particular ocamlopt uses Flambda may be determined by invoking it with the -config option and looking for any line starting with “flambda:”. If such a line is present and says “true”, then Flambda is supported, otherwise it is not.
Flambda provides full optimisation across different compilation units, so long as the .cmx files for the dependencies of the unit currently being compiled are available. (A compilation unit corresponds to a single .ml source file.) However it does not yet act entirely as a whole-program compiler: for example, elimination of dead code across a complete set of compilation units is not supported.
Optimisation with Flambda is not currently supported when generating bytecode.
Flambda should not in general affect the semantics of existing programs. Two exceptions to this rule are: possible elimination of pure code that is being benchmarked (see section 20.14) and changes in behaviour of code using unsafe operations (see section 20.15).
Flambda does not yet optimise array or string bounds checks. Neither does it take hints for optimisation from any assertions written by the user in the code.
Consult the Glossary at the end of this chapter for definitions of technical terms used below.
The Flambda optimisers provide a variety of command-line flags that may be used to control their behaviour. Detailed descriptions of each flag are given in the referenced sections. Those sections also describe any arguments which the particular flags take.
Commonly-used options:
Less commonly-used options:
Advanced options, only needed for detailed tuning:
Flambda operates in rounds: one round consists of a certain sequence of transformations that may then be repeated in order to achieve more satisfactory results. The number of rounds can be set manually using the -rounds parameter (although this is not necessary when using predefined optimisation levels such as with -O2 and -O3). For high optimisation the number of rounds might be set at 3 or 4.
Command-line flags that may apply per round, for example those with -cost in the name, accept arguments of the form:
The flags -Oclassic, -O2 and -O3 are applied before all other flags, meaning that certain parameters may be overridden without having to specify every parameter usually invoked by the given optimisation level.
Inlining refers to the copying of the code of a function to a place where the function is called. The code of the function will be surrounded by bindings of its parameters to the corresponding arguments.
The aims of inlining are:
These goals are often reached not just by inlining itself but also by other optimisations that the compiler is able to perform as a result of inlining.
When a recursive call to a function (within the definition of that function or another in the same mutually-recursive group) is inlined, the procedure is also known as unrolling. This is somewhat akin to loop peeling. For example, given the following code:
let rec fact x = if x = 0 then 1 else x * fact (x - 1) let n = fact 4
unrolling once at the call site fact 4 produces (with the body of fact unchanged):
let n = if 4 = 0 then 1 else 4 * fact (4 - 1)
This simplifies to:
let n = 4 * fact 3
Flambda provides significantly enhanced inlining capabilities relative to previous versions of the compiler.
Inlining is performed together with all of the other Flambda optimisation passes, that is to say, after closure conversion. This has three particular advantages over a potentially more straightforward implementation prior to closure conversion:
In -Oclassic mode the behaviour of the Flambda inliner mimics previous versions of the compiler. (Code may still be subject to further optimisations not performed by previous versions of the compiler: functors may be inlined, constants are lifted and unused code is eliminated all as described elsewhere in this chapter. See sections 20.3.3, 20.8.1 and 20.10. At the definition site of a function, the body of the function is measured. It will then be marked as eligible for inlining (and hence inlined at every direct call site) if:
Non-Flambda versions of the compiler cannot inline functions that contain a definition of another function. However -Oclassic does permit this. Further, non-Flambda versions also cannot inline functions that are only themselves exposed as a result of a previous pass of inlining, but again this is permitted by -Oclassic. For example:
module M : sig val i : int end = struct let f x = let g y = x + y in g let h = f 3 let i = h 4 (* h is correctly discovered to be g and inlined *) end
All of this contrasts with the normal Flambda mode, that is to say without -Oclassic, where:
The Flambda mode is described in the next section.
The Flambda inlining heuristics, used whenever the compiler is configured for Flambda and -Oclassic was not specified, make inlining decisions at call sites. This helps in situations where the context is important. For example:
let f b x = if b then x else ... big expression ... let g x = f true x
In this case, we would like to inline f into g, because a conditional jump can be eliminated and the code size should reduce. If the inlining decision has been made after the declaration of f without seeing the use, its size would have probably made it ineligible for inlining; but at the call site, its final size can be known. Further, this function should probably not be inlined systematically: if b is unknown, or indeed false, there is little benefit to trade off against a large increase in code size. In the existing non-Flambda inliner this isn’t a great problem because chains of inlining were cut off fairly quickly. However it has led to excessive use of overly-large inlining parameters such as -inline 10000.
In more detail, at each call site the following procedure is followed:
Inlining within recursive functions of calls to other functions in the same mutually-recursive group is kept in check by an unrolling depth, described below. This ensures that functions are not unrolled to excess. (Unrolling is only enabled if -O3 optimisation level is selected and/or the -inline-max-unroll flag is passed with an argument greater than zero.)
There is nothing particular about functors that inhibits inlining compared to normal functions. To the inliner, these both look the same, except that functors are marked as such.
Applications of functors at toplevel are biased in favour of inlining. (This bias may be adjusted: see the documentation for -inline-lifting-benefit below.)
Applications of functors not at toplevel, for example in a local module inside some other expression, are treated by the inliner identically to normal function calls.
The inliner will be able to consider inlining a call to a function in a first class module if it knows which particular function is going to be called. The presence of the first-class module record that wraps the set of functions in the module does not per se inhibit inlining.
Method calls to objects are not at present inlined by Flambda.
If the -inlining-report option is provided to the compiler then a file will be emitted corresponding to each round of optimisation. For the OCaml source file basename.ml the files are named basename.round.inlining.org, with round a zero-based integer. Inside the files, which are formatted as “org mode”, will be found English prose describing the decisions that the inliner took.
Inlining typically results in an increase in code size, which if left unchecked, may not only lead to grossly large executables and excessive compilation times but also a decrease in performance due to worse locality. As such, the Flambda inliner trades off the change in code size against the expected runtime performance benefit, with the benefit being computed based on the number of operations that the compiler observes may be removed as a result of inlining.
For example given the following code:
let f b x = if b then x else ... big expression ... let g x = f true x
it would be observed that inlining of f would remove:
Formally, an estimate of runtime performance benefit is computed by first summing the cost of the operations that are known to be removed as a result of the inlining and subsequent simplification of the inlined body. The individual costs for the various kinds of operations may be adjusted using the various -inline-...-cost flags as follows. Costs are specified as integers. All of these flags accept a single argument describing such integers using the conventions detailed in section 20.2.1.
(Default values are described in section 20.5 below.)
The initial benefit value is then scaled by a factor that attempts to compensate for the fact that the current point in the code, if under some number of conditional branches, may be cold. (Flambda does not currently compute hot and cold paths.) The factor—the estimated probability that the inliner really is on a hot path—is calculated as (1/1 + f)d, where f is set by -inline-branch-factor and d is the nesting depth of branches at the current point. As the inliner descends into more deeply-nested branches, the benefit of inlining thus lessens.
The resulting benefit value is known as the estimated benefit.
The change in code size is also estimated: morally speaking it should be the change in machine code size, but since that is not available to the inliner, an approximation is used.
If the estimated benefit exceeds the increase in code size then the inlined version of the function will be kept. Otherwise the function will not be inlined.
Applications of functors at toplevel will be given an additional benefit (which may be controlled by the -inline-lifting-benefit flag) to bias inlining in such situations towards keeping the inlined version.
As described above, there are three parameters that restrict the search for inlining opportunities during speculation:
These parameters are ultimately bounded by the arguments provided to the corresponding command-line flags (or their default values):
Note in particular that -inline does not have the meaning that it has in the previous compiler or in -Oclassic mode. In both of those situations -inline was effectively some kind of basic assessment of inlining benefit. However in Flambda inlining mode it corresponds to a constraint on the search; the assessment of benefit is independent, as described above.
When speculation starts the inlining threshold starts at the value set by -inline (or -inline-toplevel if appropriate, see above). Upon making a speculative inlining decision the threshold is reduced by the code size of the function being inlined. If the threshold becomes exhausted, at or below zero, no further speculation will be performed.
The inlining depth starts at zero and is increased by one every time the inliner descends into another function. It is then decreased by one every time the inliner leaves such function. If the depth exceeds the value set by -inline-max-depth then speculation stops. This parameter is intended as a general backstop for situations where the inlining threshold does not control the search sufficiently.
The unrolling depth applies to calls within the same mutually-recursive group of functions. Each time an inlining of such a call is performed the depth is incremented by one when examining the resulting body. If the depth reaches the limit set by -inline-max-unroll then speculation stops.
The inliner may discover a call site to a recursive function where something is known about the arguments: for example, they may be equal to some other variables currently in scope. In this situation it may be beneficial to specialise the function to those arguments. This is done by copying the declaration of the function (and any others involved in any same mutually-recursive declaration) and noting the extra information about the arguments. The arguments augmented by this information are known as specialised arguments. In order to try to ensure that specialisation is not performed uselessly, arguments are only specialised if it can be shown that they are invariant: in other words, during the execution of the recursive function(s) themselves, the arguments never change.
Unless overridden by an attribute (see below), specialisation of a function will not be attempted if:
The compiler can prove invariance of function arguments across multiple functions within a recursive group (although this has some limitations, as shown by the example below).
It should be noted that the unboxing of closures pass (see below) can introduce specialised arguments on non-recursive functions. (No other place in the compiler currently does this.)
This function might be written like so:
let rec iter f l = match l with | [] -> () | h :: t -> f h; iter f t
and used like this:
let print_int x = print_endline (string_of_int x) let run xs = iter print_int (List.rev xs)
The argument f to iter is invariant so the function may be specialised:
let run xs = let rec iter' f l = (* The compiler knows: f holds the same value as foo throughout iter'. *) match l with | [] -> () | h :: t -> f h; iter' f t in iter' print_int (List.rev xs)
The compiler notes down that for the function iter’, the argument f is specialised to the constant closure print_int. This means that the body of iter’ may be simplified:
let run xs = let rec iter' f l = (* The compiler knows: f holds the same value as foo throughout iter'. *) match l with | [] -> () | h :: t -> print_int h; (* this is now a direct call *) iter' f t in iter' print_int (List.rev xs)
The call to print_int can indeed be inlined:
let run xs = let rec iter' f l = (* The compiler knows: f holds the same value as foo throughout iter'. *) match l with | [] -> () | h :: t -> print_endline (string_of_int h); iter' f t in iter' print_int (List.rev xs)
The unused specialised argument f may now be removed, leaving:
let run xs = let rec iter' l = match l with | [] -> () | h :: t -> print_endline (string_of_int h); iter' t in iter' (List.rev xs)
The compiler cannot currently detect invariance in cases such as the following.
let rec iter_swap f g l = match l with | [] -> () | 0 :: t -> iter_swap g f l | h :: t -> f h; iter_swap f g t
The benefit of specialisation is assessed in a similar way as for inlining. Specialised argument information may mean that the body of the function being specialised can be simplified: the removed operations are accumulated into a benefit. This, together with the size of the duplicated (specialised) function declaration, is then assessed against the size of the call to the original function.
The default settings (when not using -Oclassic) are for one round of optimisation using the following parameters.
Parameter | Setting |
-inline | 10 |
-inline-branch-factor | 0.1 |
-inline-alloc-cost | 7 |
-inline-branch-cost | 5 |
-inline-call-cost | 5 |
-inline-indirect-cost | 4 |
-inline-prim-cost | 3 |
-inline-lifting-benefit | 1300 |
-inline-toplevel | 160 |
-inline-max-depth | 1 |
-inline-max-unroll | 0 |
-unbox-closures-factor | 10 |
When -O2 is specified two rounds of optimisation are performed. The first round uses the default parameters (see above). The second uses the following parameters.
Parameter | Setting |
-inline | 25 |
-inline-branch-factor | Same as default |
-inline-alloc-cost | Double the default |
-inline-branch-cost | Double the default |
-inline-call-cost | Double the default |
-inline-indirect-cost | Double the default |
-inline-prim-cost | Double the default |
-inline-lifting-benefit | Same as default |
-inline-toplevel | 400 |
-inline-max-depth | 2 |
-inline-max-unroll | Same as default |
-unbox-closures-factor | Same as default |
When -O3 is specified three rounds of optimisation are performed. The first two rounds are as for -O2. The third round uses the following parameters.
Parameter | Setting |
-inline | 50 |
-inline-branch-factor | Same as default |
-inline-alloc-cost | Triple the default |
-inline-branch-cost | Triple the default |
-inline-call-cost | Triple the default |
-inline-indirect-cost | Triple the default |
-inline-prim-cost | Triple the default |
-inline-lifting-benefit | Same as default |
-inline-toplevel | 800 |
-inline-max-depth | 3 |
-inline-max-unroll | 1 |
-unbox-closures-factor | Same as default |
Should the inliner prove recalcitrant and refuse to inline a particular function, or if the observed inlining decisions are not to the programmer’s satisfaction for some other reason, inlining behaviour can be dictated by the programmer directly in the source code. One example where this might be appropriate is when the programmer, but not the compiler, knows that a particular function call is on a cold code path. It might be desirable to prevent inlining of the function so that the code size along the hot path is kept smaller, so as to increase locality.
The inliner is directed using attributes. For non-recursive functions (and one-step unrolling of recursive functions, although @unroll is more clear for this purpose) the following are supported:
For recursive functions the relevant attributes are:
A compiler warning will be emitted if it was found impossible to obey an annotation from an @inlined or @specialised attribute.
module F (M : sig type t end) = struct let[@inline never] bar x = x * 3 let foo x = (bar [@inlined]) (42 + x) end [@@inline never] module X = F [@inlined] (struct type t = int end)
Simplification, which is run in conjunction with inlining, propagates information (known as approximations) about which variables hold what values at runtime. Certain relationships between variables and symbols are also tracked: for example, some variable may be known to always hold the same value as some other variable; or perhaps some variable may be known to always hold the value pointed to by some symbol.
The propagation can help to eliminate allocations in cases such as:
let f x y = ... let p = x, y in ... ... (fst p) ... (snd p) ...
The projections from p may be replaced by uses of the variables x and y, potentially meaning that p becomes unused.
The propagation performed by the simplification pass is also important for discovering which functions flow to indirect call sites. This can enable the transformation of such call sites into direct call sites, which makes them eligible for an inlining transformation.
Note that no information is propagated about the contents of strings, even in safe-string mode, because it cannot yet be guaranteed that they are immutable throughout a given program.
Expressions found to be constant will be lifted to symbol bindings—that is to say, they will be statically allocated in the object file—when they evaluate to boxed values. Such constants may be straightforward numeric constants, such as the floating-point number 42.0, or more complicated values such as constant closures.
Lifting of constants to toplevel reduces allocation at runtime.
The compiler aims to share constants lifted to toplevel such that there are no duplicate definitions. However if .cmx files are hidden from the compiler then maximal sharing may not be possible.
The following language semantics apply specifically to constant float arrays. (By “constant float array” is meant an array consisting entirely of floating point numbers that are known at compile time. A common case is a literal such as [| 42.0; 43.0; |].
Toplevel let-expressions may be lifted to symbol bindings to ensure that the corresponding bound variables are not captured by closures. If the defining expression of a given binding is found to be constant, it is bound as such (the technical term is a let-symbol binding).
Otherwise, the symbol is bound to a (statically-allocated) preallocated block containing one field. At runtime, the defining expression will be evaluated and the first field of the block filled with the resulting value. This initialise-symbol binding causes one extra indirection but ensures, by virtue of the symbol’s address being known at compile time, that uses of the value are not captured by closures.
It should be noted that the blocks corresponding to initialise-symbol bindings are kept alive forever, by virtue of them occurring in a static table of GC roots within the object file. This extended lifetime of expressions may on occasion be surprising. If it is desired to create some non-constant value (for example when writing GC tests) that does not have this extended lifetime, then it may be created and used inside a function, with the application point of that function (perhaps at toplevel)—or indeed the function declaration itself—marked as to never be inlined. This technique prevents lifting of the definition of the value in question (assuming of course that it is not constant).
The transformations in this section relate to the splitting apart of boxed (that is to say, non-immediate) values. They are largely intended to reduce allocation, which tends to result in a runtime performance profile with lower variance and smaller tails.
This transformation is enabled unless -no-unbox-free-vars-of-closures is provided.
Variables that appear in closure environments may themselves be boxed values. As such, they may be split into further closure variables, each of which corresponds to some projection from the original closure variable(s). This transformation is called unboxing of closure variables or unboxing of free variables of closures. It is only applied when there is reasonable certainty that there are no uses of the boxed free variable itself within the corresponding function bodies.
In the following code, the compiler observes that the closure returned from the function f contains a variable pair (free in the body of f) that may be split into two separate variables.
let f x0 x1 = let pair = x0, x1 in Printf.printf "foo\n"; fun y -> fst pair + snd pair + y
After some simplification one obtains:
let f x0 x1 = let pair_0 = x0 in let pair_1 = x1 in Printf.printf "foo\n"; fun y -> pair_0 + pair_1 + y
and then:
let f x0 x1 = Printf.printf "foo\n"; fun y -> x0 + x1 + y
The allocation of the pair has been eliminated.
This transformation does not operate if it would cause the closure to contain more than twice as many closure variables as it did beforehand.
This transformation is enabled unless -no-unbox-specialised-args is provided.
It may become the case during compilation that one or more invariant arguments to a function become specialised to a particular value. When such values are themselves boxed the corresponding specialised arguments may be split into more specialised arguments corresponding to the projections out of the boxed value that occur within the function body. This transformation is called unboxing of specialised arguments. It is only applied when there is reasonable certainty that the boxed argument itself is unused within the function.
If the function in question is involved in a recursive group then unboxing of specialised arguments may be immediately replicated across the group based on the dataflow between invariant arguments.
Having been given the following code, the compiler will inline loop into f, and then observe inv being invariant and always the pair formed by adding 42 and 43 to the argument x of the function f.
let rec loop inv xs = match xs with | [] -> fst inv + snd inv | x::xs -> x + loop2 xs inv and loop2 ys inv = match ys with | [] -> 4 | y::ys -> y - loop inv ys let f x = Printf.printf "%d\n" (loop (x + 42, x + 43) [1; 2; 3])
Since the functions have sufficiently few arguments, more specialised arguments will be added. After some simplification one obtains:
let f x = let rec loop' xs inv_0 inv_1 = match xs with | [] -> inv_0 + inv_1 | x::xs -> x + loop2' xs inv_0 inv_1 and loop2' ys inv_0 inv_1 = match ys with | [] -> 4 | y::ys -> y - loop' ys inv_0 inv_1 in Printf.printf "%d\n" (loop' [1; 2; 3] (x + 42) (x + 43))
The allocation of the pair within f has been removed. (Since the two closures for loop’ and loop2’ are constant they will also be lifted to toplevel with no runtime allocation penalty. This would also happen without having run the transformation to unbox specialise arguments.)
The transformation to unbox specialised arguments never introduces extra allocation.
The transformation will not unbox arguments if it would result in the original function having sufficiently many arguments so as to inhibit tail-call optimisation.
The transformation is implemented by creating a wrapper function that accepts the original arguments. Meanwhile, the original function is renamed and extra arguments are added corresponding to the unboxed specialised arguments; this new function is called from the wrapper. The wrapper will then be inlined at direct call sites. Indeed, all call sites will be direct unless -unbox-closures is being used, since they will have been generated by the compiler when originally specialising the function. (In the case of -unbox-closures other functions may appear with specialised arguments; in this case there may be indirect calls and these will incur a small penalty owing to having to bounce through the wrapper. The technique of direct call surrogates used for -unbox-closures is not used by the transformation to unbox specialised arguments.)
This transformation is not enabled by default. It may be enabled using the -unbox-closures flag.
The transformation replaces closure variables by specialised arguments. The aim is to cause more closures to become closed. It is particularly applicable, as a means of reducing allocation, where the function concerned cannot be inlined or specialised. For example, some non-recursive function might be too large to inline; or some recursive function might offer no opportunities for specialisation perhaps because its only argument is one of type unit.
At present there may be a small penalty in terms of actual runtime performance when this transformation is enabled, although more stable performance may be obtained due to reduced allocation. It is recommended that developers experiment to determine whether the option is beneficial for their code. (It is expected that in the future it will be possible for the performance degradation to be removed.)
In the following code (which might typically occur when g is too large to inline) the value of x would usually be communicated to the application of the + function via the closure of g.
let f x = let g y = x + y in (g [@inlined never]) 42
Unboxing of the closure causes the value for x inside g to be passed as an argument to g rather than through its closure. This means that the closure of g becomes constant and may be lifted to toplevel, eliminating the runtime allocation.
The transformation is implemented by adding a new wrapper function in the manner of that used when unboxing specialised arguments. The closure variables are still free in the wrapper, but the intention is that when the wrapper is inlined at direct call sites, the relevant values are passed directly to the main function via the new specialised arguments.
Adding such a wrapper will penalise indirect calls to the function (which might exist in arbitrary places; remember that this transformation is not for example applied only on functions the compiler has produced as a result of specialisation) since such calls will bounce through the wrapper. To mitigate this, if a function is small enough when weighed up against the number of free variables being removed, it will be duplicated by the transformation to obtain two versions: the original (used for indirect calls, since we can do no better) and the wrapper/rewritten function pair as described in the previous paragraph. The wrapper/rewritten function pair will only be used at direct call sites of the function. (The wrapper in this case is known as a direct call surrogate, since it takes the place of another function—the unchanged version used for indirect calls—at direct call sites.)
The -unbox-closures-factor command line flag, which takes an integer, may be used to adjust the point at which a function is deemed large enough to be ineligible for duplication. The benefit of duplication is scaled by the integer before being evaluated against the size.
In the following code, there are two closure variables that would typically cause closure allocations. One is called fv and occurs inside the function baz; the other is called z and occurs inside the function bar. In this toy (yet sophisticated) example we again use an attribute to simulate the typical situation where the first argument of baz is too large to inline.
let foo c = let rec bar zs fv = match zs with | [] -> [] | z::zs -> let rec baz f = function | [] -> [] | a::l -> let r = fv + ((f [@inlined never]) a) in r :: baz f l in (map2 (fun y -> z + y) [z; 2; 3; 4]) @ bar zs fv in Printf.printf "%d" (List.length (bar [1; 2; 3; 4] c))
The code resulting from applying -O3 -unbox-closures to this code passes the free variables via function arguments in order to eliminate all closure allocation in this example (aside from any that might be performed inside printf).
The simplification pass removes unused let bindings so long as their corresponding defining expressions have “no effects”. See the section “Treatment of effects” below for the precise definition of this term.
This transformation is analogous to the removal of let-expressions whose defining expressions have no effects. It operates instead on symbol bindings, removing those that have no effects.
This transformation is only enabled by default for specialised arguments. It may be enabled for all arguments using the -remove-unused-arguments flag.
The pass analyses functions to determine which arguments are unused. Removal is effected by creating a wrapper function, which will be inlined at every direct call site, that accepts the original arguments and then discards the unused ones before calling the original function. As a consequence, this transformation may be detrimental if the original function is usually indirectly called, since such calls will now bounce through the wrapper. (The technique of direct call surrogates used to reduce this penalty during unboxing of closure variables (see above) does not yet apply to the pass that removes unused arguments.)
This transformation performs an analysis across the whole compilation unit to determine whether there exist closure variables that are never used. Such closure variables are then eliminated. (Note that this has to be a whole-unit analysis because a projection of a closure variable from some particular closure may have propagated to an arbitrary location within the code due to inlining.)
Flambda performs a simple analysis analogous to that performed elsewhere in the compiler that can transform refs into mutable variables that may then be held in registers (or on the stack as appropriate) rather than being allocated on the OCaml heap. This only happens so long as the reference concerned can be shown to not escape from its defining scope.
This transformation discovers closure variables that are known to be equal to specialised arguments. Such closure variables are replaced by the specialised arguments; the closure variables may then be removed by the “removal of unused closure variables” pass (see below).
The Flambda optimisers classify expressions in order to determine whether an expression:
This is done by forming judgements on the effects and the coeffects that might be performed were the expression to be executed. Effects talk about how the expression might affect the world; coeffects talk about how the world might affect the expression.
Effects are classified as follows:
There is a single classification for coeffects:
It is assumed in the compiler that, subject to data dependencies, expressions with neither effects nor coeffects may be reordered with respect to other expressions.
Compilation of modules that are able to be statically allocated (for example, the module corresponding to an entire compilation unit, as opposed to a first class module dependent on values computed at runtime) initially follows the strategy used for bytecode. A sequence of let-bindings, which may be interspersed with arbitrary effects, surrounds a record creation that becomes the module block. The Flambda-specific transformation follows: these bindings are lifted to toplevel symbols, as described above.
Especially when writing benchmarking suites that run non-side-effecting algorithms in loops, it may be found that the optimiser entirely elides the code being benchmarked. This behaviour can be prevented by using the Sys.opaque_identity function (which indeed behaves as a normal OCaml function and does not possess any “magic” semantics). The documentation of the Sys module should be consulted for further details.
The behaviour of the Flambda simplification pass means that certain unsafe operations, which may without Flambda or when using previous versions of the compiler be safe, must not be used. This specifically refers to functions found in the Obj module.
In particular, it is forbidden to change any value (for example using Obj.set_field or Obj.set_tag) that is not mutable. (Values returned from C stubs are always treated as mutable.) The compiler will emit warning 59 if it detects such a write—but it cannot warn in all cases. Here is an example of code that will trigger the warning:
let f x = let a = 42, x in (Obj.magic a : int ref) := 1; fst a
The reason this is unsafe is because the simplification pass believes that fst a holds the value 42; and indeed it must, unless type soundness has been broken via unsafe operations.
If it must be the case that code has to be written that triggers warning 59, but the code is known to actually be correct (for some definition of correct), then Sys.opaque_identity may be used to wrap the value before unsafe operations are performed upon it. Great care must be taken when doing this to ensure that the opacity is added at the correct place. It must be emphasised that this use of Sys.opaque_identity is only for exceptional cases. It should not be used in normal code or to try to guide the optimiser.
As an example, this code will return the integer 1:
let f x = let a = Sys.opaque_identity (42, x) in (Obj.magic a : int ref) := 1; fst a
However the following code will still return 42:
let f x = let a = 42, x in Sys.opaque_identity (Obj.magic a : int ref) := 1; fst a
High levels of inlining performed by Flambda may expose bugs in code thought previously to be correct. Take care, for example, not to add type annotations that claim some mutable value is always immediate if it might be possible for an unsafe operation to update it to a boxed value.
The following terminology is used in this chapter of the manual.
This manual is also available in PDF. Postscript, DVI, plain text, as a bundle of HTML files, and as a bundle of Emacs Info files.
Part I |
Part II |
Part III |
Part IV |
Part V |
This document was translated from LATEX by HEVEA.ocaml-doc-4.05/ocaml.html/typedecl.html 0000644 0001750 0001750 00000052021 13131636457 017004 0 ustar mehdi mehdi
Type definitions bind type constructors to data types: either variant types, record types, type abbreviations, or abstract data types. They also bind the value constructors and record fields associated with the definition.
|
See also the following language extensions: private types, generalized algebraic datatypes, attributes, extension nodes, extensible variant types and inline records.
Type definitions are introduced by the type keyword, and consist in one or several simple definitions, possibly mutually recursive, separated by the and keyword. Each simple definition defines one type constructor.
A simple definition consists in a lowercase identifier, possibly preceded by one or several type parameters, and followed by an optional type equation, then an optional type representation, and then a constraint clause. The identifier is the name of the type constructor being defined.
In the right-hand side of type definitions, references to one of the type constructor name being defined are considered as recursive, unless type is followed by nonrec. The nonrec keyword was introduced in OCaml 4.02.2.
The optional type parameters are either one type variable ' ident, for type constructors with one parameter, or a list of type variables ('ident1,…,' identn), for type constructors with several parameters. Each type parameter may be prefixed by a variance constraint + (resp. -) indicating that the parameter is covariant (resp. contravariant). These type parameters can appear in the type expressions of the right-hand side of the definition, optionally restricted by a variance constraint ; i.e. a covariant parameter may only appear on the right side of a functional arrow (more precisely, follow the left branch of an even number of arrows), and a contravariant parameter only the left side (left branch of an odd number of arrows). If the type has a representation or an equation, and the parameter is free (i.e. not bound via a type constraint to a constructed type), its variance constraint is checked but subtyping etc. will use the inferred variance of the parameter, which may be less restrictive; otherwise (i.e. for abstract types or non-free parameters), the variance must be given explicitly, and the parameter is invariant if no variance is given.
The optional type equation = typexpr makes the defined type equivalent to the type expression typexpr: one can be substituted for the other during typing. If no type equation is given, a new type is generated: the defined type is incompatible with any other type.
The optional type representation describes the data structure representing the defined type, by giving the list of associated constructors (if it is a variant type) or associated fields (if it is a record type). If no type representation is given, nothing is assumed on the structure of the type besides what is stated in the optional type equation.
The type representation = [|] constr-decl { | constr-decl } describes a variant type. The constructor declarations constr-decl1, …, constr-decln describe the constructors associated to this variant type. The constructor declaration constr-name of typexpr1 * … * typexprn declares the name constr-name as a non-constant constructor, whose arguments have types typexpr1 …typexprn. The constructor declaration constr-name declares the name constr-name as a constant constructor. Constructor names must be capitalized.
The type representation = { field-decl { ; field-decl } [;] } describes a record type. The field declarations field-decl1, …, field-decln describe the fields associated to this record type. The field declaration field-name : poly-typexpr declares field-name as a field whose argument has type poly-typexpr. The field declaration mutable field-name : poly-typexpr behaves similarly; in addition, it allows physical modification of this field. Immutable fields are covariant, mutable fields are non-variant. Both mutable and immutable fields may have a explicitly polymorphic types. The polymorphism of the contents is statically checked whenever a record value is created or modified. Extracted values may have their types instantiated.
The two components of a type definition, the optional equation and the optional representation, can be combined independently, giving rise to four typical situations:
The type variables appearing as type parameters can optionally be prefixed by + or - to indicate that the type constructor is covariant or contravariant with respect to this parameter. This variance information is used to decide subtyping relations when checking the validity of :> coercions (see section 6.7.6).
For instance, type +'a t declares t as an abstract type that is covariant in its parameter; this means that if the type τ is a subtype of the type σ, then τ t is a subtype of σ t. Similarly, type -'a t declares that the abstract type t is contravariant in its parameter: if τ is a subtype of σ, then σ t is a subtype of τ t. If no + or - variance annotation is given, the type constructor is assumed non-variant in the corresponding parameter. For instance, the abstract type declaration type 'a t means that τ t is neither a subtype nor a supertype of σ t if τ is subtype of σ.
The variance indicated by the + and - annotations on parameters is enforced only for abstract and private types, or when there are type constraints. Otherwise, for abbreviations, variant and record types without type constraints, the variance properties of the type constructor are inferred from its definition, and the variance annotations are only checked for conformance with the definition.
The construct constraint ' ident = typexpr allows the specification of type parameters. Any actual type argument corresponding to the type parameter ident has to be an instance of typexpr (more precisely, ident and typexpr are unified). Type variables of typexpr can appear in the type equation and the type declaration.
|
Exception definitions add new constructors to the built-in variant
type exn
of exception values. The constructors are declared as
for a definition of a variant type.
The form exception constr-decl generates a new exception, distinct from all other exceptions in the system. The form exception constr-name = constr gives an alternate name to an existing exception.
This chapter describes the OCaml high-performance native-code compiler ocamlopt, which compiles OCaml source files to native code object files and link these object files to produce standalone executables.
The native-code compiler is only available on certain platforms. It produces code that runs faster than the bytecode produced by ocamlc, at the cost of increased compilation time and executable code size. Compatibility with the bytecode compiler is extremely high: the same source code should run identically when compiled with ocamlc and ocamlopt.
It is not possible to mix native-code object files produced by ocamlopt with bytecode object files produced by ocamlc: a program must be compiled entirely with ocamlopt or entirely with ocamlc. Native-code object files produced by ocamlopt cannot be loaded in the toplevel system ocaml.
The ocamlopt command has a command-line interface very close to that of ocamlc. It accepts the same types of arguments, and processes them sequentially, after all options have been processed:
The implementation is checked against the interface file x.mli (if it exists) as described in the manual for ocamlc (chapter 8).
The output of the linking phase is a regular Unix or Windows executable file. It does not need ocamlrun to run.
The following command-line options are recognized by ocamlopt. The options -pack, -a, -shared, -c and -output-obj are mutually exclusive.
If -cclib or -ccopt options are passed on the command line, these options are stored in the resulting .cmxalibrary. Then, linking with this library automatically adds back the -cclib and -ccopt options as if they had been provided on the command line, unless the -noautolink option is given.
The environment variable OCAML_COLOR is considered if -color is not provided. Its values are auto/always/never as above.
If the given directory starts with +, it is taken relative to the standard library directory. For instance, -I +labltk adds the subdirectory labltk of the standard library to the search path.
The -opaque option, available since 4.04, disables cross-module optimization information for the currently compiled unit. When compiling .mli interface, using -opaque marks the compiled .cmi interface so that subsequent compilations of modules that depend on it will not rely on the corresponding .cmx file, nor warn if it is absent. When the native compiler compiles a .ml implementation, using -opaque generates a .cmx that does not contain any cross-module optimization information.
Using this option may degrade the quality of generated code, but it reduces compilation time, both on clean and incremental builds. Indeed, with the native compiler, when the implementation of a compilation unit changes, all the units that depend on it may need to be recompiled – because the cross-module information may have changed. If the compilation unit whose implementation changed was compiled with -opaque, no such recompilation needs to occur. This option can thus be used, for example, to get faster edit-compile-test feedback loops.
Unix: See the Unix manual page for gprof(1) for more information about the profiles.Full support for gprof is only available for certain platforms (currently: Intel x86 32 and 64 bits under Linux, BSD and MacOS X). On other platforms, the -p option will result in a less precise profile (no call graph information, only a time profile).
Windows: The -p option does not work under Windows.
ocamlopt -pack -o P.cmx A.cmx B.cmx C.cmxgenerates compiled files P.cmx, P.o and P.cmi describing a compilation unit having three sub-modules A, B and C, corresponding to the contents of the object files A.cmx, B.cmx and C.cmx. These contents can be referenced as P.A, P.B and P.C in the remainder of the program.
The .cmx object files being combined must have been compiled with the appropriate -for-pack option. In the example above, A.cmx, B.cmx and C.cmx must have been compiled with ocamlopt -for-pack P.
Multiple levels of packing can be achieved by combining -pack with -for-pack. Consider the following example:
ocamlopt -for-pack P.Q -c A.ml ocamlopt -pack -o Q.cmx -for-pack P A.cmx ocamlopt -for-pack P -c B.ml ocamlopt -pack -o P.cmx Q.cmx B.cmx
The resulting P.cmx object file has sub-modules P.Q, P.Q.A and P.B.
The warning-list argument is a sequence of warning specifiers, with no separators between them. A warning specifier is one of the following:
Warning numbers and letters which are out of the range of warnings that are currently defined are ignored. The warnings are as follows.
The default setting is -w +a-4-6-7-9-27-29-32..39-41..42-44-45-48-50. It is displayed by ocamlopt -help. Note that warnings 5 and 10 are not always triggered, depending on the internals of the type checker.
Note: it is not recommended to use warning sets (i.e. letters) as arguments to -warn-error in production code, because this can break your build when future versions of OCaml add some new warnings.
The default setting is -warn-error -a+31 (only warning 31 is fatal).
The IA32 code generator (Intel Pentium, AMD Athlon) supports the following additional option:
The AMD64 code generator (64-bit versions of Intel Pentium and AMD Athlon) supports the following additional options:
The Sparc code generator supports the following additional options:
The default is to generate code for SPARC version 7, which runs on all SPARC processors.
The compiler command line can be modified “from the outside” with the following mechanisms. These are experimental and subject to change. They should be used only for experimental and development work, not in released packages.
The error messages are almost identical to those of ocamlc. See section 8.4.
Executables generated by ocamlopt are native, stand-alone executable files that can be invoked directly. They do not depend on the ocamlrun bytecode runtime system nor on dynamically-loaded C/OCaml stub libraries.
During execution of an ocamlopt-generated executable, the following environment variables are also consulted:
This section lists the known incompatibilities between the bytecode compiler and the native-code compiler. Except on those points, the two compilers should generate code that behave identically.
|
See also the following language extensions: integer literals for types int32, int64 and nativeint, quoted strings and extension literals.
The syntactic class of constants comprises literals from the four base types (integers, floating-point numbers, characters, character strings), and constant constructors from both normal and polymorphic variants, as well as the special constants false, true, (), [], and [||], which behave like constant constructors, and begin end, which is equivalent to ().
Module types are the module-level equivalent of type expressions: they specify the general shape and type properties of modules.
|
|
See also the following language extensions: recovering the type of a module, substitution inside a signature, type-level module aliases, attributes, extension nodes and generative functors.
The expression modtype-path is equivalent to the module type bound to the name modtype-path. The expression ( module-type ) denotes the same type as module-type.
Signatures are type specifications for structures. Signatures sig … end are collections of type specifications for value names, type names, exceptions, module names and module type names. A structure will match a signature if the structure provides definitions (implementations) for all the names specified in the signature (and possibly more), and these definitions meet the type requirements given in the signature.
An optional ;; is allowed after each specification in a signature. It serves as a syntactic separator with no semantic meaning.
A specification of a value component in a signature is written val value-name : typexpr, where value-name is the name of the value and typexpr its expected type.
The form external value-name : typexpr = external-declaration is similar, except that it requires in addition the name to be implemented as the external function specified in external-declaration (see chapter 19).
A specification of one or several type components in a signature is written type typedef { and typedef } and consists of a sequence of mutually recursive definitions of type names.
Each type definition in the signature specifies an optional type equation = typexpr and an optional type representation = constr-decl … or = { field-decl … }. The implementation of the type name in a matching structure must be compatible with the type expression specified in the equation (if given), and have the specified representation (if given). Conversely, users of that signature will be able to rely on the type equation or type representation, if given. More precisely, we have the following four situations:
The specification exception constr-decl in a signature requires the matching structure to provide an exception with the name and arguments specified in the definition, and makes the exception available to all users of the structure.
A specification of one or several classes in a signature is written class class-spec { and class-spec } and consists of a sequence of mutually recursive definitions of class names.
Class specifications are described more precisely in section 6.9.4.
A specification of one or several classe types in a signature is written class type classtype-def { and classtype-def } and consists of a sequence of mutually recursive definitions of class type names. Class type specifications are described more precisely in section 6.9.5.
A specification of a module component in a signature is written module module-name : module-type, where module-name is the name of the module component and module-type its expected type. Modules can be nested arbitrarily; in particular, functors can appear as components of structures and functor types as components of signatures.
For specifying a module component that is a functor, one may write
instead of
A module type component of a signature can be specified either as a manifest module type or as an abstract module type.
An abstract module type specification module type modtype-name allows the name modtype-name to be implemented by any module type in a matching signature, but hides the implementation of the module type to all users of the signature.
A manifest module type specification module type modtype-name = module-type requires the name modtype-name to be implemented by the module type module-type in a matching signature, but makes the equality between modtype-name and module-type apparent to all users of the signature.
The expression open module-path in a signature does not specify any components. It simply affects the parsing of the following items of the signature, allowing components of the module denoted by module-path to be referred to by their simple names name instead of path accesses module-path . name. The scope of the open stops at the end of the signature expression.
The expression include module-type in a signature performs textual inclusion of the components of the signature denoted by module-type. It behaves as if the components of the included signature were copied at the location of the include. The module-type argument must refer to a module type that is a signature, not a functor type.
The module type expression functor ( module-name : module-type1 ) -> module-type2 is the type of functors (functions from modules to modules) that take as argument a module of type module-type1 and return as result a module of type module-type2. The module type module-type2 can use the name module-name to refer to type components of the actual argument of the functor. If the type module-type2 does not depend on type components of module-name, the module type expression can be simplified with the alternative short syntax module-type1 -> module-type2 . No restrictions are placed on the type of the functor argument; in particular, a functor may take another functor as argument (“higher-order” functor).
Assuming module-type denotes a signature, the expression module-type with mod-constraint { and mod-constraint } denotes the same signature where type equations have been added to some of the type specifications, as described by the constraints following the with keyword. The constraint type [type-parameters] typeconstr = typexpr adds the type equation = typexpr to the specification of the type component named typeconstr of the constrained signature. The constraint module module-path = extended-module-path adds type equations to all type components of the sub-structure denoted by module-path, making them equivalent to the corresponding type components of the structure denoted by extended-module-path.
For instance, if the module type name S is bound to the signature
sig type t module M: (sig type u end) end
then S with type t=int denotes the signature
sig type t=int module M: (sig type u end) end
and S with module M = N denotes the signature
sig type t module M: (sig type u=N.u end) end
A functor taking two arguments of type S that share their t component is written
functor (A: S) (B: S with type t = A.t) ...
Constraints are added left to right. After each constraint has been applied, the resulting signature must be a subtype of the signature before the constraint was applied. Thus, the with operator can only add information on the type components of a signature, but never remove information.
The ocamldep command scans a set of OCaml source files (.ml and .mli files) for references to external compilation units, and outputs dependency lines in a format suitable for the make utility. This ensures that make will compile the source files in the correct order, and recompile those files that need to when a source file is modified.
The typical usage is:
ocamldep options *.mli *.ml > .depend
where *.mli *.ml expands to all source files in the current directory and .depend is the file that should contain the dependencies. (See below for a typical Makefile.)
Dependencies are generated both for compiling with the bytecode compiler ocamlc and with the native-code compiler ocamlopt.
The following command-line options are recognized by ocamldep.
filename: Module1 Module2 ... ModuleNwhere Module1, …, ModuleN are the names of the compilation units referenced within the file filename, but these names are not resolved to source file names. Such raw dependencies cannot be used by make, but can be post-processed by other tools such as Omake.
Here is a template Makefile for a OCaml program.
OCAMLC=ocamlc OCAMLOPT=ocamlopt OCAMLDEP=ocamldep INCLUDES= # all relevant -I options here OCAMLFLAGS=$(INCLUDES) # add other options for ocamlc here OCAMLOPTFLAGS=$(INCLUDES) # add other options for ocamlopt here # prog1 should be compiled to bytecode, and is composed of three # units: mod1, mod2 and mod3. # The list of object files for prog1 PROG1_OBJS=mod1.cmo mod2.cmo mod3.cmo prog1: $(PROG1_OBJS) $(OCAMLC) -o prog1 $(OCAMLFLAGS) $(PROG1_OBJS) # prog2 should be compiled to native-code, and is composed of two # units: mod4 and mod5. # The list of object files for prog2 PROG2_OBJS=mod4.cmx mod5.cmx prog2: $(PROG2_OBJS) $(OCAMLOPT) -o prog2 $(OCAMLFLAGS) $(PROG2_OBJS) # Common rules .SUFFIXES: .ml .mli .cmo .cmi .cmx .ml.cmo: $(OCAMLC) $(OCAMLFLAGS) -c $< .mli.cmi: $(OCAMLC) $(OCAMLFLAGS) -c $< .ml.cmx: $(OCAMLOPT) $(OCAMLOPTFLAGS) -c $< # Clean up clean: rm -f prog1 prog2 rm -f *.cm[iox] # Dependencies depend: $(OCAMLDEP) $(INCLUDES) *.mli *.ml > .depend include .depend
If you use module aliases to give shorter names to modules, you need to change the above definitions. Assuming that your map file is called mylib.mli, here are minimal modifications.
OCAMLFLAGS=$(INCLUDES) -open Mylib mylib.cmi: mylib.mli $(OCAMLC) $(INCLUDES) -no-alias-deps -w -49 -c $< depend: $(OCAMLDEP) $(INCLUDES) -map mylib.mli $(PROG1_OBJS:.cmo=.ml) > .depend
Note that in this case you should not compute dependencies for mylib.mli together with the other files, hence the need to pass explicitly the list of files to process. If mylib.mli itself has dependencies, you should compute them using -as-map.
sig
type t = t
val equal : t -> t -> bool
val hash : t -> int
val compare : t -> t -> int
val output : out_channel -> t -> unit
val print : Format.formatter -> t -> unit
end