You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
717 lines
32 KiB
717 lines
32 KiB
\documentclass{article}
|
|
|
|
\usepackage[english]{babel}
|
|
|
|
\usepackage{fontspec} % encoding
|
|
\usepackage{hyperref} % internal and external links (e.g. mail, www)
|
|
\usepackage{authblk} % author affiliation handling
|
|
\renewcommand\Affilfont{\itshape\small} %% chktex 6
|
|
\usepackage{bbding}
|
|
\usepackage{fancyvrb} % source code
|
|
\input{styledef.tex} % source code
|
|
\usepackage[most]{tcolorbox} % boxes (for source code or figures)
|
|
\usepackage{minted} % source code
|
|
\usepackage{csquotes}
|
|
|
|
\usepackage[
|
|
autolang=other,
|
|
backend=biber, % choix de l'outil de traitement
|
|
backref=true, % liens dans la bibliographie pour remonter dans le texte
|
|
backrefstyle=none, % afficher toutes les utilisations de la référence
|
|
bibstyle=alphabetic, % style pour les clés des références dans la bibliographie : [initialesAnnée]
|
|
citestyle=alphabetic, % style pour les clés des références dans le texte : [initialesAnnée]
|
|
%datamodel=software, % swh
|
|
sorting=ynt, % bibliographie triée par année, nom, titre
|
|
]{biblatex}
|
|
|
|
\definecolor{bg}{HTML}{282c34}
|
|
\definecolor{lightbg}{HTML}{D7D3CB}
|
|
\tcbuselibrary{minted}
|
|
\tcbset{listing engine=minted,colback=bg,enhanced,frame hidden}
|
|
\setminted{autogobble=true,bgcolor=bg,style=one-dark}
|
|
|
|
\setmainfont{Linux Libertine O}
|
|
\setsansfont{TeX Gyre Heros}
|
|
\setmonofont[Scale=MatchLowercase]{RobotoMono Nerd Font}
|
|
|
|
\newtcblisting{wast}{listing only, minted language=wast}
|
|
\newtcblisting{ocaml}{listing only, minted language=ocaml}
|
|
|
|
\title{Wasocaml: a compiler from OCaml to WebAssembly}
|
|
|
|
\author[1,2]{Léo Andrès}
|
|
\author[1]{Pierre Chambart}
|
|
|
|
\affil[1]{OCamlPro SAS, 21 rue de Châtillon, 75014 Paris, France}
|
|
\affil[2]{Université Paris-Saclay, CNRS, ENS Paris-Saclay, Inria, Laboratoire Méthodes Formelles, 91190 Gif-sur-Yvette, France}
|
|
|
|
\bibliography{bib}
|
|
|
|
\begin{document}
|
|
|
|
\maketitle
|
|
|
|
\tableofcontents
|
|
|
|
\begin{abstract}
|
|
In this presentation, we explore the compilation of OCaml to WebAssembly (Wasm).
|
|
The limitations of JavaScript as the default language of the web led to the
|
|
development of Wasm, a secure and predictable-performance modular language.
|
|
However, compiling garbage-collected languages to Wasm presents challenges,
|
|
including the need to compile or re-implement the runtime.
|
|
Some Wasm extensions such as Wasm-GC are developed by the Wasm working groups to facilitate
|
|
the compilation of garbage-collected languages.
|
|
We present Wasocaml, an OCaml to Wasm-GC compiler.
|
|
It is the first compiler for a real-world functional programming language targeting Wasm-GC.
|
|
Wasocaml confirms the adequacy of the Wasm-GC proposal for a functional language and
|
|
had an impact on the design of the proposal.
|
|
Moreover, the compilation strategies developed within Wasocaml are applicable to other
|
|
compilers and languages.
|
|
Indeed, two compilers already used a design similar to our.
|
|
Finally, we describe how we plan to handle the C/JavaScript FFIs and effects
|
|
handlers, in order to allow developers to easily deploy programs mixing C,
|
|
JavaScript and OCaml code to the web, while maintaining good performances.
|
|
\end{abstract}
|
|
|
|
% TODO:
|
|
% I wish you said a little more about:
|
|
% int31ref (you mention it in places, but never explain what it is; it sounds like an important point)
|
|
|
|
% TODO:
|
|
% dire que i31 plutôt que i63 parce qu'on s'attend à avoir des runtimes avec du pointer-compression (comme V8 (et firefox ?)) qui eux utilisent du 32 bits en interne
|
|
|
|
\section{Context}
|
|
\subsection{WebAssembly}
|
|
JavaScript is widely thought as being the de facto language of the
|
|
web. However, it does show its limitations when it comes to
|
|
performance, security and safety. In order to remedy this, WebAssembly (Wasm for short)~\cite{Haa+17} has been developped as a secure modular language of predictable
|
|
performance. Its usage is expanding beyond the web: finding
|
|
applications in the cloud (Fastly, Cloudflare) and in the creation of
|
|
portable binaries.
|
|
|
|
The first version of Wasm was meant to compile libraries in languages
|
|
such as C, C++ or Rust to expose them for consumption by a JavaScript
|
|
program. It was designed with the explicit aim that future versions
|
|
would catter to the specific needs of other languages and uses. This version of Wasm can be seen as simplified C. Here is an example:
|
|
|
|
\begin{wast}
|
|
(func $fact (param $x i32) (result i32)
|
|
(if (i32.eq (local.get $x) (i32.const 0))
|
|
(then (i32.const 1))
|
|
(else
|
|
(i32.mul
|
|
(local.get $x)
|
|
(call $fact (i32.sub (local.get $x) (i32.const 1)))))))
|
|
\end{wast}
|
|
|
|
A Wasm module is executed inside what is called an \emph{embedder} or a \emph{host}. In the context of the web, it is a browser and the host language
|
|
is JavaScript. A Wasm program only manipulates scalar values (integers and floats)
|
|
but not garbage collected values coming from the host language, such as JavaScript objects.
|
|
This would be useful in a browser context, but requires some interaction
|
|
with the GC of the embedder. This led to various extensions that we describe
|
|
in the next section.
|
|
|
|
\subsection{Wasm Extensions}
|
|
|
|
We discuss only the most notable ones, that is to say the tail call~\cite{Web17c}, reference types~\cite{Web18}, typed function references~\cite{Web20}, GC~\cite{Web17b}, and exception handling~\cite{Web17a} extensions.
|
|
|
|
\subsubsection{References and GC}
|
|
|
|
The aim of these proposals is to allow the program to manipulate
|
|
references to different kinds of values such as opaque objects, functions,
|
|
and garbage-collected values. Since Wasm has to be memory safe, those values
|
|
are not exposed as raw pointers. Instead, the type system
|
|
guarantees the proper manipulation of such values.
|
|
|
|
\begin{wast}
|
|
(type $f1 (func (param i32) (result i32)))
|
|
(type $t1 (array i31ref))
|
|
(type $t2 (struct
|
|
((field $a (ref $t1))
|
|
(field $b (ref $f1))
|
|
(field $c i32))))
|
|
|
|
(func (param $x (ref $t2)) (result i32)
|
|
(i31.get_u
|
|
(array.get $t1
|
|
(i32.const 0)
|
|
(struct.get $t2 $a (local.get $x)))))
|
|
\end{wast}
|
|
|
|
% TODO: explain why in the previous code example we cannot manipulate the pointer and how they are hidden
|
|
|
|
However, Wasm is rather a compilation target than a programming
|
|
language. It needs to be able to represent the values of any kind of
|
|
source language. It was not deemed possible to have a type system
|
|
powerful enough to do that. Instead, the decision was to have a simple
|
|
type system, but to rely on dynamic cast to fill the gaps, and guarantee that
|
|
those casts are efficient.
|
|
|
|
There is a sub-typing relation that controls which casts are
|
|
allowed. Some types are predefined by Wasm: \mintinline{wast}{anyref},
|
|
\mintinline{wast}{eqref}, \mintinline{wast}{externref},
|
|
\mintinline{wast}{funcref} and \mintinline{wast}{i31ref}.
|
|
Others are user defined: array and struct. There is a structural
|
|
sub-typing relation between user types. Upcasts are implicit, downcasts
|
|
are explicit and incorrect ones lead to runtime errors.
|
|
It is possible to dynamically test for type compatibility.
|
|
|
|
\begin{tcolorbox}[colback=lightbg]
|
|
\begin{center}
|
|
\includegraphics[width=20em]{figure_type_hierarchy.mps}
|
|
\end{center}
|
|
\end{tcolorbox}
|
|
|
|
\paragraph{Contribution to the proposals.}
|
|
|
|
The general rule for the Wasm committee is to only include features
|
|
with a demonstrated use case. As there are currently very few
|
|
compilers targeting the GC-proposal, some features were lacking
|
|
conclusive evidence of their usefulness. An example is the \mintinline{wast}{i31ref}
|
|
type that is not required by the Dart compiler (the only one targeting the GC-proposal at the time).
|
|
Wasocaml demonstrates the usefulness of \mintinline{wast}{i31ref}.
|
|
It also validates the GC-proposal on a functional language.
|
|
We presented Wasocaml to the Wasm-GC working group~\cite{AC23}.
|
|
It helped in convincing the working group to keep \mintinline{wast}{i31ref} in the proposal.
|
|
|
|
% TODO: at this point, readers have no idea what the i31 ref type is
|
|
% the wording suggests that it's important
|
|
% maybe an explanation would be useful ?
|
|
|
|
\subsubsection{Exception}
|
|
|
|
Another serious limitation of Wasm1 is the absence of an exception
|
|
mechanism. This is not a problem for compiling either C or Rust, but
|
|
C++ does require them. It is of course possible to encode exception, but
|
|
this has a significant runtime cost and limits the interactions
|
|
with JavaScript (that has exceptions).
|
|
|
|
\subsubsection{Tail Calls}
|
|
|
|
Wasm1 does not allow tail-calls to be optimized. This is
|
|
a show-stopper when compiling functional languages. In the tail-call proposal, new instructions such as \mintinline{wast}{return_call} guarantee that a tail-call will be optimized.
|
|
|
|
\section{Value Representation}
|
|
|
|
There have been attempts at targeting the early versions of Wasm from OCaml.
|
|
The first one simply compiles the OCaml runtime (including the GC)
|
|
and interpreter using emscripten~\cite{The14} (a C to Wasm compiler). The second one is a smarter and more efficient version of the same approach. Both are part of the WASICaml~\cite{Sto21} project.
|
|
|
|
These approaches suffer from the same problem as C programs running on Wasm: they
|
|
cannot natively manipulate external values, such as DOM objects in the
|
|
browser. Indeed, in the first versions of Wasm, the only types
|
|
are scalars (\mintinline{wast}{i32}, \mintinline{wast}{i64}, \mintinline{wast}{f32} and
|
|
\mintinline{wast}{f64}). There is no way to directly
|
|
manipulate values from the embedder (\emph{e.g.} JavaScript objects). It could be
|
|
possible to identify objects with an integer and map them to their
|
|
corresponding objects on the embedder side. This approach comes with
|
|
the usual limitations of having two runtimes manipulating (indirectly)
|
|
garbage-collected values: it is tedious and easily leads to memory leaks
|
|
(cycles between the two runtimes cannot be collected).
|
|
|
|
To properly interact with the embedder, we need to leverage the reference extension.
|
|
This extension adds new types to the language, e.g.\ \mintinline{wast}{externref} which
|
|
is an opaque type representing a value from the embedder. References cannot be stored in
|
|
the linear memory of Wasm thus they cannot appear inside OCaml values when using the
|
|
previously described compilation scheme.
|
|
|
|
In order to use references, we require a completely different compilation strategy.
|
|
We do not use the linear memory.
|
|
Our strategy is close to the native OCaml one, which we describe now.
|
|
|
|
\subsection{Native OCaml Value Representation}
|
|
|
|
OCaml uses a uniform representation of values known as pointer
|
|
tagging. One bit is used to indicate if the value is a small scalar or
|
|
a pointer to a heap-allocated block. Granted we are working on $n$
|
|
bits values, we have to choose one bit that will be the \emph{tag
|
|
bit}. For instance, if we choose the least significant bit, we start
|
|
by testing its value:
|
|
|
|
\begin{tcolorbox}[colback=lightbg]
|
|
\begin{center}
|
|
\includegraphics[width=10em]{figure_ocaml_bit1.mps}
|
|
\end{center}
|
|
\end{tcolorbox}
|
|
|
|
If $b_{0} = 0$, then the whole value is a pointer:
|
|
|
|
\begin{tcolorbox}[colback=lightbg]
|
|
\begin{center}
|
|
\includegraphics[width=20em]{figure_ocaml_bit2.mps}
|
|
\end{center}
|
|
\end{tcolorbox}
|
|
|
|
%TODO: split in two (add accolade en bas à gauche pour montrer la taille du pointeur/scalaire et enlever la partie droite)
|
|
|
|
If $b_{1} = 1$, then the $n - 1$ most significant bits are a small scalar and $b_{0}$ is ignored:
|
|
|
|
\begin{tcolorbox}[colback=lightbg]
|
|
\begin{center}
|
|
\includegraphics[width=20em]{figure_ocaml_bit3.mps}
|
|
\end{center}
|
|
\end{tcolorbox}
|
|
|
|
In the second case, we talk about \emph{small scalars} instead of
|
|
scalars because we can only represents $2^{n - 1}$ values instead of
|
|
the $2^n$ that are representable when all bits are available. For
|
|
pointers, we do not lose anything as they need to be \emph{aligned}
|
|
anyway and the least significant bit is always zero.
|
|
|
|
\begin{ocaml}
|
|
let array1 = [| 1; 2; 3 |]
|
|
let array2 = [| 42; 99; 666 |]
|
|
let pair = (array1, array2)
|
|
let x = 42
|
|
\end{ocaml}
|
|
|
|
\begin{tcolorbox}[colback=lightbg]
|
|
\begin{center}
|
|
\includegraphics[width=13em]{figure_ocaml_layout.mps}
|
|
\end{center}
|
|
\end{tcolorbox}
|
|
|
|
\subsection{OCaml Value Representation in Wasm}
|
|
|
|
As in native OCaml, we use a uniform representation. We cannot
|
|
use integers, as they cannot be used as pointers. We need to use a
|
|
reference type. The most general one is \mintinline{wast}{anyref}.
|
|
We can be more precise and use \mintinline{wast}{eqref}.
|
|
This is the type of all values that can be tested for physical equality.
|
|
It is a super type of OCaml values. This also allows us to test for
|
|
equality without requiring an additional cast.
|
|
|
|
\paragraph{Small scalars.}
|
|
|
|
All OCaml values that are represented as integers in the native
|
|
OCaml are represented as \mintinline{wast}{i31ref}. This includes
|
|
the \mintinline{ocaml}{int} and \mintinline{ocaml}{char} types,
|
|
but also all constant constructors of sum types.
|
|
|
|
\paragraph{Arrays.}
|
|
|
|
The OCaml array type is directly represented as a Wasm array.
|
|
|
|
\paragraph{Blocks.}
|
|
|
|
For other kinds of blocks, there are two possible choices: we can
|
|
either use a struct with a field for the tag and one field per OCaml
|
|
value field, or use arrays of \mintinline{wast}{eqref}
|
|
with the tag stored at position $0$.
|
|
|
|
\subsubsection{Blocks as Structs}
|
|
|
|
A block of size one is represented using the type \mintinline{wast}{$block1}
|
|
while for size two it is \mintinline{wast}{$block2}:
|
|
|
|
\begin{wast}
|
|
(type $block1 (struct
|
|
(field $tag i8)
|
|
(field $field0 eqref)))
|
|
|
|
(type $block2 (sub $block1) (struct
|
|
(field $tag i8)
|
|
(field $field0 eqref)
|
|
(field $field1 eqref)))
|
|
\end{wast}
|
|
|
|
The type \mintinline{wast}{$block2} is declared as a subtype of
|
|
\mintinline{wast}{$block1} because in the OCaml IRs, the primitive
|
|
for accessing a block field is untyped:
|
|
when accessing the $n$-th field of the block the only available information
|
|
is that the block has size at least $n + 1$.
|
|
It could be possible to propagate size metadata in the IRs but that wouldn't be sufficient because of untyped primitives such as \mintinline{ocaml}{Obj.field}
|
|
that we need to support in order to be compatible with existing code.
|
|
|
|
% TODO: more explicit example:
|
|
\begin{wast}
|
|
(func $snd (param $x eqref) (result eqref)
|
|
(struct.get $block2 1
|
|
(ref.cast $block2 (local.get $x))))
|
|
\end{wast}
|
|
|
|
\subsubsection{Blocks as Arrays}
|
|
|
|
We represent blocks with an array of eqref:
|
|
|
|
\begin{wast}
|
|
(type $block (array eqref))
|
|
\end{wast}
|
|
|
|
The tag is stored in the cell at index 0 of the array.
|
|
Reading its value is implemented by getting the cell and casting it to an integer:
|
|
|
|
\begin{wast}
|
|
(func $tag (param $x eqref) (result i32)
|
|
(i31.get_u (ref.cast i31ref
|
|
(array.get $block
|
|
(i32.const 0)
|
|
(ref.cast $block (local.get $x))))))
|
|
\end{wast}
|
|
|
|
Thus accessing the field $n$ of the OCaml block amounts
|
|
to accessing the field $n + 1$ of the array:
|
|
|
|
\begin{wast}
|
|
(func $snd (param $x eqref) (result eqref)
|
|
(array.get $block
|
|
(i32.const 2)
|
|
(ref.cast $block (local.get $x))))
|
|
\end{wast}
|
|
|
|
\subsubsection{Block Representation Tradeoffs}
|
|
|
|
The array representation is simpler but requires (implicit) bound checks at each
|
|
field access and a cast to read the tag.
|
|
|
|
On the other hand, the struct representation requires a more complex cast for the
|
|
Wasm runtime (a subtyping test). A compiler propagating more types
|
|
could use finer Wasm type information, providing a precise type for
|
|
each struct field. This would allow fewer casts.
|
|
|
|
The V8 runtime allows to consider casts as no-ops
|
|
(only for test purposes). The speedup we measured is around 10\%.
|
|
That gives us an upper bound on the actual runtime cost of casts.
|
|
|
|
OCaml blocks can be arbitrarily large, and very large ones do occur
|
|
in practice. In particular, modules are compiled as blocks and tend
|
|
to be on the larger side. We have seen in the wild some examples
|
|
containing thousands of fields. Wasm only allows a subtyping chain
|
|
of length $64$, which is far from sufficient for the struct encoding.
|
|
|
|
% TODO: explain more
|
|
% But a variant can be devised to
|
|
% circumvent that limitation: it is possible to represent those as trees
|
|
% of structs. We have not yet tried this solution and instead use the one described next.
|
|
|
|
\subsubsection{Boxed Numbers}
|
|
|
|
Raw Wasm scalars are not subtypes of \mintinline{wast}{eqref} thus they
|
|
cannot be used directly to represent OCaml boxed numbers. We need to box them inside a struct in order to make them compatible with
|
|
our representation:
|
|
|
|
\begin{wast}
|
|
(type $boxed_float (struct (field $v f64)))
|
|
(type $boxed_int64 (struct (field $v i64)))
|
|
\end{wast}
|
|
|
|
\subsubsection{Closures}
|
|
|
|
Wasm \mintinline{wast}{funcref} are functions, not closures, hence we need to produce
|
|
values containing both the function and its environment. The only Wasm
|
|
type construction that can contain both \mintinline{wast}{funcref} and other values are
|
|
structs. Thus, a closure is a struct containing a \mintinline{wast}{funcref} and the captured
|
|
variables. As an example, here is the type of a closure with two captured variables:
|
|
|
|
% TODO: explain more (why closure1 )
|
|
% TODO: est-ce que pour chaque fermeture on génère un type différent
|
|
|
|
\begin{wast}
|
|
(type $closure1 (struct
|
|
(field funcref)
|
|
(field $v1 eqref)
|
|
(field $v2 eqref)))
|
|
\end{wast}
|
|
|
|
In order to reduce casts and to handle mutually recursive functions,
|
|
the actual representation is a bit more complex. This is the only place where we
|
|
need recursive Wasm types.
|
|
|
|
\section{Compilation}
|
|
|
|
We use the Flambda IR of the OCaml compiler as input for the Wasm generation.
|
|
This is a step of the compilation chain where most of the
|
|
high-level OCaml-specific optimisations are already applied. Also in
|
|
this IR, the closure conversion pass is already performed. Most of the
|
|
constructions of this IR maps quite directly to Wasm ones.
|
|
|
|
\subsection{Control flow}
|
|
|
|
Control flow and continuations have a direct equivalent with Wasm \mintinline{wast}{block},
|
|
\mintinline{wast}{loop}, \mintinline{wast}{br_table}, and \mintinline{wast}{if} instructions.
|
|
Low level OCaml primitives to handle exceptions are quite similar to Wasm ones.
|
|
In OCaml, it is possible to generate new exceptions at runtime by using \emph{e.g.} the
|
|
\mintinline{ocaml}{let exception} syntax or functors and first-class modules.
|
|
This is not possible in the Wasm exception proposal.
|
|
Thus, we use the same Wasm exception everywhere and manage the rest on the side by ourselves,
|
|
using an identifier to discriminate between different exceptions.
|
|
|
|
\subsection{Currification}
|
|
The main difference revolves around functions. In OCaml, functions
|
|
take only one argument. However, in pratice, functions look like they have
|
|
more than one. Without any special management this would mean that
|
|
most of the code would be functions producing closures that would be
|
|
immediately applied. To handle that, internally, OCaml does have
|
|
functions taking multiple arguments, with some special handling for
|
|
times when they are partially applied. This information is explicit
|
|
at the Flambda level. In the native OCaml compiler, the transformation handling that
|
|
occurs in a later step called cmmgen.
|
|
Hence, we have to duplicate this in Wasocaml. Compiling this
|
|
requires some kind of structural subtyping on closures such that,
|
|
closures for functions of arity one are supertypes of all the other
|
|
closures. Thankfully there are easy encodings for that in Wasm.
|
|
% TODO: explicit closures sub-typing depending on the arity and the encoding in wasm
|
|
|
|
%% TODO example ml partial apply
|
|
|
|
\subsection{Stack Representation}
|
|
|
|
Most of the remaining work revolves around
|
|
translating from let bound expressions to a stack based language
|
|
without producing overly naive code. Also, we do not need to care too much
|
|
about low level Wasm specific optimisations as we rely on Binaryen~\cite{Web15}
|
|
(a quite efficient Wasm to Wasm optimizer) for those.
|
|
|
|
\subsection{Unboxing}
|
|
|
|
The main optimisation available in native OCaml that we are missing is
|
|
number unboxing. As OCaml values have a uniform representation, only
|
|
small scalars can fit in a true OCaml value. This means that the
|
|
types \mintinline{ocaml}{nativeint}, \mintinline{ocaml}{int32},
|
|
\mintinline{ocaml}{int64} and \mintinline{ocaml}{float} have to be
|
|
boxed. In numerical code, lots of intermediate values of type float
|
|
are created, and in that case, the allocation time to box
|
|
numbers completely dominates the actual computation time.
|
|
To limit that problem, there is an optimisation called unboxing performed
|
|
during the cmmgen pass that tends to eliminate most of the
|
|
useless allocations. As this pass is performed after Flambda, and was
|
|
not required to produce a complete working compiler, this was left
|
|
for future work. Note that the end plan is to use the next version of
|
|
Flambda, which does a much better job at unboxing. For the time being,
|
|
we can expect Binaryen to perform local unboxing in some cases.
|
|
|
|
\subsection{FFI}
|
|
|
|
A lot of OCaml code in the wild interacts with C or JavaScript code through bindings written using dedicated FFI mechanisms. We plan to allow re-use of existing bindings when compiling to Wasm. Currently, using C bindings when targeting the browser is not possible. People have to rewrite their code in pure OCaml or using JavaScript bindings. Compiling C bindings directly to Wasm allows to re-use them as-is.
|
|
|
|
\paragraph{C.}
|
|
|
|
Some extensions recently added to Clang~\cite{Lat08} allow to
|
|
compile C code to Wasm in a way that makes reusing existing
|
|
OCaml bindings to C code possible with almost no change.
|
|
We would only have to provide a modified version of the OCaml native FFI headers files, replacing the usual macros with hand written Wasm
|
|
functions. The only limitation that we forsee is that the \mintinline{c}{Field} macro will not be an l-value anymore; a new \mintinline{c}{Set_field} macro will be needed
|
|
instead (as it was originally proposed for OCaml 5).
|
|
|
|
\paragraph{JavaScript.}
|
|
|
|
To re-use existing bindings of OCaml to JavaScript code, we need one more extension: the reference-typed strings proposal~\cite{Web22}. Almost all external calls in the JavaScript FFI goes through the
|
|
\mintinline{ocaml}{Js.Unsafe.meth_call} function
|
|
(of type \mintinline{ocaml}{'a -> string -> any array -> 'b})
|
|
which can be exposed to the Wasm module by the embedder as a function of type:
|
|
|
|
\begin{wast}
|
|
(func $meth_call
|
|
(param $obj externref)
|
|
(param $method stringref)
|
|
(param $args $anyarray)
|
|
(result externref))
|
|
\end{wast}
|
|
|
|
This calls a method of name \mintinline{wast}{$method} on the object \mintinline{wast}{$obj} with the
|
|
arguments \mintinline{wast}{$args}. The JavaScript side manages all the dynamic typing.
|
|
|
|
\subsection{Expected Performances}
|
|
|
|
We cannot yet produce benchmarks on real sized programs,
|
|
Wasocaml is still a prototype and is not yet integrated
|
|
into existing build systems. We are able to run the
|
|
classical functional micro benchmarks using the experimental
|
|
branch of V8 and the results are quite convincing. While
|
|
we do not expect to be competitive with the native code
|
|
compiler, the performance degradation seems to be almost
|
|
constant (around twice slower).
|
|
|
|
% TODO: give a ratio for wasm/bytecode ??
|
|
|
|
We are able to compile an OCaml implementation of the
|
|
Knuth-Bendix algorithm~\cite{KB83}.
|
|
For now, it leads to runtime errors.
|
|
We have not been able to reproduce the error on a smaller case.
|
|
It is unclear whether our generated Wasm is incorrect or if we are
|
|
hitting a bug in the experimental V8 support for all Wasm extensions
|
|
we require.
|
|
|
|
% TODO: this is quite a minimal form of performance evaluation...
|
|
|
|
Compared to a JavaScript VM, a Wasm compiler is a much simpler beast
|
|
that can compile ahead of time. For this reason, various Wasm engines
|
|
are expected to behave quite similarly. They do not show any of the wild
|
|
impredictability that browsers tend to demonstrate with JavaScript.
|
|
Indeed, compiling OCaml to JS using jsoo leads to results that
|
|
are usually also twice as slow as native code in the best cases, but
|
|
can sometimes be much slower in an unpredictable fashion.
|
|
|
|
Currently there is no other Wasm runtime supporting all the extensions we require.
|
|
SpiderMonkey does not have tail-call. The reference interpreter implementation of
|
|
the various extensions are split in separate repository and merging them requires
|
|
some work. Nevertheless, we expect performances of Wasm-GC to vary across implementations in
|
|
a not too different way than they do for C-compiled Wasm programs.
|
|
Although the implementation choices space is larger when it comes to a full GC
|
|
than when implementing what is needed in Wasm1 (\emph{e.g.} register allocation).
|
|
|
|
\section{Perspectives}
|
|
|
|
As the first version of the implementation was intended as a
|
|
demonstration, it remains a bit rough around the edges. In particular only a
|
|
fraction of the externals from the standard library externals are
|
|
provided as handwritten Wasm.
|
|
The only unsupported part of the language is the objects fragment (by
|
|
lack of time rather than due to any specific complexity).
|
|
The source code of Wasocaml is publicly available~\cite{AC22}.
|
|
|
|
\subsection{Effect Handlers Support}
|
|
|
|
Our compiler is based on OCaml 4.14. So effect handlers were out of
|
|
the scope. There are three strategies to handle them.
|
|
|
|
\paragraph{CPS Compilation.}
|
|
|
|
It is possible to represent effect handlers as continuations,
|
|
using whole-program CPS transformation~\cite{Hil+17}.
|
|
However it is not ideal for two reasons. It requires the program to contain
|
|
OCaml code only. Moreover, it prevents the use of the stack of the
|
|
target language and instead stores that stack in the heap, with a non-negligible
|
|
cost. It can be mitigated by only applying the transformation to
|
|
code that cannot be proved to not use effect handlers but then it is no longer compatible with separate compilation and the optimisation needs to be done as a whole program transformation.
|
|
This is the choice made by js\_of\_ocaml.
|
|
|
|
% TODO: does it work with language interop ?
|
|
|
|
\paragraph{Wasm Stack Switching.}
|
|
|
|
There is an ongoing proposal, called stack switching~\cite{Web21b}, that exactly
|
|
matches the needs for OCaml effect handlers. With it the translation
|
|
would be quite straightforward.
|
|
|
|
\paragraph{JavaScript Promise Integration.}
|
|
|
|
Another strategy is available when targeting runtimes providing
|
|
both Wasm and JavaScript. There is another ongoing proposal called
|
|
JavaScript promise integration~\cite{Web21a}.
|
|
With this proposal, effects handlers can be implemented using a JavaScript device.
|
|
This proposal is likely to land before the proper stack switching one.
|
|
It might be a temporary solution.
|
|
|
|
\section{Related Work}
|
|
|
|
\subsection{Garbage Collected Languages to Wasm}
|
|
|
|
\subsubsection{Targeting Wasm1}
|
|
|
|
\paragraph{Go.}
|
|
Go compiles to Wasm1~\cite{Mus18}.
|
|
In order to be able to re-use the Go GC, it must be able to inspect the stack.
|
|
This is not possible within Wasm.
|
|
Thus it has to maintain the stack by itself using the memory,
|
|
which is much slower than using the Wasm stack.
|
|
% TODO: talk about coroutines ?
|
|
% TODO: talk about tinygo ?
|
|
|
|
\paragraph{Haskell.}
|
|
Haskell also compiles to Wasm~\cite{Sha22}.
|
|
It has constraints similar to Go and uses similar solutions.
|
|
In particular, it also uses the memory to manage the stack.
|
|
% TODO: explain that they had two backends and that Asterius was deprecated ?
|
|
|
|
\subsubsection{Targeting Wasm-GC}
|
|
|
|
\paragraph{Dart.}
|
|
Dart~\cite{The22} compiles to Wasm-GC.
|
|
It does not uses \mintinline{wast}{i31ref} and boxes scalars in
|
|
a \mintinline{wast}{struct}.
|
|
% TODO: maybe explain this better... (it's not clear how integers are represented natively)
|
|
|
|
\paragraph{Scheme.}
|
|
|
|
The last addition to the small family of compiler targeting Wasm-GC
|
|
is the Guile Scheme compiler. Scheme has many similar constraints as
|
|
OCaml and Guile uses many similar solutions. The compiler was
|
|
presented to the Wasm-GC working group~\cite{Win23b}. A more detailed
|
|
explanation is published~\cite{Win23a}. The implementation
|
|
and its in-depth technical description are available~\cite{Win23c}.
|
|
|
|
\subsection{OCaml Web Compilers}
|
|
|
|
The history of OCaml compilers targeting web languages is quite
|
|
crowded. Surely thanks to the great pleasure that is writing compilers in
|
|
OCaml.
|
|
|
|
\subsubsection{Targeting JavaScript}
|
|
|
|
There are multiple OCaml compilers targeting JavaScript. Many
|
|
approaches where experimented, the main two live ones are jsoo and
|
|
melange. Naive compilation of OCaml to JavaScript is quite simple as
|
|
it can almost be summarized as ``type erasure''. There are some
|
|
limitations in JavaScript that prevent that to be a complete compilation
|
|
strategy, and, of course, a proper compiler producing efficient and
|
|
small JavaScript code is quite more complex.
|
|
|
|
\paragraph{Jsoo.} Jsoo~\cite{VB14} tries to be as close as possible from the native semantics. It
|
|
compiles OCaml bytecode programs to JavaScript by decompiling it back
|
|
to a simple untyped lambda calculus then performs many optimisation
|
|
and minimisation passes.
|
|
|
|
\paragraph{Melange.} Melange~\cite{Mon22} tries to produce readable JavaScript, with a closer
|
|
integration in the JavaScript module system. Melange starts from a
|
|
modified version of the lambda IR which provides more type
|
|
information. This allows the use of JavaScript features that match the
|
|
uses of source features at the cost of some small semantical differences.
|
|
|
|
%% TODO Parler de tous les autres sans en parler ?
|
|
|
|
\subsubsection{Targeting Wasm}
|
|
|
|
\paragraph{OCamlrun WebAssembly.}
|
|
OCamlrun WebAssembly~\cite{Mar17} compiles
|
|
the OCaml runtime and interpreter to Wasm. They are both written in C so it
|
|
is possible without any Wasm extension.
|
|
In some cases, it allows to start executing code faster than when compilin
|
|
OCaml to Wasm.
|
|
This is because there is less WebAssembly code to compile for the embedder.
|
|
On the other hand, the execution is slower as it is interpreting OCaml
|
|
bytecode inside Wasm instead of running a compiled version.
|
|
|
|
\paragraph{WASICaml.}
|
|
WASICaml~\cite{Sto21} is quite similar to OCamlrun WebAssembly.
|
|
The main difference being that WASICaml does not interpret the bytecode directly
|
|
but translates it partially to Wasm.
|
|
It leads to a faster execution.
|
|
|
|
\paragraph{Wasm\_of\_ocaml.}
|
|
Wasm\_of\_ocaml~\cite{Vou23} is a fork of Js\_of\_ocaml.
|
|
It compiles the bytecode to Wasm-GC and re-uses many of the techniques developed by Wasocaml.
|
|
It does not benefit from the optimisations provided by Flambda.
|
|
It is also quite restricted in its value representation choice and must use a representation
|
|
based on arrays.
|
|
|
|
\section{Conclusion}
|
|
|
|
The original goal of this implementation was to validate the Wasm-GC
|
|
proposal and discuss its limits. This part was successful. No major
|
|
changes were required for OCaml (mainly keeping the \mintinline{wast}{i31ref}
|
|
type and changes to the maximal length of subtyping chains). And we are quite
|
|
happy to claim that Wasm-GC is, to our opinion a very good design,
|
|
that is a perfect compilation target for a garbage collected
|
|
functional language like OCaml. And we think that the experience for
|
|
most other garbage-collected languages will probably be similar.
|
|
|
|
As a side-effect, we now have an OCaml to Wasm-GC compiler.
|
|
It is not yet usable because there are no user-side Wasm-GC
|
|
runtime available. In order to test our compiler we are using V8 with various flags
|
|
to enable experimental features such as the GC support.
|
|
The design of the various extensions required by Wasocaml is stable
|
|
and quite close to completion, but some details are still in flux.
|
|
For this reason, we cannot expect it them be widely available soon.
|
|
On the other hand, it means that our compiler will be ready when
|
|
browsers start deploying new Wasm extensions.
|
|
|
|
Our future plans are to complete the Wasm implementations of OCaml externals, to implement the various FFI
|
|
mechanisms, support effect handlers and to move Wasocaml to Flambda2. All of these would allow to easily
|
|
deploy multi-language software on the browser while having good and predictable performances.
|
|
|
|
% TODO:
|
|
% For the conclusion, I really want to hear about lessons learned, moving forward, etc. Especially since there's no practical use for this work at this time, why should the broader OCaml community be interested in it? Is there something unique that OCaml offers wasm? Are there lessons that we--the OCaml community of developers--can learn from your experience? Were there specific limitations of OCaml that we should be aware of and work on? Are specific strengths of OCaml that distinguish your effort from those working in other languages and ecosystems?
|
|
\printbibliography{}
|
|
|
|
\end{document}
|
|
|