Section compilation

This commit is contained in:
Pierre Chambart 2023-07-13 10:30:22 +02:00
parent 2c9947ed7e
commit 7f1820d84c

View File

@ -292,16 +292,61 @@ The actual representation is a bit more complex, to reduce casts and
handle mutualy recursive functions. This is the only place where we
need recursive WASM types.
\subsubsection{code representation}
\subsubsection{compilation}
The translation from OCaml expression to WASM is quite
straightforward. Most of the work revolves around translating from let
bound expressions to a stack based language without producing too
naive code. In practice the fine details of the code don't matter much
as we can use Binaryen (a quite efficient WASM to WASM optimizer) to
clean up dirty code.
We use as input for the WASM generation the flambda IR of the OCaml
compiler. This is a place in the compilation chain where most of the
high level OCaml specific optimisations are already applied. Also in
this IR, the closure conversion pass is alread done. Most of the
constructions of this IR maps quite directly to WASM ones:
WASM exceptions can be used quite directly to represent OCaml ones.
\begin{itemize}
\item Control flow and continuations have a direct equivalent with WASM blocks loops, br\_table and if
\item Low level OCaml exception constructs are almost indistinguishable from WASM ones
\end{itemize}
\paragraph{currification}
The main difference revolves around functions. In OCaml, the functions
have only one argument. But most of the functions practically, takes
more than one. Without any special management this would mean that
most of the code would be functions producting closures that would be
immediately applied. To handle that, internally OCaml do have
functions taking multiple arguments, with some special handling for
cases where they are partially applied. This information is explicit
at the Flambda level. The transformation handle that would normally
occur in the native OCaml compiler in a further pass, called cmmgen.
Hence we have to duplicate this for in Wasocaml. Compiling this
requires some kind of structural subtyping on closures such that
closures for functions of arity one are supertypes of all the other
closures. Thanksfully there are easy encodings for that in WASM.
%% TODO example ml partial apply
\paragraph{stack representation}
From that point, the translation to WASM is quite
straightforward. Most of the remaining work revolves around
translating from let bound expressions to a stack based language
without producing too naive code. Also, we don't need to care too much
about low level WASM specific optimisations as we rely on Binaryen
(a quite efficient WASM to WASM optimizer) for those.
%% TODO trouver une ref à citer pour binaryen
\paragraph{unboxing}
The main optimisation available in native OCaml that we are missing is
number unboxing. As OCaml values have an uniform representation, only
small scalars can fit in a direct OCaml value. This means that the
types int64, nativeint, int32 and float have to be boxed. In numerical
code, lots of intermediate values of type float are created, and in
that case, the allocation time of the box of numbers would completely
dominate the actual computation time. To limit that problem there is
an optimisation called unboxing performed during the cmmgen pass that
tries to eliminate most of the obviously useless allocations. As this
pass is performed after flambda, and was not required to produce a
complete working compiler, this was left for future work. Note that
the end plan is to use the next version of flambda, which does a much
better job for unboxing.
\subsubsection{FFI}