Dario Reviews Poor English II: The Revenge of Queen Elisabeth
This commit is contained in:
parent
f501ba9359
commit
1c9ef9a23d
71
wasocaml.tex
71
wasocaml.tex
@ -152,7 +152,7 @@ It is possible to dynamically test for type compatibility.
|
||||
|
||||
The general rule for the Wasm comittee is to only include features
|
||||
with a demonstrated use case. As there are currently very few
|
||||
compilers targeting the GC-proposal, some features where lacking
|
||||
compilers targeting the GC-proposal, some features were lacking
|
||||
conclusive evidence of their usefulness. An example is the \mintinline{wast}{i31ref}
|
||||
type that is not required by the Dart compiler (the only one targeting the GC-proposal at the time). Wasocaml demonstrates the usefulness of \mintinline{wast}{i31ref}. It also validates the GC-proposal on a functional language. We presented Wasocaml to the Wasm-GC working group~\cite{AC23}. It helped in convincing the working group to keep \mintinline{wast}{i31ref} in the proposal.
|
||||
|
||||
@ -183,8 +183,8 @@ These approaches suffer from the same problem as C programs running on Wasm: the
|
||||
cannot natively manipulate external values, such as DOM objects in the
|
||||
browser. Indeed, in the first versions of Wasm, the only types
|
||||
are scalars (\mintinline{wast}{i32}, \mintinline{wast}{i64}, \mintinline{wast}{f32} and
|
||||
\mintinline{wast}{f64}). There's no way to manipulate
|
||||
directly values from the embedder (\emph{e.g.} JavaScript objects). It could be
|
||||
\mintinline{wast}{f64}). There is no way to directly
|
||||
manipulate values from the embedder (\emph{e.g.} JavaScript objects). It could be
|
||||
possible to identify objects with an integer and map them to their
|
||||
corresponding objects on the embedder side. This approach comes with
|
||||
the usual limitations of having two runtimes manipulating (indirectly)
|
||||
@ -192,12 +192,12 @@ garbage-collected values: it is tedious and easily leads to memory leaks
|
||||
(cycles between the two runtimes cannot be collected).
|
||||
|
||||
To properly interact with the embedder, we need to leverage the reference extension.
|
||||
This extension adds new types to the language, e.g.\ \mintinline{wast}{externref} that
|
||||
This extension adds new types to the language, e.g.\ \mintinline{wast}{externref} which
|
||||
is an opaque type representing a value from the embedder. References cannot be stored in
|
||||
the linear memory of Wasm thus they cannot appear inside OCaml values when using the
|
||||
previously described compilation scheme.
|
||||
|
||||
In order to use references, we require a completely different compilation strategy. We do not use the linear memory. Our strategy is close to the native OCaml one, which we describe now.
|
||||
In order to use references, we require a completely different compilation strategy; we do not use the linear memory. Our strategy is close to the native OCaml one, which we describe now.
|
||||
|
||||
\subsection{Native OCaml Value Representation}
|
||||
|
||||
@ -274,9 +274,9 @@ The OCaml array type is directly represented as a Wasm array.
|
||||
|
||||
\paragraph{Blocks.}
|
||||
|
||||
For other kinds of blocks, there are two possible choices. We can
|
||||
For other kinds of blocks, there are two possible choices: we can
|
||||
either use a struct with a field for the tag and one field per OCaml
|
||||
value field. The other choice is to use arrays of \mintinline{wast}{eqref}
|
||||
value field, or use arrays of \mintinline{wast}{eqref}
|
||||
with the tag stored at position $0$.
|
||||
|
||||
\subsubsection{Blocks as Structs}
|
||||
@ -319,7 +319,7 @@ We represent blocks with an array of eqref:
|
||||
\end{wast}
|
||||
|
||||
The tag is stored in the cell at index 0 of the array.
|
||||
Reading its value is implemented by getting the cell and casting to an integer:
|
||||
Reading its value is implemented by getting the cell and casting it to an integer:
|
||||
|
||||
\begin{wast}
|
||||
(func $tag (param $x eqref) (result i32)
|
||||
@ -348,14 +348,14 @@ field access and a cast to read the tag.
|
||||
|
||||
On the other hand, the struct representation requires a more complex cast for the
|
||||
Wasm runtime (a subtyping test). A compiler propagating more types
|
||||
could use finer Wasm type information, providing a precise type to
|
||||
could use finer Wasm type information, providing a precise type for
|
||||
each struct field. This would allow fewer casts.
|
||||
|
||||
The V8 runtime allows to consider casts as no-ops
|
||||
(only for test purpose). The speedup we measured is around 10\%.
|
||||
(only for test purposes). The speedup we measured is around 10\%.
|
||||
That gives us an upper bound on the actual runtime cost of casts.
|
||||
|
||||
OCaml blocks can be arbitrarily large, and very large ones do appear
|
||||
OCaml blocks can be arbitrarily large, and very large ones do occur
|
||||
in practice. In particular, modules are compiled as blocks and tend
|
||||
to be on the larger side. We have seen in the wild some examples
|
||||
containing thousands of fields. Wasm only allows a subtyping chain
|
||||
@ -382,7 +382,7 @@ our representation:
|
||||
Wasm \mintinline{wast}{funcref} are functions, not closures, hence we need to produce
|
||||
values containing both the function and its environment. The only Wasm
|
||||
type construction that can contain both \mintinline{wast}{funcref} and other values are
|
||||
the structs. Thus, a closure is a struct containing a \mintinline{wast}{funcref} and the captured
|
||||
structs. Thus, a closure is a struct containing a \mintinline{wast}{funcref} and the captured
|
||||
variables. As an example, here is the type of a closure with two captured variables:
|
||||
|
||||
% TODO: explain more (why closure1 )
|
||||
@ -395,8 +395,8 @@ variables. As an example, here is the type of a closure with two captured variab
|
||||
(field $v2 eqref)))
|
||||
\end{wast}
|
||||
|
||||
The actual representation is a bit more complex, to reduce casts and
|
||||
to handle mutually recursive functions. This is the only place where we
|
||||
In order to reduce casts and to handle mutually recursive functions,
|
||||
the actual representation is a bit more complex. This is the only place where we
|
||||
need recursive Wasm types.
|
||||
|
||||
\section{Compilation}
|
||||
@ -404,7 +404,7 @@ need recursive Wasm types.
|
||||
We use the Flambda IR of the OCaml compiler as input for the Wasm generation.
|
||||
This is a step of the compilation chain where most of the
|
||||
high-level OCaml-specific optimisations are already applied. Also in
|
||||
this IR, the closure conversion pass is already performed. Most of the
|
||||
this IR, the closure conversion pass had already been performed. Most of the
|
||||
constructions of this IR maps quite directly to Wasm ones:
|
||||
|
||||
\begin{itemize}
|
||||
@ -416,7 +416,7 @@ constructions of this IR maps quite directly to Wasm ones:
|
||||
|
||||
\subsection{Currification}
|
||||
The main difference revolves around functions. In OCaml, functions
|
||||
have only one argument. However, in pratice, functions look like they have
|
||||
take only one argument. However, in pratice, functions look like they have
|
||||
more than one. Without any special management this would mean that
|
||||
most of the code would be functions producing closures that would be
|
||||
immediately applied. To handle that, internally, OCaml does have
|
||||
@ -436,7 +436,7 @@ closures. Thankfully there are easy encodings for that in Wasm.
|
||||
|
||||
Most of the remaining work revolves around
|
||||
translating from let bound expressions to a stack based language
|
||||
without producing too naive code. Also, we do not need to care too much
|
||||
without producing overly naive code. Also, we do not need to care too much
|
||||
about low level Wasm specific optimisations as we rely on Binaryen~\cite{Web15}
|
||||
(a quite efficient Wasm to Wasm optimizer) for those.
|
||||
|
||||
@ -473,7 +473,7 @@ instead (as it was originally proposed for OCaml 5).
|
||||
|
||||
\paragraph{JavaScript.}
|
||||
|
||||
To re-use existing OCaml bindings to JavaScript code, we need one more extension: the reference-typed strings proposal~\cite{Web22}. Almost all external calls in the OCaml JavaScript FFI goes through the
|
||||
To re-use existing bindings of OCaml to JavaScript code, we need one more extension: the reference-typed strings proposal~\cite{Web22}. Almost all external calls in the OCaml JavaScript FFI goes through the
|
||||
\mintinline{ocaml}{Js.Unsafe.meth_call} function
|
||||
(of type \mintinline{ocaml}{'a -> string -> any array -> 'b})
|
||||
which can be exposed to the Wasm module by the embedder as a function of type:
|
||||
@ -502,7 +502,7 @@ constant (around twice slower).
|
||||
|
||||
Compared to a JavaScript VM, a Wasm compiler is a much simpler beast
|
||||
that can compile ahead of time. For this reason, various Wasm engines
|
||||
are expected to behave quite similarly. They don't have none of the wild
|
||||
are expected to behave quite similarly. They do not show any of the wild
|
||||
impredictability that browsers tend to demonstrate with JavaScript.
|
||||
Indeed, compiling OCaml to JS using jsoo leads to results that
|
||||
are usually also twice as slow as native code in the best cases, but
|
||||
@ -516,17 +516,17 @@ Currently there is no other Wasm runtime supporting all the extensions we requir
|
||||
SpiderMonkey does not have tail-call. The reference interpreter implementation of
|
||||
the various extensions are split in separate repository and merging them requires
|
||||
some work. Nevertheless, we expect performances of Wasm-GC to vary across implementations in
|
||||
a not too different way they vary for C-compiled Wasm programs.
|
||||
a not too different way than they do for C-compiled Wasm programs.
|
||||
Although the implementation choices space is larger when it comes to a full GC
|
||||
than when implementing what is needed in Wasm1 (\emph{e.g.} register allocation).
|
||||
|
||||
\section{Perspectives}
|
||||
|
||||
As the first version of the implementation was meant as a
|
||||
demonstrator, it is a bit rough on the edges. In particular only a
|
||||
As the first version of the implementation was intended as a
|
||||
demonstration, it remains a bit rough around the edges. In particular only a
|
||||
fraction of the externals from the standard library externals are
|
||||
provided as hand written Wasm.
|
||||
The only unsuported part of the language is the objects fragment (by
|
||||
provided as handwritten Wasm.
|
||||
The only unsupported part of the language is the objects fragment (by
|
||||
lack of time rather than due to any specific complexity).
|
||||
The source code of Wasocaml is publicly available~\cite{AC22}.
|
||||
|
||||
@ -580,7 +580,7 @@ It might be a temporary solution.
|
||||
|
||||
\paragraph{Scheme.}
|
||||
|
||||
The last addition to the small familly of compiler targeting Wasm-GC
|
||||
The last addition to the small family of compiler targeting Wasm-GC
|
||||
is the Guile Scheme compiler. Scheme has many similar constraints as
|
||||
OCaml and Guile uses many similar solutions. The compiler was
|
||||
presented to the Wasm-GC working group~\cite{Win23b}. A more detailed
|
||||
@ -590,7 +590,7 @@ and its in-depth technical description are available~\cite{Win23c}.
|
||||
\subsection{OCaml Web Compilers}
|
||||
|
||||
The history of OCaml compilers targeting web languages is quite
|
||||
crowded. Maybe thanks to the great pleasure of writing compilers in
|
||||
crowded. Surely thanks to the great pleasure that is writing compilers in
|
||||
OCaml.
|
||||
|
||||
\subsubsection{Targeting JavaScript}
|
||||
@ -599,8 +599,8 @@ There are multiple OCaml compilers targeting JavaScript. Many
|
||||
approaches where experimented, the main two live ones are jsoo and
|
||||
melange. Naive compilation of OCaml to JavaScript is quite simple as
|
||||
it can almost be summarized as ``type erasure''. There are some
|
||||
limitations in JavaScript that prevent that to be complete compilation
|
||||
strategy, and of course a proper compiler producing efficient and
|
||||
limitations in JavaScript that prevent that to be a complete compilation
|
||||
strategy, and, of course, a proper compiler producing efficient and
|
||||
small JavaScript code is quite more complex.
|
||||
|
||||
\paragraph{Jsoo.} Jsoo~\cite{VB14} tries to be as close as possible from the native semantics. It
|
||||
@ -610,10 +610,9 @@ and minimisation passes.
|
||||
|
||||
\paragraph{Melange.} Melange~\cite{Mon22} tries to produce readable JavaScript, with a closer
|
||||
integration in the JavaScript module system. Melange starts from a
|
||||
modified version of the lambda IR providing more type
|
||||
informations. This allows to use JavaScript features that matches the
|
||||
uses of the sources ones, at the cost of some small differences in the
|
||||
semantics.
|
||||
modified version of the lambda IR which provides more type
|
||||
information. This allows the use of JavaScript features that match the
|
||||
uses of source features at the cost of some small semantical differences.
|
||||
|
||||
%% TODO Parler de tous les autres sans en parler ?
|
||||
|
||||
@ -641,14 +640,14 @@ type and changes to the maximal length of subtyping chains). And we are quite
|
||||
happy to claim that Wasm-GC is, to our opinion a very good design,
|
||||
that is a perfect compilation target for a garbage collected
|
||||
functional language like OCaml. And we think that the experience for
|
||||
most other garbage collected languages will probably be similar.
|
||||
most other garbage-collected languages will probably be similar.
|
||||
|
||||
As a side-effect, we now have an OCaml to Wasm-GC compiler.
|
||||
This is not usable today because there are no user available Wasm-GC
|
||||
runtimes. In order to test our compiler we are using V8 with various flags
|
||||
It is not yet usable because there are no user-side Wasm-GC
|
||||
runtime available. In order to test our compiler we are using V8 with various flags
|
||||
to enable experimental features such as the GC support.
|
||||
The design of the various extensions required by Wasocaml is stable
|
||||
and quite to completion, but some details are still in flux.
|
||||
and quite close to completion, but some details are still in flux.
|
||||
For this reason, we cannot expect it them be widely available soon.
|
||||
On the other hand, it means that our compiler will be ready when
|
||||
browsers start deploying new Wasm extensions.
|
||||
|
Loading…
x
Reference in New Issue
Block a user