From 1c9ef9a23d22968f4bd8403f05102940fffb71e5 Mon Sep 17 00:00:00 2001 From: RadioPotin Date: Fri, 21 Jul 2023 14:20:53 +0200 Subject: [PATCH] Dario Reviews Poor English II: The Revenge of Queen Elisabeth --- wasocaml.tex | 71 ++++++++++++++++++++++++++-------------------------- 1 file changed, 35 insertions(+), 36 deletions(-) diff --git a/wasocaml.tex b/wasocaml.tex index 12c244b..fa653e5 100644 --- a/wasocaml.tex +++ b/wasocaml.tex @@ -152,7 +152,7 @@ It is possible to dynamically test for type compatibility. The general rule for the Wasm comittee is to only include features with a demonstrated use case. As there are currently very few -compilers targeting the GC-proposal, some features where lacking +compilers targeting the GC-proposal, some features were lacking conclusive evidence of their usefulness. An example is the \mintinline{wast}{i31ref} type that is not required by the Dart compiler (the only one targeting the GC-proposal at the time). Wasocaml demonstrates the usefulness of \mintinline{wast}{i31ref}. It also validates the GC-proposal on a functional language. We presented Wasocaml to the Wasm-GC working group~\cite{AC23}. It helped in convincing the working group to keep \mintinline{wast}{i31ref} in the proposal. @@ -183,8 +183,8 @@ These approaches suffer from the same problem as C programs running on Wasm: the cannot natively manipulate external values, such as DOM objects in the browser. Indeed, in the first versions of Wasm, the only types are scalars (\mintinline{wast}{i32}, \mintinline{wast}{i64}, \mintinline{wast}{f32} and -\mintinline{wast}{f64}). There's no way to manipulate -directly values from the embedder (\emph{e.g.} JavaScript objects). It could be +\mintinline{wast}{f64}). There is no way to directly +manipulate values from the embedder (\emph{e.g.} JavaScript objects). It could be possible to identify objects with an integer and map them to their corresponding objects on the embedder side. This approach comes with the usual limitations of having two runtimes manipulating (indirectly) @@ -192,12 +192,12 @@ garbage-collected values: it is tedious and easily leads to memory leaks (cycles between the two runtimes cannot be collected). To properly interact with the embedder, we need to leverage the reference extension. -This extension adds new types to the language, e.g.\ \mintinline{wast}{externref} that +This extension adds new types to the language, e.g.\ \mintinline{wast}{externref} which is an opaque type representing a value from the embedder. References cannot be stored in the linear memory of Wasm thus they cannot appear inside OCaml values when using the previously described compilation scheme. -In order to use references, we require a completely different compilation strategy. We do not use the linear memory. Our strategy is close to the native OCaml one, which we describe now. +In order to use references, we require a completely different compilation strategy; we do not use the linear memory. Our strategy is close to the native OCaml one, which we describe now. \subsection{Native OCaml Value Representation} @@ -274,9 +274,9 @@ The OCaml array type is directly represented as a Wasm array. \paragraph{Blocks.} -For other kinds of blocks, there are two possible choices. We can +For other kinds of blocks, there are two possible choices: we can either use a struct with a field for the tag and one field per OCaml -value field. The other choice is to use arrays of \mintinline{wast}{eqref} +value field, or use arrays of \mintinline{wast}{eqref} with the tag stored at position $0$. \subsubsection{Blocks as Structs} @@ -319,7 +319,7 @@ We represent blocks with an array of eqref: \end{wast} The tag is stored in the cell at index 0 of the array. -Reading its value is implemented by getting the cell and casting to an integer: +Reading its value is implemented by getting the cell and casting it to an integer: \begin{wast} (func $tag (param $x eqref) (result i32) @@ -348,14 +348,14 @@ field access and a cast to read the tag. On the other hand, the struct representation requires a more complex cast for the Wasm runtime (a subtyping test). A compiler propagating more types -could use finer Wasm type information, providing a precise type to +could use finer Wasm type information, providing a precise type for each struct field. This would allow fewer casts. The V8 runtime allows to consider casts as no-ops -(only for test purpose). The speedup we measured is around 10\%. +(only for test purposes). The speedup we measured is around 10\%. That gives us an upper bound on the actual runtime cost of casts. -OCaml blocks can be arbitrarily large, and very large ones do appear +OCaml blocks can be arbitrarily large, and very large ones do occur in practice. In particular, modules are compiled as blocks and tend to be on the larger side. We have seen in the wild some examples containing thousands of fields. Wasm only allows a subtyping chain @@ -382,7 +382,7 @@ our representation: Wasm \mintinline{wast}{funcref} are functions, not closures, hence we need to produce values containing both the function and its environment. The only Wasm type construction that can contain both \mintinline{wast}{funcref} and other values are -the structs. Thus, a closure is a struct containing a \mintinline{wast}{funcref} and the captured +structs. Thus, a closure is a struct containing a \mintinline{wast}{funcref} and the captured variables. As an example, here is the type of a closure with two captured variables: % TODO: explain more (why closure1 ) @@ -395,8 +395,8 @@ variables. As an example, here is the type of a closure with two captured variab (field $v2 eqref))) \end{wast} -The actual representation is a bit more complex, to reduce casts and -to handle mutually recursive functions. This is the only place where we +In order to reduce casts and to handle mutually recursive functions, +the actual representation is a bit more complex. This is the only place where we need recursive Wasm types. \section{Compilation} @@ -404,7 +404,7 @@ need recursive Wasm types. We use the Flambda IR of the OCaml compiler as input for the Wasm generation. This is a step of the compilation chain where most of the high-level OCaml-specific optimisations are already applied. Also in -this IR, the closure conversion pass is already performed. Most of the +this IR, the closure conversion pass had already been performed. Most of the constructions of this IR maps quite directly to Wasm ones: \begin{itemize} @@ -416,7 +416,7 @@ constructions of this IR maps quite directly to Wasm ones: \subsection{Currification} The main difference revolves around functions. In OCaml, functions -have only one argument. However, in pratice, functions look like they have +take only one argument. However, in pratice, functions look like they have more than one. Without any special management this would mean that most of the code would be functions producing closures that would be immediately applied. To handle that, internally, OCaml does have @@ -436,7 +436,7 @@ closures. Thankfully there are easy encodings for that in Wasm. Most of the remaining work revolves around translating from let bound expressions to a stack based language -without producing too naive code. Also, we do not need to care too much +without producing overly naive code. Also, we do not need to care too much about low level Wasm specific optimisations as we rely on Binaryen~\cite{Web15} (a quite efficient Wasm to Wasm optimizer) for those. @@ -473,7 +473,7 @@ instead (as it was originally proposed for OCaml 5). \paragraph{JavaScript.} -To re-use existing OCaml bindings to JavaScript code, we need one more extension: the reference-typed strings proposal~\cite{Web22}. Almost all external calls in the OCaml JavaScript FFI goes through the +To re-use existing bindings of OCaml to JavaScript code, we need one more extension: the reference-typed strings proposal~\cite{Web22}. Almost all external calls in the OCaml JavaScript FFI goes through the \mintinline{ocaml}{Js.Unsafe.meth_call} function (of type \mintinline{ocaml}{'a -> string -> any array -> 'b}) which can be exposed to the Wasm module by the embedder as a function of type: @@ -502,7 +502,7 @@ constant (around twice slower). Compared to a JavaScript VM, a Wasm compiler is a much simpler beast that can compile ahead of time. For this reason, various Wasm engines -are expected to behave quite similarly. They don't have none of the wild +are expected to behave quite similarly. They do not show any of the wild impredictability that browsers tend to demonstrate with JavaScript. Indeed, compiling OCaml to JS using jsoo leads to results that are usually also twice as slow as native code in the best cases, but @@ -516,17 +516,17 @@ Currently there is no other Wasm runtime supporting all the extensions we requir SpiderMonkey does not have tail-call. The reference interpreter implementation of the various extensions are split in separate repository and merging them requires some work. Nevertheless, we expect performances of Wasm-GC to vary across implementations in -a not too different way they vary for C-compiled Wasm programs. +a not too different way than they do for C-compiled Wasm programs. Although the implementation choices space is larger when it comes to a full GC than when implementing what is needed in Wasm1 (\emph{e.g.} register allocation). \section{Perspectives} -As the first version of the implementation was meant as a -demonstrator, it is a bit rough on the edges. In particular only a +As the first version of the implementation was intended as a +demonstration, it remains a bit rough around the edges. In particular only a fraction of the externals from the standard library externals are -provided as hand written Wasm. -The only unsuported part of the language is the objects fragment (by +provided as handwritten Wasm. +The only unsupported part of the language is the objects fragment (by lack of time rather than due to any specific complexity). The source code of Wasocaml is publicly available~\cite{AC22}. @@ -580,7 +580,7 @@ It might be a temporary solution. \paragraph{Scheme.} -The last addition to the small familly of compiler targeting Wasm-GC +The last addition to the small family of compiler targeting Wasm-GC is the Guile Scheme compiler. Scheme has many similar constraints as OCaml and Guile uses many similar solutions. The compiler was presented to the Wasm-GC working group~\cite{Win23b}. A more detailed @@ -590,7 +590,7 @@ and its in-depth technical description are available~\cite{Win23c}. \subsection{OCaml Web Compilers} The history of OCaml compilers targeting web languages is quite -crowded. Maybe thanks to the great pleasure of writing compilers in +crowded. Surely thanks to the great pleasure that is writing compilers in OCaml. \subsubsection{Targeting JavaScript} @@ -599,8 +599,8 @@ There are multiple OCaml compilers targeting JavaScript. Many approaches where experimented, the main two live ones are jsoo and melange. Naive compilation of OCaml to JavaScript is quite simple as it can almost be summarized as ``type erasure''. There are some -limitations in JavaScript that prevent that to be complete compilation -strategy, and of course a proper compiler producing efficient and +limitations in JavaScript that prevent that to be a complete compilation +strategy, and, of course, a proper compiler producing efficient and small JavaScript code is quite more complex. \paragraph{Jsoo.} Jsoo~\cite{VB14} tries to be as close as possible from the native semantics. It @@ -610,10 +610,9 @@ and minimisation passes. \paragraph{Melange.} Melange~\cite{Mon22} tries to produce readable JavaScript, with a closer integration in the JavaScript module system. Melange starts from a -modified version of the lambda IR providing more type -informations. This allows to use JavaScript features that matches the -uses of the sources ones, at the cost of some small differences in the -semantics. +modified version of the lambda IR which provides more type +information. This allows the use of JavaScript features that match the +uses of source features at the cost of some small semantical differences. %% TODO Parler de tous les autres sans en parler ? @@ -641,14 +640,14 @@ type and changes to the maximal length of subtyping chains). And we are quite happy to claim that Wasm-GC is, to our opinion a very good design, that is a perfect compilation target for a garbage collected functional language like OCaml. And we think that the experience for -most other garbage collected languages will probably be similar. +most other garbage-collected languages will probably be similar. As a side-effect, we now have an OCaml to Wasm-GC compiler. -This is not usable today because there are no user available Wasm-GC -runtimes. In order to test our compiler we are using V8 with various flags +It is not yet usable because there are no user-side Wasm-GC +runtime available. In order to test our compiler we are using V8 with various flags to enable experimental features such as the GC support. The design of the various extensions required by Wasocaml is stable -and quite to completion, but some details are still in flux. +and quite close to completion, but some details are still in flux. For this reason, we cannot expect it them be widely available soon. On the other hand, it means that our compiler will be ready when browsers start deploying new Wasm extensions.