|
| 1 | +--- |
| 2 | +title: A Preview of Universal Libraries in Dune |
| 3 | +date: 2024-04-10 |
| 4 | +author: Antonio Monteiro |
| 5 | +gravatar: 45c2052f50f561b9dc2cae59c777aecd794f57269fa317f9c9c3365c2e00d16f |
| 6 | +twitter: '@_anmonteiro' |
| 7 | +--- |
| 8 | + |
| 9 | +I recently shared a [2024 progress |
| 10 | +update](whats-2024-brought-to-melange-so-far) about our work on Melange. In |
| 11 | +that message, I briefly wrote about "universal libraries" in Dune, the ability |
| 12 | +to write a shared OCaml / Melange codebase while varying specific module |
| 13 | +implementations, flags, preprocessing steps, etc. according to the compilation |
| 14 | +target. |
| 15 | + |
| 16 | +I also promised to dive deeper into what "universal libraries" are all about, |
| 17 | +and the new use cases that they unlock in Dune. Keep reading for an in-depth |
| 18 | +look at the history behind this new feature rolling out in Dune 3.16. |
| 19 | + |
| 20 | +--- |
| 21 | + |
| 22 | +## The Bird's-eye View |
| 23 | + |
| 24 | +Let's walk backwards from our end-goal: having a shared OCaml / Melange |
| 25 | +codebase that can render React.js components on the server, such that the |
| 26 | +[Ahrefs](https://ahrefs.com) website can be rendered on the server without |
| 27 | +JavaScript. And, finally, having React.js hydrate the server-rendered HTML in |
| 28 | +the browser. [Dave](https://twitter.com/davesnx) explains the motivation behind |
| 29 | +this goal in more depth [in his blog |
| 30 | +](https://sancho.dev/blog/server-side-rendering-react-in-ocaml). |
| 31 | + |
| 32 | +To look at a specific example, we'll start with a Melange codebase already |
| 33 | +using [`reason-react`](https://github.com/reasonml/reason-react). Our goal is |
| 34 | +to get those `reason-react` components to compile server-side with the OCaml |
| 35 | +compiler, where we'll use |
| 36 | +[`server-reason-react`](https://github.com/ml-in-barcelona/server-reason-react) |
| 37 | +as a drop-in replacement for `reason-react`. |
| 38 | + |
| 39 | + |
| 40 | +What gets in our way is that: |
| 41 | + |
| 42 | +- not everything is supported on both sides: some Melange modules use APIs that |
| 43 | + don't exist in OCaml (and extensive shimming is undesirable). |
| 44 | + - vice-versa on the Melange side; especially code that calls into C bindings. |
| 45 | +- we can't choose what implementation to use inside a module or conditionally |
| 46 | + apply different preprocessing steps and/or flags. |
| 47 | + |
| 48 | +In summary, we would like to vary specific module implementations across the |
| 49 | +same library, based on their compilation target. If we try to use it in a |
| 50 | +real-world codebase, we'll also find the need to specify different |
| 51 | +preprocessing definitions, compilation flags, the set of modules belonging to |
| 52 | +the library – effectively most fields in the `(library ..)` stanza. |
| 53 | + |
| 54 | +## A First ~~Hack~~ Approach |
| 55 | + |
| 56 | +We concluded that it would be desirable to write two library definitions. That |
| 57 | +would allow us to configure each `(library ..)` stanza field separately, |
| 58 | +achieving our goal. |
| 59 | + |
| 60 | +But Dune doesn't allow you to have two libraries with the same name. How could |
| 61 | +it? If Dune derives the artifact directory for libraries from their `(name ..)` |
| 62 | +field, two conflicting names compete for the same artifact directory. |
| 63 | + |
| 64 | +So we first tried to work around that, and set up: |
| 65 | + |
| 66 | +- unwrapped (`(wrapped false)`) Dune libraries with different names |
| 67 | + - with unwrapped libraries, we could share modules across compilation |
| 68 | + targets, e.g. `react.ml` originating from both `reason-react` and |
| 69 | + `server-reason-react`; |
| 70 | +- defined in different directories; |
| 71 | +- `(copy_files ..)` from one of the directories into the other, duplicating |
| 72 | + shared modules. |
| 73 | + - Modules with the same name and different implementations, specific to |
| 74 | + each directory. |
| 75 | + |
| 76 | +```clj |
| 77 | +;; native/dune |
| 78 | +(library |
| 79 | + (name native_lib) |
| 80 | + (wrapped false) |
| 81 | + (modules a b c)) |
| 82 | + |
| 83 | +;; melange/dune |
| 84 | +(library |
| 85 | + (name melange_lib) |
| 86 | + (wrapped false) |
| 87 | + (modes melange) |
| 88 | + (modules a b c)) |
| 89 | + |
| 90 | +;; Copy modules `A` and `B` from `../native` |
| 91 | +(copy_files# ../native |
| 92 | + (files {a,b}.ml{,i})) |
| 93 | + |
| 94 | +;; module `C` has a specific Melange implementation |
| 95 | +(rule |
| 96 | + (with-stdout-to c.ml |
| 97 | + (echo "let backend = \"melange\""))) |
| 98 | +``` |
| 99 | + |
| 100 | +This worked until it didn't: we quickly ran into a limitation in `(copy_files |
| 101 | +..)` ([dune#9709](https://github.com/ocaml/dune/issues/9709)). Because this |
| 102 | +stanza operates in the build directory, it was impossible to exclude some of |
| 103 | +build artifacts that get generated with `.ml{,i}` extensions from the copy glob |
| 104 | +– Dune uses extensions such as `.pp.ml` and `.re.pp.ml` as targets of its |
| 105 | +[dialect](https://dune.readthedocs.io/en/stable/overview.html#term-dialect) and |
| 106 | +[preprocessing](https://dune.readthedocs.io/en/stable/reference/preprocessing-spec.html) |
| 107 | +phases. |
| 108 | + |
| 109 | +## Limiting `(copy_files ..)` to source-files only |
| 110 | + |
| 111 | +What we would want from `copy_files` in our scenario is the ability to limit |
| 112 | +copying only to files that are present in source. That way we can address all |
| 113 | +the `.re{,i}` and `.ml{,i}` files in source directories without worrying about |
| 114 | +polluting our target directories with some intermediate Dune targets. |
| 115 | + |
| 116 | +In [dune#9827](https://github.com/ocaml/dune/pull/9827), we added a new option |
| 117 | +to `copy_files` that allows precisely that: if the field `(only_sources |
| 118 | +<optional_boolean_language>)` is present, Dune will only match files in the |
| 119 | +source directory, and won't apply the glob to the targets of rules. |
| 120 | + |
| 121 | +After this change, our Dune file just needs to contemplate one more line: |
| 122 | + |
| 123 | +```diff |
| 124 | + ;; Copy modules `A` and `B` from `../native` |
| 125 | + (copy_files# ../native |
| 126 | ++ (only_sources) |
| 127 | + (files {a,b}.ml{,i})) |
| 128 | +``` |
| 129 | + |
| 130 | + |
| 131 | +## Checkpoint |
| 132 | + |
| 133 | +Our Dune file allows us to move forward. We were now able to define multiple |
| 134 | +libraries that share common implementations across native code and Melange. |
| 135 | +Though library names still need to be different. And, overall, we still face |
| 136 | +some other glaring limitations: |
| 137 | + |
| 138 | +- The `(wrapped false)` requirement makes it impossible to namespace these |
| 139 | + libraries; |
| 140 | +- Defining libraries in different directories and using `copy_files` places |
| 141 | + extra separation between common implementations, and adds extra build |
| 142 | + configuration overhead; |
| 143 | +- Publishing a library with `(modes :standard melange)` adds a non-optional |
| 144 | + dependency on Melange, which should really be optional for native-only |
| 145 | + consumers. |
| 146 | +- Extensive usage of `(copy_files ..)` as shared in the example above breaks |
| 147 | + editor integration and "jump to definition"; Merlin and OCaml-LSP don't track |
| 148 | + the original source in this scenario. |
| 149 | + |
| 150 | +## Testing a New Solution |
| 151 | + |
| 152 | +We became intentful on removing these limitations, and realized at some point |
| 153 | +that our use case is somewhat similar to cross-compilation, which Dune [already |
| 154 | +supports well](https://dune.readthedocs.io/en/stable/cross-compilation.html). |
| 155 | +The key insight, which we shared in a Dune proposal |
| 156 | +([dune#10222](https://github.com/ocaml/dune/issues/10222)), is that we could |
| 157 | +share library names as long as they resolved to a single library per [build |
| 158 | +context](https://dune.readthedocs.io/en/stable/reference/dune-workspace/context.html). |
| 159 | + |
| 160 | +After making the proposed changes to Dune |
| 161 | +([dune#10220](https://github.com/ocaml/dune/pull/10220), |
| 162 | +[dune#10307](https://github.com/ocaml/dune/pull/10307), |
| 163 | +[dune#10354](https://github.com/ocaml/dune/pull/10354), |
| 164 | +[dune#10355](https://github.com/ocaml/dune/pull/10355)) we found ourselves |
| 165 | +having implemented support for: |
| 166 | + |
| 167 | +- Dune libraries with the same name; |
| 168 | +- which may be defined in the same directory; |
| 169 | +- as long as they don't conflict in the same context. |
| 170 | + - to achieve that, we use e.g. `(enabled_if (= %{context_name} melange))`. |
| 171 | + |
| 172 | +Putting it all together, our example can be adapted to look like: |
| 173 | + |
| 174 | +```clj |
| 175 | +;; src/dune |
| 176 | +(library |
| 177 | + (name a) |
| 178 | + (modules a b c) |
| 179 | + (enabled_if |
| 180 | + (= %{context_name} default))) |
| 181 | + |
| 182 | +;; can also be defined in src/dune(!) |
| 183 | +(library |
| 184 | + (name a) |
| 185 | + (modes melange) |
| 186 | + (modules a b c) |
| 187 | + (enabled_if |
| 188 | + (= %{context_name} melange))) |
| 189 | +``` |
| 190 | + |
| 191 | +In other words, we define two libraries named `a`, each in their own build |
| 192 | +context (with build artifacts ending up in `_build/default` and |
| 193 | +`_build/melange`). In the `melange` context, the library has `(modes melange)`. |
| 194 | + |
| 195 | +Both libraries contain modules `A`, `B` and `C` like before. Their |
| 196 | +corresponding source files can live in a single directory, no copying required. |
| 197 | +If we need to vary `C`'s implementation, we can express that in Dune rules: |
| 198 | + |
| 199 | +```clj |
| 200 | +(rule |
| 201 | + (target c.ml) |
| 202 | + (deps c.native.ml) |
| 203 | + (action |
| 204 | + (with-stdout-to |
| 205 | + %{target} |
| 206 | + (echo "let backend = \"OCaml\""))) |
| 207 | + (enabled_if |
| 208 | + (= %{context_name} default))) |
| 209 | + |
| 210 | +(rule |
| 211 | + (target c.ml) |
| 212 | + (deps c.melange.ml) |
| 213 | + (action |
| 214 | + (with-stdout-to |
| 215 | + %{target} |
| 216 | + (echo "let backend = \"Melange\""))) |
| 217 | + (enabled_if |
| 218 | + (= %{context_name} melange))) |
| 219 | +``` |
| 220 | + |
| 221 | +In short, both libraries get a module `C`. `c.ml`'s contents vary according to |
| 222 | +the build context. The example above is currently illustrative, even if |
| 223 | +functional. We're still working on the developer experience of multi-context |
| 224 | +libraries. This might not be the best setup for editor support, which we'll |
| 225 | +find out as we take it for a spin. |
| 226 | + |
| 227 | +## Missing Pieces |
| 228 | + |
| 229 | +We proved that compiling libraries with the same name in different contexts can |
| 230 | +work after migrating some of the libraries to the new configuration. |
| 231 | + |
| 232 | +Before deploying such a major change at scale, we need to get the developer |
| 233 | +experience right. To illustrate some examples: |
| 234 | + |
| 235 | +- [Dune](https://github.com/ocaml/dune/pull/10324) and |
| 236 | + [`ocaml-lsp`](https://github.com/ocaml/ocaml-lsp/pull/1238) must support |
| 237 | + selecting the context to know where to look for compiled artifacts; |
| 238 | +- Editor plugins must have commands or configuration associating certain files |
| 239 | + with their respective context; |
| 240 | +- Dune can do better to [show the |
| 241 | + context](https://github.com/ocaml/dune/issues/10378) to which errors belong |
| 242 | + |
| 243 | +We will need some additional time to let all pieces fall in their right places |
| 244 | +before we can start recommending compiling Melange code in a separate Dune |
| 245 | +context. Before that happens, we wanted to share the problems we faced, how we |
| 246 | +ended up lifting some interesting limitations in a composable way, and the new |
| 247 | +constructs that will be available in Dune 3.16. |
| 248 | + |
0 commit comments