-
-
Notifications
You must be signed in to change notification settings - Fork 421
Fix Zinc worker performance regression #5710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Zinc worker performance regression #5710
Conversation
…erformance-regression-minimized
…r-performance-regression-minimized # Conflicts: # integration/feature/startup-shutdown/src/StartupShutdownTests.scala # libs/daemon/server/src/mill/server/Server.scala
| os.write.append(daemonDir / DaemonFiles.serverLog, s"$s\n", createFolders = true) | ||
| try os.write.append(daemonDir / DaemonFiles.serverLog, s"$s\n", createFolders = true) | ||
| catch { | ||
| case _: ClosedByInterruptException => // write was interrupted with a thread interrupt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lihaoyi does this look sensible, silencing the exception in case the thread was interruped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, does this happen regularly? If so skipping the logging seems fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed it when running netty tests, I was getting spammed in the console with random looking stacktraces, which has lead to 14a2986. So pretty regularly in bigger builds.
https://github.com/arturaz/mill into fix/5693-zinc-worker-performance-regression-minimized
Open problems: * `lint-autofix` seems like it takes forever before crashing in https://github.com/com-lihaoyi/mill/actions/runs/17039983244/job/48301519852?pr=5713 and https://github.com/com-lihaoyi/mill/actions/runs/17039983244/job/48304434864?pr=5713, while on `1.0.3` it takes ~6min (https://github.com/com-lihaoyi/mill/actions/runs/17038734067/job/48297600736?pr=5703) * The crash seems to be related to `java.lang.OutOfMemoryError: Compressed class space`, but there's also this stack trace that indicates it's failing in the `semanticDbData` task: ``` integration.ide[bsp-server].semanticDbData java.lang.IllegalStateException: RPC wire has broken (ZincWorker,TCP localhost/127.0.0.1:45263 -> /127.0.0.1:47582). The server probably crashed. mill.rpc.MillRpcClient$.mill$rpc$MillRpcClient$$$_$awaitForResponse$1(MillRpcClient.scala:60) mill.rpc.MillRpcClient$$anon$1.apply$$anonfun$1(MillRpcClient.scala:103) mill.rpc.MillRpcClient$.mill$rpc$MillRpcClient$$$_$withRequestId$1(MillRpcClient.scala:35) mill.rpc.MillRpcClient$$anon$1.apply(MillRpcClient.scala:101) mill.javalib.worker.JvmWorkerImpl$$anon$3.compileMixed$$anonfun$1(JvmWorkerImpl.scala:426) mill.javalib.worker.JvmWorkerImpl.$anonfun$9(JvmWorkerImpl.scala:390) mill.util.Timed$.apply(Timed.scala:14) mill.javalib.worker.JvmWorkerImpl.withRpcClient$1$$anonfun$2$$anonfun$1$$anonfun$2(JvmWorkerImpl.scala:390) mill.client.ServerLauncher.runWithConnection(ServerLauncher.java:103) mill.javalib.worker.JvmWorkerImpl.withRpcClient$1$$anonfun$2$$anonfun$1(JvmWorkerImpl.scala:362) scala.util.Using$Manager.scala$util$Using$Manager$$manage(Using.scala:202) scala.util.Using$Manager$.apply(Using.scala:255) mill.javalib.worker.JvmWorkerImpl.withRpcClient$1$$anonfun$2(JvmWorkerImpl.scala:346) mill.util.CachedFactoryWithInitData.withValue(CachedFactory.scala:54) mill.javalib.worker.JvmWorkerImpl.mill$javalib$worker$JvmWorkerImpl$$_$withRpcClient$1(JvmWorkerImpl.scala:345) mill.javalib.worker.JvmWorkerImpl$$anon$3.compileMixed(JvmWorkerImpl.scala:420) mill.javalib.worker.JvmWorkerImpl.$anonfun$6(JvmWorkerImpl.scala:74) mill.util.Timed$.apply(Timed.scala:14) mill.javalib.worker.JvmWorkerImpl.compileMixed(JvmWorkerImpl.scala:74) mill.scalalib.ScalaModule.semanticDbData$$anonfun$1$$anonfun$1(ScalaModule.scala:601) mill.api.Task$Named.evaluate(Task.scala:338) mill.api.Task$Named.evaluate$(Task.scala:323) mill.api.Task$Computed.evaluate(Task.scala:349) ``` `integration.ide[bsp-server]` does not have `jvmId` defined, so I would not expect it to run via `ZincWorkerMain`. But `SemanticDbJavaModule.javacOptionsTask` adds a bunch of `-J--add-exports` options, which causes `javaRuntimeOptions.options.isEmpty && javaHome.isEmpty` to fail, which causes all `semanticDbData` task compilation to run in a `ZincWorkerMain` subprocess There are two areas to investigate: 1. `semanticDbData` should probably not force every compilation to run in a subprocess. Not sure what flags to tweak, but we should tweak something. * Do we actually need all those `-J--add-exports` flags? If we don't, we can remove them, which would remove the need for forking a `ZincWorkerMain`. * If we do need them, we could move them onto the `MillDaemonMain`/`MillNoDaemonMain` process itself, again so we don't need to fork a `ZincWorkerMain` to run `compileMixed` in `semanticDbData` 2. Even if we run everything in the subprocess, it should run about as fast and stably as it does in-process. * It should not be causing massive slowdowns or massive resource footprint increases and running out of memory just because it's running in a subprocess. * #5710 seems to have fixed in for Java modules, but it seems there's some other performance bug compiling Scala modules. We may be not properly re-using classloaders (causing the slowdown) or leaking them (causing the `OutOfMemory`) Apart from the Mill repo, we can also use the `example/thirdparty/gatling/` build as a benchmark, and ensure that `./mill clean && ./mill __.compile` and `./mill clean && ./mill __.semanticDbData` on that project have similar performance whether or a custom `def jvmId` is provided on its submodules
Fixes the delay when using Mill RPC infrastructure and sending large messages over it.
The cause was PipedInput/OutputStream waiting for one second when it's internal buffer fills up. Because
ProxyStreamprotocol used a very small (126 byte) buffer, things got slow very fast once a large message was sent.This PR removes the dependence on the piped streams and uses socket streams directly. It also ditches ProxyStream protocol for Mill RPC and instead sends heartbeats as empty lines.
Other changes:
Servernow is basic server,ProxyStreamspecific logic was moved toProxyStreamServer.out/mill/javalib/JvmWorkerModule/internalWorker.dest/jvm-worker.logandout/mill/javalib/JvmWorkerModule/internalWorker.dest/zinc-worker/DAEMON_HASH/daemon/server.logrespectively.DebuggingInputStreamandDebuggingOutputStreamwas created and left in the codebase if we'll need to debug something in the future.mill.util.Timedwas introduced as a utility to time code blocks.This has been tested manually:
The first run has to spawn a Zinc subprocess, so it takes 10s, but then the subsequent compilations are fast, as we expect them to be.
Fixes #5693