Julia sysimage for fast LLM-driven scripting
May. 16, 2026
Using julia with LLMs
There is a lot of debate on which programming language is better suited for AI (I tried a few things myself) but the turf is mostly about software engineering and development, not academic research.
After using a coding agent for a grand total of one year, here is what has been useful:
- Easy to read: I can review code, skim it and understand its general structure.
- Good tooling: environments, project management, tests, modules, …
- Portable: can it also run on my coauthors’ machine without too much fussing
- Easy to run short scripts: claude tends to run lots of short scripts to poke at the data1 or test out parts of a bigger script.
- Large available codebase for training
Some of these are idiosyncratic (easy to read? for whom?), the others are pretty much common sense. I can hear my stata friends arguing that their ability to read trumps anything else. I disagree!
Practically, julia checks all the boxes (for me). Julia is a great language for economics with good libraries for data and statistics . It is reasonably easy to learn, has good tooling, and it’s portable.
Its downsides come from its relative lack of popularity among the general public (at least relative to its obvious competitor python) leading to a possibly smaller training sample. It also suffers from an infamous time-to-first-plot that has to do with its jit nature; so rapid iteration, running small scripts can get a little slower than other languages.
I can’t do much for the first problem, but it turns out there is a solution to the second one. Some people propose running a Daemon , but I am not sure its adapted to llm coding.
Something easier is to compile your own julia image for fast startup of a few essential packages that get reused again and again in a project (CSV, DataFrames. etc…)
Honestly, setting this up had never been a priority and it seemed too complicated … until claude needed it and was willing to figure it out for me.
Official documentation is available here
.
A simple Julia sysimage
Some packages are large and can take a while to load each time we start a julia instance (DataFrames, CSV, DuckDB, FixedEffectModels, …).
At a high level, a custom Julia image bakes these packages into a single shared library that Julia loads at startup.
After that, using DataFrames is essentially free, and short scripts can be as fast as they would be in a compiled language.
The setup can be integrated or separated from your project in its own small environment.
For example I created a sysimage directory with 4 main files:
sysimage/
├── Project.toml # deps for the build itself
├── packages.jl # the list of packages to bake in
├── precompile.jl # a representative workload to trace
└── build.jl # the build script
-
Project.tomlis the standard Julia environment file listing the dependencies needed for the build (the packages to bake in, plusPackageCompileritself) — without it,Pkg.instantiate()has nothing to install. -
packages.jlis just the list of packages I want baked. Keep it to the heavy, stable ones — skip dev tooling (Revise,OhMyREPL, …) and project-internal packages that change all the time.const SYSIMAGE_PACKAGES = [ :DataFrames, :DataFramesMeta, :DataPipes, :CSV, :Parquet2, :DuckDB, :JSON, :XLSX, :HTTP, :CategoricalArrays, :PrettyTables, :FixedEffectModels, :RegressionTables, ] -
precompile.jlis a small script that exercises the packages the way you actually use them: read a CSV, run a@chainpipe, fit a regression, write a parquet.PackageCompilertraces this to decide which methods to AOT-compile. The closer it matches real code, the bigger the speedup.
You can ask claude to generate it based on what your code base looks like. -
build.jlties it together:using Pkg Pkg.instantiate() using PackageCompiler include(joinpath(@__DIR__, "packages.jl")) create_sysimage( SYSIMAGE_PACKAGES; sysimage_path = "munis_sys.dylib", precompile_execution_file = joinpath(@__DIR__, "precompile.jl"), incremental = false, )
Then compile it from the shell:
$ julia --project=sysimage -t 4 sysimage/build.jl
This takes a while (10-20 minutes on my machine). The artifact is not portable: it is tied to your CPU and Julia version, so if you work on different hosts you will have to build it for everyone of them.
To use it, start Julia with the -J flag:
$ julia -J sysimage/munis_sys.dylib --project=. my_script.jl
It is faster!
$ SCRIPT='using DataFrames, DataFramesMeta, FixedEffectModels, Statistics
df = DataFrame(x = rand(10_000), y = rand(10_000), g = rand(1:50, 10_000))
s = @chain df begin
groupby(:g)
@combine :mx = mean(:x) :my = mean(:y)
end
reg(df, @formula(y ~ x + fe(g)))'
$ /usr/bin/time julia -e "$SCRIPT"
# 6.16 real 6.00 user 1.96 sys
$ /usr/bin/time julia -J munis_sys.dylib -e "$SCRIPT"
# 1.07 real 1.13 user 1.80 sys
-
Claude insistence on writing a 30 lines script to find the fields of a csv (or even a json) is the reason I push it to use my cli data tools , or run duckdb using the cli . ↩︎