Julia sysimage for fast LLM-driven scripting

May. 16, 2026

Using julia with LLMs

There is a lot of debate on which programming language is better suited for AI (I tried a few things myself) but the turf is mostly about software engineering and development, not academic research.

After using a coding agent for a grand total of one year, here is what has been useful:

Easy to read: I can review code, skim it and understand its general structure.
Good tooling: environments, project management, tests, modules, …
Portable: can it also run on my coauthors’ machine without too much fussing
Easy to run short scripts: Claude tends to run lots of short scripts to poke at the data¹ or test out parts of a bigger script.
Large available codebase for training

Some of these are idiosyncratic (easy to read? for whom?), the others are pretty much common sense. I can hear my stata friends arguing that their ability to read trumps anything else. I disagree!

Practically, Julia checks all the boxes (for me). Julia is a great language for economics with good libraries for data and statistics . It is reasonably easy to learn, has good tooling, and it’s portable.

Its downsides come from its relative lack of popularity among the general public (at least relative to its obvious competitor python) leading to a possibly smaller training sample. It also suffers from an infamous time-to-first-plot that has to do with its JIT nature; so rapid iteration, running small scripts can get a little slower than other languages.

I can’t do much for the first problem, but it turns out there is a solution to the second one. Some people propose running a Daemon , but I am not sure it’s adapted to LLM coding.

Something easier is to compile your own Julia image for fast startup of a few essential packages that get reused again and again in a project (CSV, DataFrames, etc.). Honestly, setting this up had never been a priority and it seemed too complicated … until Claude needed it and was happy to figure it out for me. Official documentation is available here .

A simple Julia sysimage

Some packages are large and can take a while to load each time we start a Julia instance (DataFrames, CSV, DuckDB, FixedEffectModels, …). At a high level, a custom Julia image bakes these packages into a single shared library that Julia loads at startup. After that, using DataFrames is essentially free, and short scripts can be as fast as they would be in a compiled language.

The setup can be integrated or separated from your project in its own small environment. For example I created a sysimage directory with 4 main files:

sysimage/
├── Project.toml      # deps for the build itself
├── packages.jl       # the list of packages to bake in
├── precompile.jl     # a representative workload to trace
└── build.jl          # the build script

Project.toml is the standard Julia environment file listing the dependencies needed for the build (the packages to bake in, plus PackageCompiler itself) — without it, Pkg.instantiate() has nothing to install.

packages.jl is just the list of packages I want baked. Keep it to the heavy, stable ones — skip dev tooling (Revise, OhMyREPL, …) and project-internal packages that change all the time.

const SYSIMAGE_PACKAGES = [
    :DataFrames, :DataFramesMeta, :DataPipes,
    :CSV, :Parquet2, :DuckDB, :JSON, :XLSX,
    :HTTP, :CategoricalArrays, :PrettyTables,
    :FixedEffectModels, :RegressionTables,
]

precompile.jl is a small script that exercises the packages the way you actually use them: read a CSV, run a @chain pipe, fit a regression, write a parquet. PackageCompiler traces this to decide which methods to AOT-compile. The closer it matches real code, the bigger the speedup.
You can ask Claude to generate it based on what your codebase looks like.

build.jl ties it together:

using Pkg
Pkg.instantiate()

using PackageCompiler
include(joinpath(@__DIR__, "packages.jl"))

create_sysimage(
    SYSIMAGE_PACKAGES;
    sysimage_path             = "julia_sysimage.dylib", # .so on Linux, .dll on Windows
    precompile_execution_file = joinpath(@__DIR__, "precompile.jl"),
    incremental               = false,
)

Then compile it from the shell:

$ julia --project=sysimage -t 4 sysimage/build.jl

This takes a while (10-20 minutes on my machine). The artifact is not portable: it is tied to your CPU, OS, and Julia version (the .dylib extension here is macOS; use .so on Linux, .dll on Windows), so if you work on different hosts you will have to build it for each of them.

To use it, start Julia with the -J flag:

$ julia -J julia_sysimage.dylib --project=. my_script.jl

It is faster! After a one-time build cost (~10 min), the same script runs ~6× faster.

$ SCRIPT='using DataFrames, DataFramesMeta, FixedEffectModels, Statistics
  df = DataFrame(x = rand(10_000), y = rand(10_000), g = rand(1:50, 10_000))
  s = @chain df begin
      groupby(:g)
      @combine :mx = mean(:x) :my = mean(:y)
  end
  reg(df, @formula(y ~ x + fe(g)))'

$ /usr/bin/time julia -e "$SCRIPT"
# 6.16 real         6.00 user         1.96 sys
$ /usr/bin/time julia -J julia_sysimage.dylib -e "$SCRIPT"
# 1.07 real         1.13 user         1.80 sys

Claude’s insistence on writing a 30-line script to find the fields of a csv (or even a json) is the reason I push it to use my cli data tools , or run duckdb using the cli . ↩︎