Show HN: Nucleus – A security-hardened, Nix-native container runtime

github.com

33 points by 0kenx a day ago

Hi HN, I've been building Nucleus, a lightweight Linux container runtime focused on two workloads: ephemeral AI-agent sandboxes and declarative NixOS services. It's a single Rust binary, no daemon.

It is not a Docker replacement and not a strict subset of Docker either. I dropped the entire image-and-distribution half (no Dockerfile, no layers, no registry, no pull/push, no persistent storage layer) in exchange for going deeper on isolation and reproducibility. The rootfs is either a directory copied into tmpfs (agent mode) or a Nix-built closure mounted read-only (production mode). If your mental model is "run my image instead of docker run," this won't fit. If it's "run untrusted or ephemeral workloads with stronger, auditable isolation on a single host," that's the target.

Things that I think are interesting:

  - Defense-in-depth defaults. All capabilities dropped, ~100-syscall seccomp allowlist (vs Docker's ~300), up to 8 namespaces including time/cgroup, Landlock LSM path ACLs per service.
  - Deny-by-default egress. Outbound traffic is denied unless you allow specific CIDRs or DNS-resolved domains. Enforced with namespace-local iptables rules.
  - Externalized, hash-pinned security policies. seccomp (JSON), capabilities (TOML), and Landlock (TOML) live as separate SHA-256-verified files, decoupled from the rootfs build. There's a nucleus seccomp generate that records syscalls in trace mode and emits a minimal profile.
  - gVisor as a first-class integrated runtime, not an add-on. Explicit network modes including a gvisor-host mode that's intentionally separate from native host networking.
  - Nix-native production path. nucleus.lib.mkRootfs builds locked-down closures; rootfs attestation verifies a per-file SHA-256 manifest at startup; first-class NixOS module.
  - Formal verification. TLA+ specs for the isolation/resource/filesystem/security/gVisor subsystems, checked with Apalache, plus property-based tests that drive the Rust implementation against the specs.
Honest tradeoffs: - Linux x86_64 only. No macOS/Windows/BSD, no plans. - No CNI, no overlay networks, no cluster orchestration. nucleus compose is a single-host TOML DAG over systemd, not Swarm/K8s. - Ephemeral-by-default storage. Persistence is opt-in via explicit --volume binds. - Agent mode applies several mechanisms best-effort by design (warn-and-continue on seccomp/Landlock failure). For fail-closed isolation on ephemeral workloads use --service-mode strict-agent; for long-running services use production mode.

Cold-start is ~12ms in the native runtime. Postgres 18 pgbench numbers under Nucleus are within noise of bare metal in our harness (full results in benches/).

waterfisher 18 hours ago

Please, guys, I beg of you: even if you're going to let LLMs generate whole wheel-reinventing GitHub repositories for you (I've let them generate many!), at least write your Hacker News posts yourself. The ability to write a Hacker News post without LLM assistance non-trivially relates to the ability to develop good software, because it boils down to skills conceptualising the project in a way that makes sense to humans, such that the project is product-shaped, rather than loose-blob-of-proper-nouns shaped. It's just very difficult to invest trust in a piece of software doing the right thing when it's not clear someone on the other end has enough ability to express their own ends in writing to make clear what that right thing is.

  • mpalmer 18 hours ago

        If your mental model is "run my image instead of docker run," this won't fit. If it's "run untrusted or ephemeral workloads with stronger, auditable isolation on a single host," that's the target.
    
    This in particular is barely coherent.
wallzero 15 hours ago

This is neat! Is it rootless? Could it pair with devenv?

I've just gone down a rabbit hole with Fedora atomic desktop (Kinoite), Flatpak Zed, devcontainers with podman compose using the Debian container and nix feature, and devenv.

It allows me to keep an immutable OS while still having an infrastructure as code development experience. Also team members on MacOS or Windows can choose to use devcontainers to wrap devenv or just skip devcontainers and the extra isolation. It's pretty portable.

  • 0kenx 6 hours ago

    Yes it's rootless and can pair with devenv. MacOS is unfortunately not supported because seccomp is not available.

  • lifeisstillgood 13 hours ago

    >>> devcontainers with podman compose using the Debian container and nix feature, and devenv.

    Can you expand on that please?

    • wallzero 4 hours ago

      Sure!

      Side note: Unfortunately VSCode devcontainers aren't open source and do not work with VSCodium. Upvote if you'd like VSCode devcontainers open sourced. [1] This example should still work with VSCode though. And the devcontainer CLI.

      Also, Zed has some issues around Podman and SELinux with an open PR. [2] And unfortunately Podman Compose does not currently work with Flatpak Zed. [3]

      In Zed to enable Podman, add the following to Zed 'settings.json':

        "use_podman": true
      
      Then we're just mostly following the guide:

      https://containers.dev/guide/dockerfile

      Create '.devcontainer/devcontainer.json':

        {
          "name": "projectName",
          "runArgs": ["--name", "projectName"],
          "dockerComposeFile": "docker-compose.yml",
          "service": "devcontainer",
          "features": {
            "ghcr.io/devcontainers/features/nix:1": {
              "packages": "devenv"
            }
          },
          "workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}",
          "onCreateCommand": "nix-env -iA nixpkgs.devenv",
          "postCreateCommand": "git config --global user.name \"${GIT_USER_NAME}\" && git config --global user.email \"${GIT_USER_EMAIL}\" && git config --global --add --bool push.autoSetupRemote true && echo 'eval \"$(devenv hook bash)\"' | tee -a ~/.bashrc"
        
          // If compose isn't needed use the following:
          // "image": "mcr.microsoft.com/devcontainers/base:debian",
          // "containerEnv": {
          //   "GIT_USER_NAME": "${localEnv:GIT_USER_NAME}",
          //   "GIT_USER_EMAIL": "${localEnv:GIT_USER_EMAIL}",
          //   "SSH_AUTH_SOCK": "/run/host-services/ssh-auth.sock",
          // },
          // "mounts": [
          //   "source=${localEnv:XDG_RUNTIME_DIR}/ssh-agent.socket,target=/run/host-services/ssh-auth.sock,type=bind",
          // ],
        }
      
      Then create '.devcontainer/docker-compose.yml':

        name: projectName
        services:
          devcontainer:
            image: mcr.microsoft.com/devcontainers/base:debian
            command: sleep infinity
            userns_mode: keep-id
            environment:
              SSH_AUTH_SOCK: /run/host-services/ssh-auth.sock
              GIT_USER_EMAIL: ${GIT_USER_EMAIL?err}
              GIT_USER_NAME: ${GIT_USER_NAME?err}
              POSTGRES_DB: ${POSTGRES_DB:-projectName}
              POSTGRES_USER: ${POSTGRES_USER:-postgres}
              POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-postgres}
            ports:
              # To connect to postgres running inside the container
              - target: 5432
                published: 5432
                protocol: tcp
                host_ip: 127.0.0.1
                mode: host
            volumes:
              - ${XDG_RUNTIME_DIR}/ssh-agent.socket:/run/host-services/ssh-auth.sock:bind
              - ..:/workspaces/projectName:cached
      
      And lastly create 'devenv.nix':

        { pkgs, config, ... }: {
          env.GREET = "determinism";
        
          enterShell = ''
            echo hello ${config.env.GREET}
          '';
        
          packages = [
            pkgs.nodejs
            pkgs.yarn
          ];
        
          services = {
            postgres = {
              enable = true;
              listen_addresses = "0.0.0.0";
              hbaConf = ''
                # TYPE      DATABASE      USER      ADDRESS       METHOD
                  local       all         all                     peer
                  host        all         all       127.0.0.1/32  trust
                  host        all         all       0.0.0.0/0     md5
              '';
              initialDatabases = [
                {
                  name = "postgres";
                }
                {
                  name = "projectName";
                }
                {
                  name = "projectName_auth";
                }
              ];
              initialScript = ''
                CREATE ROLE postgres SUPERUSER LOGIN PASSWORD 'postgres';
                CREATE ROLE api LOGIN PASSWORD 'api';
                CREATE ROLE auth LOGIN PASSWORD 'auth';
              '';
              settings = {
                wal_level = "logical";
              };
            };
          };
        
          scripts = {
            drizzle.exec = "npx lerna run --scope @projectName/drizzle \"$@\"";
            better-auth.exec = "npx lerna run --scope @projectName/better-auth \"$@\"";
          };
        }
      
      On Linux with SELinux, until the PR [2] is merged, a workaround for Zed needs to be applied:

        # ~/.config/containers/containers.conf
        [containers]
        label = false
      
      After this you can work within a podman container, connect to adjacent compose services, and use nix and devenv. If a collaborator wants to skip containers they can just run devenv locally. Though I think devcontainers running devenv is actually the easier route provided that they are setup and working on your OS.

      And this all works pretty much out of the box without root on an immutable OS like Fedora Silverblue/Kinoite.

      ---

      [1](https://github.com/microsoft/vscode-remote-release/issues/11...)

      [2](https://github.com/zed-industries/zed/pull/58500)

      [3](https://github.com/flathub/dev.zed.Zed/pull/342#issuecomment...)

lavaman131 13 hours ago

Very cool to see more security focused tools being built here for the Nix ecosystem. What were some of the biggest roadblocks or challenges you hit when building this?

alberand 13 hours ago

Isn't it the same as using systemd-nspawn? containers.<name> let you declare containers with nspawn. What's the difference?

  • 0kenx 6 hours ago

    my main reason for building this is gvisor/seccomp/capability/landlock

yjftsjthsd-h 15 hours ago

> rootfs attestation verifies a per-file SHA-256 manifest at startup;

What threat model does this protect against? Certainly nice, especially for free, but wondering about utility.

  • 0kenx 6 hours ago

    it's a simple integrity check for catching deployment drift/tampering.