KitOps v1.14.0: SLSA Provenance and Composable Datasets Make ML Packaging Grow Up

| |

5 min read

KitOps logo

KitOps just shipped v1.14.0, and it brings two features that make ML packaging feel less like wrestling a bear and more like actual engineering: SLSA Provenance attestations for your imports and composable dataset references that let you link ModelKits together like LEGO bricks.

Imagine you run a bakery. Every morning, you get deliveries of flour, sugar, butter, and eggs from different suppliers. Now imagine if every time you wanted to make a cake, you had to re-order all those ingredients individually, even though you already had them in the pantry. That is roughly what managing ML datasets and models felt like before this release. KitOps v1.14.0 fixes that, and throws in a supply-chain security bonus on top.

What is KitOps?

KitOps is a CNCF sandbox project that packages machine learning models, datasets, code, and configuration into a single portable artifact called a ModelKit. Think of it as a container image, but purpose-built for ML workloads. It uses the OCI standard (the same one behind Docker images), so your ML artifacts can live in the same registries you already use for containers.

The problem it solves? ML projects are messy. You have model weights in one place, training data in another, a notebook somewhere else, and maybe a YAML file nobody has updated in three months. KitOps bundles all of that into one versioned, pushable, pullable package. Platform engineers use it to deploy models. Data scientists use it to share reproducible experiments. If you have ever tried to move a model from a laptop to a registry to a production cluster and wanted to scream, KitOps is for you.

v1.14.0 is worth your attention because it adds SLSA Provenance attestations for supply-chain security and remote ModelKit references that let you compose datasets across multiple ModelKits without duplicating data. Both are genuine capability upgrades, not just polish.

What is New

SLSA Provenance Attestations for kit import

Supply-chain security has been the hottest topic in cloud-native for the past two years, and for good reason. If you cannot prove where your artifacts came from and how they were built, you cannot trust them. KitOps now generates SLSA Provenance v1 predicates when you run kit import, using the new --attestation-output flag.

What this means in practice: every time you import an ML artifact, you get a cryptographic receipt documenting exactly what was imported and how. That attestation can then be pushed to your OCI registry alongside the ModelKit using tools like cosign.

kit import ./my-model --attestation-output attestation.json
cosign attest --predicate attestation.json my-registry.io/my-model:latest

For anyone building compliant ML pipelines (and these days, who is not?), this is a meaningful step forward. You now have tamper-evident provenance for your imports without adding bespoke tooling. PR #1175

Remote ModelKit References for Datasets

This is the headliner. Previously, the remotePath field in a Kitfile only supported S3 URLs for datasets. In v1.14.0, remotePath now accepts ModelKit references too. This means you can point a dataset layer at another ModelKit stored in an OCI registry instead of bundling the data locally.

Why does this matter? Consider a Kitfile that references a dataset stored as its own ModelKit:

manifestVersion: v1.0.0
model:
  path: my-model.safetensors
datasets:
- path: my-dataset/
  remotePath: docker.io/jozu/my-dataset:latest

Here is what happens at each stage:

  • Packing: The contents of my-dataset/ are ignored. No dataset layer is created in the manifest.
  • Pushing: Only the model layer gets pushed. Bandwidth saved, storage saved.
  • Pulling: The referenced ModelKit docker.io/jozu/my-dataset:latest is also pulled automatically.
  • Unpacking: With the --include-remote flag, the dataset contents land in my-dataset/ as if they were always there.

This is essentially dependency management for ML artifacts. You can now share a single dataset ModelKit across multiple model ModelKits, update the dataset in one place, and have every downstream model pick up the change. If you manage large ML environments with shared training data, this will save you real time and real bandwidth. PR #1185

Deeper Control with kit init –depth

The kit init command analyzes your project directory and generates a starter Kitfile, grouping files into datasets, models, code, and documentation layers. The new --depth flag lets you control how far into subdirectories this analysis goes.

With --depth=0 (the default), init behaves like it always did: it groups whole directories. Increase the depth, and init will drill into subdirectories to create more granular layers. This is useful when a subdirectory contains large files that really should not be lumped into a single layer. PR #1176

OCI Created Annotations on Pack

KitOps now sets the org.opencontainers.image.created annotation on manifests during kit pack. This is a small but correct change: OCI-compliant registries and tooling expect this annotation, and having it means your ModelKits play better with the broader container ecosystem. The trade-off is that re-packing the same ModelKit will produce a different overall digest hash (individual layers keep their reproducible digests). PR #1139 (first contribution by @itniuma2026)

Broader Tool Auto-Detection for Skills Unpacking

When unpacking ModelKits as skills, KitOps can now auto-detect more tools and agents. If you are using KitOps to distribute AI agent configurations, this means less manual configuration and more “it just works” out of the box. PR #1181

Bug Fixes

Version notifications are now less annoying. Instead of nagging you every time you run a command on an outdated version, KitOps will only notify you once every 24 hours or when a new version is actually released. A small quality-of-life fix that your terminal will thank you for. PR #1188

KitOps v1.14.0 is the release where ML packaging grew up a little: verifiable provenance for your imports and composable dataset references that treat your OCI registry like the package manager it always wanted to be.

Learn More