Add composefs-ostree and some basic CLI tools by alexlarsson · Pull Request #144 · composefs/composefs-rs

alexlarsson · 2025-06-16T16:42:07Z

Based on ideas from #141

This is an initial version of ostree support. This allows pulling
from local and remote ostree repos, which will create a set of
regular file content objects, as well as a blob containing all the
remaining ostree objects. From the blob we can create an image.

When pulling a commit, a base blob (i.e. "the previous version" can be
specified. Any objects in that base blob will not be downloaded. If a
name is given for the pulled commit, then pre-existing blobs with the
same name will automatically be used as a base blob.

This is an initial version and there are several things missing:

Pull operations are completely serial
There is no support for ostree summary files
There is no support for ostree delta files
There is no caching of local file availability (other than base blob)
Local ostree repos only support archive mode

allisonkarlitskaya

I love this! Thanks for working on it!

I made some comments on the first round of commits. Feel free to adjust those and PR them separately: we can merge those now without further discussion.

The blobs thing is going to need a call.

I didn't review the crate addition in any detail at all. That's probably also going to need a call :)

alexlarsson · 2025-06-19T07:41:11Z

Hmmm, thinking more about this. We probably want a "content type" magic thing in the splitstream header as well, so we can error out if the wrapped thing is of the wrong type.

alexlarsson · 2025-06-26T12:07:47Z

Ok. Reworked this to use splitstreams for object maps and commits. And, by using an object mapping to find the object map we make the content of the splitstream for the commit be just the commit data, and thus the sha256 of that splitstream matches the ostree commit id.

alexlarsson · 2025-06-26T12:08:17Z

@allisonkarlitskaya There is still lots to do here. But have a look at this approach and see what you think.

alexlarsson · 2025-06-27T16:19:58Z

Added some further changes. We now validate all objects when pulling and all non-file objects when creating images. Its hard to efficiently validate file objects during create-image though, we would like to avoid re-reading the external object files to compute the sha256.

Remaining things to do:

Stream larger objects into repo
Support summaries and summary branches for remote repos
Support deltas when remote pulling
Parallelize downloads of objects
Report pull progress in some sane way
Use some kind of local cache for available objects other than just those from "previous version"
Handle GPG validation of commit objects

alexlarsson · 2025-06-30T14:38:26Z

I started working on the delta support, but it failed because of an issue in gvariant-rs.

allisonkarlitskaya

It occurs to me that it might be interesting not to sort the table of fs-verity references, and it might also be interesting to permit duplicate items.

On the topic of deferring writing of objects to a background thread, this would allow us to write "external object #123" based on a sequential index to the splitstream without actually knowing the hash value yet, and then fill in the actual values in the header at the end when we're writing: it helps there that the fs-verity references aren't compressed and therefore not part of the stream...

cgwalters · 2025-09-05T12:58:46Z

It seems like we should get in the splitstream changes in 0f6d69e at least sooner rather than later? Can you file a separate PR?

This changes the splitstream format a bit, with the goal of allowing splitstreams to support ostree files as well (see composefs#144) The primary differences are: * The header is not compressed * All referenced fs-verity objects are stored in the header, including external chunks, mapped splitstreams and (a new feature) references that are not used in chunks. * The mapping table is separate from the reference table (and generally smaller), and indexes into it. * There is a magic value to detect the file format. * There is a magic content type to detect the type wrapped in the stream. * We store a tag for what ObjectID format is used * The total size of the stream is stored in the header. The ability to reference file objects in the repo even if they are not part of the splitstream "content" will be useful for the ostree support to reference file content objects. This change also allows more efficient GC enumeration, because we don't have to parse the entire splitstream to find the referenced objects. Signed-off-by: Alexander Larsson <alexl@redhat.com>

cgwalters · 2026-01-29T14:18:54Z

I also think the fact that it has nothing to do with OCI is great.

I think unless we prove out that composefs can be a very good way to store OCI, then it is not worth investing in. Thankfully that's not the case - I think it is (and I believe you do too!).

So it's not that it has "nothing to do with OCI" (right?) - how about "has the capability to easily/natively store any type of content that one would want to represent as read-only immutable versioned filesystem trees".

For example, today Android as far as I know uses fsverity on single zip files, and they've made it work quite well, but it's harder to get deduplication across apps that way, and maybe someday they go to a composefs-like model.

alexlarsson · 2026-01-29T14:59:12Z

I rebased this, lets see if CI passes now.

allisonkarlitskaya · 2026-01-29T15:23:34Z

I also think the fact that it has nothing to do with OCI is great.

I think unless we prove out that composefs can be a very good way to store OCI, then it is not worth investing in. Thankfully that's not the case - I think it is (and I believe you do too!).

So it's not that it has "nothing to do with OCI" (right?) - how about "has the capability to easily/natively store any type of content that one would want to represent as read-only immutable versioned filesystem trees".

Just to be clear, when I said "it has nothing to do with OCI" I specifically meant composefs-ostree, not composefs-rs generally (which very clearly was designed with OCI in mind).

Very obviously the main target of composefs-rs right now is bootc (OCI), probably followed by container storage (obviously also OCI). flatpak is probably a distant third at the moment, and indeed, even that has something to do with OCI (the current flatpak demo only works with OCI, in fact)...

For example, today Android as far as I know uses fsverity on single zip files, and they've made it work quite well, but it's harder to get deduplication across apps that way, and maybe someday they go to a composefs-like model.

Ya, that's sort of what I meant... it would be cool to show that you can really do a lot of different things with this stuff...

cgwalters

Just an initial pass

cgwalters · 2026-01-29T19:49:45Z

+        if filetype.is_symlink() {
+            Ok((zlib_header, Box::new(empty())))
+        } else {
+            let fd_path = format!("/proc/self/fd/{}", path_fd.as_fd().as_raw_fd());


Tangential to this but I'd like to use https://docs.rs/crate/rustix-linux-procfs/latest I think

alexlarsson · 2026-04-20T16:46:27Z

I rebased this and fixes some comments. Still some work to do though.

Signed-off-by: Alexander Larsson <alexl@redhat.com>

This lets you look up a ref digest from the splitstream by index and is needed by the ostree code. Signed-off-by: Alexander Larsson <alexl@redhat.com>

This is basically ensure_object_from_fd(), but for anything implementing Read. basically ensure_object_from_fd() is reimplemented based on this. We will need this in the ostree support code for streaming a zlib compressed file to the repo. Signed-off-by: Alexander Larsson <alexl@redhat.com>

alexlarsson · 2026-06-16T15:27:56Z

Ok, i updated this to the latest version and added streaming creation of repo files and parallelized fetching. Plus some other cleanups.

alexlarsson · 2026-06-17T13:23:45Z

Ok, I sent some time on this, its now much more like the "cfsctl oci" commands and behavior, and it does parallel fetches. I also added various integration tests. I think this is pretty complete for what it does (i.e. imports ostree commits into composefs and lets you mount it).

There are some TODOs for summary and delta support, but those are not necessarily super important for the basic functionallity.

cgwalters · 2026-06-17T15:29:21Z

+                ref ostree_ref,
+                base_name,
+            } => {
+                eprintln!("Fetching {ostree_ref}");


Don't log via eprintln! we have the progress API now.

Also on that topic...I think we should expose a varlink API for this now, right?

I guess neither of these need to strictly block merging though.

🤔 I guess actually...if we go down this varlink path, perhaps in theory we could have both the oci and ostree fetchers be extension binaries i.e. something like /usr/libexec/composefs/ext/oci is automatically cfsctl oci? That could be interesting...and would actually force us to have a good "core" varlink api.

cgwalters · 2026-06-17T16:08:15Z

+    for i in 1..256 {
+        // Bucket ends are (non-strictly) increasing
+        if buckets[i] < buckets[i - 1] {


In general in Rust many array accesses can be done more elegantly and more safely than just direct indexing. In this specific case I think https://doc.rust-lang.org/stable/std/primitive.slice.html#method.array_windows is what we want

array_windows is unstable though, do we really want to use that?

I used regular .windows() instead. Also, I spent some time in general rustifying the code and cleaning it up.

cgwalters · 2026-06-17T16:11:12Z

+        // until the queue is drained and all in-flight fetches have completed.
+        let mut join_set: JoinSet<Result<FetchResult<ObjectID>>> = JoinSet::new();
+
+        loop {


We can interleave metadata and data fetches, it's what libostree does. Is it worth the added complexity? Maybe not.

probably not. This thing is actually surprisingly fast as is:

$ time target/debug/cfsctl --repo repo ostree pull https://dl.flathub.org/repo runtime/org.gnome.Platform/x86_64/50 Fetching runtime/org.gnome.Platform/x86_64/50 █████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ 16288/16288commit f6fb972824514aefc06b23d7d591192c9cba2ad72648bf0473d06a565c40c264 verity a2e16123b2310d65b6b886e89e4fc45ec61c47efd0b4daac05bd3132ac3d8f78890fcbb22d0975411917742d17e86ad52709c8ec1ff33a62b97e94f38528242a image 079ade7aba4e5fef51b30a80e3d1805fd5f79d9a48f240340c7a94f98497d8265edee857563308c00574a1bab792168caa0152af04ef47ac15d8a935ca291e15 tagged runtime/org.gnome.Platform/x86_64/50 objects 2752 metadata + 16288 files fetched real 0m13,006s user 0m12,066s sys 0m1,663s $ du -csh repo 1,1G repo 1,1G total

cgwalters · 2026-06-17T16:18:07Z

+ *
+ * Commit splitstreams are mappings from a set of ostree sha256
+ * digests into the content for that ostree object. The content is
+ * defined as some data, and an optional ObjectID referencing an


I read this thing twice and didn't fully understand it. "The content is defined as some data" is vague 😄

Our goal is conceptually to define a serialization of an ostree commit into a single "stream", right? And then splitting out content objects as externals.

Hmm...why wouldn't it work to basically do what we do with tar, walk the commit in a depth-first manner, serializing metadata + externals as we go?

Well, the "some data" is just generally the ostree object data for the digest.
We don't just want to have a serialization, because we also want to use the commit as a way to efficiently look up ostree objects in the commit. We use this during pull to avoid pulling objects that was in the previous version of the commit.

So, its more like a hash table from sha256 digests to objects that are optionally external object ids.

And its not actually "inline data OR external refs", we sometimes have to have both, because we need to store metadata as well. So, more like we store the archive-z2 file header in the data, and then content in the external ref.

We don't just want to have a serialization, because we also want to use the commit as a way to efficiently look up ostree objects in the commit.

Would parsing all of the metadata be expensive though?

I mean, would it be impossible to change it to something else? For sure not, but what is the point? Allison and I spent a fair amount of time creating the new splitstream format specifically for this use. So it is the format we have, and its intended to be efficient for what we use it for.

OK, fair! But can you spend some tokens clarifying the docs a bit at least?

I get the efficiency idea, but one thing that seems odd to me right now is that because we store this ostree-specific thing in the split stream content, it ends up zstd compressed. So we're at least reading the whole thing into RAM, we can't mmap etc.

With putting tar in split stream, this all made sense because we basically don't look at the tar stream unless we're copying the image out.

Also, while I get that it was nontrivial to design the format, there's also the traditional "cost of maintenance > cost of writing" to consider. Splitstream is a good bit of complexity on its own, but I think it's turned out mostly OK because for the OCI case it basically is a wrapper for a very well known thing - tar (ok well tar is a mess too, but it's a well-known mess). This work here is combining split stream with two entirely different more bespoke formats (splitstream-ostree and ostree).

I guess one way I'd say this is if you have a data format, it should have the ability to be converted to JSON, have a "structure checker" like fsck etc.

I'm aware I ~lost this argument before but e.g. https://cbor.io is pretty widely used. Does pulling in cbor for just this secondary bespoke binary format have the right cost/benefit? Perhaps not. (But, since we already need it: why not gvariant?)

Dunno. This is a discussion, nothing I am saying here is blocking.

I'll update the docs to be be more readable, comprehensive and documenting the final/current state of things. And, I agree that having them zstd compressed does make it a bit weird for this to claim "efficiency", although I sort of agree with Allisons more modern view of mmap and its problems.

That said, I fundamentally think a "bucket of sha256 indexed objects" is the more correct format for an ostree thing. Serializing an ostree commit tree just feels wrong. Like, would you then duplicate things that were shared in many places (like dirmetas, or hardlinks)?

I'll update the docs to be be more readable, comprehensive and documenting the final/current state of things.

Thanks.

That said, I fundamentally think a "bucket of sha256 indexed objects" is the more correct format for an ostree thing. Serializing an ostree commit tree just feels wrong. Like, would you then duplicate things that were shared in many places (like dirmetas, or hardlinks)?

No, I think the obvious flattened serialization would just have "each object is emitted once" semantics. Hardlinks are implicit in the ostree format - the data doesn't have st_nlink.

I added doc/ostree.md which has a more detailed documentation on the format, including some general faffing about ostree and how this is supposed to be used.

cgwalters · 2026-06-18T08:41:32Z

@allisonkarlitskaya You have a "changes requested" here which blocks merges

Based on ideas from composefs#141 This is an initial version of ostree support. This allows pulling from local and remote ostree repos, which will create a set of regular file content objects, as well as a commit splitstream containing all the remaining ostree objects and file data. From the splitstream we can create an image. When pulling a commit, base commits (i.e. "the previous version" can be specified, either manually and/or added automatically based on parent commit or previous commit for the pulled ref. Any objects in that base commit will not be downloaded. Commits are splitstreams named ostree-commit-xxxx, and refs that points to these are refs/ostree/$ref. erofs images are automatically created for pulled commits, and they can be mounted with "cfsctl ostree mount". There are also some other subcommands, that are simliar to those of oci: * dump * compute-id * inspect * tag * untag * images Signed-off-by: Alexander Larsson <alexl@redhat.com> Assisted-by: Claude Code (Opus 4.6)

alexlarsson force-pushed the ostree-support branch 2 times, most recently from e0e827f to 9c5b086 Compare June 17, 2025 06:54

allisonkarlitskaya requested changes Jun 17, 2025

View reviewed changes

alexlarsson mentioned this pull request Jun 17, 2025

Various repository fixes #146

Merged

alexlarsson force-pushed the ostree-support branch from 9c5b086 to cd067c5 Compare June 18, 2025 14:17

alexlarsson force-pushed the ostree-support branch 2 times, most recently from 2ed83a2 to c041afe Compare June 19, 2025 09:11

alexlarsson force-pushed the ostree-support branch from c041afe to dd0bf65 Compare June 26, 2025 12:06

alexlarsson force-pushed the ostree-support branch from dd0bf65 to d6a5b39 Compare June 27, 2025 16:02

alexlarsson force-pushed the ostree-support branch 4 times, most recently from 481e604 to e88573d Compare June 30, 2025 14:26

allisonkarlitskaya reviewed Jul 4, 2025

View reviewed changes

alexlarsson mentioned this pull request Sep 29, 2025

Preparatory splitstream format changes for ostree support #185

Merged

alexlarsson force-pushed the ostree-support branch 2 times, most recently from c788da2 to 2ee193a Compare October 6, 2025 14:58

alexlarsson force-pushed the ostree-support branch from 2ee193a to da310b0 Compare November 26, 2025 17:38

alexlarsson force-pushed the ostree-support branch from da310b0 to 8b32f51 Compare January 19, 2026 09:42

alexlarsson force-pushed the ostree-support branch from cc33c5f to 7ac06a0 Compare January 29, 2026 14:58

alexlarsson force-pushed the ostree-support branch from 7ac06a0 to 0deb546 Compare January 29, 2026 15:12

alexlarsson force-pushed the ostree-support branch 2 times, most recently from 1b98032 to 515fb7f Compare January 29, 2026 16:10

cgwalters reviewed Jan 29, 2026

View reviewed changes

alexlarsson force-pushed the ostree-support branch from 515fb7f to 9b1060f Compare April 20, 2026 16:46

alexlarsson force-pushed the ostree-support branch 2 times, most recently from 5fba232 to 1228e9b Compare June 2, 2026 15:49

alexlarsson added 3 commits June 16, 2026 15:32

Expose ErrnoFilter for other crates

bc8ca87

Signed-off-by: Alexander Larsson <alexl@redhat.com>

SplitStreamReader: Add lookup_external_ref()

8e523e6

This lets you look up a ref digest from the splitstream by index and is needed by the ostree code. Signed-off-by: Alexander Larsson <alexl@redhat.com>

alexlarsson force-pushed the ostree-support branch from 1228e9b to 8d8c6b2 Compare June 16, 2026 15:26

alexlarsson force-pushed the ostree-support branch from 8d8c6b2 to 5837fb4 Compare June 17, 2026 13:19

alexlarsson force-pushed the ostree-support branch 7 times, most recently from 188640d to efba46d Compare June 17, 2026 16:20

cgwalters reviewed Jun 17, 2026

View reviewed changes

alexlarsson force-pushed the ostree-support branch from efba46d to e966ce5 Compare June 18, 2026 11:04

Conversation

alexlarsson commented Jun 16, 2025

Uh oh!

allisonkarlitskaya left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexlarsson commented Jun 19, 2025

Uh oh!

alexlarsson commented Jun 26, 2025

Uh oh!

alexlarsson commented Jun 26, 2025

Uh oh!

alexlarsson commented Jun 27, 2025

Uh oh!

alexlarsson commented Jun 30, 2025

Uh oh!

allisonkarlitskaya left a comment

Choose a reason for hiding this comment

Uh oh!

cgwalters commented Sep 5, 2025

Uh oh!

cgwalters commented Jan 29, 2026

Uh oh!

alexlarsson commented Jan 29, 2026

Uh oh!

allisonkarlitskaya commented Jan 29, 2026

Uh oh!

cgwalters left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexlarsson commented Apr 20, 2026

Uh oh!

alexlarsson commented Jun 16, 2026

Uh oh!

alexlarsson commented Jun 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexlarsson Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexlarsson Jun 17, 2026 •

edited

Loading