CONCEPT Cited by 1 source
File vs. Object Semantics¶
File semantics (the OS filesystem contract applications have been written against for 50 years) and object semantics (the S3-style immutable-blob-with-HTTP-API contract) differ on almost every axis that matters when you try to fuse them. The 2026 S3 Files post enumerates these asymmetries explicitly — and the design lesson from its six abandoned convergence attempts is that the differences are load-bearing, not cosmetic. You can't collapse them without breaking applications built against either side.
(Source: sources/2026-04-07-allthingsdistributed-s3-files-and-the-changing-face-of-s3)
The four axes of incongruity¶
1. Mutation granularity¶
Files:
- In-place sub-region updates — "update a record in a database in
place, or append data to a log"
- Concurrent writers + readers observe "changes almost
instantaneously, to an arbitrary sub-region of the file"
- mmap() as shared persistent data that can mutate at a very fine
granularity as if it were in-memory structures
- Rich, mutation-heavy, semantically deep.
Objects: - "Writing to the middle of an object while someone else is accessing it is more or less sacrilege" - Immutability is "cooked into APIs and applications" - Whole-object replacement only; versioning as the answer to "preserve old copies" - Focused, narrow, immutable.
See concepts/immutable-object-storage.
2. Atomicity primitives¶
Files: - Atomic rename (the canonical "make a change visible all at once" primitive — used everywhere from compilers emitting a binary to daemons rotating logs) - Atomic directory rename (the same primitive, scaled) - Both are first-class, cheap, synchronous
Objects: - No native rename — renaming means copy + delete - No native directory rename — S3 has no directories, just key-prefix conventions - Conditional PUT (compare-and-set) partially covers some rename-as-atomic-switch patterns but is not the same semantic - See patterns/conditional-write
3. Authorization model¶
Files: - Permissions attached to inodes - Handle-based — once opened, authorization off the data path; "often even if the file is renamed, moved, and even deleted" the handle still works - Directory-traversal-gated — you can't reach a file without permission on every directory in its path - Hard links complicate all of this — one inode, many paths
Objects: - IAM policies — prefix-scoped, network-conditioned, request-property-conditioned - Rich: "you can further constrain those permissions based on things like the network or properties of the request itself" - Also much more expensive to evaluate than file permissions — "file systems have spent years getting things like permission checks off of the data path, often evaluating up front and then using a handle for persistent future access"
4. Namespace and path semantics¶
- Filesystems have first-class
/path separators. - S3 has
/only as a suggestion — "S3's LIST command allows you to specify anything you want to be parsed as a path separator and there are a handful of customers who have built remarkable multi-dimensional naming structures that embed multiple different separators in the same paths and pass a different delimiter to LIST." - Object keys can end with
/(looks like a directory but is an object — the team briefly called these "filerectories" before abandoning the idea). - Object keys can contain characters not valid in POSIX filenames.
5. Namespace performance shape¶
Filesystems: - Metadata is data-dependent: accessing a file also accesses (and sometimes updates) the directory record. - Many ops traverse all directory records along a path. - Fast distributed filesystems co-locate per-directory metadata on one host to keep these chains fast.
Object systems: - Namespace is completely flat. - Optimised for highly parallel point queries/updates. - "There are many cases in S3 where individual 'directories' have billions of objects in them and are being accessed by hundreds of thousands of clients in parallel."
Notification-driven pipelines — a specifically-object primitive¶
S3's at-least-once object-creation notifications (300B/day) are the trigger for:
- S3 Cross Region Replication
- Log processing
- Image transcoding
- Any "do something when a new blob lands" pattern
These pipelines assume whole-object creation. They have no analog in a sub-region-mutation filesystem, and forcing file-semantics onto S3 would silently break them at massive scale. This is the single clearest example in the post of why the semantics must remain distinct to preserve an enormous ecosystem of existing applications.
The design-lesson generalisation¶
The 2026 S3 Files post is the most detailed public articulation of this incongruity in the AWS literature. The design lesson:
"There is actually a pretty profound boundary between files and objects… this boundary that separated them was what we really needed to pay attention to, and that rather than trying to hide it, the boundary itself was the feature we needed to build."
See concepts/boundary-as-feature for the generalised principle and patterns/explicit-boundary-translation for the implementation pattern. The S3 Files stage-and-commit mechanism (concepts/stage-and-commit) is how file-side mutations translate across to object-side whole-object atomicity.
Seen in¶
- sources/2026-04-07-allthingsdistributed-s3-files-and-the-changing-face-of-s3 — exhaustive enumeration of file/object semantic asymmetries; design payoff from admitting rather than hiding them.