DASL — Data-Addressed Structures & Links
What is this?
DASL ("dazzle") is a small set of simple, standard primitives for working with content-addressed, linked data. It builds on content addressing, a proven approach used in Git and IPFS to create reliable content identifiers (known as CIDs) through cryptographic hashing. Content addressing enables robust data integrity checks and efficient networking: systems can verify they received exactly what they asked for and avoid downloading the same content twice. The linked data part lets you link to stuff by its hash. You can build very big graphs with these primitives, such as the graph behind Bluesky.
We call DASL "data-addressed" because it supports a data serialization component that makes content-addressing sweet and easy when working with data. The design is inspired by subcomponents of the IPFS universe, but simplified to improve interoperability, decrease costs, and work well with the web. More specifically, our priorities are:
- pave the cowpaths: we focus on supporting what people trying to solve real-world problems actually use. This takes over any consideration of engineering ideals or theoretical purity. We're retconning the spec to what people actually use implement — as it should be.
- extensibility vs optionality: extensibility is important for long-lived distributed systems, because the world will happen and you will need to change. But introducing optionality reduces interoperability and increases cost of both implementation and adoption. So rather than require support for many options now, we have extension points now but deliberately don't use their full range.
- don't make me think: you don't want to be thinking about content addressing. You want to grab this off the shelf and have something that works out of the box. Nothing weird, no impedance mismatch with the systems you know and love (or maybe know and hate, but whatever, it just works).
- lightweight loading: some people like JavaScript, others don't. We don't care, we just want things that work. What's certain is that you can't ignore it and be relevant. The ability to ship small code to the browser is critical.
- Unix philosophy: all of our specs are tiny and meant to compose together in simple ways that can be implemented independently from one another.
This is intended to work for the community, to grow support for what we need. If you have thoughts, don't be shy and submit an issue! No stupid questions, don't assume everyone else has context that you don't. If this page isn't enough to understand DASL, then we're the ones who screwed up.
How
This section describes how to use DASL patterns. It's work in progress!
Implementations
DASL is a strict subset of IPFS CIDs and IPLD, so existing IPFS and IPLD implementations will just read DASL CIDs and dCBOR42 without so much as a hiccup. Some implementations also specifically target a DASL subset.
Here are some implementations that partially or fully support DASL:
- atcute (JS/TS): a collection of lightweight packages to make working with Bluesky and the ATmosphere easy.
- dag-cbrrr (Python): fast DAG-CBOR implementation.
- python-libipld (Python): a Python wrapper around Rust, focused on the ATmosphere.
- ipld-core (Rust): fast Rust implementation.
- rust_cid_npm (Rust): Fast and tiny rust library, CLI tool, and npm package to generate CIDs without a full IPFS client.
- go-cid (Go): implements the CID spec
- Kubo/Boxo (Go): the Swiss-Army chainsaw of all things IPFS.
- Helia (JS/TS): a browser- and CDN-friendly, modular, "import only what you need" JS implementation of IPFS.
Specifications
- Content Identifiers (CIDs)
- CIDs (Content IDs) are identifiers used for addressing resources by their contents, essentially a hash with limited metadata.
- Deterministically-serialized CBOR with Tag 42 (dCBOR42)
- dCBOR42 is a serialization format that is deterministic (so that the same data will have the same CID) and that features native support for using CIDs as links.
- RASL — Retrieval of Arbitrary Structures & Links Early Draft
- RASL is a URL scheme used to identify content-addressed DASL resources along with a simple HTTP-based retrieval method.
- Content-Addressable aRchives (CAR)
- The CAR format offers a serialized representation of set of content-addressed resources in one single concatenated stream, alongside a header that describes that content.
- Big DASL (BDASL)
- This extends DASL CIDs with a new hash type that works better for large files but isn't available by default in browsers, and therefore not an appropriate option in most situations.