RASL — Retrieval of Arbitrary Structures & Links

date	2025-06-25
editors	Robin Berjon <robin@berjon.com> Juan Caballero <bumblefudge@learningproof.xyz>
issues	list, new
abstract	RASL is a URL scheme used to identify content-addressed DASL resources along with a simple HTTP-based retrieval method.

Introduction

Content-addressed resources are "self-certifying," which is to say that you don't need any external authority to certify that the content you have when you resolve the identifier is correct: because the identifier contains a hash, you can (and should) verify that you obtained the right content yourself ([ipfs-principles]). The identifier is enough to certify the content. This has several implications, but two are particularly relevant for this specification:

When resolving a content-addressed identifier, you can obtain the content from anyone. It doesn't have to be the content's author. You can even obtain it from entirely untrusted sources — given that you can always certify it, you don't need to trust whoever gives it to you. As a result, the authority part of a URL — the part that can certify the content you get, which is the domain part in an https URL — is the CID itself ([cid]).
Because it doesn't matter where you get content from, content-addressed URLs are inherently transport-independent. There are benefits to agreeing on transport (if only so that people can find one another's content) but as a client, if you know of several potential ways of obtaining a CID you are free to use whichever you prefer or to try several in whatever order.

Taking these aspects into consideration, this specification defines a URL scheme in which the CID is the authority, along with optional hints of potential look-up locations, and defines a retrieval method but does not mandate that RASL retrieval rely on it.

The `web+rasl` URL Scheme

RASL URLs look like this: web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com,bsky.app/. This breaks down into the following components:

The web+rasl scheme. This uses the web+ prefix to facilitate registration of the scheme in browser contexts so that RASL URLs can be used on the web directly.
An authority composed of:
- a DASL CID bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4 and
- (optionally) a comma-separated list of URI-encoded hostnames (any authority parts that are acceptable in HTTP) that can be used to attempt retrieval from.
A path (here just /) that is empty or / for raw CIDs like the above but can contain a complete path if the CID resolves to MASL data ([masl]).

Use the following steps to parse a RASL URL:

Accept a string url and parse it according to the URL Standard ([url]).
If that's a failure, return the failure.
Read the host part of the parsed URL up to either the first ; character or to the end of the string. Store that in cid.
If cid is not a valid CID ([cid]), return failure.
If there was no ; then hints is an empty array. Otherwise:
1. Split the remainder of the string on ,.
2. Apply decodeURIComponent() to each part.
3. The results is the hints array.
Return the URL's parts as well as cid and hints.

Fetching RASL

A user agent may retrieve a CID in whichever way it prefers. This section provides a simple standard for HTTP-based CID retrieval, to make it easy for authors to publish content to their own sites and have it retrieved, without having to worry about operating any infrastructure beyond the web server they already have.

Use the following steps to fetch a RASL URL:

Accept a string url and parse it according to the steps to parse a RASL URL.
Construct a request using cid from the url as well as hints that may be from the URL or from elsewhere (this is entirely up to you):
1. For each hint, construct a request URL that is the concatenation of https://, the hint as host, /.well-known/rasl/, and the cid.
2. Prepare the request such that it has a method of either GET or HEAD, that it is stateless (no cookies, no credentials of any kind), and that it uses no content negotiation.
Fetch the requests. How these get prioritised is entirely up to the implementation. It is common to run them all in parallel and abort them with the first success response. Note that the .well-known path may redirect, so be ready to handle that. This makes it possible to create sites that are published the usual way and to have a RASL that is simply a redirect to the resource. So for instance, you may have an existing https://berjon.com/kitten.jpg the CID for which is bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4. This can be published as this RASL URL: web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com/. A client can retrieve it by constructing the a request to this URL: https://berjon.com/.well-known/rasl/bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4. In turn, the latter may simply 307 back to https://berjon.com/kitten.jpg. (Yes, this is HTTP with extra steps, but the extra steps get you self-certifying content.)
If the response is a redirect but not a 307, the client should treat it as if it had been a 307 anyway.
If none of the responses are successful, return failure.
Set the response's media type to application/octet-stream. (The server should have done that already, but may not have done so, notably if it relied on a redirect.) The purpose of RASL is to retrieve data in ways that are independent of the server — any media type processing must therefore take place at another layer. Without this, we lose the self-certifying nature of the system. (Note that servers are encouraged to enforce that so as not to have their RASL endpoints used for general-purpose web serving, which can be a security vector depending on where the data being served came from.)
Produce a CID for the retrieved data. If that CID does not match the requested cid, return failure.
Return the data.

RASL Pathing

The pathname component of a RASL URL can only be interpreted in the context of the content that is retrieved. It is never transmitted to the RASL server and is purely interpreted on the client side.

As of this time, the only case in which pathing is defined is if the CID has a 0x55 prefix (i.e. is marked as a [drisl] object), and the content contents retrieved from that CID parse as valid MASL ([masl]) and it has a resources map defining resources for each available path. But carrying out that resolution happens outside of RASL fetching.

References

[cid]: Robin Berjon & Juan Caballero. Content IDs (CIDs). 2025-06-25. URL: https://dasl.ing/cid.html
[drisl]: Robin Berjon & Juan Caballero. DRISL — Deterministic Representation for Interoperable Structures & Links. 2025-06-25. URL: https://dasl.ing/drisl.html
[ipfs-principles]: Robin Berjon. IPFS Principles. march 2023. URL: https://specs.ipfs.tech/architecture/principles/
[masl]: Robin Berjon & Juan Caballero. MASL — Metadata for Arbitrary Structures & Links. 2025-06-25. URL: https://dasl.ing/masl.html
[url]: WHATWG. URL. Living Standard. URL: https://url.spec.whatwg.org/

Introduction

The web+rasl URL Scheme

Fetching RASL

RASL Pathing

References

The `web+rasl` URL Scheme