RASL — Retrieval of Arbitrary Structures & Links

date2025-01-17
editorsRobin Berjon <robin@berjon.com>
Juan Caballero <bumblefudge@learningproof.xyz>
issueslist, new
abstract

RASL is a URL scheme used to identify content-addressed DASL resources along with a simple HTTP-based retrieval method.

Introduction

Content-addressed resources are "self-certifying," which is to say that you don't need any external authority to certify that the content you have when you resolve the identifier is correct: because the identifier contains a hash, you can (and should) verify that you obtained the right content yourself ([ipfs-principles]). The identifier is enough to certify the content. This has several implications, but two are particularly relevant for this specification:

Taking these aspects into consideration, this specification defines a URL scheme in which the CID is the authority, along with optional hints of potential look-up locations, and defines a retrieval method but does not mandate that RASL retrieval rely on it.

The web+rasl URL Scheme

RASL URLs look like this: web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com,bsky.app/. This breaks down into the following components:

  • The web+rasl scheme. This uses the web+ prefix to facilitate registration of the scheme in browser contexts so that RASL URLs can be used on the web directly.
  • An authority composed of:
    • a DASL CID bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4 and
    • (optionally) a comma-separated list of hostnames (any authority parts that are acceptable in HTTP) that can be used to attempt retrieval from.
  • A path (here just /) that is empty or / for raw CIDs like the above but can contain a complete path if the CID resolves to MASL data ([masl]).

Need to specify parsing based on URL spec:

  • Use regular URL parsing
  • Extract CID from host up to ; or end
  • Return failure if it's not a valid [cid]
  • If there was ; process the rest as a comma-separated list
  • Each item in the list is a hint, they don't get validated here
  • Use CID as authority, provide hints separately
  • Specify origin tuple using the authority

RASL Retrieval

A user agent may retrieve a CID in whichever way it prefers. This section provides a simple standard for HTTP-based CID retrieval, to make it easy for authors to publish content to their own sites and have it retrieved, without having to worry about operating any infrastructure beyond the web server they already have.

RASL retrieval works this way:

  • Obtain the [cid] by extracting the authority from the URL (or whatever other way).
  • If there are hints, you can use them as hosts to construct a retrieval request from. But you don't have to.
  • Constructing a request works by constructing an HTTPS URL this way:
    • Always use https
    • Use the host you have (from hint or yours)
    • Path is /.well-known/rasl/${cid}
    • No further pathing information is provided
  • Use that URL to make a stateless HTTP request (no cookies, nothing gets saved), don't use conneg, just the most vanilla side-effect free GET that money can buy.
  • The .well-known path may redirect, so be ready to handle that. This makes it possible to create sites that are published the usual way and to have a RASL that is simply a redirect to the resource. So for instance, you may have an existing https://berjon.com/kitten.jpg the CID for which is bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4. This can be published as this RASL URL: web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com/. A client can retrieve it by constructing the a request to this URL: https://berjon.com/.well-known/rasl/bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4. In turn, the latter may simply 307 back to https://berjon.com/kitten.jpg. (Yes, this is HTTP with extra steps, but the extra steps get you self-certifying content.)
  • If there's a redirect and it's not a 307, the client should treat it as such anyway.
  • Note that the response media type for ALL RASL requests is application/octet-stream. This is done explicitly to avoid people using RASL endpoints to serve sites directly.

RASL Pathing

Mention that the path is only meaningful if the CID is a dCBOR42 one AND if the content is MASL AND if it has a resources map. If it is, look it up. If it's there, return that, with the right headers. If not, then return 404. If the conditions don't match, we need to pick the right 4xx error.

References

[cid]
Robin Berjon & Juan Caballero. Content IDs (CIDs). 2025-01-17. URL: https://dasl.ing/cid.html
[ipfs-principles]
Robin Berjon. IPFS Principles. march 2023. URL: https://specs.ipfs.tech/architecture/principles/
[masl]
Robin Berjon & Juan Caballero. MASL — Metadata for Arbitrary Structures & Links. 2025-01-17. URL: https://dasl.ing/masl.html