Content-Addressable aRchives (CAR)
| date | 2025-10-20 | 
|---|---|
| editors | Robin Berjon <robin@berjon.com> Juan Caballero <bumblefudge@learningproof.xyz>  | 
| issues | list, new | 
| abstract | The CAR format offers a serialized representation of set of content-addressed resources in one single concatenated stream, alongside a header that describes that content.  | 
Introduction
The CAR format (Content Addressable aRchives) is used to store series of content-addressable objects as a sequence of bytes. It packages that stream of objects with a header.
Much of the content of this specification was initially developed as part of the IPLD project. This specification was developed based on demand from the community to have just the one simplified document. Note that a CARv2 specification was developed at some point to add support for an index trailer, but it met with limited adoption and so was not considered when bringing CAR into DASL.
Parsing CAR
The CAR format is made of a Header followed by a Body. The Header is a length-prefixed chunk of DRISL ([drisl]) and the Body is a sequence of zero or more length-prefixed blocks that contain a tuple of a DASL CID ([cid]) which is always 36 bytes long and the data addressed by that CID. The length prefix in a CAR is encoded as an unsigned variable-length integer ([varint], a variant of LEB128). This integer specifies the number of remaining bytes, excluding the bytes used to encode the integer, but including the CID for Body blocks.
|------ Header -----| |------------------- Body -------------------|
[ int | DRISL block ] [ int | CID | data ] [ int | CID | data ] …
      
      The steps to parse a CAR are:
- Accept a byte stream bytes that is consumed with every step that reads from it.
 - Run the steps to parse a CAR header with bytes to obtain metadata.
 - 
          Set up array blocks and run these substeps:
          
- If bytes is empty, terminate these substeps.
 - Run the steps to parse a CAR block header with bytes to obtain cid and data size.
 - Read data size bytes from bytes and store the result in data.
 - Push an entry onto blocks containing cid, data size, and data.
 - Return to the beginning of these substeps.
 
 - Return metadata and blocks.
 
Note that the CAR header contains a near-arbitrary DRISL object that is to be treated as metadata ([drisl]). For historical reasons, there are two constraints on the header:
- 
          The object MUST contain a 
versionmap entry, the value of which is always integer-type1. Version numbers in data formats are an anti-pattern, and as a result this number is guaranteed never to change. - 
          The object MUST contain a 
rootsentry, which MUST be of type array. It MAY be empty, but if it isn't then it must be an array of CIDs encoded using tag 42 ([cid]). A CAR can be used to contain one or more DAGs of [drisl] content and the purpose of therootsis to list one or more roots for those DAGs. The array may be empty if you do not care about encoding DAGs. 
Some implementations will only return version and roots, but it is RECOMMENDED that they make the entire metadata object available. A best practice for authors is to use the metadata to capture MASL content, which is able to provide metadata and a pathing mapping for the entire content of the CAR stream if needed ([masl]).
The steps to parse a CAR header are:
- Accept a byte stream bytes.
 - Read an unsigned varint length from bytes ([varint]).
 - If length is 0, throw an error.
 - Read length bytes from bytes and decode them as DRISL ([drisl]) into metadata. If metadata is not a map, throw an error.
 - 
          If metadata does not have a 
versionkey entry with integer value1, throw an error. Otherwise, storeversionin version. - 
          If metadata does not have a 
rootskey entry that is an array, or if that array contains anything other than DASL CIDs, throw an error. Otherwise, storerootsin roots. - 
          Return metadata. (For implementations that only report
          
versionandroots, return those.) 
After its header, CAR contains a series of blocks each of which is length-prefixed and has a small header capturing a CID followed by the block's body data.
The steps to parse a CAR block header are:
Additional Considerations
Conformance
A CAR stream must only feature DASL CIDs.
A CAR stream must have CIDs that match the data body that follows them. A CAR implementation should verify that CIDs match block body data, though it may delegate verification to other components. (Keep in mind that not verifying at all negates the value of content addressing.)
A CAR stream's stated roots must match CIDs contained in the Body. However, implementations frequently operate in a streaming fashion such that they have no way of knowing whether a CAR stream conforms to this requirement before having processed the entire stream. Checking correctness with respect to this requirement may therefore be more readily performed via a warning (at end of processing) or a dedicated validator.
Determinism
Deterministic CAR creation is not covered by this specification. However, deterministic generation of a CAR from a given graph is possible and is relied upon by certain uses of the format, most notably, Filecoin. dCAR may be the topic of a future specification.
          Care regarding the ordering of the roots array in the Header and avoidance
          of duplicate blocks may also be required for strict determinism.
        
Security & Verifiability
          The roots specified by the Header of a CAR is expected to appear somewhere in its Body section,
          however there is no requirement that the roots define entire DAGs, nor that all blocks
          in a CAR must be part of DAGs described by the root CIDs in the Header. Therefore, the
          roots must not be used alone to determine or differentiate the contents of a CAR.
        
The CAR format contains no internal means, beyond the blocks and their CIDs, to verify or differentiate contents. Where such a requirement exists, this must be performed externally, such as creating a digest of the entire CAR (and refer to it using a CID).
Appendix: Media Type
        The media type for CAR is
        application/vnd.ipld.car.
      
        The conventional file extension for CAR is .car.
      
References
- [cid]
 - Robin Berjon & Juan Caballero. Content IDs (CIDs). 2025-10-20. URL: https://dasl.ing/cid.html
 - [drisl]
 - Robin Berjon & Juan Caballero. DRISL — Deterministic Representation for Interoperable Structures & Links. 2025-10-20. URL: https://dasl.ing/drisl.html
 - [masl]
 - Robin Berjon & Juan Caballero. MASL — Metadata for Arbitrary Structures & Links. 2025-10-20. URL: https://dasl.ing/masl.html
 - [varint]
 - unsigned varint. URL: https://github.com/multiformats/unsigned-varint