IPLD / Multiformats

머클트리 포레스트
웹의 진화 - 브라우저는 유지되겠지만 백엔드의 프로토콜은 다양화
데이터로서의 웹 / 데이터 기반의 서비스 / 데이터 종속성 제거 의미
서비스 개발 > 데이터 + 렌더링

GPN19 - Foundations for Decentralization: Data with IPLD

Understanding IPFS in Depth(2/6): What is InterPlanetary Linked Data(IPLD)?Medium

This is extracted from the address below

Understanding IPFS in Depth(4/6): What is MultiFormats?Medium

Multiformats

Why do we Need Multiformats?
Ok, I think we need it. But What is it?
This seems great. Tell me how to use it?

Allowing systems to evolve and grow, without introducing breaking changes is important.

Why do we Need Multiformats?

Summing up there are a number of problems that we face:

Introducing breaking changes to update systems with better security.
Introducing breaking changes due to some unforeseen issues.
And sometimes we need to make trade-offs when it comes to multiple numbers of options, each having a desired trait, but you can have only one.

The Multiformats Project introduces a set of standards/protocols that embrace this fact and allows multiple protocols to co-exist so that even if there is a breaking change, the ecosystem still supports all the versions of the protocol.

What are Multiformats?

The Multiformats Project is a collection of protocols which aim to future-proof systems, today. They do this mainly by enhancing format values with self-description. This allows interoperability, protocol agility, and helps us avoid lock-in.

The self-describing aspects of the protocols have a few stipulations:

They MUST be in-band (with the value); not out-of-band (in context).
They MUST avoid lock-in and promote extensibility.
They MUST be compact and have a binary-packed representation.
They MUST have a human-readable representation.

Multiformat protocols

Currently, we have the following multiformat protocols:

Multihash: Self-describing hashes
Multiaddr: Self-describing network addresses
Multibase: Self-describing base encodings
Multicodec: Self-describing serialization
Multistream: Self-describing stream network protocols
Multistream-select: Friendly protocol multiplexing.
Multigram(WIP): Self-describing packet network protocols
Multikey: cryptographic keys and artifacts

Each of the projects has its list of implementations in various languages.

Multihash: fn code + length prefix

Source

You can find a number of multihash implementations in multiple languages.

Multiaddr

/ip4/127.0.0.1/udp/9090/quic
/ip6/::1/tcp/3217
/ip4/127.0.0.1/tcp/80/http/baz.jpg
/dns4/foo.com/tcp/80/http/bar/baz.jpg
/dns6/foo.com/tcp/443/https

Multicodec

multicodec is a self-describing multiformat, it wraps other formats with a tiny bit of self-description. A multicodec identifier is a varint.

A chunk of data identified by multicodec will look like this:

<multicodec><encoded-data>
# To reduce the cognitive load, we sometimes might write the same line as:
<mc><data>

Multicodec uses “protocol tables” to agree upon the mapping from one multicodec code. These tables can be application specific, though — like with other multiformats — we will keep a globally agreed upon table with common protocols and formats.

In order to enable self-descriptive data formats or streams that can be dynamically described, without the formal set of adding a binary packed code to a table, we have multistream, so that applications can adopt multiple data formats for their streams and with that create different protocols.

Multistream

Motivation

Multicodecs are self-describing protocol/encoding streams. (Note that a file is a stream). It’s designed to address the perennial problem:

I have a bitstring, what codec is the data coded with?

To decode an incoming stream of data, a program must either

know the format of the data a priori, or
learn the format from the data itself.

(1) precludes running protocols that may provide one of many kinds of formats without prior agreement on which. multistream makes (2) neat using self-description.

Moreover, this self-description allows straightforward layering of protocols without having to implement support in the parent (or encapsulating) one.

How does the protocol work?

multistream is a self-describing multiformat, it wraps other formats with a tiny bit of self-description:

<varint-len>/<codec>\n<encoded-data>

For example, let’s encode a JSON doc:

// encode some json
const buf = new Buffer(JSON.stringify({ hello: 'world' }))const prefixedBuf = multistream.addPrefix('json', buf) // prepends multicodec ('json')
console.log(prefixedBuf)
// <Buffer 06 2f 6a 73 6f 6e 2f 7b 22 68 65 6c 6c 6f 22 3a 22 77 6f 72 6c 64 22 7d>console.log(prefixedBuf.toString('hex'))
// 062f6a736f6e2f7b2268656c6c6f223a22776f726c64227d// let's get the Codec and then get the data backconst codec = multicodec.getCodec(prefixedBuf)
console.log(codec)
// jsonconsole.log(multistream.rmPrefix(prefixedBuf).toString())
// "{ \"hello\": \"world\" }

So, buf is:

hex:   062f6a736f6e2f7b2268656c6c6f223a22776f726c64227d
ascii: /json\n"{\"hello\":\"world\"}"

Note that on the ASCII version, the varint at the beginning is not being represented, you should account that.

You can find multistream-select tutorial here.

PreviousSHACL NextMPC

Last updated 1 year ago

Was this helpful?