The Decentralized Storage space is rapidly evolving. Filecoin is at an important moment – and in this blog we propose both areas for the ecosystem to double down on and ways we can track that progress. It is by no means exhaustive, but written from the vantage point of having been embedded in the Filecoin ecosystem for many years, gathering feedback from users, builders and the community, and having thought deeply about what is needed as the network moves forward. 


The blog is organized in the following sections: 

  • What matters for Filecoin in 2024
  • Why these matter and how to measure progress

It is our hope that with the right north star, teams will be able to better coordinate and identify convergences between project-level interests & ecosystem interests. The proposed framework and metrics should make it easier for capital and resource allocators in the ecosystem to evaluate the level of impact each team is creating, and distribute capital and resources accordingly. For startups, this can help frame where broader ecosystem efforts may dovetail into your roadmap and releases.


WHAT MATTERS IN 2024

  1. Accelerating conversions to paid deals: Helping Filecoin providers increase their paid services (storage, retrieval, compute) is critical for driving cashflows into Filecoin and to support sustainable funding of its hardware outside of token incentives.

  2. Growing on-chain activity: Filecoin is not aiming to be just another L1 fighting over the same use cases. But it does have a unique value proposition as a base layer with “real world” services anchored into it. This enables new use cases (programmable services, DeFi around cash flows, etc.) that are unique to Filecoin. Building out and growing adoption of these services can help prove that Filecoin is not just “a storage layer”, but an economy with a stable set of cash flows.

  3. Making Filecoin indispensable to others: Bull cycles mean velocity is critical – as is making Filecoin an indispensable part of the stack for more teams. There are many emerging themes to capitalize on (Chain Archival, Compute, AI) – and Filecoin positioning itself matters. The ecosystem collectively wins when more participants leverage Filecoin as a core part of their story. For individual teams, this means that shipping to your users matters. At the ecosystem level, it means orienting efforts to unblock the teams closest to driving integrations and building services on Filecoin.

The verticals in our framework remain relatively high-level – and many of these objectives will have their own set of tasks. But it is more critical first, for the ecosystem to be aligned that this is the right set of verticals to progress against. We dive into each vertical and some tangible metrics that the ecosystem should start tracking against.



WHY THESE MATTER AND HOW TO MEASURE PROGRESS

1) Accelerating conversion to paid deals

As a storage network – Filecoin should maximize the cashflows it can bring into its economy. Having incentives as an accelerant is fine – but without having a steady (and growing ramp) of paid deals Filecoin can’t achieve its maximum potential.

Paid deals (when settled on-chain) are a net capital inflow into the Filecoin economy that can be the substrate for use cases uniquely possible in our ecosystem. DeFi as an example has a real opportunity to provide actual services to businesses (e.g. converting currencies to pay for storage).

There are two main paths that we can drive growth of paid services:

  • Drive growth in existing services (data archival)
  • Expand to new markets with additional services (hot storage, compute, indexing, etc.)


In both cases, there’s work to be done to reduce friction for paid on-ramps or ship new features that raise the floor (as informed by on-ramps and projects trying to bring Filecoin services to market). It is critical that the Filecoin ecosystem collectively prioritizes the right efforts to make Filecoin services sellable, and allocate resources accordingly.

There are already a number of teams making substantial progress on this front (CID.Gravity, Seal Storage, Holon, Banyan, Lighthouse.storage, Web3Mine, Basin, among others) – and we can best measure progress by helping reduce their friction and helping drive their success.


We propose measuring success for this vertical in two forms:

  1. Dollars and Total Data Stored for Paid Deals (self reported)
  2. Dollars and Total Data Stored for Paid Deals (on-chain)

There are a number of initiatives from public goods teams along these efforts for the quarter (Q2 2024) which include: 

  • FilOz: is working on a FIP for new proofs to reduce storage costs and dramatically improve retrieval speeds
  • DeStor: is helping drive enterprise adoption for business ready on-ramps
  • Ansa Research, Filecoin Foundation, etc.: Web3 BD support for ecosystem builders
  • Targeted grant funding for efforts that directly support growth of sustainable on-chain paid deal activity


2) Growing on-chain activity

Filecoin, as an L1, has more than just its storage service. Building a robust on-chain economy is critical for accelerating the services and tooling with which others can compose. In the Filecoin ecosystem, we have a unique opportunity in that there are real economic flows to enable via paid on-chain deals.

Centering our on-chain economy around supporting those flows – be it from automating renewals, designing incentives for retrievals, creating endowments for perpetual storage, or building economic efficiency for the operators of the network – can lead to compounding growth as it creates a flywheel.

As Filecoin owns more of its own economic activity on-chain, value will accrue for the token – enabling ecosystem users to use Filecoin in more productive ways, generating real demand for services inside the ecosystem. 

We propose the following metrics for us to collectively measure success: 

  1. Contract calls
  2. Active Filecoin addresses
  3. Volume of on-chain payments

There are notable builders already seeding the on-chain infrastructure to leverage some of these primitives (teams like GLIF working on liquid staking, Lighthouse on storage endowments, and teams like Fluence enabling compute).

There’s a set of improvements that can dramatically reduce friction for driving on-chain activity, and there several efforts prioritized against this for Q2 2024:

  • FilOz: F3 to bring fast finality to Filecoin can both improve the bridging experience, and enable more “trade” between Filecoin and other economies (e.g. native payments from other ecosystems for services in Filecoin). 
  • FilOz: Refactoring how deals work on Filecoin to enable more flexible payment (e.g. with stablecoins)
  • FilPonto, FilOz: Reducing EVM tech debt to substantially reduce friction for builders porting Solidity contracts onto Filecoin (and hardening the surrounding infrastructure for more stable services)



3) Making Filecoin indispensable to others

This vertical is broad, but we would argue that there are two key ways to be consider the impact that the Filecoin ecosystem is driving:

  1. The first is along high profile integrations, where Filecoin is critical to the success of the customer and its proposition. It is especially critical for the ecosystem to provide the necessary support for these cross-chain integrations.
  1. The second is along specific verticals, where there is a large and growing trend in activity; Filecoin is uniquely positioned to provide value here, both in terms of the primitives it has, as well as in its cost profile and scale
    1. Opportunities are brimming in Web3 at the moment, and the ecosystem should rally workstreams around on-ramps that are making Filecoin integral to narratives such as Compute, DePIN (sensors), Social, Gaming, AI, and Chain Archival.


We propose that the metrics to evaluate for Filecoin indispensability as: 

  1. Number of partnerships and integrations

There are a number of efforts from ecosystem teams aimed at helping onramps succeed on this front in the quarter (Q2 2024): 

  • Ansa Research, Filecoin Foundation, DeStor and others: Forming a new working group to accelerate shared ecosystem BD and marketing resources
    • Shared BD resources for builders in the Filecoin ecosystem
    • Shared Marketing resources and amplification (#ecosystem-amplification-requests in the Filecoin slack) to help signal boost ecosystem wins
    • Community Discord to help expand accessibility, visibility, and drive community engagement



FINAL THOUGHTS

After reading the above, we hope that the direction of Filecoin in the coming year is clearer. Filecoin is at a pivotal moment where many of its pieces are coming together. Protocols and ecosystems naturally evolve and each stage calls for different priorities and strategies for the next leg of growth. By focusing efforts in the ecosystem, we believe that the Filecoin ecosystem can make its resources and support go that much further. 

We are excited for what is to come and how Filecoin can continue to expand the pie for what can be done on Web3 rails. Moving forward, Ansa Research will post periodic updates on the key metrics for Filecoin’s ecosystem progress.

To stay updated on the latest Filecoin happenings, follow the @Filecointldr handle.

Disclaimer: This information is for informational purposes only and is not intended to constitute investment, financial, legal, or other advice. This information is not an endorsement, offer, or recommendation to use any particular service, product, or application.

As we’ve written about previously, Filecoin is building an economy of open data services. While today, Filecoin’s economy is primarily oriented around storage services, as other services (retrieval, compute) come online, the utility of the Filecoin network will compound as they all anchor in and can be triggered from the same block space.

The Filecoin Virtual Machine (FVM) allows us to compose these services together, along with other on-chain services (e.g. financial services), to create more sophisticated offerings. This is similar to how the composability in Defi enables the construction of key financial markets services in a permissionless manner (e.g. auto investment capabilities (Yearn) which builds on liquidity pools like Curve and lending protocols like Compound). The FVM is an important milestone for Filecoin, as it allows anyone to build protocols to improve the Filecoin network and build valuable services for other participants. Smart contracts on Filecoin are unique in that they pair web3 offerings with real world services like storage and compute, provided by an open market.

In this blogpost, we’ll unpack a sample use case and its supporting components for the FVM, how these services might compose together, and the potential business opportunities behind them. One of the neat artifacts of what the FVM enables is for modularity between solutions, meaning components built for one protocol can be reused for others. While designing these solutions, hopefully builders (potentially you!) keep this in mind to maximize the customer set.

This is only a subset of the opportunities that the broader Filecoin community has put forward here, but the aim is to show how these services might intertwine and how the Filecoin economy might evolve.

Note: Over time, it’s likely that a number of these services will migrate to subnets via Interplanetary Consensus — but for this blogpost we want to paint a more detailed picture of what the Filecoin economy might look like early on.

The rest of the blog is laid out as follows:

  • Motivating Use Case
    – 
    Perpetual Storage
  • Managing Flows of Data
    – 
    Aggregator / Caching Nodes
    – Trustless Notaries
    – Retrievability Oracles
  • Managing Flows of Funds
    – 
    Staking
    – Cross Chain Messaging
    – Automated Market Makers (AMMs)

Motivating Use Case

Perpetual Storage

Perpetual storage is a useful jumping point — as it motivates a number of the other services (both infrastructure and economic) in the network. Permanent storage (which we’ve argued previously is a subset of perpetual storage) is a market valued at ~$300 million.

The basic goal of perpetual storage is straightforward: enable users to specify terms for how their datasets should be stored (e.g. ensure there are always at least 10 copies of this data on the network) — without having to run additional infrastructure to manage repair or renewal of deals. As long as the funds exist to pay for storage services, the perpetual storage protocol should automatically incentivize the repair and restoration of any under-replicated dataset to meet the specified terms.

This tweet thread shares a mental model for how one might create and fund such a contract. In the simplest form, you can boil down a perpetual storage contract to a minimum set of requirements <cid, number of copies, USDC, FIL, rules for an auction>, and the primitives to verify proper execution. Filecoin’s proofs are critical — as they can tell us when data is under-replicated and allow us to trigger auctions to bring replication back to a minimum threshold.

In order to build the above, a number of services are required. While one protocol could try and solve for all the services required in a monolithic architecture, modular solutions would allow for re-use in other protocols. Below we’ll cover some of the middleware services that might exist to help enable the full end-to-end flow.

Managing the Flow of Data

A sample flow for how data may move across the Filecoin economy.

Aggregator / Caching Nodes

In our perpetual storage protocol, the client specifies some data that should be replicated. This leads to an interesting UX question — in many cases, users don’t want to have to wait for storage proofs to land on-chain to know the data will be stored and replicated. Instead, users might prefer to have their data persisted by an incentivized actor with guarantees that all other services will occur, similar to the role that data on-ramps play (like Estuary and NFT.Storage).

Note: One of the nice things about content addressing is that relying on incentivized actors is totally optional! Users wait for their data to land on-chain themselves if they’d like — or send their data to an incentivized network (as described here) that manages this onboarding process for them.

One solution to this UX question might be to design a protocol for an incentivized IPFS nodes operating with weaker storage guarantees to act as incentivized caches. These nodes might lock some collateral (to ensure good behavior, enact penalties if services are not rendered properly), and when data is submitted return a commitment to store the data Filecoin according to the specified requirements of the client. This commitment might include a merkle proof (showing the client’s data was included inside of a larger set of data that might be stored in aggregate), a max block height by which the deal would start, etc.

Revenue Model: One neat feature of this design for aggregator services is they can take small microtransactions on service on both sides — a small fee from clients (pricing the temporary storage costs, compute for aggregation, bandwidth costs, etc), and potentially an onboarding bounty from an auction protocol (an example described in the Trustless Notaries section below).

Trustless Notaries (Auction Protocols)

To actually make the deal on Filecoin, we might want to automate the process of using Filecoin Plus. Filecoin has two types of storage deals — verified deals and unverified deals. Verified deals refer to storage deals done via the Filecoin Plus program, and are advantageous for data clients as it leverages Filecoin’s network incentives to help reduce the cost of storage.

Today, Filecoin Plus uses DataCap (allocated by Notaries) to help imbue individual clients with the ability to store fixed amounts of data on the network. Notaries help add a layer of social trust to publicly verify that clients are authentic and prevent sybils from malicious actors. This works when clients are human — but it leaves an open question on how one can verify non-human (e.g. smart contract!) actors.

One solution would be to design a trustless notary. A trustless notary is a smart contract, where it would be economically irrational to attempt to sybil the deal-making process.

A basic flow of how a trustless notary interact with clients and storage providers.

What might this look like? A trustless notary might be an on-chain auction, where all participants (clients, storage providers) are required to lock some collateral (proportional to the onboarding rate) to participate. When the auction is run, storage providers can submit valid bids (even negative ones!) accommodating the requirements of the client. By running an auction via a smart contract — everyone can verify that the winning bidder(s) came from a transparent process. Economic collateral (both from the clients and storage providers) can be used to disincentivize malicious actors and ensure closed auctions result in on-chain deals. The auction process might also allow for more sophisticated negotiations between a prospective client and storage provider — not just on the terms of the deal, but on the structure of the payment as well. A client looking to control costs might offer a payment in fiat (to cover a storage provider’s opex) along with a loan in Filecoin (and in return expect a large share of the resulting block rewards).

Revenue Model: For running the auction, the notary maintainer might collect some portion of fees for the deal clearing, collect a fee on locked collateral (e.g. if staked FIL is used as the collateral some slice of the yield), or some combination of both. One nice artifact about running a transparent auction is it can also allow for negative prices for storage (which can be used to fund an insurance fund for datasets, bounties for teams that help onboard new clients, distributed to tokenholders who participate in governance of the trustless notary, etc).

Note: Trustless notaries (if designed correctly) have a distinct advantage of being permissionless — where they can support any number of use cases that might not want humans in the loop (e.g. ETL pipelines that want to automatically store derivative datasets). Today, 393 PiB of data have been stored via verified deals.

In our perpetual storage use case, we’d likely want to be able to leverage the trustless notary to trigger the deal renewals and auctions any time a dataset is under replicated. On the first iteration, this means that storage providers might grab the data out of the caching nodes and on subsequent iterations from other storage providers who have copies of the data.

Retrievability Oracles

For both the deals struck by the trustless notaries, as well as for the caching done by the aggregators — we need to ensure data is properly transferred and protect clients against price gouging. One solution to this problem are retrievability oracles.

Retrievability oracles are consortiums that allow a storage provider to commit to a maximum retrieval price for the data stored. The basic mechanism is as follows:

  • When striking a deal with a client, a storage provider additionally can commit to retrieval terms.
  • In doing so, the storage provider locks collateral with the retrievability oracle along with relevant commitment (e.g. max price to charge per GiB for some duration).
  • In normal operation, the client and the storage provider continue to store and retrieve data as normal.
  • In the event the storage provider refuses to serve data (against whatever terms previously agreed), the client can appeal to the retrievability oracle who can request the data from the storage provider.
    → If the storage provider serves the data to the oracle, the data is forwarded to the client.
    → If the storage provider doesn’t serve the data, the storage provider is slashed.

Revenue Model: For running the retrieval oracles, the consortium may collect fees (either from storage providers for using the service, fees for accepting different forms of collateral, yield from staked collateral, or perhaps upon retrieval of data on behalf of the client).

By including a retrievability oracle in this loop, we can ensure incentives exist to proper transfer of data at the relevant points of the lifecycle of our perpetual storage protocol.

Building the Economic Loop

With all of the above, we’ve effectively created incentivized versions of the relevant components for the dataflows. Now with this out of the way, it’s worthwhile to focus on the economic flows and how we can ensure that our perpetual storage protocol can fully fund the operations above.

A sample view on the economic flows for enabling perpetual storage. Note other DeFi primitives might help mitigate risk and volatility (e.g. options, perpetuals)
A sample view on the economic flows for enabling perpetual storage. Note other DeFi primitives might help mitigate risk and volatility (e.g. options, perpetuals)

Aside from the initial onboarding costs, the remainder of the costs will come down to storage and repairs. While there are many approaches to calculating the upfront “price”, a conservative strategy will likely involve the perpetual storage protocol generating revenue in the same currencies (fiat, Filecoin) as the liabilities incurred due to storage (i.e the all-in costs of storage). This approach relies on the fact that storage on Filecoin has two types of (fairly predictable) expenses:

  • The Filecoin portion (the cost of a loan in FIL, the FIL required to send messages over the network) and
  • The fiat portion (the cost of the harddrive, running the proofs, electricity)

A perpetual storage protocol that builds an endowment in the same mix of currencies as its liabilities can ensure that its costs are fully covered despite the volatility of a single token (as might be the case if the endowment was backed by a single currency). In addition, by putting the capital to work and generating yield, the upfront cost for the client can be reduced.

Staking

To generate yield in Filecoin, the natural place to focus would be on the base protocol of Filecoin itself. Storage providers are required to lock FIL as collateral in order to onboard capacity, and by running Filecoin’s consensus, they earn a return in the form of block rewards and transaction fees. The collateral requirements for the now ~4,000 storage providers on the Filecoin network create a high demand to borrow the FIL token. FIL staking would allow a holder of Filecoin to lend FIL — locking their capital with a storage provider and receiving yield by sharing in the rewards of the storage provider.

Today, programs exist with companies like Anchorage, Darma, and Coinlist to deploy Filecoin with some storage providers, but these programs can service only a subset of the storage providers and don’t support protocols (such as our perpetual storage protocol) that might be looking to generate yield.

Staking protocols can uniquely solve this problem — allowing for permissionless aggregation (allowing smart contracts to generate yield), and deployment of Filecoin to all storage providers directly on-chain. Similar to Lido or Rocketpool in Ethereum, these protocols could also create tokenized representations of the yield bearing versions of Filecoin — further allowing these tokenized representations to be used as collateral in other services (e.g. the trustless notary, retrievability oracles listed above).

Revenue Model: Staking protocols can monetize in a number of ways — including taking a percentage of the yield generated from deployed assets.

Note: Today, roughly 34% of the circulating supply of Filecoin is locked as collateral, less than half of some proof of stake networks (e.g. 69% for Solana, 71% for Cardano).

Cross Chain Messaging

The other portion of the storage costs (the fiat denominated portion) will need to generate yield — and while it makes sense that some of these solutions might be deployed on the FVM, it’s worth discussing the areas where DeFi in other ecosystems might be used to fund operations on Filecoin.

Cross chain messaging could connect the Filecoin economy to other ecosystems allowing perpetual storage protocols to create pools for their non-Filecoin assets (e.g. USDC) on other networks. This would allow these protocols to generate yield on stablecoins in deeper markets (e.g. Ethereum) and bridge back portions as needed to Filecoin when renewing deals. Perpetual storage protocols can offer differentiated sources of recurring demand for these lending protocols, as they likely will have a much more stable profile in terms of their deployment capital given their cost structure — similar to pension funds in the traditional economy.

Given an early source of demand for many perpetual storage protocols include communal data assets (e.g. NFTs, DeSci datasets) which primarily involve on-chain entities, it’s likely that over time we’ll see steady demand for these cross chain services. For cross chain messaging protocols, this offers a unique opportunity to capture value between the “trade” of these different economies — as services are rendered on either side.

Automated Market Makers (AMMs)

One last component worth mentioning in the value flow for our perpetual storage protocol is a need for AMMs. The protocols listed above offer solutions for yield generation, but at the moment of payment conversion of assets will likely need to happen (e.g. converting from a staked representation of Filecoin to Filecoin itself). This is where AMMs can help!

Outside of helping convert staked representations of Filecoin to Filecoin, AMMs can also be useful for allowing perpetual storage protocols to accept multiple types of currencies for payment (e.g. allowing ETH or SOL to be swapped into the appropriate amounts of FIL and stablecoins to fund the perpetual storage protocol). These conversions might happen on other chains as well — but similar to the traditional economy, it’s likely that over time we’ll see trade balances emerge between these economies and swaps to happen on both sides.

Conclusion

These examples are a subset of the use cases and business models that the FVM enables. While I focused on tracing the flow of data and value through the Filecoin economy, it’s worth underscoring this is just a single use case — many of these components could be re-used for other data flows as well. Given all services require remuneration, these naturally will tie to economic flows as well.

As Filecoin launches its retrieval markets and compute infrastructure, the network will support more powerful interactions — as well as creating new business opportunities to connect services to support higher order use cases. Furthermore, as more primitives are built out on the base layer of Filecoin (e.g. verifiable credentials around the greenness of storage providers, certifications for HIPAA compliance), you can imagine permutations of each of the base primitives built above allowing for more control at the user level about the service offerings they wish to consume (e.g. a perpetual storage protocol that will only allow green storage providers to participate in storing their data).The FVM dramatically increases the design space for developers looking to build on the hardware and services in the Filecoin economy — and can hopefully also provide tangible benefits for all of web3, allowing for existing protocols and services to now be connected to a broader set of use cases.Disclaimer: Personal views and not reflective of my employer nor should be treated as “official”. This information is intended to be used for informational purposes only. It is not investment advice. Although we try to ensure all information is accurate and up to date, occasionally unintended errors and misprints may occur.Special thanks to @duckie_han for helping shape this.

🇨🇳Filecoin虚拟机(FVM)上的主要商业模式

Overview

There is an overwhelming amount of work going on in the Filecoin ecosystem, and it can be difficult to see how all the pieces fit together. In this blog post, I’m going to explain the structure of Filecoin and various components of the roadmap to hopefully simplify navigating the ecosystem. This blog is organized into the following sections:

  • What is Filecoin?
  • Diving into the Major Components
  • Final Thoughts

This post is intended to be a primer on the major goings-on in Filecoin land; it is by no means exhaustive of everything happening! Hopefully, this post serves as a useful anchor and the embedded links are jumping-off points for the intrepid reader.

What is Filecoin?

My short answer: Filecoin is enabling open services for data, built on top of the IPFS protocol.

IPFS allows data to be uncoupled from specific servers —reducing the siloing of data to specific machines. In IPFS land, the goal is to allow permanent references to data — and do things like compute, storage, and transfer — without relying on specific devices, cloud providers, or storage networks. Why content addressing is super powerful and what CIDs unlock is a separate topic — worthy of its own blog post — that I won’t get into here.

Filecoin is an incentivized network on top of IPFS — in that it allows you to contract out services around data on an open market.

Today, Filecoin focuses primarily on storage as an open service— but the vision includes the infrastructure to store, distribute, and transform data. Looking at Filecoin through this lens, the path the project is pursuing and the bets/tradeoffs that are being taken become clearer.

It’s easier to bucket Filecoin into a few major components:

There are 3 core pillars of Filecoin, enabled by 2 critical protocol upgrades
  • Storage Market(s): Exists today (cold storage), improvements in progress.
  • Retrieval Market(s): In progress
  • Compute over Data (Off-chain Compute): In progress
  • FVM (Programmable Applications): In progress
  • Interplanetary Consensus (Scaling): In progress

Diving into the Major Components

Storage Market(s)

Storage is the bread and butter of the Filecoin economy. Filecoin’s storage network is an open market of storage providers — all offering capacity on which storage clients can bid. To date, there are 4000+ storage providers around the world offering 17EiB (and growing) of storage capacity.

Filecoin is unique in that it uses two types of proofs (both related to storage space and data) for its consensus: Proof-of-replication (PoRep) and Proof-of-Spacetime (PoST).

  • PoRep allows a miner to prove both that they’ve allocated some amount of storage space AND that there is a unique encoding of some data (could be empty space, could be a user’s data) into that storage space. This proves that a specific replica of data is being stored on the network.
  • PoST allows a miner to prove to the network that data from sets of storage space are indeed still intact (the entire network is checked every 24 hrs). This proves that said data is being stored (space) over time.

These proofs are tied to economic incentives to reward miners who reliably store data (block rewards) and severely penalize those who lose data (slashing). One can think of these incentives like a cryptographically enforced service-level agreement, except rather than relying on the reputation of a service provider — we use cryptography and protocols to ensure proper operation.

In summary, the Filecoin blockchain is a verifiable ledger of attestations about what is happening to data and storage space on the network.

A few features of the architecture that make this unique:

  • The Filecoin Storage Network (total storage capacity) is 17EiB of data — yet the Filecoin blockchain is still verifiable on commodity hardware at home. This gives the Filecoin blockchain properties similar to that of an Ethereum or a Bitcoin, but with the ability to manage internet-scale capacity for the services anchoring into the blockchain.
  • This ability is uniquely enabled by the fact that Filecoin uses SNARKs for its proofs — rather than storing data on-chain. In the same way zk-rollups can use proofs to assert the validity of some batched transactions, Filecoin’s proofs can be used to verify the integrity of data off-chain.
  • Filecoin is able to repurpose the “work” that storage providers would normally do to secure our chain via consensus to also store data. As a result, storage users on the network are subsidized by block rewards and other fees (e.g. transaction fees for sending messages) on the network. The net result is Filecoin’s storage price is super cheap (best represented in scientific notation per TiB/year).
  • Filecoin gets regular “checks” via our proofs about data integrity on the network (the entire network is checked 24 hrs!). These verifiable statements are important primitives that can lead to unique applications and programs being built on Filecoin itself.

While this architecture has many advantages (scalability! verifiability!), it comes at the cost of additional complexity — the storage providing process is more involved and writing data into the network can take time. This complexity makes Filecoin (as it is today) best suited for cold storage. Many folks using Filecoin today are likely doing so through a developer on-ramp (Estuary.tech, NFT.Storage, Web3.Storage, Chainsafe’s SDKs, Textile’s Bidbot, etc) which couples hot caching in IPFS with cold archival in Filecoin. For those using just Filecoin alone, they’re typically storing large scale archives.

However, as improvements land both to the storage providing process and the proofs, expect more hot storage use cases to be enabled. Some major advancements to keep an eye on:

  •  SnapDeals — coupled with the below, storage providers can turn the mining process into a pipeline, injecting data into existing capacity on the network to dramatically lessen time to data landing on-chain.
  • 🔄 Sealing-as-a-service / SNARKs-as-a-service — allowing storage providers to focus on data storage and outsource expensive computations to a market of specialized providers.
  • 🔄 Proofs optimizations — tuning hardware to optimize for the generation of Filecoin proofs.
  • 🔄 More efficient cryptographic primitives — reducing the footprint or complexity of proof generation.

Note: All of this is separate from the “read” flow — which techniques for faster reads exist today via unsealed copies. However, for Filecoin to get to web2 latency we will need Retrieval Market(s), discussed in the next section.

Retrieval Market(s)

The thesis with retrieval markets is straightforward: at scale, caching data at the edge via an open market can solve for the speed of light problem and result in performant delivery at lower costs than traditional infrastructure.

Why might this be the case? The argument is as follows:

  • The magic of content addressing (using fingerprints of content as the canonical reference) means data is verifiable.
  • This maps neatly to building a permissionless CDN — meaning anyone can supply infrastructure and serve content — as end users can always verify that the content they receive back is the content they requested (even from an untrusted computer).
  • If anyone can supply infrastructure into this permissionless network, a CDN can be created from a market of edge-caching nodes (rather than centrally planning where to put these nodes) and use incentive mechanisms to bootstrap hardware — leading to the optimal tradeoff on performance and cost.

The way retrieval markets are being designed on Filecoin, the aim is not to mandate a specific network to be used — rather to let an ecosystem evolve (e.g. Magmo, Ken Labs, Myel, Filecoin Saturn, and more) to solve the components involved with building a retrieval market.

Source: https://www.youtube.com/watch?v=acqTSORhdoE&ab_channel=Filecoin (From April ‘22)

This video is a good primer on the structure and approach of the working group and one can follow progress here.

Note: Given latency requirements, retrievals happen off-chain, but the settlement for payment for the services can happen on-chain.

Compute over Data (Off-chain Compute)

Compute over data is the third piece of the open services puzzle. When one thinks of what needs to be done with data, it’s typically not just storage and retrieval — users also want to be able to transform the data. The goal with these compute over data protocols are generally to perform computation over IPLD.

For the unfamiliar, IPLD aims to be the data layer for content-addressed systems. It can be used to describe a filesystem (like UnixFS which IPFS uses), Ethereum data, Git data, — really anything that is hash linked. This video might be a helpful primer.

The neat thing about IPLD being generic is that it can be an interface for all sorts of data — and by building computation tools that interact with IPLD, we reduce the complexity for teams building these tools to have their networks be compatible with a wide range of underlying types of data.

Note: This should be exciting for any network building on top of IPFS / IPLD (e.g. Celestia, Gala Games, Audius, Ceramic, etc)

Of course, not all compute is created equal — and for different use cases, different types of compute will be needed. For some use cases, there might be stricter requirements for verifiability — and one may want a zk proof along with the result to know the output was correctly calculated. For others, one might want to keep the data entirely private — and so instead might require fully homomorphic encryption. For others, one may want to just run batch processing like on a traditional cloud (and rely on economic collateral or reputational guarantees for correctness).

Source: https://www.youtube.com/watch?v=-d4iJm-RbyA&t=537s&ab_channel=ProtocolLabs

There are a bunch of teams working on different types of compute — from large scale parallel compute (e.g. Bacalhau), to cryptographically verifiable compute (e.g. Lurk), to everything in between.

One interesting feature of Filecoin is that the storage providers have compute resources (GPUs, CPUs — as a function of needing to run the proofs) colocated with their data. Critically, this feature sets up the network well to allow compute jobs to be moved to data — rather than moving the data to external compute nodes. Given that data has gravity, this is a necessary step to set the network up to support use cases for compute over large datasets.

Filecoin is set up well to have compute layers be deployed on top as L2s.

One can follow the compute over data working group here.

FVM (Programmable Applications)

Up until this point, I’ve talked about three services (storage, retrieval, and compute) that are related to the data stored on the Filecoin network. These services and their composability can lead to compounding demand for the services of the network — all of which ultimately anchor into the Filecoin blockchain and generate demand for block space.

But how can these services be enhanced?

Enter the FVM — Filecoin’s Virtual Machine.

The FVM will enable computation over Filecoin’s state. This service is critical — as it gives the network all the powers of smart contracts from other networks — but with the unique ability to interact with and trigger the open services mentioned above.

With the FVM, one can build bespoke incentive systems to make more sophisticated offerings on the network:

Filecoin’s virtual machine is a WebAssembly (WASM) VM designed like a hypervisor. The vision with the FVM is to support many foreign runtimes, starting with the Ethereum Virtual Machine (EVM). This interoperability means Filecoin will support multiple VMs — on the same network contracts designed for the EVM, MoveVM, and more can be deployed.

By allowing for many VMs, Filecoin developers can deploy hardened contracts from other ecosystems to build up the on-chain infrastructure in the Filecoin economy, while also making it easier for other ecosystems to natively bridge into the services on the Filecoin network. Multiple VM support also allows for more native interactions between the Filecoin economy and other L1 economies.

Note the ipld-wasm module — the generalized version of this will be the IPVM work (which could be backported here). Source: https://fvm.filecoin.io

The FVM is critical as it provides the expressiveness for people to deploy and trigger custom data services from the Filecoin network (storage, retrieval, and compute). This feature allows for more sophisticated offerings to be built on Filecoin’s base primitives, and expand the surface area for broader adoption.

Note: For a flavor of what might be possible, this tweet thread might help elucidate how one might use smart contracts and the base primitives of Filecoin to build more sophisticated offerings.

Most importantly, the FVM also sets the stage for the last major pillar to be covered in this post: interplanetary consensus.

One can follow progress on the FVM here, and find more details on the FVM here.

Interplanetary Consensus (Scaling)

Before diving into what interplanetary consensus is, it’s worth restating what Filecoin is aiming to build: open services for data (storage, retrieval, compute) as credible alternatives to the centralized cloud.

To do this, the Filecoin network needs to operate at a scale orders of magnitude above what blockchains are currently delivering:

Product requirements for the Filecoin network.

Looking at the above requirements, it may seem contradictory for one chain to target all of these properties. And it is! Rather than trying to force all these properties at the base layer, Filecoin is aiming to deliver these properties across the network.

With interplanetary consensus, the network allows for recursive subnets to be created on the fly. This framework allows each subnet to tune its own trade off between security and scalability (and recursively spin up subnets of its own) — while still checkpointing information to their respective parent subnets.

This setup means that while Filecoin’s base layer can be highly secure (allowing many folks to verify at home on commodity hardware) — Filecoin can have subnets that are natively connected that can make different trade offs, allowing for more use cases to be unlocked.

In this diagram, the root would be the Filecoin base layer. Source: https://research.protocol.ai/blog/2022/scaling-blockchains-with-hierarchical-consensus/

A few interesting properties based on how interplanetary consensus is being designed:

  • Each subnet can spin up their own subnets (enabling recursive subnets)
  • Native messaging up, down, and across the tree — meaning any of these subnets can communicate with each other
  • Tunable tradeoffs between security and scalability (each subnet can choose their own consensus model and can choose to maintain their own state tree).
  • Firewall-esque security guarantees from children to parents (think of each subnet as being like a limited liability chain up to the tokens injected from the perspective of the parent chain).

To double click on some of the things interplanetary consensus sets Filecoin up for:

  • Because subnets can have different consensus mechanisms, interplanetary consensus opens the door for subnets that allow for native communication with other ecosystems (e.g. a Tendermint subnet for Cosmos).
  • Enabling subnets to tune between scalability and security (and enabling communications to subnets that make different trade offs) means Filecoin can have different regions of the network with different properties. Performant subnets can get hyper fast local consensus (to enable things like chat apps) — while allowing for results to checkpoint into the highly secure (and verifiable and slow) Filecoin base layer.
  • In a very high throughput subnet (a single data center, running a few nodes) — the FVM/IPVM work could be used to simply task schedule and execute computation directly “on-chain” — with native messaging and payment bubbling back up to more secure base layers.

Learn more by reading this blogpost and following the progress of ConsensusLab. This Github discussion may also be useful to contextualize IPC vs L2s.

Final Thoughts

So, after reading all the above, hopefully clearer what Filecoin is — and how it’s not exactly like any other protocol out there. Filecoin’s ambition is not just to be a storage network (as Tesla’s ambition was not to just ship the Roadster) — the goal is to facilitate a fully decentralized web powered by open services.

Compared to most other web3 infra plays, Filecoin is aiming to be substantially more than a single service. Compared to most L1s, Filecoin is targeting a set of use cases that are uniquely enabled through the architecture of the network. Excitingly, this means rather than competing for the same use cases, Filecoin can uniquely expand the pie for what can actually be done on crypto rails.

Disclaimer: Personal views, not reflective of my employer nor should be treated as “official”. This is my distillation of what Filecoin is and what makes it different based on my time in the ecosystem. Thanks to @duckie_han and @yoitsyoung for helping shape this.

🇨🇳Filecoin当前状态和发展方向总结