Crypto Reading List

A curated list for getting up to speed on crypto and decentralized networks.

The content on the toplevel page contains what we consider essential reading. Child pages contain deeper, topic-specific information to review afterward.

The lists here are a work in progress. We welcome any feedback or criticism! Please open a PR/issue here or reach out to crypto-research@jumptrading.com with any suggestions, or to report any errors.

Nothing in this repo constitutes financial or legal advice.

Contents

Why is crypto important?

We'd recommend starting your exploration by trying to understand what problems crypto is trying to solve.

In a few words, we'd say it is:

  • enabling a decentralized ledger-based currency system
    • decentralized means, extremely difficult for bad actors to forge transactions taking your holdings
  • enabling a decentralized network of computation / decentralized state transition machine
    • decentralized means, extremely difficult for bad actors for enact state changes not defined in sourcecode
  • enabling an open network of APIs that can be leveraged to build increasingly advanced apps
  • enabling an incentive model for these open networks to grow via crypto tokens

Here's the list:

More: see in-depth page: Why

Blockchain mechanics & innovations

We think it's essential reading to understand how bitcoin works, and how smart contracts (pioneered by Ethereum) work.

DeFi primitives

In-depth page: DeFi

Next, let's try to understand the major kinds of financial dApps on the blockchain. Although there are many types, we'd say the two most common are:

  1. Lending protocol (a decentralized bank, i.e. a smart contract where you can loan your assets for yield, or do borrow while paying interest). Example: Aave
  2. Decentralized exchange (most commonly an Automated Market Maker (AMM), a smart contract with two pools of assets that allows swapping from one asset to the other). Example: Uniswap

A third, which can be thought of as a competitor to (1) of sorts, is:

  1. Decentralized stablecoin issuer (a protocol allowing you to deposit assets (e.g. Eth) and borrow a decentralized stablecoin (minted by the protocol) against it). We say that it is a competitor of sorts to (1) where the lender is the protocol. Example: MakerDAO

Initial reading material on these categories:

For much more, see our in-depth page on DeFi

NFTs & digital identity

In-depth page: NFT

DAOs & Governance

In-depth page: DAO

Byzantine Fault Tolerance & Proof-of-Stake algos

At this point, we'd recommend learning about alternative smart contract blockchains.

A fundamental design decision in blockchains is the mechanism by which block producers (miners in Bitcoin and Eth 1.0) come to consensus on the next block. This problem of doing so in a distributed system with a variety of actors--some of whom may be sending intentionally confusing or destabilizing messages to their peers--is the key to establishing consensus and progressing the blockchain.

Bitcoin and Eth 1.0 accomplish this by proof of work ("Nakamoto consensus"), but most other blockchains use variants of a different family of algorithms referred to as Byzantine Fault Tolerant (BFT) algorithms.

L1s

In-depth page: L1

At this point you might want to dig into different L1 blockchains--both their protocol designs and their ecosystems. See in-depth pages below:

L2s

In-depth page: L2

Trading mechanics

In-depth page: TradingDynamics

In-depth page: MEV/Arbitrage

Smart contract programming

In-depth page: Development

Economic design

In-depth page: EconDesign

Tools & Analytics

In-depth page: Tools

Exercises

Check your understanding with these thought questions and exercises.

Other references

Other lists/directories

In-depth page: Other Lists

Original research

In-depth page: Researchers

Online courses

Why crypto?

In addition to the list on the main page, we recommend:

  • Fat Protocol thesis
    • oft-cited piece arguing for the value of network effects in blockchain infrastructure

DeFi

Contents

Overview

Lending protocols

DEX trading

AMM (swap according to a formula) DEXes

Serum (full order book)

  • Serum whitepaper
  • A technical introduction to the Serum DEX
    • describes the orderbook interactions and data structures, as well as how they're implemented. Note the section on the request queue is outdated
  • Serum core
    • a next-gen orderbook that decouples the matching engine from SPL tokens. Also known as the agnostic orderbook (AOB)
    • DEX v4 is a new version of Serum DEX that builds on top of the AOB

Derivatives

Stablecoins

Bridges

Oracles

Indexing

  • Intro to The Graph
    • The Graph is a decentralized protocol for indexing and querying blockchain data
      • It's like a search engine framework, and each 'subgraph' is a domain-specific search engine implementation
    • GraphQL API
      • tutorial on how to query a subgraph using GraphQL

Insurance

Governance wars and DeFi

  • CRV/CVX part 1 (June 2021)
    • describing the innovation of Convex (CVX) as a method for controlling valuable CRV governance votes, as well as the more general question of protocol design with respect to 'meta-protocols' like CVX
  • CRV/CVX part 2 (Sep 2021)
    • an in-depth look at the CRV governance war which TokenBrice correctly predicted in part 1
  • Mochi scam and the Curve Wars (Nov 2021)

Tools

High-level TVL/usage stats

Portfolio

Use these to track your (or anyone else's) balances/activity across various DeFi protocols.

Yield Farming Analytics

Smart Contract Security

  • apesafe.io - contract comparison
  • app.unrekt.net - allowance checker
  • RugDoc - AMM emergency withdrawal tool, honeypot checker, LP breaker, etc
  • defiyield.info - tools for reviewing where you approved ERC20 tokens, timelocks, etc

Airdrops

Fun

  • fees.wtf - check how much you have spent on ETH fees
  • il.wtf - check your cumulative IL

NFT Reading

Background

Fractionalization

Social Tokens

Minting

Other Lists

NFT Tools

Ethereum

Marketplaces/Trading:

Analytics:

Rarity:

Solana

See Awesome Solana NFTs (curated list of resources on Solana NFTs)

Marketplaces:

Analytics:

Rarity:

Tezos

DAOs / Decentralized Governance

Why

Challenges

  • Moving beyond coin voting governance (Aug 2021, Vitalik) - discussing challenges in existing simple coin-weighted voting schemes, as well as some high-profile governance attacks like the hostile takeover of Steem

How

Lists of DAOs

Tools

L1 Details

Child pages

Overviews

Historical / academic background

Cryptography

Distributed consensus

  • Practical Byzantine Fault Tolerance (Castro and Liskov, 1999)
    • a seminal algorithm for state machine replication / transaction ordering in a distributed system with up to 1/3 malicious nodes
  • Distributed Systems
    • excellent 101-level lecture notes. presumes some comfort with math notation, particularly predicate logic. distributed systems concepts can help answer questions like "why is Solana's "Proof of History" an innovation?"
  • CAP Theorem
    • fundamental trade-off between staying consistent vs. staying available when there's a network partition, i.e. nodes can't communicate
  • FLP Result
    • why distributed consensus is not guaranteed when a node goes down. in practice, this is solved by relaxing the constraints of their model
  • Proof of useful work
    • using collective computing power to search for prime number chains. an academic application but interesting concept
  • HotStuff BFT
    • an iteration on PBFT attempting to optimize for variable network delays

Bonus

  • P vs. NP - deep theoretical question on the very nature of computation: whether easy to verify implies (=>) easy to solve
  • Elliptic curve cryptography - in-depth blog post with plenty of diagrams on how elliptic curve cryptography works

Bitcoin

Ethereum

Blockchain

  • Ethereum whitepaper - building on bitcoin to get smartcontracts; also a good explanation of Bitcoin
  • How does ethereum work anyway (2017) - a clear description of the Eth network as a state transition machine
  • Patricia tree - an important Ethereum data structure. Intuitively, a Patricia tree is a Merkle tree + trie + compression. The linked page includes diagrams and a small example worth working through
  • EIP-1559 Analysis (June 2020) - EIP-1559 was a major UX change to how Ethereum would set gas price. At the time there was much debate about whether miners would accept these changes which would reduce their revenue. Hasu and Georgios correctly predicted the result (see also this article)
  • Mastering Ethereum (Andreas Antonopoulos and Gavin Wood, 2018) - a book for devs about Ethereum, published by O'Reilly but available for free on github

See also: L2.md for a description of rollups.

Programming

See Dev/Solidity.md

Ecosystem

Avalanche

Avalanche defines a consensus mechanism called Avalanche Consensus, which replaces the linear blockchain with a DAG and uses random subsampling for conflict resolution within conflict sets.

Snowman Consensus is the linear version of Avalanche Consensus. Validator coordination and smart contract execution in Avalanche use this because they require total ordering.

There are at least 3 chains in Avalanche:

  • Exchange (X) Chain: uses Avalanche Consensus; used for creating and exchanging assets. Assets are tracked using the UTXO model in order to leverage the speedup from the DAG.
  • Platform (P) Chain: uses Snowman Consensus; used for validator consensus and definition of other subnets
  • Contract (C) Chain: uses Snowman Consensus; runs smart contracts (including those written for EVM).

Blockchain

Ecosystem

Binance Smart Chain

Blockchain

Ecosystem

Smart contract programming

Cosmos

  • Tendermint is a PoS consensus mechanism.
  • ABCI is an interface for how application-specific state will be replicated by a consensus algo. Tendermint complies with ABCI.
  • IBC is a protocol for how blockchains to communicate with each other
  • The Cosmos SDK is a library with implementations of Tendermint and IBC, for use in building a new blockchain.
  • The Cosmos Hub is a specific blockchain intended to serve as a master connector of many blockchains that use IBC

Consensus Mechanism

Cosmos Network

  • Cosmos whitepaper (2016)
    • Cosmos network whitepaper (describing Cosmos zones/IBC and validator behavior - a network built on Tendermint consensus algo). See "Consensus systems" for a good summary of Tendermint vs PBFT and other consensus algos.
  • Cosmos & Polkadot comparison (2019)

Ecosystem

NEAR

Blockchain

NEAR uses Doomslug PoS for consensus and features dynamic resharding ("Nightshade") for horizontal scalability. It supports EVM programs via Aurora.

Ecosystem

Polkadot

Blockchain

Ecosystem

Polygon

Polygon is a sidechain that uses a modified Tendermint PoS consensus mechanism for transaction ordering/inclusion.

It features two chains:

  • Heimdall Chain - responsible for selecting Bor validator sets and checkpointing balances back to the main chain (Ethereum)
  • Bor Chain - responsible for executing smart contracts and producing blocks.

Blockchain

Ecosystem

Solana

Blockchain

Ecosystem

see also: DeFi/Serum and NFT/Solana

Serum

Programming

See SolanaProgramming

Terra

Protocol

Ecosystem

A few specific dApps

  • Anchor - lending protocol offering high yields; see this thread for an insightful explanation
  • Mirror - mirror real-world assets (e.g. $mAAPL) on Terra and Ethereum / BSC via Shuttle. note: currently under investigation by the SEC
  • Nebula - a protocol for "ETFs", where the target asset mix is dynamically changing via smart contracts (there are incentives to make actual asset mix = target). Do Kwon asserts a fascinating hypothesis: young investors want to make bets, not passively invest.

L2

Intro / Survey

  • L2 for Beginners - describing a mental model of an L2 as a chain which writes enough state back to Ethereum that no one (including the L2's miners/validators) can send back a fraudulent state
  • Optimistic rollups vs ZK-rollups - a recent assessment of the state of various rollup projects
  • Vitalik's post on rollups - a fairly technical summary of the SOTW of rollups
  • Validating Bridges (Sep 2021) - a fairly thorough assessment of L2 solutions (including Oasis?) as well as open research questions
  • Eth docs on L2 Rollups - see the bottom of this page for further links

Optimistic rollups

zk-Rollups

Primary source of info for zkps should be awesome-zero-knowledge-proofs, a git-based list of learnings curated by Matter Labs.

Our shortlist:

  • Validity Proofs vs Fraud Proofs (2019) - StarkWare's summary of fraud proofs (typically employed by optimistic rollups) vs validity proofs (not feasible on optimistic rollups)
  • zk-STARKS vs zk-SNARKS
  • ethhub ZK-STARKS
  • StarkNet (Jan 2021) - StarkWare's description of their STARK-powered zk-rollup
  • Awesome StarkNet - curated list of StarkNet resources and tools from @gakonst and others
  • zkSync (June 2020) - Matter Labs' description of their "SNORK"-powered zk-rollup. SNORKS are SNARKs with a universal and updateable trusted setup.
  • Volitions: best of all worlds (Aug 2021) - describing the volition, a hybrid system where users can choose between storing their data as a rollup (on-chain data) or as a validium (off-chain).
    • See also StarkWare's proposal (June 2020) for the volition concept, and Matter Labs' description of its volition implementation zkSync 2.0 (Apr 2021)

zk-Proofs

Analytics

  • L2beat - comparison of current state of various L2s

Other Lists / Acknowledgements

  • L2 skill tree - this tweetstorm by @likebeckett was instrumental in assembling this page

Trading Dynamics

Crypto Products

Theoretical products

Products that haven't been built yet which people frequently talk about:

Automated trading

MEV

MEV is a measure of the profit a block producer (miner, validator, etc) can make due to their ability to arbitrarily include, exclude, or re-order transactions within a block.

  • Interview with a Searcher - the single best discussion of MEV and how searchers (arbitrageurs) uncover arbs and submit them to the flashbots private relay. Includes discussion of the role flashbots plays in 'democratizing' MEV and providing DOS protection for miners.
  • awesome-MEV-resources - curated list of MEV articles/podcasts/etc

History

  • Ethereum is a dark forest (Dan Robinson, Aug 2020) an example of a frontrunning incident to illustrate the adversarial nature of arbitrage on the blockchain
  • Flash Boys 2.0 - Phil Daian's seminal paper on frontrunning and backrunning prior to the introduction of flashbots
  • MEV SOTW Feb 2021 - Charlie Noyes' assessment of the state of MEV
  • MEV Roast - virtual conference on MEV

It's not easy / examples

Tools

  • explore.flashbots.net - dashboard of recent transactions going thru flashbots + estimate of profit (not very accurate)

Developing on the blockchain

Skill tree

Child pages

Programming in Solidity

Learning

Security

Broader page: Security.md

Dev tools

Contract references

Topics

Solana programming

Blockchain core concepts

See L1/Solana.md

SolDev.app

  • SolDev.app - excellent collection of resources, getting better by the day

Tutorials

Learning Rust

Essential

Bonus

Reference

Security

See also: Security.md

Devtools

  • https://www.sollet.io/
  • https://www.spl-token-ui.com/#/
  • Essential CLI tools:
    • solana
    • spl-token
    • anchor

Walkthrough

Please feel free to directly edit: fix inaccuracies, expand content, condense sections, etc.

Goal of this section: condense the above resources and compile useful tidbits (tips, gotchas) to reference in the future.

Let's also take a look at the Serum code itself. The escrow tutorial is excellent, but it would be interesting to look at a directly relevant real-world program.

Overview

Framing

Let's start by framing what Solana programming is. Doing this will help us form the mental model of what exactly we're programming.

At the highest-level, you have two kinds of Solana programs:

  • an on-chain program. Analogous to "smart contract". Literally runs on the blockchain and executes "Solana computer instructions".
  • an off-chain app. Doesn't run on the blockchain but interacts with the programs that do.

I find this analogous to any client-server interaction model, where the server (backend) is the on-chain program and the client (frontend) is an app that interacts with it. In fact, writing Solana programs is a lot like implementing RPC servers, REST endpoints, [insert distributed / IPC thing here]. On-chain programs can interact with other on-chain programs, just like how any server can also be a client.

Tooling

Solana supports any programming language that can compile to BPF bytecode. However, the vast majority of the ecosystem and tooling is in Rust. All code in this tutorial is written in Rust, unless otherwise specified.

Relevant Rust crates:

  • solana-program for writing on-chain programs
    • in all likelihood, you will also depend on existing programs in the Solana program library which are published as their own crates, e.g. spl-token
  • solana-sdk, solana-client for off-chain programs

On-chain programs

The solana-program crate exposes a macro aptly named entrypoint!

  • It follows the framework over library model, i.e. they call you

  • So it expects you to pass in a function with a very specific signature. This function is where you will define your logic that ultimately runs on the chain once deployed (more on that later).

  • The inputs to this deployed function are passed in via a transaction

  • Solana program flow:

    • Process the input, including deserialization to determine the user's instructions
    • Do your logic / algorithm
    • Return the serialized outputs in a ProgramResult, a thin-wrapper around std::result::Result.
  • See the helloworld example

Anchor

Echo

  • If you've worked through the "Echo program" from the Solana bootcamp, you might be curious how the same program might be written in Anchor.
    • See here for a quick-and-dirty implementation of the above.

Testing

In general, it's a good practice to take a layered approach to testing. For example:

  • for fast and lightweight tests, use regular Rust unit tests. Optionally, use solana-program-test and BanksClient.
  • for something heavier but closer to the "real deal", consider standing SolanaTestValidator inside your integration.rs.
    • with Anchor, consider also writing "end-to-end" tests using (anchor test), which will also test the IDL.
  • finally, deploy to devnet and testnet before mainnet-beta.

Rust's convention is unit tests (#[cfg(test)]) and integration tests (separate tests subdirectory). In addition to this, Anchor supports end-to-end integration tests via anchor test.

Highlights

Here are some differences to highlight about Anchor:

  • Instead of passing a list of AccountInfo and processing them with next_account_info, all the account-passing is handled by Anchor via struct definitions.
    • Notice how they #[derive(Accounts)]: Anchor does the wiring of account passing and parsing for us. We instead just pass in Context<T>).
  • A lot of the checks / constraints are now pulled out of the actual "processing" and pushed into Anchor macros.
    • This allows us to separate business logic from administrative logic.

Reification

  • The process of turning all the account parsing / processing logic into structs is known as reification. We've transformed code into data, thereby reifying it.
  • The inverse is Church encoding. Turning data into code.
  • Read more here if this excites you.

Anchor has sample Escrow program. Their version implements an escrow program that is Paulx++: it supports cancellation, and makes use of more advanced and idiomatic Rust. See the code.

Walkthrough: Serum (WIP)

Let's take a look at something more complicated than Escrow. How about the Serum dex itself!

To begin, let's examine the project structure. It looks like a monorepo: both the on-chain program and the app live in the same codebase.

Ok, now let's look for the entrypoint to the on-chain program. Remember that entrypoint! macro mentioned earlier? It's being called in dex/src/lib.rs:


#![allow(unused)]
fn main() {
#[cfg(all(feature = "program", not(feature = "no-entrypoint")))]
use solana_program::entrypoint;
#[cfg(feature = "program")]
use solana_program::{account_info::AccountInfo, entrypoint::ProgramResult, pubkey::Pubkey};

#[cfg(feature = "program")]
#[cfg(not(feature = "no-entrypoint"))]
entrypoint!(process_instruction);
#[cfg(feature = "program")]
fn process_instruction(
    program_id: &Pubkey,
    accounts: &[AccountInfo],
    instruction_data: &[u8],
) -> ProgramResult {
    Ok(state::State::process(
        program_id,
        accounts,
        instruction_data,
    )?)
}
}

Simple enough. Continuing to follow the white rabbit into the process method of state.rs:


#![allow(unused)]
fn main() {
#[cfg_attr(not(feature = "program"), allow(unused))]
impl State {
    #[cfg(feature = "program")]
    pub fn process(program_id: &Pubkey, accounts: &[AccountInfo], input: &[u8]) -> DexResult {
        let instruction = MarketInstruction::unpack(input).ok_or(ProgramError::InvalidArgument)?;
        match instruction {
            MarketInstruction::InitializeMarket(ref inner) => Self::process_initialize_market(
                account_parser::InitializeMarketArgs::new(program_id, inner, accounts)?,
            )?,
            MarketInstruction::NewOrder(_inner) => {
                unimplemented!()
            }
            MarketInstruction::NewOrderV2(_inner) => {
                unimplemented!()
            }
            MarketInstruction::NewOrderV3(ref inner) => {
                account_parser::NewOrderV3Args::with_parsed_args(
                    program_id,
                    inner,
                    accounts,
                    Self::process_new_order_v3,
                )?
            }
        ...
}

Interesting! So it looks like they deserialize the instructions then pattern match to figure out what the client wants to do.

As you might expect, these are standard instructions you'd run on an order book.

APIs and the bigger picture

One thing that stands out here is the NewOrder, NewOrderV2, NewOrderV3 definitions, with the first two unimplemented. It looks a little funky. It appears to be handling for API versioning / compatibility.

Here is a key point: a Solana program is ultimately an API. As will all APIs, it's important to think carefully about the interface you're providing.

  • What are you going to do when your program logic changes?
  • How are you going to add features (API additions)?
  • How are you going to deprecate functionality (breaking changes)?

Ok, so why is this important?

When there is an entire ecosystem of other programs or client apps that depend on your on-chain program, it's especially critical to consider these points.

Imagine a dApp developer building a UI on top of the Serum interface. How would they feel if Serum was constantly changing the way they implemented order submission, breaking their app? They'd probably stop using Serum.

In the spirit of decentralization, you should consider it your responsibility to think carefully about your consumers everytime you make a change your program. Did the public API change? Is it really encapsulated from the client? A culture of constant breaking changes will damage morale and eventually drive developers away from the ecosystem.

All this said, as with anything there is a trade-off. Solana programming is still very new. Development is rapid and ongoing. That comes hand-in-hand with breaking changes.

In my opinion, the best thing to do is think for yourself: carefully consider the trade-offs and make an informed decision. Don't blindly break APIs, thinking that's ok just because Solana development is new. Don't chain yourself to previous iterations of your code either -- if your app has evolved, your codebase should evolve with it. Solana explicitly spells out some backwards compatibility guidelines.

API Evolution in Serum (WIP)

Take a look at this pull request releasing Dex V3, a breaking change: https://github.com/project-serum/serum-dex/pull/97. Here's the description:

The primary change is to immediately match incoming orders against the book, instead of first buffering them in the request queue. The request queue still exists to reduce breakage, but is always empty. Because of this, we're forced to remove support for the old order placement and cancellation instructions, since they don't provide the bids and asks accounts which would be necessary in order to process them in the new model.

Some more context on the purpose of the above change:

  • orders used to go into a request queue, not the order book directly.
  • a "user" (i.e. either another on-chain program or client app) would explicitly send a "crank turn" instruction to Serum
  • this "crank turn" pulled requests off the queue and matched them in the order book

PR Observations:

  • All affected instructions implemented a corresponding new version "V2" or "V3", e.g. MarketInstruction::CancelOrderV2
  • The old versions of the instruction were then removed by replacing the body with unimplemented!

Solana programming pros and cons

LISP programmers know the value of everything and the cost of nothing

Solana programs are stateless. What does that mean? Programs cannot hold state, so it's passed in via process_instruction.

The consequence is the same input always results in the same output. If you're familiar with the "pure functional programming" paradigm, you'll feel right at home.

Not every chain enforces statelessness. What are the pros / cons to Solana doing this?

Pros:

  • Parallel processing. Sealevel (Solana runtime) knows exactly what data a program depends on and can safely parallelize
  • Stateless-ness and determinism makes it easier to reason about programs, because of referential transparency

Cons:

  • Debugging can be difficult because a lot of data lives outside your program that you have to fetch with RPC
  • APIs for passing around accounts are not that friendly: they're passed as an array, so you have to remember the position-order

Anchor aims to solve some of the cons described. See the above Anchor escrow tutorial as well as angkor wat.

Accounts (WIP)

  • Accounts are just bytearrays &[u8].
  • Accounts have a data field for you to store arbitrary information
  • Executable accounts are programs
  • Types of account "types" in Anchor: Account, Program, Sysvar, different macros
  • Go into how Serum parses the account array into the literal information it needs to execute the order instruction

Highlights

Solana programs are stateless

If the program needs to store state between transactions, it does so using accounts

Programs are constrained to run deterministically, so random numbers are not available

the basic operational unit on solana is an instruction. an instruction is one call into a program. one or more instructions can be bundled into a message. a message plus an array of signatures constitutes a transaction

It's worth reviewing the on-chain programming docs or at least the FAQ. It'll save you a debugging headache.

Gotchas

  • Don't use std::collections::HashMap. You'll get an obscure error because of the "no-randomness" constraint
    • Reason: HashMap<K, V, S = RandomState>. Notice the generic type S defaults to RandomState
    • It may be possible use by substituting a different, non-random S - see the with_hasher constructor (I have not tried this myself)
  • Relatedly, don't use rand crate. If a crate you depend on transitively depends on rand, follow this guide

Smart contract security

Education / best practices

Solana-specific

See SolanaProgramming/Security

Bughunting challenges:

Real-life vulnerabilities

Summaries

Notable issues and incidents, explained

This is only a sampling! We'd recommend that smart contract devs review all major exploits (the rekt leaderboard is a great starting point) to learn from previous failures.

Re-entrancy

Re-entrancy is a famous and common issue where the attacker can unexpectedly recursively call a function multiple times, to get the contract's state variables into an unexpected state.

Oracle attacks

Some AMMs provide on-chain oracle functions (i.e. to compute asset prices from the current state of their pools). Unfortunately, this could allow an attacker to manipulate the state of a pool (especially using a flash loan), then do something else on a different protocol which depends on that oracle price. Developers of protocols that depend on on-chain oracles for pricing should be especially cognizant of this.

  • Cream Finance 130M hack (Oct 2021)
    • oracle attack on a lending protocol due to a flawed custom oracle for yearn assets (see also: cream hack analysis)
  • PancakeBunny reward overmint (May 2021)
    • oracle manipulation attack on PancakeBunny AMM
    • attacker gets way too many BUNNY reward tokens for LPing by unstaking in the middle of a massive mispricing from a flashloan
  • Enzyme finance custom oracle bug
    • an issue showing an interesting interaction between a governance token's custom oracle and its support for flashloans
  • Visor finance pricing exploit (Nov 2021)
    • reliance on spot prices for issuing shares
  • Rari pool attack - TWAP manipulation of VUSD
    • a specific pool was seemingly misconfigured to point at a pool with only concentrated liquidity
    • thread includes discussion of how it is easier to manipulate a pool with only concentrated liquidity because trading loss is relatively small
    • discussion of how TWAPs are still vulnerable because single huge input can move average a lot
  • Harvest Finance exploit (Oct 2020)
    • exploiter moved USDT/USDC on Curve up before depositing USDT into Harvest Finance, then down before withdrawing
    • pool share calc uses market price as oracle instead of 1
  • Oracle vulnerabilities
    • samczsun discussion of some famous oracle attacks

Other interesting economic attacks

  • bZx 2020 exploit (Feb 2020)
    • lending protocol bZx allowed fancier functionality than a typical lending protocol, specifically allowing a user to put on a leveraged equity/debt position by routing to an AMM
    • a missing check caused the protocol to be fooled into taking a negative-value position while moving an AMM price way out of line
    • attacker made money by arbing the AMM back into line outside of the lending protocol, while abandoning the negative-value vault
    • another great description of this issue
    • another great description
  • Spartan Protocol LP share value calc issue (May 2021)
    • mechanical flaw in calculation of LP share value in a synthetic asset protocol

Bridge attacks

Bridges are complex because they involve multiple chains, and interaction with a third party. Also, from the perspective of a single chain, transfers to that chain just involve unlocking tokens (or minting claim tokens) from the bridge contract.

Missing checks

Unauthorized access

Frontend attacks

  • BadgerDAO Cloudflare exploit (Dec 2021)
    • frontend attack arising from Cloudflare bug which allowed attackers to preregister API keys by email address without email verification
    • attacker used access to inject malicious scripts that prompted users to authorize tokens via MetaMask.

Logic bugs

Arguably all bugs are logic bugs, but some seem like pure logic issues...

  • Compound overdistribution of governance token (Sep 2021)
  • Popsicle Finance exploit
    • bug in computing users' share of fees when LP shares are transferred
    • notable in that bug had been repeatedly exploited in other contracts, but was missed by creators and auditors
  • MonoX hack (Nov 2021)
    • vAMM protocol for trading synthetics
    • when user swaps A for B, vAMM updates price of A to be lower than before, then updates price of B to be higher than before
    • MonoX didn't prevent corner case where A == B, so user could use this to increase price of B
    • attacker used this repeatedly to pump internal price of MONO token, then swap MONO into a lot of real value
  • Opyn bug (Aug 2020)
    • bug stemming from special case for ETH transfers

Other

Not financial or legal advice. Use all tools at your own risk. Please contact crypto-research@jumptrading.com for suggestions or to report errors.

Coin info

Price, official websites, stats, volume by venue, etc:

Charting

Charting for CEXes

Charting for coins only on DEXes

  • Dex.guru - charting for Eth and BSC coins
  • Poocoin - charting for BSC, Matic, and KCC coins
  • Dextools - for BSC, Matic, and KCC coins

CEX order book data

  • Okitoki - side-by-side viewer of multiple order books

Analytics

Address/token info

On-chain analytics

  • Dune Analytics
    • analytics platform for Ethereum, featuring various datasets built from processing blockchain data
  • Glassnode
    • on-chain analytics platform, featuring various datasets built from processing blockchain data
  • Nansen
    • wallet analytics (coin movements, large holders, big transactions, etc) for Ethereum. Also some NFT tools.
  • ETH Gas.watch
    • stats on historical/current Eth gas prices

Macro analytics

  • The Block
    • various timeseries graphs showing state of network utilization, volume, DeFi protocol utilization, marketshare by venue, etc.

DeFi

For DeFi-related tools see DeFi.md

NFTs

For NFT-related tools see NFT.md

VC Activity

Acknowledgement

This twitter thread from @0xShual was extremely useful as a starting point for this page.

Exercises

A collection of reading questions and exercises to help check for understanding.

Blockchain mechanics

Fundamentals

  • What's a cryptographic hash? What are its important properties?
  • What's a digital signature? Properties?
  • They both seem to take some input and scramble it to some output. Can we reuse the same function for both?
  • How does public-key cryptography relate?
  • Given a cryptographic hash function h, can you define an associative hash function H using another function f?
    • You pick f
    • H(a, b) = f(h(a), h(b))
    • H should generally satisfy H(a, H(b, c)) <=> H(H(a, b), c)

Can we reuse the same function?

The answer is no, we can't. To start, a cryptographic hash function should be irreversible, whereas a digital signature has to be reversible (otherwise how would you determine what was "signed"?)

That's a bit literal, so a higher-level answer is that cryptographic hash functions and digital signature have different requirements, which in turn requires the mathematical functions they use to have different properties.

Reversibility happens to be one property directly in conflict.

Associative hash function H

The intuition is to pick an f you already know is associative.

Let f be the associative string concatenation operator (||):

H(a, H(b, c))
<=> f(h(a), f(h(b), h(c)))
<=> h(a) || (h(b) || h(c))
<=> h((a) || h(b)) || h(c)  # by the associative property of `||`
<=> f((h(a), h(b)), h(c))
<=> H(H(a, b), c)

There may be other f: concatenation is used here because it is simple and intuitive. It is also directly relevant to the exercise below on Merkle Trees. Merkle trees hash the concatenation itself, so `h(h(a) || h(b)).

Follow-up: is concatenating two hashes cryptographically secure? Can H be a cryptographic hash function?

One way to answer is to revisit the above question "important properties of a cryptographic hash function" and think about whether concatenation would violate any of them.

Data structures

Describe a blockchain (data structure). What's the use-case?

A blockchain is a linked list with hash pointers. Its components:

  • .prev pointer
  • .prev_hash pointer (cryptographic hash of the previous node aka "block")

The use-case is tamper detection. If any node in the blockchain is altered, we'll know because the hash will no longer match. Therefore, you can always check the blockchain is valid by iterating from the head of the list

Pseudocode: for every node curr, hash prev and check that result hash equals curr.prev_hash

What's a Merkle tree? Use-case?

Merkle tree is a clever data structure to reduce time complexity by leveraging the fact hashes are composable. You can combine two hashes H(h1, h2) to produce a third hash h3. If either h1 or h2 change, h3 changes.

The use-case is efficient verification. It takes O(n) time to verify a block is part of a blockchain, where n is the number of blocks. This is because you have to start from the head and check the hashes until you get to that given block.

With a Merkle tree, you overlay a tree on the blockchain such that all the leaves of the tree correspond to the original blocks. Each parent is a composite hash, and the parent's parent is a composite of composite hashes, and so on. Diagram.

Now, to verify a block is part of the chain, you only need a path through the tree. This is O(h), where h is the height of the tree. Then, by keeping the tree balanced, the complexity is logarithmic.

Design a crypto protocol

Doxxcoin / Anoncoin

You are the designer of a currency Doxxcoin, and you want to implement a protocol Anoncoin that allows Doxxcoin holders to "anonymize" their coins.

Rules

  • The definition of anonymize is to disassociate a given Doxxcoin from all its prior transactions and addresses.
  • Doxxcoin works like Bitcoin, so you can trace all a given coin's addresses using the transaction ledger.
  • You have the ability to mint and burn Doxxcoin and Anoncoin.
  • Users should still be able to transact with Doxxcoin after anonymization.
  • For the sake of simplicity, disregard units or amounts.

Assume also you have the means to construct a zero-knowledge proof that satisfies the predicate Zk(f, x): ∃x, f(x) ∈ {s1, s2, ... , sn}

  • Predicate: "There exists an x such that f(x) is in the set S = {s1, s2, ..., sn}"
  • Zero-knowledge proof: "I know x such that f(x) is in S, without giving away x"
  • You can pick what f and x are.
  • The Zk-proof can be treated as a black-box and used anywhere in the scheme.

Can you come up with a cryptographic scheme Anoncoin that anonymizes Doxxcoin?

Hint

commitment, escrow pool, burn to mint, trust-less

Hint 2

  • What if you pooled money together?
  • How can you leverage your treasury superpowers? You can issue (mint) or remove (burn) currency from circulation.
  • Can this scheme be completely trust-less? How might the Zk-proof help?

Solution

The key idea is to anonymize by pooling Doxxcoin together into a collective Anoncoin escrow pool then redeeming Doxxcoin from that pool. The coin you get out is not the same coin you put in. More importantly, the coin you get out cannot be associated in any way with the coin you put in.

Imagine you put physical cash in an envelope along with some proof of ownership (identity / public key), and add it to a pool of envelopes. This physical cash has your fingerprints, as well as those of everyone before you (analogous to public keys and tx input / output addresses).

The safest way to ensure anonymity is to put the envelopes in a vault, taking that money out of circulation ("burning") and minting new money to replace it. You can't just shuffle the envelopes and give back someone else's cash, because that still leaks information about history.

Now you have some freshly-minted money which has no transaction history. It's still Doxxcoin so you can spend it just like you otherwise would.

In order for this to work, the owner / custodian of the pool needs some way to determine whether you have added money before handing out new Doxxcoin.

The obvious solution is that the custodian opens the envelope when you ask to redeem Doxxcoin from the pool (recall you stored your identity in the envelope to indicate that it belongs to you). They do this to verify your ownership before crediting you with newly-minted Doxxcoin. The problem is this requires trust: that owner now has linked the newly-minted Doxxcoin with your original, and it's no longer anonymous. Centralization and trust is what we want to avoid!

This is where ZK-proofs come in. The ZK-proof allows you to prove you own one of the envelopes without ever opening it.

The other thing we need to do is model this notion of an "envelope" with cryptography. We can do this using a commitment. Indeed, a commitment is often explained using this envelope analogy.

The Anoncoin scheme is as follows:

  1. Anonymize:
    1. Create a transaction with two inputs commit(identity) and Doxxcoin.
    2. Mint an Anoncoin which represents the commitment commit(identity) and add it to a pool of Anoncoin.
    3. The input Doxxcoin is burned.
  2. Redeem:
    1. Construct a Zk-proof Zk(f, x), where f = commit and x = identity
    2. Then, the proof shows "I know x, where f(x) = commit(identity) and f(x) is in the pool of commitments {s1, s2, ...}" (see problem statement)
    3. Here the commitment pool is in fact a pool of Anoncoin. The Zk-proof proves you minted one of the Anoncoin without revealing your identity (x).
    4. The protocol (pool custodian) verifies Zk(f, x).
    5. The protocol then outputs newly-minted Doxxcoin. Because it's zero-knowledge, the protocol does not learn what your identity (x) is at any point, thus anonymizing.

The beauty of Zk-proofs is everything can be done with pure cryptography. No trusted third-party needed.

This scheme is based on the Zerocoin protocol, with details omitted for simplicity.

Follow-up

There is a problem with the scheme described in the solution above. Can you spot it?

Hint

UTXO

Solution

There's a double-spend problem. I can keep constructing Zk-proofs, and getting newly-minted Doxxcoin.

This is because the Zk-proof makes it such that my envelope is never opened: in fact the protocol doesn't even know which envelope is mine. I could have multiple envelopes, so it can't just restrict me to redeem / spend once.

The solution is to include as part of the commitment a unique mint_id.

  • Anonymize: the mint_id parameter is added to commit so commit(mint_id, identity)
  • Redeem: the mint_id is also included with the Zk-proof. Anoncoin protocol keeps track mint_id has never been used in a prior redeem operation

Other reading lists

Here are some other helpful compendiums of crypto knowledge:

Directories of crypto projects

Researchers

Here are a few places to read interesting original research:

News