Reading List

Ordering

Consensus

Paxos optimizations/variants

  • 2/11: Dan RK Ports, Jialin Li, Vincent Liu, Naveen Kr Sharma, and Arvind Krishnamurthy. Designing distributed systems using approximate synchrony in data center networks. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15). 2015.
  • 2/13: Seo Jin Park and John Ousterhout. 2019. Exploiting Commutativity For Practical Fast Replication. In Proceedings of the 16th Symposium on Networked Systems Design and Implementation (NSDI ’19).
  • Optional:
    • Leslie Lamport. 2005. Fast paxos. MSR-TR-2005-112.
    • Leslie Lamport. 2005. Generalized Consensus and Paxos. (2005).
    • Aishwarya Ganesan, Ramnatthan Alagappan, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau Exploiting Nil-Externality of Fast Replicated Storage. SOSP 2021.
    • Charapko, Aleksey, Ailidani Ailijiang, and Murat Demirbas. "Linearizable quorum reads in Paxos." In 11th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 19). 2019.
    • Moraru, Iulian, David G. Andersen, and Michael Kaminsky. "Paxos quorum leases: Fast reads without sacrificing writes." In Proceedings of the ACM Symposium on Cloud Computing, pp. 1-13. 2014.
    • Manos Kapritsos, Yang Wang, Vivien Quema, Allen Clement, Lorenzo Alvisi, and Mike Dahlin. 2012. All About Eve: Execute-verify Replication for Multi-core Servers. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation (OSDI ’12).
    • Iulian Moraru, David G. Andersen, and Michael Kaminsky. There is more consensus in egalitarian parliaments. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP), 2013.

Byzantine

Consensus: Practice

Consistency models

Distributed transactions

Shared logs

Student presentations

New hardware: RDMA, disaggregated memory

  • Optional: other emerging hardware
      • Wang et al. Replicating Persistent Memory Key-Value Stores with Efficient RDMA Abstraction. OSDI 2023.
      • J. Li, E. Michael, and D. R. K. Ports. Eris: Coordination free consistent transactions using in-network concurrency control. SOSP 17.
      • Henry N. Schuh, Weihao Liang, Ming Liu, Jacob Nelson, and Arvind Krishnamurthy. 2021. Xenic: SmartNIC-Accelerated Distributed Transactions. SOSP 21.
      • Thomas E. Anderson, Marco Canini, Jongyul Kim, Dejan Kostic, Youngjin Kwon, Simon Peter, Waleed Reda, Henry N. Schuh, and Emmett Witchel. Assise: Performance and Availability via Client-local NVM in a Distributed File System. OSDI 20.
      • YJongyul Kim, Insu Jang, Waleed Reda, Jaeseong Im, Marco Canini, Dejan Kostić, Youngjin Kwon, Simon Peter, Emmett Witchel. LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism. SOSP 21.
  • Virtual Machine Fault-Tolerance

    Fault-tolerant computing

    Fault-tolerant Training