Main Page

From CS 260r 2014
Revision as of 08:45, 19 April 2014 by Eddie Kohler (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

CS260r 2014

CS260r is Topics and Close Readings in Computer Systems, spring 2014. Our topic: Cloud Big Data Systems.

Reading list

  • 4/14 Fabric: A platform for secure distributed computation and storage (alternate link). Jed Liu, Michael D. George, K. Vikram, Xin Qi, Lucas Waye, and Andrew C. Myers. Proc. SOSP’09. (Lucas presented)
    • 4/14 Question: To improve Fabric’s resiliency to failure, one could implement storage nodes using Paxos or VR groups. New object versions would be stored in a replicated log. Contrast this hypothetical VR log with the distributed transaction logs that Fabric uses to commit transactions. Specifically, describe the contents of entries in the hypothetical VR log, and the contents of entries in the Fabric distributed transaction logs.
  • 4/2 Transaction chains: achieving serializability with low latency in geo-distributed systems. Yang Zhang, Russell Power, Siyuan Zhou, Yair Sovran, Marcos K. Aguilera, and Jinyang Li. Proc. SOSP’13. (Scott presented)
    • 4/2 Question: Transaction chains, like Spanner, can fall back on two-phase locking and two-phase commit for transaction execution. Explain how Spanner’s two-phase locking differs from that of transaction chains. For instance, when does each system use two-phase locking, and in each system what granularity of data is locked?
  • 3/31 Spanner: Google’s globally distributed database. 26 authors. Proc. OSDI’12. (Eddie presented)
    • 3/31 Question: Describe two different ways Spanner’s user-visible behavior would change if the TrueTime API’s ε value (the error bound) changed a lot.
  • 3/26 In search of an understandable consensus algorithm. Diego Ongaro and John Ousterhout. Technical report. (Nate presented)
    • 3/26 Question: Describe a scenario (e.g., number of servers, failure pattern, client requests) where Raft would do something substantively different than viewstamped replication (e.g., commit a different client request, recover a different server). Explain what Raft would do and what VR would do.
  • 3/12 Transactional storage for geo-replicated systems. Yair Sovran, Russell Power, Marcos K. Aguilera, and Jinyang Li. Proc. ACM SOSP'11. (Marco presented)
    • 3/12 Question: Describe an execution of two or more transactions, written for an application like the Walter ReTwis (using their description as a reference), that produce a result under Walter's PSI that is impossible under normal snapshot isolation.
  • 2/26 Fast crash recovery in RAMCloud. Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. Proc. ACM SOSP’11. (Stephen presented)
    • 2/26 Question: Section 3.5.2 discusses how RAMCloud ensures that a recovery master can detect that a recovered log is complete (has at least one copy of every segment). The design uses a recovery digest, and a protocol that occurs every time a new segment replica is added: a new digest is inserted in the new replica and marked “active”; once this is persisted, the previous active digest is marked as inactive. Does this design require that RAMCloud block client requests while waiting for a disk write? If yes, then how often does this happen? If no, then why not?
  • 2/12 Background Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems. Amar Phanishayee, Elie Krevat, Vijay Vasudevan, David G. Andersen, Gregory R. Ganger, Garth A. Gibson, and Srinivasan Seshan. Proc. FAST’08 (USENIX Conference on File and Storage Technologies, Feb. 2008.
  • 2/12 Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication. Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat, David G. Andersen, Gregory R. Ganger, Garth A. Gibson, and Brian Mueller. Proc. SIGCOMM’09, Aug. 2009. (Bob presented)
  • 2/12 Data Center TCP (DCTCP). Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. Proc. SIGCOMM’10, Aug. 2010. (Lucas presented)
    • 2/12 Question: The Vasudevan et al. paper (call it “fine-grained RTO”) and the Alizadeh et al. paper (“DCTCP”) both address the incast problem, but they address it at different scales. What are those scales? Does either paper dominate the other with respect to the incast problem? Does either paper dominate the other overall?
  • 2/10 Background Events Can Make Sense. Maxwell Krohn, Eddie Kohler, and M. Frans Kaashoek. Proc. 2007 USENIX Annual Technical Conference, June, 2007, 87-100.

Notes

Code

Background:

Paxos Benchmarks:

Chubby Implementations: