A brief introduction to the concept of distributed systems. The goal
is to explain the importance of a course about theory of distributed systems;
in this sense, we will show that the design and implementation of effective
distributed systems is not easy; issues related to ``imperfect'' computation
and communication makes it substantially more difficult than designing
centralized algorithms.
Lectures: 21/9
Slides: [pdf]
Additional notes: [pdf]
Reading material:
This lecture presents the basic abstractions we use to model a distributed system composed of computational entities (processes) communicating by exchanging messages
Lectures: 23/9, 28/9
Slides: [pdf]
Reading material and topics:
- Abstracting computation [CDK05, Section 2.2],[GR04, Chapter 2].
- Abstracting communication [CDK05, Section 2.3],[GR04, Chapter 2].
- Abstracting time [CDK05, 2.4],[GR04, Chapter 2].
- Distributed models [CDK05, 2.4],[GR04, Chapter 2].
- Specifications: liveness and safety [AS87]
- A Modular Approach to Fault-Tolerant Broadcasts and Related Problems [HT93, Section 2]
- Logical time and logical clocks [BM93]
This lecture introduces the concept of logical time, showing how it can
be used to order distributed events and debug distributed applications.
Lectures: 28/9, 30/9
Slides: [pdf]
Reading material and topics: [BM93]
- Global states
- Distributed debugging
This lecture explains how to obtain reliable communication ``many-to-many'', from
multiple nodes to multiple nodes. The relation between this problem and more
complex forms of coordination is explained.
Lectures: 5/10, 7/10
Slides: [pdf]
Reading material and topics: [HT93]
- Reliable Broadcast, Uniform Reliable Broadcast
- Ordering Properties (FIFO, Causal)
- Atomic Broadcast
This set of lectures will show how biology-inspired techniques (epidemics)
can be used to obtain protocols characterized by efficiency and robustness,
with the only drawback that guarantees are not deterministic any more, but
only probablistic.
Lectures: 7/10
Slides: [pdf]
Reading material and topics: [DGH+87]
- Anti-entropy protocols
- Gossip protocols
This set of lectures is devoted to one of the main problems in distributed computing:
how to reach an agreement among a collection of distributed processes. We will first
show that Consensus cannot be reached in an asynchronous system [FLP85], and we will
later shows how consensus can be solved ``in practice'' in real systems [CT96].
Lectures: 19/10, 21/10
Reading material and topics:
- Impossibility of Consensus [FLP85]
Slides: Lecture at the blackboard [pdf]
- Consensus: Beyond Impossibility [CT96,MR99,AT98]
Slides: [pdf]
Additional reading material:
- [Tus96] proves that "◊W" is the weakest failure detector
to solve Consensus
- [Gue00] introduces the concept of ``indulgent algorithms'',
i.e. algorithms that tolerate unreliable failure detectors.
- [AT98] shows how the Consensus problem can be
solved through an hybrid approach (FD + randomization)
- [BO83] is the original paper of Ben-Or about
randomized Consensus.
- [Asp03] is a survey about randomized Consensus
protocols.
This lecture introduces the concept of aggregation, which is the
collective name of a set of functions that provide statistical
information about a system.
Lectures: 26/10
Slides: [pdf]
Reading material and topics:
A transaction is the execution of a sequence of actions that must be either
entirely completed or aborted, independently of other transactions.
A distributed transaction involves several servers and/or databases,
and needs special protocols to be implemented.
Lectures: 28/10
Slides: [pdf]
Reading material and topics: [BT93]
- Recap about transactional model
- Atomic commitment
This lecture introduces the concept of consistency policies
for replicated systems.
Lectures: 2/11, 4/11
Slides: [pdf]
Reading material and topics: [Tv07, Chapter 7], [Gho06, Chapter 16]
- Consistency models
- Passive vs active replication
- Primary-backup [BMST93]
- State machine [Sch93]
Additional reading material:
- T. Bressoud and F. Schneider. Hypervisor-Based Fault Tolerance. ACM TOCS, 14(1):80-107, 1996. [pdf]
- B. Cully et al. Remus: High Availability via Asynchronous Virtual Machine Replication. In proceedings of NSDI'08, 2008. [pdf]
This lecture discusses a middleware tool called group communication,
where multiple groups of processes or objects cooperate through simple
primitives to provide replicated services to clients.
Lectures: 4/11
Slides: [pdf]
Reading material:
- Group Communication [CKV01]
This lecture introduces the peer sampling problem, which is
the large-scale equivalent of group membership in traditional systems.
Lectures: 9/11
Slides: [pdf]
Topics and Reading material:
This lecture introduces the Peersim simulator, which will be used for
the laboratory projects.
Lectures: 11/11
Slides: [pdf]
Additional material: Peersim web site
This lecture discusses the differences between structured and unstructured
P2P systems and briefly introduces some well-known protocols.
Lectures: 18/11, 23/11)
Slides:
[pdf],
[pdf],
[pdf]
Reading material and topics:
This short lecture discusses Paxos, a very practical Consensus protocol
which is actually used in several real projects - including the Google
file system.
Lectures: 30/11
Slides: [pdf]
Reading material: [Lam01]
Additional reading material
- A paper about the Chubby lock service using Paxos [].
- A nice paper about the fine details of implementing Paxos in
a real system [CGR07].
Lectures: 9/12
Slides: [pdf]
Reading material: [GPJ10]
Additional reading material: [BvS08]
We give a look to a paper that has recently re-started the research
in the area of byzantine fault tolerance.
Lectures: 2/12
Slides: [pdf]
Reading material: [CL99]
Additional reading material:
- A paper introducing Zyzzyva, a speculative byzantine protocol
that further optimized PBFT in the common case [KCW+07].
- A paper introducing Aardvark, a novel approach to byzantine
protocol trying to make them highly available even under attack.
In rollback-recovery systems, processes achieve fault-tolerance by
using ``stable storage'' to periodically save recovery information
that can be used to re-start execution in case of failures.
Lectures: 14/12
Slides: [pdf]
Reading material: [EAWJ02]
This lecture introduces novel results on peer-to-peer video streaming..
Lectures: 16/12 (last year)
Slides:
[pdf]
Reading material: [DBGM02,ZLLY05]
Additional reading material: [hCRSZ02,SF06]
- CDK05
-
George Coulouris, Jean Dollimore, and Tim Kindberg.
Distributed systems (4th ed.): concepts and design.
Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2005.
[Bibtex].
- GR04
-
Rachid Guerraoui and Louis Rodrigues.
Reliable Distributed Programming.
Springer-Verlag, Berlin, Germany, 2004.
[Bibtex].
- Tv07
-
Andrew Tanenbaum and Maarten van Steen.
Distributed Systems: Principles and Paradigms (2nd ed.).
Prentice Hall, 2007.
[Bibtex].
- AS87
-
Bowen Alpern and Fred B. Schneider.
Recognizing safety and liveness.
Distributed Computing, 2(3):117-126, 1987.
[PDF],
[Bibtex].
- HT93
-
Vassos Hadzilacos and Sam Toueg.
A modular approach to fault-tolerant broadcasts and related problems.
In Sape Mullender, editor, Distributed Systems (2nd edition).
Addison-Wesley, 1993.
Chapter 5.
[PDF],
[Bibtex].
- BM93
-
Ozalp Babaoglu and Keith Marzullo.
Consistent global states of distributed systems: Fundamental concepts
and mechanisms.
In Sape Mullender, editor, Distributed Systems (2nd edition).
Addison-Wesley, 1993.
Chapter 4.
[PDF],
[Bibtex].
- DGH+87
-
Alan J. Demers, Daniel H. Greene, Carl Hauser, Wes Irish, John Larson, Scott
Shenker, Howard E. Sturgis, Daniel C. Swinehart, and Douglas B. Terry.
Epidemic algorithms for replicated database maintenance.
In In Proceedings of the 6th annual ACM Symposium on Principles
of Distributed Computing Systems (PODC'87), pages 1-12, 1987.
[PDF],
[Bibtex].
- FLP85
-
Michael J. Fischer, Nancy A. Lynch, and M. S. Paterson.
Impossibility of distributed consensus with one faulty processor.
Journal of the ACM, 32(2):374-382, April 1985.
[PDF],
[Bibtex].
- CT96
-
Tushar Deepak Chandra and Sam Toueg.
Unreliable failure detectors for reliable distributed systems.
Journal of the ACM, 43(2):225-267, 1996.
[PDF],
[Bibtex].
- MR99
-
Achour Mostéfaoui and Michel Raynal.
Solving consensus using chandra-toueg's unreliable failure detectors:
A general quorum-based approach.
In In Proceedings of the 13th International Symposium on
Distributed Computing (DISC'00), pages 49-63, Bratislava, Slavak Republic,
1999.
[PDF],
[Bibtex].
- AT98
-
Marcos Kawazoe Aguilera and Sam Toueg.
Correctness proof of ben-or’s randomized consensus algorithm.
Technical Report TR98-1682, Cornell University, 1998.
[PDF],
[Bibtex].
- Tus96
-
Vassos Hadzilacos Sam Toueg Tushar Deepak Chandra.
The weakest failure detector for solving consensus.
Journal of the ACM, 43(4):685-722, 1996.
[PDF],
[Bibtex].
- Gue00
-
Rachid Guerraoui.
Indulgent algorithms.
In In Proceedings of the 19th annual ACM Symposium on Principles
of Distributed Computing Systems (PODC'00), pages 49-63, Portland, OR,
2000.
[PDF],
[Bibtex].
- AT98
-
Marcos Kawazoe Aguilera and Sam Toueg.
Failure detection and randomization: A hybrid approach to solve
consensus.
SIAM Journal of Computing, 1998.
[PDF],
[Bibtex].
- BO83
-
Michael Ben-Or.
Another advantage of free choice: Completely asynchronous agreement
protocols (extended abstract).
In In Proceedings of the 2nd annual ACM Symposium on Principles
of Distributed Computing Systems (PODC'83), pages 27-30, 1983.
[PDF],
[Bibtex].
- Asp03
-
James Aspnes.
Randomized protocols for asynchronous consensus.
Distributed Computing, 16(2-3):165-175, September 2003.
[PDF],
[Bibtex].
- JMB05
-
Márk Jelasity, Alberto Montresor, and Ozalp Babaoglu.
Gossip-based aggregation in large dynamic networks.
ACM Trans. Comput. Syst., 23(1):219-252, August 2005.
[PDF],
[Bibtex].
- vRBV03
-
Robbert van Renesse, Kenneth Birman, and Werner Vogels.
Astrolabe: A robust and scalable technology for distributed system
monitoring, management, and data mining.
ACM Trans. Comput. Syst., 21(2):164-206, May 2003.
[PDF],
[Bibtex].
- BT93
-
Ozalp Babaoglu and Sam Toueg.
Understanding non-blocking atomic commitment.
In Sape Mullender, editor, Distributed Systems (2nd edition).
Addison-Wesley, 1993.
Chapter 6.
[PDF],
[Bibtex].
- Gho06
-
Sukumar Ghosh.
Distributed Systems: An Algorithmic Approach.
Chapman & Hall/CRC, 2006.
[Bibtex].
- BMST93
-
Navin Budhiraja, Keith Marzullo, Fred Schneider, and Sam Toueg.
The primary-backup approach.
In Sape Mullender, editor, Distributed Systems (2nd edition).
Addison-Wesley, 1993.
Chapter 8.
[PDF],
[Bibtex].
- Sch93
-
Fred Schneider.
Replication management using the state machine approach.
In Sape Mullender, editor, Distributed Systems (2nd edition).
Addison-Wesley, 1993.
Chapter 7.
[PDF],
[Bibtex].
- CKV01
-
Gregory Chockler, Idit Keidar, and Roman Vitenberg.
Group communication specifications: a comprehensive study.
ACM Comput. Surv., 33(4):427-469, 2001.
[PDF],
[Bibtex].
- JGKv04
-
Márk Jelasity, Rachid Guerraoui, Anne-Marie Kermarrec, and Maarten van
Steen.
The peer sampling service: Experimental evaluation of unstructured
gossip-based implementations.
In Hans-Arno Jacobsen, editor, Proceedings of Middleware 2004,
ACM/IFIP/USENIX International Middleware Conference, volume 3231 of Lecture Notes in Computer Science, pages 79-98, Toronto, Canada, 2004.
Springer.
[PDF],
[Bibtex].
- VGv05
-
Spyros Voulgaris, Daniela Gavidia, and Maarten van Steen.
CYCLON: Inexpensive membership management for unstructured p2p
overlays.
J. Network Syst. Manage., 13(2), 2005.
[PDF],
[Bibtex].
- JHv07
-
Gian Paolo Jesi, David Hales, and Maarten van Steen.
Identifying malicious peers before it's too late: A decentralized
secure peer sampling service.
In Proceedings of the First International Conference on
Self-Adaptive and Self-Organizing Systems (SASO'07), pages 237-246, Boston,
MA, 2007. IEEE Computer Society.
[PDF],
[Bibtex].
- SMK+01
-
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari
Balakrishnan.
Chord: A scalable peer-to-peer lookup service for internet
applications.
In Proceedings of the 2001 Conference on Applications,
Technologies, Architectures, and Protocols for Computer Communications
(SIGCOMM), pages 149-160, San Diego, CA, 2001. ACM, ACM Press.
[PDF],
[Bibtex].
- MM02
-
P. Maymounkov and D. Mazieres.
Kademlia: A peer-to-peer information system based on the XOR
metric.
In Proceedings of the 1st International Workshop on Peer-to-Peer
Systems (IPTPS’02), pages 258-263. Springer, 2002.
[PDF],
[Bibtex].
- Coh03
-
B. Cohen.
Incentives Build Robustness in BitTorrent.
In Proc. of the Workshop on Economics of Peer-to-Peer Systems,
2003.
[PDF],
[Bibtex].
- Lam01
-
L. Lamport.
Paxos made simple.
ACM SIGACT News, 32(4):18-25, 2001.
[PDF],
[Bibtex].
- CGR07
-
Tushar Deepak Chandra, Robert Griesemer, and Joshua Redstone.
Paxos made live: an engineering perspective.
In Indranil Gupta and Roger Wattenhofer, editors, Proceedings of
the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing
(PODC'07), pages 398-407, Portland, Oregon, USA, August 2007. ACM.
[PDF],
[Bibtex].
- GPJ10
-
Maarten van Steen Gian Paolo Jesi, Alberto Montresor.
Secure peer sampling.
Elsevier Computer Networks – Special Issue on Collaborative
Peer-to-Peer Systems, 2010.
accepted.
[PDF],
[Bibtex].
- BvS08
-
Arno Bakker and Maarten van Steen.
Puppetcast: A secure peer sampling protocol.
In Proc. of the European Conference onComputer Network Defense,
pages 3-10. IEEE Computer Society, 2008.
[PDF],
[Bibtex].
- CL99
-
Miguel Castro and Barbara Liskov.
Practical byzantine fault tolerance.
In Proc. of the 3rd Symposium on Operating Systems Design and
Implementation (OSDI'99), New Orleans, USA, February 1999. USENIX
Association.
[PDF],
[Bibtex].
- KCW+07
-
R. Kotla, A. Clement, E. Wong, L. Alvisi, and M. Dahlin.
Zyzzyva: Speculative byzantine fault tolerance.
In Proc. of the ACM Symposium on Operating Systems Principles
(SOSP'07), Stevenson, WA, October 2007. ACM.
[PDF],
[Bibtex].
- EAWJ02
-
E. N. Elnozahy, Lorenzo Alvisi, Yi-Min Wang, and David B. Johnson.
A survey of rollback-recovery protocols in message-passing systems.
ACM Computing Surveys, 34(3):375-408, 2002.
[PDF],
[Bibtex].
- DBGM02
-
Hrishikesh Deshpande, Mayank Bawa, and Hector Garcia-Molina.
Streaming live media over peers.
Technical Report 2002-21, Stanford InfoLab, 2002.
[PDF],
[Bibtex].
- ZLLY05
-
Xinyan Zhang, Jiangchuan Liu, Bo Li, and Y.-S.P. Yum.
Coolstreaming/donet: a data-driven overlay network for peer-to-peer
live media streaming.
volume 3, pages 2102-2111 vol. 3, March 2005.
[Bibtex].
- hCRSZ02
-
Yang hua Chu, S.G. Rao, S. Seshan, and Hui Zhang.
A case for end system multicast.
Selected Areas in Communications, IEEE Journal on,
20(8):1456-1471, October 2002.
[Bibtex].
- SF06
-
Thomas Silverston and Olivier Fourmaux.
P2p iptv measurement: A comparison study.
CoRR, abs/cs/0610133, 2006.
[PDF],
[Bibtex].
Alberto Montresor
2009-12-12