Skip to content

Latest commit

 

History

History
334 lines (285 loc) · 25.1 KB

awesome-distributed-computing.md

File metadata and controls

334 lines (285 loc) · 25.1 KB

Resources on Distributed Computing and Distributed Systems

This document contains a big list of resources on distributed computing and distributed systems, including people, conferences/journals, lecture notes, open courses, videos, and so on.

People

People on Fundamentals of Distributed Computing Theory

My research interest are programming models to program distributed, parallel, or concurrent or systems conveniently, efficiently, and correctly.

People on Fundamentals of Multiprocessor Programming Theory

Victor Luchangco works in the Scalable Synchronization Group of Oracle Labs. His research focuses on developing algorithms and mechanisms to support concurrent programming on large-scale distributed systems.

Moir's main research interests concern practical and theoretical aspects of concurrent, distributed, and real-time systems, particularly hardware and software support for programming constructs that facilitate scalable synchronization in shared memory multiprocessors.

On memory models.

People on Programming Languages (including Consistency Models and (Weak) Memory Models)

My research interests are in methods and tools for developing correct concurrent and distributed software.

My research interests are programming models to program distributed, parallel, or concurrent or systems conveniently, efficiently, and correctly.

My research interests lie in the broad area of distributed systems. Some more specific topics that I am (currently) interested in include fault-tolerance, blockchain and distributed ledgers, cloud computing security and distributed storage.

Michael’s research enables the construction of reliable software by developing the foundations for effective programming abstractions and informative program analysis tools.

People on the Theory of Distributed Systems

I am a software engineer at Google. Before that, I was a principal research scientist at Yahoo! Research. Before that I was an assistant professor at Georgia Tech, and before that I was a PhD student at Stanford.

Research Groups

Conferences, Journals, Workshops, and Magazines (By Topics)

SIGs:

General Theory of Computer Science

Conferences

Journals

Distributed Computing Theory

Programming Languages and Concurrency Theory

Distributed Systems (and More General)

Formal Methods (Logic)

Database Systems

Databases

Journals

Workshops

January 20 – 25 , 2013, Dagstuhl Seminar 13042

This workshop will bring together experts in the field (and some exceptional graduate students and postdocs) to discuss fundamental distributed computing problems whose computational complexities have not been resolved and the limitations of current techniques for obtaining lower bounds for these problems.

Prizes and Awards

The European Association for Programming Languages and Systems has established a Best Dissertation Award in the international research area of programming languages and systems.

Reports

Courses & Paper Reading Lists

This course introduces the principles of distributed computing, emphasizing the fundamental issues underlying the design of distributed systems and networks: communication, coordination, fault-tolerance, locality, parallelism, self-organization, symmetry breaking, synchronization, uncertainty. We explore essential algorithmic ideas and lower bound techniques, basically the "pearls" of distributed computing.

It will present abstractions and implementation techniques for engineering distributed systems. Topics include multithreading, remote procedure call, client/server designs, peer-to-peer designs, consistency, fault tolerance, and security, as well as several case studies of distributed systems.

The Lecture Notes (ppt) is elegant. Topics will include the majority (we are going to shoot for all and see what happens) of the following: Global states and event ordering; Logical clocks; Vector clocks; Consistent cuts and global property detection; Rollback-recovery and message-logging protocols; State machine approach; Agreement protocols; Failure detectors; Replication and consistency; Byzantine fault tolerance; Atomic Commit

This course studies the organization of cloud computing systems and survey research problems in this area.

Big Ideas. Big Money. Big Data.

The primary emphasis is on operating systems and distributed systems. A secondary emphasis is on protocol implementation and next-generation network protocols. The focus when covering these topics is the extent to which they impact end-system design and implementation.

Lecture notes: Robust Concurrent Computing It also provides a list of papers to read.

CPS 212 is a graduate-level course dealing with techniques for storing and sharing information in computer networks, large and small. We will cover a range of core distributed systems topics, with an emphasis on the issues faced by networked utility services, scalable Internet services, and enterprise storage systems.

This class will examine file system implementation, low-level database storage techniques, and distributed programming. Lectures will cover basic file system structures, journaling and logging, I/O system performance, RAID, the RPC abstraction, and numerous systems illustrating these concepts.

This course broadly examines distributed storage systems in its many manifestations. It explores how to harness and maintain the collective storage capabilities in storage systems from global-scale enterprises and cloud computing to peer-to-peer, ad hoc, and home networks.

Principles, techniques, and examples related to the design, implementation, and analysis of distributed and parallel computer systems.

Computer Science Ph.D. Thesis

Tools

Abstraction-based parameterized TLA+ checker: Bringing state-of-the-art model checking to TLA+

Blogs

English

Chinese

Other Articles

Videos

From Leslie Lamport

  • What is Computation: Dr. Leslie Lamport, Microsoft
  • Thinking Above the Code Architects draw detailed blueprints before a brick is laid or a nail is hammered. Programmers and software engineers seldom do. A blueprint for software is called a specification. The need for extremely rigorous specifications before coding complex or critical systems should be obvious—especially for concurrent and distributed systems. This talk explains why some sort of specification should be written for any software.

Books

Synthesis Lectures on Distributed Computing Theory is edited by Jennifer Welch of Texas A&M University and Nancy Lynch of the Massachusetts Institute of Technology. The series publishes 50- to 150-page publications on topics pertaining to distributed computing theory. The scope largely follows the purview of premier information and computer science conferences, such as ACM PODC, DISC, SPAA, OPODIS, CONCUR, DialM-POMC, ICDCS, SODA, Sirocco, SSS, and related conferences. Potential topics include, but not are limited to: distributed algorithms and lower bounds, algorithm design methods, formal modeling and verification of distributed algorithms, and concurrent data structures.

Miscellaneous