An open API service indexing awesome lists of open source software.

https://github.com/arpit20adlakha/computer-science-papers-for-system-design


https://github.com/arpit20adlakha/computer-science-papers-for-system-design

Last synced: 8 months ago
JSON representation

Awesome Lists containing this project

README

          

# Computer-Science-Papers

## Storagesystems
- Haystack (https://lnkd.in/gSZYcmmB)
- f4: Facebook’s Warm BLOB Storage System (https://lnkd.in/gMEfTpAh)
- The Hadoop Distributed File System (https://lnkd.in/gSUqafDg)
- The Google File System (https://lnkd.in/giUResea)
- Facebook's Tectonic Filesystem: Efficiency from Exascale (https://lnkd.in/geg7-ub9)
- Pelican: A Building Block for Exascale Cold Data Storage (https://lnkd.in/gSse26YK)
- CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data (https://lnkd.in/gUbnK4rH)
- RADOS: a scalable, reliable storage service for petabyte-scale storage (https://lnkd.in/gKwbmzTx)
- Megastore: Providing Scalable, Highly Available Storage for Interactive Services (https://lnkd.in/gT7mSDQN)
- The Design and Implementation of a Log-Structured File System (https://lnkd.in/gVuka_Ym)
- The RAMCloud Storage System (https://lnkd.in/gC3SQccF)

## Analytics

- Monarch: Google's Planet-Scale In-Memory Time Series Database (https://lnkd.in/gbqa7HNa)
- Gorilla: A Fast, Scalable, In-Memory Time Series Database (https://lnkd.in/gd_nUJbu)
- Scuba: Diving into Data at Facebook (https://lnkd.in/gfBrJcge)
- The Unified Logging Infrastructure for Data Analytics at Twitter (https://lnkd.in/gwhNUMnF)
- Cubrick: Indexing Millions of Records per Second for Interactive Analytics (https://lnkd.in/g-n9GUMD)
- Shark: SQL and Rich Analytics at Scale (https://lnkd.in/gqXHq5BG)
- Realtime Data Processing at Facebook (https://lnkd.in/gQdMN4kP)

## Clustermanager and Scheduling

- Large-scale cluster management at Google with Borg (https://lnkd.in/gT7bG2SF)

- Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing (https://lnkd.in/gEEdRmcD)

- Apache Hadoop YARN: Yet Another Resource Negotiator (https://lnkd.in/g9SVx_Ft)

- Twine: A Unified Cluster Management System for Shared Infrastructure (https://lnkd.in/gbnuqutm)

## Streamprocessing

- MillWheel: Fault-Tolerant Stream Processing at Internet Scale (https://lnkd.in/gC7VjCfG)
- The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing (https://lnkd.in/g-PyJUPa)
- Apache Flink™: Stream and Batch Processing in a Single Engine (https://lnkd.in/gpzRA6v3)
- Drizzle: Fast and Adaptable Stream Processing at Scale (https://lnkd.in/g9Hbnvp7)
- Kafka, Samza and the Unix Philosophy of Distributed Data (https://lnkd.in/grtHkFWN)
- Discretized Streams: Fault-Tolerant Streaming Computation at Scale (https://lnkd.in/gbzc3_Ke)
- Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark (https://lnkd.in/gnQQP2UY)
- Noria: dynamic, partially-stateful data-flow for high-performance web applications (https://lnkd.in/gYtpef34)

## Pubsub

- Kafka: a Distributed Messaging System for Log Processing (https://lnkd.in/dkfPsFwH)
- Scribe: Transporting petabytes per hour via a distributed, buffered queueing system (https://lnkd.in/dTyTBE_t)
- LogDevice: a distributed data store for logs (https://lnkd.in/dvVTBz46)
- Scalog: Seamless Reconfiguration and Total Order in a Scalable Shared Log (https://lnkd.in/d7xmexrQ)
- CORFU: A Shared Log Design for Flash Clusters (https://lnkd.in/dxiquk5h)
- The FuzzyLog: A Partially Ordered Shared Log (https://lnkd.in/da4ikmEa)
- Ubiq: A Scalable and Fault-tolerant Log Processing Infrastructure (https://lnkd.in/dQTfCDwH)

## Graph processing in distributed setting.

- Pregel: A System for Large-Scale Graph Processing (https://lnkd.in/ggpew7yq)
- PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs (https://lnkd.in/g6f9Mjzk)
- GraphX: Graph Processing in a Distributed Dataflow Framework (https://lnkd.in/gixUZP46)
- Gemini: A Computation-Centric Distributed Graph Processing System (https://lnkd.in/gCs2R5EJ)
- TAO: Facebook’s Distributed Data Store for the Social Graph (https://lnkd.in/gfesm_Hn)

## Consensus and replicated state machines.
- Paxos Made Simple (https://lnkd.in/gk6nxyVj)
- Implementing Fault-Tolerant Services Using the State Machine (https://lnkd.in/gPwNde-i)
- The Chubby lock service for loosely-coupled distributed systems (https://lnkd.in/gFXKTrXR)
- ZooKeeper: Wait-free coordination for Internet-scale systems (https://lnkd.in/gWTYBxQN)
- In Search of an Understandable Consensus Algorithm (https://lnkd.in/gqrKhvsK)
- Virtual Consensus in Delos (https://lnkd.in/g5bitkdM)

## Peertopeer systems and information dessimination.
- Gossip-Based Broadcast (https://lnkd.in/gT74Zb8Z)
- Gossiping in Distributed Systems (https://lnkd.in/g55DFbuP)
- Peer-to-peer membership management for gossip-based protocols (https://lnkd.in/g_XE4TiE)
- Gossip-based Peer Sampling (https://lnkd.in/gSPwEkaW)
- SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol (https://lnkd.in/gxZtR3Nh)
- Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems (https://lnkd.in/gyURBizm)
- Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications (https://lnkd.in/grVF9crk)

Additional May be Repeated articles will categorize later.

| |Short Name| Title |Link | Extra links|
|---| -------- | ---------- |-----|------------|
|1 | Apache Kafka | Kafka: A Distributed Messaging System for Log Processing | (https://notes.stephenholiday.com/Kafka.pdf) ||
|2 | Apache Cassandra | Cassandra - A Decentralized Structured Storage System | (https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf) ||
|3 | Apache Flink | Apache Flink: Stream and Batch Processing in a Single Engine | (https://asterios.katsifodimos.com/assets/publications/flink-deb.pdf)||
|4 | Apache Spark | Spark: Cluster Computing with Working Sets | (https://www.usenix.org/legacy/event/hotcloud10/tech/full_papers/Zaharia.pdf) ||
|5 | Apache Zookeeper | ZooKeeper: Wait-free coordination for Internet-scale systems | (https://www.usenix.org/legacy/event/atc10/tech/full_papers/Hunt.pdf) ||
|6 | BigTable | Bigtable: A Distributed Storage System for Structured Data | (https://research.google.com/archive/bigtable-osdi06.pdf) ||
|8 | Apache Impala | Apache Impala: A Modern, Open-Source SQL Engine for Hadoop | (https://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf) ||
|9 | Apache Druid | Druid: A Real-time Analytical Data Store | (http://static.druid.io/docs/druid.pdf) ||
|10 | Timer Wheel | Hashed and Hierarchical Timing Wheels | (http://www.cs.columbia.edu/~nahum/w6998/papers/sosp87-timing-wheels.pdf) ||
|11 | MillWheel | MillWheel: Fault-Tolerant Stream Processing at Internet Scale | (https://research.google.com/pubs/archive/41378.pdf) ||
|12 | Dynamo | Dynamo: Amazon’s Highly Available Key-value Store | (https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf) ||
|13 | Google File System | The Google File System | (https://research.google.com/archive/gfs-sosp2003.pdf) ||
|14 | MapReduce | MapReduce: Simplified Data Processing on Large Clusters | (https://research.google.com/archive/gfs-sosp2003.pdf) ||
|15 | Spanner | Spanner: Google’s Globally-Distributed Database | (https://research.google.com/archive/spanner-osdi2012.pdf) ||
|16 | Zab | Zab: High-performance broadcast forprimary-backup systems | (http://www.cs.cornell.edu/courses/cs6452/2012sp/papers/zab-ieee.pdf) ||
|17 | Paxos | Paxos Made Simple | (https://lamport.azurewebsites.net/pubs/paxos-simple.pdf) ||
|18 | Chubby | The Chubby lock service for loosely-coupled distributed systems | (https://research.google.com/archive/chubby-osdi06.pdf) ||
|19 | Dremel | Dremel: Interactive Analysis of Web-Scale Datasets | (https://research.google/pubs/pub36632/) ||
|20 | Megastore | Megastore:Providing Scalable, Highly Available Storage for Interactive Services | (https://research.google/pubs/pub36971.pdf) ||
|21 | Raft | In Search of an Understandable Consensus Algorithm (Extended Version) | (https://raft.github.io/raft.pdf) ||
|22 | Flexible Paxos | Flexible Paxos: Quorum Intersection Revisited | (https://arxiv.org/abs/1608.06696) ||
|23 | Thrift | Thrift: Scalable Cross-Language Services Implementation | (https://thrift.apache.org/static/files/thrift-20070401.pdf) ||
|24 | Maglev | Maglev: A Fast and Reliable Software Network Load Balancer | (https://research.google.com/pubs/archive/44824.pdf) ||
|25 | LSM | The Log-Structured Merge-Tree (LSM-Tree) | (https://www.cs.umb.edu/~poneil/lsmtree.pdf) ||
|26 | Chord | Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications | (https://pdos.csail.mit.edu/papers/chord:sigcomm01/chord_sigcomm.pdf) ||
|27 | Kademlia | Kademlia: A Peer-to-peer Information System Based on the XOR Metric | (https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf) ||
|28 | Mesa | Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing | (https://research.google/pubs/pub42851/ ) ||
|29 | SCRIBE | SCRIBE: A large-scale and decentralized application-level multicast infrastructure | https://rowstron.azurewebsites.net/PAST/jsac.pdf ||
|30 | PAST | Storage management and caching in PAST- A large-scale, persistent peer-to-peer storage utility | https://people.mpi-sws.org/~druschel/publications/PAST-hotos.pdf ||
|31 | Pastry | Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems? | https://www.cs.cornell.edu/people/egs/615/pastry.pdf ||
|32 | Linearizability | Linearizability: A Correctness Condition for Concurrent Objects | http://cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf ||
|33 | Time and Clocks | Time, Clocks, and the Ordering of Events in a Distributed System | http://lamport.azurewebsites.net/pubs/time-clocks.pdf ||
|34 | CRDTs | CRDTs: Consistency without concurrency control | http://hal.archives-ouvertes.fr/docs/00/39/79/81/PDF/RR-6956.pdf ||
|35 | Photon | Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams | https://research.google/pubs/pub41318/ ||
|36 | TAO | TAO: Facebook’s Distributed Data Store for the Social Graph | https://www.usenix.org/system/files/conference/atc13/atc13-bronson.pdf ||
|37 | Pregel | Pregel: A System for Large-Scale Graph Processing | https://15799.courses.cs.cmu.edu/fall2013/static/papers/p135-malewicz.pdf ||
|38 | Dapper | Dapper: A-large-scale-distributed-tracing-infrastructure | https://research.google/pubs/pub36356.pdf ||
|39 | Raft Refloated | Raft Refloated: Do We Have Consensus? | https://www.cl.cam.ac.uk/~ms705/pub/papers/2015-osr-raft.pdf ||
|40 | Percolator | Large-scale Incremental Processing Using Distributed Transactions and Notifications | https://research.google/pubs/pub36726.pdf ||
|41 | Monarch | Monarch: Google’s Planet-Scale In-Memory Time Series Database | https://research.google/pubs/pub50652/ ||
|42 | Borg | Large-scale cluster management at Google with Borg | https://research.google/pubs/pub43438.pdf ||
|43 | Borg - Next | Borg: the Next Generation | https://research.google/pubs/pub49065.pdf ||
|44 | Amazon Aurora | Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases | https://web.stanford.edu/class/cs245/readings/aurora.pdf ||
|45 | Gorilla | Gorilla: A Fast, Scalable, In-Memory Time Series Database | http://www.vldb.org/pvldb/vol8/p1816-teller.pdf ||
|46 | HDFS | The Hadoop Distributed File System | https://storageconference.us/2010/Papers/MSST/Shvachko.pdf ||
|47 | Autopilot | Autopilot: workload autoscaling at Google | https://dl.acm.org/doi/10.1145/3342195.3387524 ||
|48 | Consistent hashing | Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web | https://dl.acm.org/doi/pdf/10.1145/258533.258660 ||
|49 | SEDA | SEDA: An Architecture for Well-Conditioned, Scalable Internet Services | http://www.sosp.org/2001/papers/welsh.pdf ||
|50 | Bitcask | Bitcask: A Log-Structured Hash Table for Fast Key/Value Data | https://riak.com/assets/bitcask-intro.pdf ||
|51 | DynamoDB | Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service | https://www.usenix.org/system/files/atc22-elhemali.pdf ||
|52 | Isolation levels | A critique of ANSI SQL isolation levels | https://dl.acm.org/doi/pdf/10.1145/223784.223785 ||
|54 | Deletable Bloom Filter | The deletable bloom filter | https://arxiv.org/pdf/1005.0352 ||
|55 | Hash Coding | Space\Time Trade-offs in Hash Coding with Allowable Errors | https://dl.acm.org/doi/pdf/10.1145/362686.362692 ||
|56 | Expedite Byzantine | Shifting Gears- Changing Algorithms on the Fly To Expedite Byzantine Agreement | https://www.sciencedirect.com/science/article/pii/089054019290035E ||
|57 | Scalability cost | Scalability! But at what COST? | https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-mcsherry.pdf ||
|58 | Foundation DB | FoundationDB: A Distributed Unbundled Transactional Key Value Store | https://www.foundationdb.org/files/fdb-paper.pdf ||
|59 | Monolith | Monolith: Real Time Recommendation System With Collisionless Embedding Table | https://arxiv.org/pdf/2209.07663 ||
|60 | Memcache at Facebook | Scaling Memcache at Facebook | https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf ||
|61 | MilliSampler | A microscopic view of bursts, buffer contention, and loss in data centers | https://dl.acm.org/doi/pdf/10.1145/3517745.3561430 | https://engineering.fb.com/2023/04/17/networking-traffic/millisampler-network-traffic-analysis/ |
|62 | FlexiRaft | FlexiRaft: Flexible Quorums with Raft | https://www.cidrdb.org/cidr2023/papers/p83-yadav.pdf ||
|63 | Minesweeper | Scalable Statistical Root Cause Analysis on AppTelemetry | https://arxiv.org/abs/2010.09974 ||
|64 | Shard Manager | Shard Manager: A Generic Shard ManagementFramework for Geo-distributed Applications | ||
|65 | FlumeJava | FlumeJava: Easy, Efficient Data-Parallel Pipelines | https://research.google/pubs/pub35650.pdf ||
|66 | Heron | Twitter Heron: Stream Processing at Scale | https://dl.acm.org/doi/pdf/10.1145/2723372.2742788 ||
|67 | Dataflow | The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in MassiveScale, Unbounded, OutofOrder Data Processing | https://research.google/pubs/pub43864.pdf ||
|68 | Flink | State Management in Apache Flink | http://www.vldb.org/pvldb/vol10/p1718-carbone.pdf ||
|69 | Dgraph | Dgraph: Synchronously Replicated, Transactional and Distributed Graph Database |||