Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rberenguel/identity-graphs
Presentation about Graphframes and how we handle graphs with more than 2 billion nodes at Hybrid Theory
https://github.com/rberenguel/identity-graphs
graphframes spark
Last synced: about 2 months ago
JSON representation
Presentation about Graphframes and how we handle graphs with more than 2 billion nodes at Hybrid Theory
- Host: GitHub
- URL: https://github.com/rberenguel/identity-graphs
- Owner: rberenguel
- Created: 2021-04-15T14:55:45.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2021-06-21T18:17:43.000Z (over 3 years ago)
- Last Synced: 2023-03-30T03:31:16.090Z (almost 2 years ago)
- Topics: graphframes, spark
- Homepage:
- Size: 29.4 MB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Keeping identity graphs in sync with Apache Spark
Presentation I ([@berenguel](https://twitter.com/berenguel)) gave at the [Data Love Conference](https://datalove.konfy.care) on April 2021
and May at the [Data+AI Summit](https://databricks.com/session_na21/keeping-identity-graphs-in-sync-with-apache-spark) to explain how we manage a 2 billion node graph at [Hybrid Theory](https://www.hybridtheory.com). You can find the slides
[here](https://github.com/rberenguel/identity-graphs/releases/download/0.2.0/identity-graphs.pdf)
(some images might look slightly blurry). I recommend you check the version with
[presenter
notes](https://github.com/rberenguel/identity-graphs/releases/download/0.2.0/identity-graphs-with-notes.pdf)
which is only available here. You can also head over the _releases_ tab in case I have a more recent version and forgot to update this README.If you want additional information about Spark in general, I gave an
`introduction to Spark` talk with [Carlos Peña](http://twitter.com/crafty_coder)
that you can find [here](https://github.com/rberenguel/WelcomeToApacheSpark).---
The video from Data Love is available [here](https://www.youtube.com/watch?v=xL8uFgXLEQY&list=PLBqWQH1MiwBS8f0PhhDeQuBVCjxC_i0X5&index=23). Don't miss the whole [playlist](https://www.youtube.com/playlist?list=PLBqWQH1MiwBS8f0PhhDeQuBVCjxC_i0X5) of videos of the conference.
You can watch the recording from Data+AI Summit by registering to it and selecting "On Demand" [here](https://databricks.com/session_na21/keeping-identity-graphs-in-sync-with-apache-spark).
---
This presentation is formatted in Markdown and prepared to be used with
[Deckset](https://www.decksetapp.com/). The drawings were done on an iPad Mini 5
using [Procreate](https://procreate.art).---