https://github.com/01110011011101010110010001101111/tigergraph_pyspark_aml
How to Use pySpark with TigerGraph's AML Sim Fraud Graph
https://github.com/01110011011101010110010001101111/tigergraph_pyspark_aml
big-data fraud fraud-detection graph-database pyspark spark tigergraph
Last synced: 3 months ago
JSON representation
How to Use pySpark with TigerGraph's AML Sim Fraud Graph
- Host: GitHub
- URL: https://github.com/01110011011101010110010001101111/tigergraph_pyspark_aml
- Owner: 01110011011101010110010001101111
- Created: 2022-06-29T18:33:39.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2022-07-04T02:48:41.000Z (over 3 years ago)
- Last Synced: 2025-01-31T11:50:11.485Z (10 months ago)
- Topics: big-data, fraud, fraud-detection, graph-database, pyspark, spark, tigergraph
- Language: Python
- Homepage:
- Size: 2.93 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# TigerGraph PySpark Sample
A sample project reading from TigerGraph with pySpark
## Quickstart
1. Install Spark and Scala (`brew install apache-spark && brew install scala`)
1. Clone this project and enter the directory
1. Create a Python virtual environment (`python3 -m venv venv`) and enter the environment (`source venv/bin/activate`)
1. Install pySpark and pyPandoc (`pip3 install pyspark pypandoc`)
1. Load an on-premise TigerGraph AMLSim graph
1. Download the lastest `.jar` file of the JDBC TigerGraph Driver
1. Run the project (`spark-submit --jars tigergraph-jdbc-driver-1.3.0.jar index.py`)
## Overview
This repository will walk you through how to get TigerGraph data using pySpark. It shows three possible methods to do so: retrieving vertices, retrieving edges, and running queries.
Find a thorough walkthrough of this project (set up, code explanation, etc.) [here](https://medium.com/datadriveninvestor/an-introduction-to-pyspark-and-tigergraph-9c3396835bc2).