https://github.com/RoheLab/aPPR

Approximate Personalized Page Rank
https://github.com/RoheLab/aPPR

Last synced: 6 months ago
JSON representation

Approximate Personalized Page Rank

Host: GitHub
URL: https://github.com/RoheLab/aPPR
Owner: RoheLab
License: other
Created: 2019-08-21T22:25:14.000Z (almost 6 years ago)
Default Branch: main
Last Pushed: 2024-06-27T15:45:27.000Z (12 months ago)
Last Synced: 2024-08-13T07:14:53.703Z (10 months ago)
Language: R
Homepage: https://rohelab.github.io/aPPR/
Size: 9.98 MB
Stars: 15
Watchers: 6
Forks: 3
Open Issues: 11
Metadata Files:
- Readme: README.Rmd
- License: LICENSE

Awesome Lists containing this project

jimsghstars - RoheLab/aPPR - Approximate Personalized Page Rank (R)

README

        ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%",

  error = TRUE

)

```

# aPPR

[![R-CMD-check](https://github.com/RoheLab/aPPR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/RoheLab/aPPR/actions/workflows/R-CMD-check.yaml)

[![Codecov test coverage](https://codecov.io/gh/RoheLab/aPPR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/RoheLab/aPPR?branch=main)

`aPPR` helps you calculate approximate personalized pageranks from large graphs, including those that can only be queried via an API. `aPPR` additionally performs degree correction and regularization, allowing you to recover blocks from stochastic blockmodels.

To learn more about `aPPR` you can:

1. Glance through slides from the [JSM2021](https://github.com/alexpghayes/JSM2021) talk

2. Read the accompanying [paper][chen]

### Installation

You can install the development version from [GitHub](https://github.com/) with:

``` r

# install.packages("devtools")

devtools::install_github("RoheLab/aPPR")

```

### Find the personalized pagerank of a node in an `igraph` graph

```{r igraph-example, message = FALSE}

library(aPPR)

library(igraph)

set.seed(27)

erdos_renyi_graph <- sample_gnp(n = 100, p = 0.5)

erdos_tracker <- appr(

  erdos_renyi_graph,   # the graph to work with

  seeds = "5",         # name of seed node (character)

  epsilon = 0.0005     # desired approximation quality (see ?appr)

)

erdos_tracker

```

You can access the Personalized PageRanks themselves via the `stats` field of `Tracker` objects.

```{r}

erdos_tracker$stats

```

Sometimes you may wish to limit computation time by limiting the number of nodes to visit, which you can do as follows:

```{r igraph-example2}

limited_visits_tracker <- appr(

  erdos_renyi_graph,   

  seeds = "5",         

  epsilon = 1e-10,     

  max_visits = 20      # max unique nodes to visit during approximation

)

limited_visits_tracker

```

### Find the personalized pagerank of a Twitter user using `rtweet`

```{r rtweet-example}

ftrevorc_ppr <- appr(

  rtweet_graph(),

  "ftrevorc",

  epsilon = 1e-4,

  max_visits = 5

)

ftrevorc_ppr

```

### Logging

`aPPR` uses [`logger`](https://daroczig.github.io/logger/) for displaying information to the user. By default, `aPPR` is quite verbose. You can control verbosity by loading `logger` and setting the logging threshold.

```{r logging-example-1, eval = FALSE}

library(logger)

# hide basically all messages (not recommended)

log_threshold(FATAL, namespace = "aPPR")

appr(

  erdos_renyi_graph,   # the graph to work with

  seeds = "5",         # name of seed node (character)

  epsilon = 0.0005     # desired approximation quality (see ?appr)

)

```

If you submit a bug report, please please please include a log file using the TRACE threshold. You can set up this kind of detailed logging via the following:

```{r log-file-example, eval = FALSE}

set.seed(528491)  # be sure to set seed for bug reports

log_appender(

  appender_file(

    "/path/to/logfile.log"  ## TODO: choose a path to log to

  ),

  namespace = "aPPR"

)

log_threshold(TRACE, namespace = "aPPR")

tracker <- appr(

  rtweet_graph(),

  seed = c("hadleywickham", "gvanrossum"),

  epsilon = 1e-6

)

```

### Ethical considerations

People have a right to choose how public and discoverable their information is. `aPPR` will often lead you to accounts that interesting, but also small and out of sight. Do not change the public profile or attention towards these the people running these accounts, or any other accounts, without their permission.

### References

1. Chen, Fan, Yini Zhang, and Karl Rohe. “Targeted Sampling from Massive Block Model Graphs with Personalized PageRank.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 82, no. 1 (February 2020): 99–126. https://doi.org/10.1111/rssb.12349. [arxiv][chen]

2. Andersen, Reid, Fan Chung, and Kevin Lang. “Local Graph Partitioning Using PageRank Vectors.” In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), 475–86. Berkeley, CA, USA: IEEE, 2006. https://doi.org/10.1109/FOCS.2006.44.

[chen]: https://arxiv.org/abs/1910.12937

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/RoheLab/aPPR

Awesome Lists containing this project

README