{"id":27099497,"url":"https://github.com/puzzlef/pagerank","last_synced_at":"2025-04-06T12:35:45.911Z","repository":{"id":109077670,"uuid":"365698481","full_name":"puzzlef/pagerank","owner":"puzzlef","description":"Design of PageRank algorithm for link analysis.","archived":false,"fork":false,"pushed_at":"2024-07-22T13:58:26.000Z","size":288,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-07-22T16:42:42.899Z","etag":null,"topics":["arrays","contribution","experiment","pagerank","pull","push","single-threaded"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/puzzlef.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-05-09T07:57:48.000Z","updated_at":"2024-07-22T13:58:29.000Z","dependencies_parsed_at":"2023-05-31T21:00:48.698Z","dependency_job_id":null,"html_url":"https://github.com/puzzlef/pagerank","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/puzzlef%2Fpagerank","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/puzzlef%2Fpagerank/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/puzzlef%2Fpagerank/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/puzzlef%2Fpagerank/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/puzzlef","download_url":"https://codeload.github.com/puzzlef/pagerank/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247485270,"owners_count":20946397,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arrays","contribution","experiment","pagerank","pull","push","single-threaded"],"created_at":"2025-04-06T12:35:45.242Z","updated_at":"2025-04-06T12:35:45.891Z","avatar_url":"https://github.com/puzzlef.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"Design of **PageRank algorithm** for link analysis.\n\nAll *seventeen* graphs used in below experiments are stored in the\n*MatrixMarket (.mtx)* file format, and obtained from the *SuiteSparse*\n*Matrix Collection*. These include: *web-Stanford, web-BerkStan,*\n*web-Google, web-NotreDame, soc-Slashdot0811, soc-Slashdot0902,*\n*soc-Epinions1, coAuthorsDBLP, coAuthorsCiteseer, soc-LiveJournal1,*\n*coPapersCiteseer, coPapersDBLP, indochina-2004, italy_osm,*\n*great-britain_osm, germany_osm, asia_osm*. The experiments are implemented\nin *C++*, and compiled using *GCC 9* with *optimization level 3 (-O3)*.\nThe system used is a *Dell PowerEdge R740 Rack server* with two *Intel*\n*Xeon Silver 4116 CPUs @ 2.10GHz*, *128GB DIMM DDR4 Synchronous Registered*\n*(Buffered) 2666 MHz (8x16GB) DRAM*, and running *CentOS Linux release*\n*7.9.2009 (Core)*. The execution time of each test case is measured using\n*std::chrono::high_performance_timer*. This is done *5 times* for each\ntest case, and timings are *averaged (AM)*. The *iterations* taken with\neach test case is also measured. `500` is the *maximum iterations* allowed.\nStatistics of each test case is printed to *standard output (stdout)*, and\nredirected to a *log file*, which is then processed with a *script* to\ngenerate a *CSV file*, with each *row* representing the details of a\n*single test case*. This *CSV file* is imported into *Google Sheets*,\nand necessary tables are set up with the help of the *FILTER* function\nto create the *charts*.\n\n\u003cbr\u003e\n\n\n### Comparing with Push computation\n\nThere are two ways (algorithmically) to think of the pagerank calculation.\n1. Find pagerank by **pushing contribution** to *out-vertices*.\n2. Find pagerank by **pulling contribution** from *in-vertices*.\n\nThis experiment ([approach-push]) was to try both of these approaches on a\nnumber of different graphs, running each approach 5 times per graph to get a\ngood time measure. The **push** method is somewhat easier to implement, and is\ndescribed in [this lecture]. However, it requires multiple writes per source\nvertex. On the other hand, the **pull** method requires 2 additional\ncalculations per-vertex, i.e., non-teleport contribution of each vertex, and,\ntotal teleport contribution (to all vertices). However, it requires only 1 write\nper destination vertex.\n\nWhile it might seem that pull method would be a clear winner, the results\nindicate that although **pull** is always **faster** than *push* approach,\nthe difference between the two depends on the nature of the graph. Note\nthat neither approach makes use of *SIMD instructions* which are available\non all modern hardware.\n\n[approach-push]: https://github.com/puzzlef/pagerank/tree/approach-push\n[this lecture]: https://www.youtube.com/watch?v=ke9g8hB0MEo\n\n\u003cbr\u003e\n\n\n### Comparing with Direct (class) computation\n\nThis experiment ([approach-class]) was for comparing the performance between:\n1. Find pagerank using C++ `DiGraph` **class** directly.\n2. Find pagerank using **CSR** representation of DiGraph.\n\nBoth these approaches were tried on a number of different graphs, running\neach approach 5 times per graph to get a good time measure. Using a **CSR**\n(Compressed Sparse Row) representation has the potential for **performance**\n**improvement** for both the methods due to information on vertices and edges\nbeing stored contiguously. Note that neither approach makes use of *SIMD*\n*instructions* which are available on all modern hardware.\n\n[approach-class]: https://github.com/puzzlef/pagerank/tree/approach-class\n\n\u003cbr\u003e\n\n\n### Comparing with Ordered approach\n\nWe generally compute PageRank by initialize the rank of each vertex (to say\n`1/N`, where *N* is the total number of vertices in the graph), and iteratively\nupdating the ranks such that the new rank of each vertex is dependent upon the\nranks of its in-neighbors in the previous iteration. We are calling this the\n**unordered** approach, since we can alter the vertex processing order without\naffecting the result (order *does not* matter). We use *two* rank vectors\n(previous and current) with the *ordered* approach.\n\nIn a standard multi-threaded implementation, we split the workload of updating\nthe ranks of vertices among the threads. Each thread operates on the ranks of\nvertices in the previous iteration, and all threads *join* together at the end\nof each iteration. Hemalatha Eedi et al. [(1)][eedi] discuss barrierless\nnon-blocking implementations of the PageRank algorithm, where threads *do not*\n*join* together, and thus may be on different iteration number at a time. A\nsingle rank vector is used, and the rank of each vertex is updated once it is\ncomputed (so we should call this the **ordered** approach, where the processing\norder of vertices *does* matter).\n\nIn this experiment ([approach-ordered]), we compare the **ordered** and\n**unordered** approaches with a sequential PageRank implementation. We use a\ndamping factor of `α = 0.85`, a tolerance of `τ = 10⁻⁶`, and limit the maximum\nnumber of iterations to `L = 500`. The error between iterations is calculated\nwith *L1 norm*, and the error between the two approaches is also calculated with\n*L1 norm*. The unordered approach is considered the *gold standard*, as it has\nbeen described in the original paper by Larry Page et al. [(2)][page]. *Dead*\n*ends* in the graph are handled by always teleporting any vertex in the graph at\nrandom (*teleport* approach [(3)][teleport]). The teleport contribution to all\nvertices is calculated *once* (for all vertices) at the begining of each\niteration.\n\nFrom the results, we observe that the **ordered approach is faster** than the\n*unordered* approach, **in terms of the number of iterations**. This seems to\nmake sense, as using newer ranks of vertices may accelerate convergence\n(especially in case of long chains). However, the **ordered** approach is **only**\n**slightly faster in terms of time**. Why does this happen? This might be due to\nhaving to access two different vectors (*factors* `f = α/d`, where *d* is the\nout-degree of each vertex; and *ranks* `r`), when compared to the *unordered*\napproach where we need to access a single vector (*contributions* `c = αr/d`,\nwhere *r* denotes rank of each vertex in the previous iteration). This suggests\nthat **ordered** approach **may not be significatly faster** than the unordered\napproach.\n\n[approach-ordered]: https://github.com/puzzlef/pagerank/tree/approach-ordered\n\n\u003cbr\u003e\n\n\n### Adjusting Damping factor\n\nAdjustment of the *damping factor α* is a delicate balancing act. For\nsmaller values of *α*, the convergence is fast, but the *link structure*\n*of the graph* used to determine ranks is less true. Slightly different\nvalues for *α* can produce *very different* rank vectors. Moreover, as\nα → 1, convergence *slows down drastically*, and *sensitivity issues*\nbegin to surface.\n\nFor this experiment ([adjust-damping-factor]), the **damping factor** `α` (which\nis usually `0.85`) is **varied** from `0.50` to `1.00` in steps of `0.05`. This\nis in order to compare the performance variation with each *damping factor*. The\ncalculated error is the *L1 norm* with respect to default PageRank (`α = 0.85`).\nThe PageRank algorithm used here is the *standard power-iteration (pull)*\nbased PageRank. The rank of a vertex in an iteration is calculated as\n`c₀ + αΣrₙ/dₙ`, where `c₀` is the *common teleport contribution*, `α` is the\n*damping factor*, `rₙ` is the *previous rank of vertex* with an incoming edge,\n`dₙ` is the *out-degree* of the incoming-edge vertex, and `N` is the *total*\n*number of vertices* in the graph. The *common teleport contribution* `c₀`,\ncalculated as `(1-α)/N + αΣrₙ/N` , includes the *contribution due to a teleport*\n*from any vertex* in the graph due to the damping factor `(1-α)/N`, and\n*teleport from dangling vertices* (with *no outgoing edges*) in the graph\n`αΣrₙ/N`. This is because a random surfer jumps to a random page upon visiting a\npage with *no links*, in order to avoid the *rank-sink* effect.\n\nResults indicate that **increasing the damping factor α beyond** `0.85`\n**significantly increases convergence time** , and lowering it below\n`0.85` decreases convergence time. As the *damping factor* `α` increases\n*linearly*, the iterations needed for PageRank computation *increases*\n*almost exponentially*. On average, using a *damping factor* `α = 0.95`\nincreases *iterations* needed by `190%` (`~2.9x`), and using a *damping*\n*factor* `α = 0.75` *decreases* it by `41%` (`~0.6x`), compared to\n*damping factor* `α = 0.85`. Note that a higher *damping factor* implies\nthat a random surfer follows links with *higher probability* (and jumps\nto a random page with lower probability).\n\n[adjust-damping-factor]: https://github.com/puzzlef/pagerank/tree/adjust-damping-factor\n\n\u003cbr\u003e\n\n\n### Adjusting Tolerance function\n\nIt is observed that a number of *error functions* are in use for checking\nconvergence of PageRank computation. Although [L1 norm] is commonly used\nfor convergence check, it appears [nvGraph] uses [L2 norm] instead. Another\nperson in stackoverflow seems to suggest the use of *per-vertex tolerance*\n*comparison*, which is essentially the [L∞ norm]. The **L1 norm** `||E||₁`\nbetween two *(rank) vectors* `r` and `s` is calculated as `Σ|rₙ - sₙ|`, or\nas the *sum* of *absolute errors*. The **L2 norm** `||E||₂` is calculated\nas `√Σ|rₙ - sₙ|2`, or as the *square-root* of the *sum* of *squared errors*\n(*euclidean distance* between the two vectors). The **L∞ norm** `||E||ᵢ`\nis calculated as `max(|rₙ - sₙ|)`, or as the *maximum* of *absolute errors*.\n\nThis experiment ([adjust-tolerance-function]) was for comparing the performance\nbetween PageRank computation with *L1, L2* and *L∞ norms* as convergence check,\nfor *damping factor* `α = 0.85`, and *tolerance* `τ = 10⁻⁶`. The PageRank\nalgorithm used here is the *standard power-iteration (pull)* based PageRank. The\nrank of a vertex in an iteration is calculated as `c₀ + αΣrₙ/dₙ`, where `c₀` is\nthe *common teleport contribution*, `α` is the *damping factor*, `rₙ` is the\n*previous rank of vertex* with an incoming edge, `dₙ` is the *out-degree* of the\nincoming-edge vertex, and `N` is the *total number of vertices* in the graph.\nThe *common teleport contribution* `c₀`, calculated as `(1-α)/N + αΣrₙ/N` ,\nincludes the *contribution due to a teleport from* *any vertex* in the graph due\nto the damping factor `(1-α)/N`, and *teleport from dangling vertices* (with *no\noutgoing edges*) in the graph `αΣrₙ/N`. This is because a random surfer jumps to\na random page upon visiting a page with *no links*, in order to avoid the\n*rank-sink* effect.\n\nFrom the results it is clear that PageRank computation with **L∞ norm**\n**as convergence check is the fastest** , quickly followed by *L2 norm*,\nand finally *L1 norm*. Thus, when comparing two or more approaches for an\niterative algorithm, it is important to ensure that all of them use the same\nerror function as convergence check (and the same parameter values). This\nwould help ensure a level ground for a good relative performance comparison.\n\nAlso note that PageRank computation with **L∞ norm** as convergence check\n**completes in a single iteration for all the road networks** *(ending with*\n*_osm)*. This is likely because it is calculated as `||E||ᵢ = max(|rₙ - sₙ|)`,\nand depending upon the *order (number of* *vertices)* `N` of the graph (those\ngraphs are quite large), the maximum rank change for any single vertex does not\nexceed the *tolerance* `τ` value of `10⁻⁶`.\n\n[adjust-tolerance-function]: https://github.com/puzzlef/pagerank/tree/adjust-tolerance-function\n\n\u003cbr\u003e\n\n\n### Adjusting Tolerance\n\nSimilar to the *damping factor* `α` and the *error function* used for\nconvergence check, **adjusting the value of tolerance** `τ` can have a\nsignificant effect. This experiment ([adjust-tolerance]) was for comparing the\nperformance between PageRank computation with *L1, L2* and *L∞ norms* as\nconvergence check, for various *tolerance* `τ` values ranging from `10⁻⁰` to\n`10⁻¹⁰` (`10⁻⁰`, `5×10⁻⁰`, `10⁻¹`, `5×10⁻¹`, ...). The PageRank algorithm used\nhere is the *standard power-iteration (pull)* based PageRank. The rank of a\nvertex in an iteration is calculated as `c₀ + αΣrₙ/dₙ`, where `c₀` is the\n*common teleport contribution*, `α` is the *damping factor*, `rₙ` is the\n*previous rank of vertex* with an incoming edge, `dₙ` is the *out-degree* of the\nincoming-edge vertex, and `N` is the *total number of vertices* in the graph.\nThe *common teleport contribution* `c₀`, calculated as `(1-α)/N + αΣrₙ/N` ,\nincludes the *contribution due to a teleport from any vertex* in the graph due\nto the damping factor `(1-α)/N`, and *teleport from dangling vertices* (with *no*\n*outgoing edges*) in the graph `αΣrₙ/N`. This is because a random surfer jumps to\na random page upon visiting a page with *no links*, in order to avoid the\n*rank-sink* effect.\n\nFor various graphs it is observed that PageRank computation with *L1*, *L2*,\nor *L∞ norm* as *convergence check* suffers from **sensitivity issues**\nbeyond certain (*smaller*) tolerance `τ` values, causing the computation to\nhalt at maximum iteration limit (`500`) without convergence. As *tolerance*\n`τ` is decreased from `10⁻⁰` to `10⁻¹⁰`, *L1 norm* is the *first* to suffer\nfrom this issue, followed by *L2 and L∞ norms (except road networks)*. This\n*sensitivity issue* was recognized by the fact that a given approach *abruptly*\ntakes `500` *iterations* for the next lower *tolerance* `τ` value.\n\nIt is also observed that PageRank computation with *L∞ norm* as convergence\ncheck **completes in just one iteration** (even for *tolerance* `τ ≥ 10⁻⁶`)\nfor large graphs *(road networks)*. This again, as mentioned above, is likely\nbecause the maximum rank change for any single vertex for *L∞ norm*, and\nthe sum of squares of total rank change for all vertices, is quite low for\nsuch large graphs. Thus, it does not exceed the given *tolerance* `τ` value,\ncausing a single iteration convergence.\n\nOn average, PageRank computation with **L∞ norm** as the error function is the\n**fastest**, quickly **followed by** **L2 norm**, and **then** **L1 norm**. This\nis the case with both geometric mean (GM) and arithmetic mean (AM) comparisons\nof iterations needed for convergence with each of the three error functions. In\nfact, this trend is observed with each of the individual graphs separately.\n\nBased on **GM-RATIO** comparison, the *relative iterations* between\nPageRank computation with *L1*, *L2*, and *L∞ norm* as convergence check\nis `1.00 : 0.30 : 0.20`. Hence *L2 norm* is on *average* `70%` *faster*\nthan *L1 norm*, and *L∞ norm* is `33%` *faster* than *L2 norm*. This\nratio is calculated by first finding the *GM* of *iterations* based on\neach *error function* for each *tolerance* `τ` value separately. These\n*tolerance* `τ` specific means are then combined with *GM* to obtain a\n*single mean value* for each *error function (norm)*. The *GM-RATIO* is\nthen the ratio of each *norm* with respect to the *L∞ norm*. The variation\nof *tolerance* `τ` specific means with *L∞ norm* as baseline for various\n*tolerance* `τ` values is shown below.\n\nOn the other hand, based on **AM-RATIO** comparison, the *relative*\n*iterations* between PageRank computation with *L1*, *L2*, and *L∞ norm*\nas convergence check is `1.00 : 0.39 : 0.31`. Hence, *L2 norm* is on\n*average* `61%` *faster* than *L1 norm*, and *L∞ norm* is `26%` *faster*\nthan *L2 norm*. This ratio is calculated in a manner similar to that of\n*GM-RATIO*, except that it uses *AM* instead of *GM*. The variation of\n*tolerance* `τ` specific means with *L∞ norm* as baseline for various\n*tolerance* `τ` values is shown below as well.\n\n[adjust-tolerance]: https://github.com/puzzlef/pagerank/tree/adjust-tolerance\n\n\u003cbr\u003e\n\n\n### Adjusting Tolerance (Ordered approach)\n\n**Unordered PageRank** is the *standard* way of computing PageRank computation,\nwhere *two different rank vectors* are maintained; one representing the\n*current* ranks of vertices, and the other representing the *previous* ranks.\nConversely, **ordered PageRank** uses *a single rank vector* for the current\nranks of vertices [(1)][pagerank]. This is similar to barrierfree non-blocking\nimplementation of PageRank by Hemalatha Eedi et al. [(2)][eedi]. Since ranks are\nupdated in the same vector (with each iteration), the order in which vertices\nare processed *affects* the final outcome (hence the modifier *ordered*).\nNonetheless, as PageRank is a converging algorithm, ranks obtained with either\napproach are *mostly the same*.\n\nIn this experiment ([adjust-tolerance-ordered]), we perform ordered PageRank\nwhile adjusting the tolerance `τ` from `10^-1` to `10^-14` with three different\ntolerance functions: `L1-norm`, `L2-norm`, and `L∞-norm`. We also compare it\nwith unordered PageRank for the same tolerance and tolerance function.  We use a\ndamping factor of `α = 0.85` and limit the maximum number of iterations to\n`L = 500`. The error between the two approaches is calculated with *L1-norm*. The\nunordered approach is considered to be the *gold standard*, as it has been\ndescribed in the original paper by Larry Page et al. [(3)][page]. *Dead ends* in the\ngraph are handled by always teleporting any vertex in the graph at random\n(*teleport* approach [(4)][teleport]). The teleport contribution to all vertices is\ncalculated *once* (for all vertices) at the begining of each iteration.\n\nFrom the results, we observe that the **ordered approach is always faster** than\nthe *unordered* approach, **in terms of the number of iterations**. This seems\nto make sense, as using newer ranks of vertices may accelerate convergence\n(especially in case of long chains). However, the **ordered** approach is\n**not** **always faster in terms of time**. When `L2-norm` is used for\nconvergence check, ordered approach is generally faster for a tolerance less\nthan `τ = 10^-4`. and when `L∞-norm` is used, it is generally faster for a\ntolerance less than `τ = 10^-6`. It looks like a **suitable value of tolerance**\nwith any tolerance function **for the ordered approach** would be\n`τ ∈ [10^-6, 10^-11]`. This could be due to the ordered approach having to access\ntwo different vectors (*factors* `f = α/d`, where *d* is the out-degree of each\nvertex; and *ranks* `r`), when compared to the *unordered* approach where we\nneed to access a single vector (*contributions* `c = αr/d`, where *r* denotes\nrank of each vertex in the previous iteration). This suggests that **ordered**\n**approach is better than the unordered approach when tighter tolerance is**\n**used (but not too tight)**.\n\n[adjust-tolerance-ordered]: https://github.com/puzzlef/pagerank/tree/adjust-tolerance-ordered\n\n\u003cbr\u003e\n\n\n### Adjusting Iteration scaling\n\n[nvGraph PageRank] appears to use [L2-norm per-iteration scaling]. This is\n(probably) required for finding a solution to **eigenvalue problem**. However,\nas the *eigenvalue* for PageRank is `1`, this is not necessary. This experiement\n([adjust-iteration-scaling]) was for observing if this was indeed true, and that\nany such *per-iteration scaling* doesn't affect the number of *iterations*\nneeded to converge. PageRank was computed with **L1**, **L2**, or **L∞-norm**\nand the effect of **L1** or **L2-norm** *scaling of ranks* was compared with\n**baseline (L0)**.\n\nResults match the above assumptions, and indeed no performance benefit\nis observed (except a reduction in a single iteration for *web-Google*\nand *web-NotreDame* graphs).\n\n[adjust-iteration-scaling]: https://github.com/puzzlef/pagerank/tree/adjust-iteration-scaling\n[nvGraph PageRank]: https://github.com/rapidsai/nvgraph/blob/main/cpp/src/pagerank.cu\n[L2-norm per-iteration scaling]: https://github.com/rapidsai/nvgraph/blob/main/cpp/src/pagerank.cu#L145\n\n\u003cbr\u003e\n\u003cbr\u003e\n\n\n## References\n\n- Estimating PageRank on graph streams\n- Fast Distributed PageRank Computation\n- Reducing Pagerank Communication via Propagation Blocking\n- Distributed PageRank computation based on iterative aggregation-disaggregation methods\n- Incremental Query Processing on Big Data Streams\n- PageRank on an evolving graph\n- Streaming graph partitioning for large distributed graphs\n- Auto-parallelizing stateful distributed streaming applications\n- Time-evolving graph processing at scale\n- Towards large-scale graph stream processing platform\n- An FPGA architecture for the Pagerank eigenvector problem\n- Towards Scaling Fully Personalized PageRank: Algorithms, Lower Bounds, and Experiments\n- Parallel personalized pagerank on dynamic graphs\n- Fast personalized PageRank on MapReduce\n- A Dynamical System for PageRank with Time-Dependent Teleportation\n- Fast PageRank Computation on a GPU Cluster\n- Accelerating PageRank using Partition-Centric Processing\n- Evaluation of distributed stream processing frameworks for IoT applications in Smart Cities\n- The power of both choices: Practical load balancing for distributed stream processing engine\n- A Survey on PageRank Computing\n- Benchmarking Distributed Stream Processing Platforms for IoT Applications\n- Toward Efficient Hub-Less Real Time Personalized PageRank\n- Identifying Key Users in Online Social Networks: A PageRank Based Approach\n- RIoTBench: An IoT benchmark for distributed stream processing systems\n- Efficient PageRank Tracking in Evolving Networks\n- Efficient Computation of PageRank\n- Deeper Inside PageRank\n- DISTINGER: A distributed graph data structure for massive dynamic graph processing\n- The PageRank Problem, Multiagent Consensus, and Web Aggregation: A Systems and Control Viewpoint\n- Cognitive spammer: A Framework for PageRank analysis with Split by Over-sampling and Train by Under-fitting\n- FrogWild! -- Fast PageRank Approximations on Graph Engines\n- RIoTBench: A Real-time IoT Benchmark for Distributed Stream Processing Platforms\n- Approximate Personalized PageRank on Dynamic Graphs\n- LBSNRank: personalized pagerank on location-based social networks\n- DataMPI: Extending MPI to Hadoop-Like Big Data Computing\n- Performance evaluation of big data frameworks for large-scale data analytics\n- SP-Partitioner: A novel partition method to handle intermediate data skew in spark streaming\n- Practice of Streaming Processing of Dynamic Graphs: Concepts, Models, and Systems\n- The applications of graph theory to investing\n- Scalability! But at what COST?\n- Mapreduce is Good Enough? If All You Have is a Hammer, Throw Away Everything That's Not a Nail!\n- G-Store: High-Performance Graph Store for Trillion-Edge Processing\n- X-Stream: edge-centric graph processing using streaming partitions\n- [PageRank Algorithm, Mining massive Datasets (CS246), Stanford University](https://www.youtube.com/watch?v=ke9g8hB0MEo)\n- [SuiteSparse Matrix Collection](https://sparse.tamu.edu)\n\n\u003cbr\u003e\n\u003cbr\u003e\n\n\n[![](https://i.imgur.com/89cRRdY.jpg)](https://www.youtube.com/watch?v=iMdq5_5eib0)\n\n\n[Prof. Dip Sankar Banerjee]: https://sites.google.com/site/dipsankarban/\n[Prof. Kishore Kothapalli]: https://faculty.iiit.ac.in/~kkishore/\n[SuiteSparse Matrix Collection]: https://sparse.tamu.edu/\n[this lecture]: https://www.youtube.com/watch?v=ke9g8hB0MEo\n[eedi]: https://ieeexplore.ieee.org/document/9407114\n[page]: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.5427\n[pagerank]: https://github.com/puzzlef/pagerank\n[teleport]: https://gist.github.com/wolfram77/94c38b9cfbf0c855e5f42fa24a8602fc\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpuzzlef%2Fpagerank","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpuzzlef%2Fpagerank","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpuzzlef%2Fpagerank/lists"}