{"id":16275944,"url":"https://github.com/gvegayon/parallel","last_synced_at":"2025-10-12T05:18:41.035Z","repository":{"id":13575288,"uuid":"16267791","full_name":"gvegayon/parallel","owner":"gvegayon","description":"PARALLEL: Stata module for parallel computing","archived":false,"fork":false,"pushed_at":"2023-10-17T18:19:20.000Z","size":8052,"stargazers_count":130,"open_issues_count":32,"forks_count":26,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-10-12T05:18:39.985Z","etag":null,"topics":["bootstrap","hpc","parallel","parallelization","simulation","stata"],"latest_commit_sha":null,"homepage":"https://rawgit.com/gvegayon/parallel/master/ado/parallel.html","language":"Stata","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gvegayon.png","metadata":{"files":{"readme":"README.Rmd","changelog":"ChangeLog","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null},"funding":{"github":"gvegayon","patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null,"custom":null}},"created_at":"2014-01-27T02:58:41.000Z","updated_at":"2025-10-08T16:39:32.000Z","dependencies_parsed_at":"2023-01-11T20:21:01.737Z","dependency_job_id":"2f44867f-a038-4211-b940-f93942b219f9","html_url":"https://github.com/gvegayon/parallel","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/gvegayon/parallel","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gvegayon%2Fparallel","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gvegayon%2Fparallel/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gvegayon%2Fparallel/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gvegayon%2Fparallel/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gvegayon","download_url":"https://codeload.github.com/gvegayon/parallel/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gvegayon%2Fparallel/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279010338,"owners_count":26084738,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-12T02:00:06.719Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bootstrap","hpc","parallel","parallelization","simulation","stata"],"created_at":"2024-10-10T18:46:16.394Z","updated_at":"2025-10-12T05:18:41.001Z","avatar_url":"https://github.com/gvegayon.png","language":"Stata","funding_links":["https://github.com/sponsors/gvegayon"],"categories":[],"sub_categories":[],"readme":"---\ntitle: \"PARALLEL: Stata module for parallel computing\"\nauthor: \"\"\ndate: \"\"\noutput: \n  md_document:\n    variant: \"markdown_github\"\n---\n\n\u003c!-- This file was originally used to create README.md, but is currently stale (README.md has been updated directlyed). --\u003e\n\n# PARALLEL: Stata module for parallel computing\n\nParallel lets you **run Stata faster**, sometimes faster than MP itself. By organizing your job in several Stata instances, parallel allows you to work with out-of-the-box parallel computing. Using the the `parallel` prefix, you can get **faster simulations, bootstrapping, reshaping big data**, etc. without having to know a thing about parallel computing. With **no need of having Stata/MP** installed on your computer, parallel has showed to dramatically speedup computations up to two, four, or more times depending on how many processors your computer has.\n\nSee also the HTML version of the program [help file](https://rawgit.com/gvegayon/parallel/master/ado/parallel.html).\n\nStata 2017 conference presentation: \u003chttps://github.com/gvegayon/parallel/blob/master/talks/20170727_stata_conference/20170727_stata_conference_handout.pdf\u003e\n\nSSC at Boston College: \u003chttp://ideas.repec.org/c/boc/bocode/s457527.html\u003e (though the SSC version is a bit out-of-date, see below)\n\n1. [Installation](#installation)\n2. [Minimal examples](#minimal-examples)\n2. [Authors](#authors)\n\nCitation {#citation}\n=======\n\nWhen using `parallel`, please include the following:\n\nVega Yon GG, Quistorff B. parallel: A command for parallel computing. The Stata Journal. 2019;19(3):667-684. doi:10.1177/1536867X19874242\n\nOr use the following bibtex entry:\n\n```bib\n@article{\n  VegaYon2019,\n  author = {George G. {Vega Yon} and Brian Quistorff},\n  title ={parallel: A command for parallel computing},\n  journal = {The Stata Journal},\n  volume = {19},\n  number = {3},\n  pages = {667-684},\n  year = {2019},\n  doi = {10.1177/1536867X19874242},\n  URL = {https://doi.org/10.1177/1536867X19874242},\n  eprint = {https://doi.org/10.1177/1536867X19874242}\n}\n```\n\nInstallation {#installation}\n=======\n\nIf you have a previous installation of `parallel` installed from a different source (SSC, specific folder, specific URL) you should uninstall that first. Once installed it is suggested to restart Stata. \n\nSSC\n---\n\nFor accessing SSC version of parallel\n\n``` stata\nssc install parallel, replace\nmata mata mlib index\n```\n\nDevelopment Version (Latest/Master)\n--------------------------\n\nFor accessing the latest development version of parallel (from here) using Stata version \\\u003e=13\n\n``` stata\nnet install parallel, from(https://raw.github.com/gvegayon/parallel/master/) replace\nmata mata mlib index\n```\n\nFor Stata version \\\u003c13, download as zip, unzip, and then replace the above `net install` with\n\n``` stata\nnet install parallel, from(full_local_path_to_files) replace\n```\n\nDevelopment Version (Other Releases)\n------------------------------------\n\nAccess other development releases via the [Releases Page](https://github.com/gvegayon/parallel/releases). You can use the release tag to install over the internet. For example,\n\n``` stata\nnet install parallel, from(https://raw.github.com/gvegayon/parallel/v1.15.8.19/) replace\nmata mata mlib index\n```\n\nOr you can download the release and install locally (for Stata \\\u003c13).\n\nMinimal examples {#minimal-examples}\n===============\n\nThe following minimal examples have been written to introduce how to use the module. Please notice that the only examples actually designed to show potential speed gains are [parfor](#parfor) and [bootstrap](#bootstraping).\n\nThe examples have been executed on a Dell Vostro 3300 notebook running Ubuntu 14.04 with an Intel Core i5 CPU M 560 (2 physical cores) with 8Gb of RAM, using Stata/IC 12.1 for Unix (Linux 64-bit x86-64).\n\nFor more examples and details please refer to the module's help file or the wiki [Gallery page](https://github.com/gvegayon/parallel/wiki/Gallery).\n\n```{r setup, echo=FALSE}\nknitr::opts_chunk$set(autodep = TRUE, echo=FALSE, comment='')\n```\n\n## Simple parallelization of egen\n\nWhen conducted over groups, parallelizing `egen` can be useful. In the following example we show how to use `parallel` with `by: egen`.\n\n```{stata}\nparallel setclusters 2, f\nsysuse auto\nparallel, by(foreign): egen maxp = max(price)\ntab maxp\n```\n\nWhich is the ``parallel'' way to do:\n\n```{stata}\nsysuse auto\nbysort foreign: egen maxp = max(price)\ntab maxp\n```\n\n\n## Bootstrapping {#examples-bootstrap}\n\nIn this example we'll evaluate a regression model using bootstrapping which, together with simulations, is one of the best ways to use parallel\n\n```{stata}\nsysuse auto, clear\nparallel setclusters 4, f\ntimer on 1\nparallel bs, reps(5000): reg price c.weig##c.weigh foreign rep\ntimer off 1\ntimer list\n```\n\nWhich is the ``parallel way'' to do:\n\n```{stata}\nsysuse auto, clear\ntimer on 2\nbs, reps(5000) nodots: reg price c.weig##c.weigh foreign rep\ntimer off 2\ntimer list\n```\n\n\n## Simulation\n\nFrom the `simulate` stata command:\n\n```{stata}\nparallel setclusters 2, f\nprogram define lnsim, rclass\n  version 12.1\n  syntax [, obs(integer 1) mu(real 0) sigma(real 1) ]\n  drop _all\n  set obs `obs'\n  tempvar z\n  gen `z' = exp(rnormal(`mu',`sigma'))\n  summarize `z'\n  return scalar mean = r(mean)\n  return scalar Var  = r(Var)\nend\nparallel sim, expr(mean=r(mean) var=r(Var)) reps(10000): lnsim, obs(100)\n\nsumm\n```\n\nwhich is the parallel way to do\n\n```{stata}\nprogram define lnsim, rclass\n  version 12.1\n  syntax [, obs(integer 1) mu(real 0) sigma(real 1) ]\n  drop _all\n  set obs `obs'\n  tempvar z\n  gen `z' = exp(rnormal(`mu',`sigma'))\n  summarize `z'\n  return scalar mean = r(mean)\n  return scalar Var  = r(Var)\nend\nsimulate mean=r(mean) var=r(Var), reps(10000) nodots: lnsim, obs(100)\n\nsumm\n```\n\n## parfor {#examples-parfor}\n\nIn this example we create a short program (`parfor`) which is intended to work as a `parfor` program, this is, looping through 1/N in a parallel fashion\n\n```{stata}\n// Cleaning working space\nclear all\ntimer clear\n\n// Set up\nset seed 123\nlocal n = 5e6\nset obs `n'\ngen x = runiform()\ngen y_pll = .\nclonevar y_ser = y_pll\n\n// Loop replacement function\nprog def parfor\n\targs var\n\tforval i=1/`=_N' {\n\t\tqui replace `var' = sqrt(x) in `i'\n\t}\nend\n\n// Running the algorithm in parallel fashion\ntimer on 1\nparallel setclusters 4, f\nparallel, prog(parfor): parfor y_pll\ntimer off 1\n\n// Running the algorithm in a serial way\ntimer on 2\nparfor y_ser\ntimer off 2\n\n// Is there any difference?\nlist in 1/10\ngen diff = y_pll != y_ser\ntab diff\n\n// Comparing time\ntimer list\ndi \"Parallel is `=round(r(t2)/r(t1),.1)' times faster\"\n\n```\n\nBuilding {#building}\n================\nIf you need to use `parallel` on an older version of Stata than what we build here, you can build and install the package locally.\n\nYou will need to install [Stata devtools](https://github.com/gvegayon/devtools) to build the package and `log2html` to build the html version of the help.\n\nThen you can go to `ado/` and either `do compile.do` or `do compile_and_install.do` depending on whether you want to just build the package (`.mlib`) or also install. There are also several build build checks in the `makefile` that can easily be run from Linux.\n\nAuthors {#authors}\n======\nGeorge G. Vega [aut,cre]\ng.vegayon %at% gmail\n\nBrian Quistorff [aut]\nbrian-work %at% quistorff . com\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgvegayon%2Fparallel","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgvegayon%2Fparallel","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgvegayon%2Fparallel/lists"}