{"id":24605855,"url":"https://github.com/r0f1/sashelp","last_synced_at":"2026-01-05T04:02:46.988Z","repository":{"id":116256241,"uuid":"68597328","full_name":"r0f1/sashelp","owner":"r0f1","description":"SAS code snippets","archived":false,"fork":false,"pushed_at":"2019-12-26T09:46:16.000Z","size":90,"stargazers_count":38,"open_issues_count":0,"forks_count":11,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-01-24T16:38:14.406Z","etag":null,"topics":["biostatistics","datascience","sas","statisics"],"latest_commit_sha":null,"homepage":"","language":"SAS","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/r0f1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2016-09-19T10:59:26.000Z","updated_at":"2023-12-01T17:59:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"ae08c16d-0120-4f91-9361-56c7a9cc3719","html_url":"https://github.com/r0f1/sashelp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/r0f1%2Fsashelp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/r0f1%2Fsashelp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/r0f1%2Fsashelp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/r0f1%2Fsashelp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/r0f1","download_url":"https://codeload.github.com/r0f1/sashelp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244198358,"owners_count":20414445,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["biostatistics","datascience","sas","statisics"],"created_at":"2025-01-24T16:32:07.359Z","updated_at":"2026-01-05T04:02:46.914Z","avatar_url":"https://github.com/r0f1.png","language":"SAS","funding_links":[],"categories":[],"sub_categories":[],"readme":"### Investigating N, Distribution\r\n\r\n```SAS\r\n* print all names of available variables ;\r\nproc contents data=alldat nodetails varnum;\r\nrun;\r\n\r\n* print all names of available variables with custom macro %parse(dataset, regex) ;\r\n%put All available variables: %parse(alldat, /.*/); \r\n\r\n\r\n* print distribution ;\r\nproc means data=alldat maxdec=2 nolabels missing n nmiss mean std;\r\n    var age;\r\n    class exposure;\r\nrun;\r\n\r\n* print more detailed information about distribution ;\r\nproc means data=alldat nolabels missing n nmiss mean median min max p1 p5 q1 q3 p95 p99 std;\r\n    var bmi n_cigarettes alc_grams;\r\n    class period exposure;\r\nrun;\r\n\r\n* cross-tabulating ;\r\nproc freq data=alldat noprint;\r\n    where gender=1;\r\n    tables (age height bmi)*year / missing out=result;\r\nrun;\r\n\r\n* sum over a column;\r\nproc summary data=alldat;\r\n    var nbirths;\r\n    output out=births sum=;\r\nrun;\r\nproc print data=births noobs; run; \r\n```\r\n\r\n\r\n### Investigating Interesting Observations\r\n\r\n```SAS\r\n* print all observations satisfying certain criteria ;\r\nproc print data=alldat;\r\n    where bmi \u003e 25;\r\n    var name gender smoking;\r\nrun;\r\n\r\n* print the first 20 observations ;\r\nproc print data=alldat(obs=20);\r\nrun;\r\n\r\n* print the observations 50 through 70 ;\r\nproc print data=alldat(firstobs=50 obs=70);\r\nrun;\r\n\r\n* print a random subset of the data ;\r\nproc surveyselect data=alldat method=srs rep=1 sampsize=50 seed=1 out=random_sample;\r\nrun;\r\nproc print data=random_sample;\r\nrun;\r\n\r\n* print to a specific output file ;\r\nproc printto print='path/to/my/file.sasoutput' new; run;\r\n    * call proc freq, proc means, etc.;\r\nproc printto; run;\r\n```\r\n\r\n\r\n### Importing/Exporting Datasets\r\n\r\n```SAS\r\n* import ;\r\n* read from csv file separated by commas ;\r\nproc import datafile=\"/path/to/data.csv\" out=alldat dbms=csv replace;\r\n    getnames=yes;\r\nrun;\r\n\r\n* read from csv file separated by other separator ;\r\noptions locale=de_AT dflang=locale; \r\nproc import datafile=\"/path/to/data.csv\" out=alldat dbms=dlm replace;\r\n    getnames=yes;\r\n    delimiter=\";\"; /* use delimiter=\"|\" for pipe separated files */\r\n    guessingrows=max;\r\nrun;\r\noptions locale=en_US dflang=locale; \r\n\r\n* export ;\r\n* write to excel spreadsheet ;\r\n%include \"export_excel.sas\";\r\n%export_excel(alldat, keep=year gender age, where=bmi le 30,\r\n                folder=\"/path/to/folder\", filename=\"filename.xlsx\");\r\n```\r\n\r\n### Renaming / Deleting\r\n\r\n```SAS\r\n* delete libnames, filenames ;\r\nlibname  mylib  clear;\r\nfilename myfile clear;\r\n\r\n* rename a dataset ;\r\nproc datasets nolist;\r\n    change myoldname = newname;\r\nquit; run;\r\n\r\n* delete datasets by enumeration or by common prefix (here _tmp_) ;\r\nproc datasets nolist nowarn nodetails;\r\n    delete olddata _tmp_: ;\r\nquit; run;\r\n\r\n* delete entire library ;\r\nproc datasets library=work kill nolist; \r\nquit; run;\r\n```\r\n\r\n### Just some test data\r\n\r\n```SAS\r\ndata testdata;\r\n    y=1;\r\n    do i=1 to 30;\r\n        x=i+1;\r\n        y=y+x;\r\n        z=y+4;\r\n        output;\r\n    end;\r\nrun;\r\n```\r\n\r\n### Graphs and Figures\r\n\r\n```SAS\r\n* preamble ;\r\nods listing style=statistical gpath=\"/path/to/my/folder\"; \r\nods graphics on / reset=all imagename=\"my_filename\" height=720px;\r\n\r\n* proc sgplot, univariate, etc. ;\r\n\r\nods _all_ close;\r\n\r\n\r\n* histogram ;\r\nproc univariate data=alldat;\r\n    histogram height / normal; * normal causes a normal distribution to be plotted ;\r\nrun;\r\n* cumulative density function ;\r\nproc univariate data=alldat;\r\n    cdfplot height / normal noecdf; * noecdf leaves only the normal density function ;\r\nrun;\r\n* barchart (grouped) where you want to hide a fake observation that has obsweight=0 ;\r\nproc sgplot data=\u0026dataset. pctlevel=group;\r\n    vbar year / group=subtype stat=percent legendlabel=\"\" name=\"a\" legendlabel=\"\" weight=obsweight;\r\n    keylegend \"a\";\r\nrun;\r\n```\r\n\r\n### Proc SQL\r\n\r\nDeriving new datasets\r\n\r\n```SAS\r\n* summing/grouping over all age groups ;\r\nproc sql noprint;\r\n\tcreate table autpop as \r\n\t\tselect year, gender, sum(population) as population\r\n\t\tfrom austrian_population \r\n\t\tgroup by year, gender\r\n\t\torder by year, gender;\r\nquit; run;\r\n\r\n\r\n\r\n* left joining ;\r\nproc sql noprint;\r\n    create table alldat as\r\n        select * \r\n        from rate_by_agegroup e left join population_by_agegroup a \r\n            on e.gender=a.gender and e.year=a.year and e.ag=a.ag;\r\nquit; run;\r\n\r\n```\r\n\r\nDeriving macro variables\r\n\r\n```SAS\r\n\r\n* select minimum, maximum into a macro variable ;\r\nproc sql noprint;\r\n    select min(rate), max(rate) into :min_y, :max_y \r\n    from alldat; \r\nquit; run;\r\n\r\n* select distinct values into a macro variable ;\r\nproc sql noprint;\r\n    select distinct(stage) into :stages separated by \" \"\r\n    from alldat;\r\nquit;\r\n\r\n* select number of different values into a macro variable ;\r\nproc sql noprint;\r\n\tselect count(distinct(group)) into :n\r\n\tfrom alldat;\r\nquit; run;\r\n\r\n\r\n\r\n* select variables of a dataset into a macro variable in alphabetical order then print the dataset ;\r\nproc sql noprint;                               \r\n    select distinct name into :varlist separated by ','              \r\n    from dictionary.columns                      \r\n    where libname='WORK' and memname='ALLDAT'\r\n    order by name;\r\nquit; run;\r\nproc sql noprint;                               \r\n    create table printme as select \u0026varlist from alldat;\r\nquit; run;\r\nproc print data=printme; \r\n    var \u0026varlist;\r\nrun;\r\n\r\n```\r\n### proc sort - Sort Dataset\r\n\r\n```SAS\r\n* out= option is optional ;\r\n* here: sort by lighest to heaviest ppl and oldest to youngest ppl ;\r\nproc sort data=alldat out=alldat_sorted;\r\n    by weight descending age;\r\nrun;\r\n\r\n* keep only the first id, that is in the physical file ;\r\nproc sort data=alldat nodupkey;\r\n    by id;\r\nrun;\r\n\r\n* use nodup, if you only want to throw out exact tuple duplicates instead ;\r\n```\r\n\r\n### proc rank - Creating Quantiles\r\n\r\nCreate tertiles, quartiles, quintiles, deciles, etc.\r\n\r\n```SAS\r\n* out= option is important ;\r\nproc rank data=alldat out=alldat groups=4;\r\n    var bmi;\r\n    ranks bmi_q;\r\nrun;\r\n```\r\n\r\n\r\n### proc mi - Missingness Patterns\r\n\r\nInvestigate missing values\r\n\r\n```SAS\r\nproc mi data=alldat nimpute=0;\r\n    var age height bmi;\r\n    ods select misspattern;\r\nrun;\r\n```\r\n\r\n### proc transpose - Transposing a Dataset\r\n\r\n```SAS\r\nproc sort data=pop; by agegrp gender; run;\r\n\r\n* prefix= and id are optional, cause the columns to have names ;\r\nproc transpose data=pop out=lexis_pop prefix=period_;\r\n    id period;\r\n    var population;\r\n    by agegrp gender;\r\nrun;\r\n\r\n```\r\n\u003cdetails\u003e\r\n\u003csummary\u003eTables before and after transposing (click to expand)\u003c/summary\u003e\r\n\r\n**before**\r\n\r\n|agegrp|gender|period|population|\r\n|---|---|---|---|\r\n|1|1|2000|1|\r\n|1|1|2001|2|\r\n|1|1|2002|3|\r\n|1|1|2003|4|\r\n|1|2|2000|5|\r\n|1|2|2001|6|\r\n|1|2|2002|7|\r\n|1|2|2003|8|\r\n|2|1|2000|9|\r\n|2|1|2001|10|\r\n|2|1|2002|11|\r\n|2|1|2003|12|\r\n|2|2|2000|13|\r\n|2|2|2001|14|\r\n|2|2|2002|15|\r\n|2|2|2003|16|\r\n  \r\n**after**\r\n\r\n|agegrp|gender|\\_name\\_|period_2000|period_2001|period_2002|period_2003|\r\n|---|---|---|---|---|---|---|\r\n|1|1|population|1|2|3|4|\r\n|2|1|population|5|6|7|8|\r\n|1|2|population|9|10|11|12|\r\n|2|2|population|13|14|15|16|\r\n\u003c/details\u003e\r\n\r\n\r\n### proc stdrate - Age Adjusting\r\n\r\n```SAS\r\nods _all_ close;\r\n\r\nproc stdrate data=alldat\r\n\t\t\t refdata=eustd\r\n\t\t\t method=direct\r\n\t\t\t stat=rate effect\r\n\t\t\t plots=none;\r\n\t\tby year;\r\n\t\tpopulation group=gender event=cancer total=population;\r\n\t\treference  total=eu_population;\r\n\t\tstrata     agegrp / stats effect;\r\n\t\tods output strataeffect=inc_std stdrate=inc_std2;\r\nrun;\r\n```\r\n\r\n### proc loess - Scatter Plot Smoothing\r\n\r\n```SAS\r\nproc sort data=breast_stage; by stage year; run;\r\n\r\nproc loess data=breast_stage plots=none;\r\n    model rate=year;\r\n    by stage;\r\n    output out=breast_stage_pred;\r\nrun;\r\n```\r\n\r\n### proc reg - Linear Regression\r\n\r\n```SAS\r\nproc sort data=alldat; by gender yeargrp; run;\r\n\r\nods _all_ close;\r\n\r\nproc reg data=alldat tableout outest=test_est;\r\n    model stdrate = year graph inter;\r\n    by gender yeargrp;\r\nrun;\r\n```\r\n\r\n### proc genmod - Generalized Linear Models\r\n\r\nThe class of generalized linear models is an extension of traditional linear models that allows the mean of a population to depend on a linear predictor through a nonlinear link function and allows the response probability distribution to be any member of an exponential family of distributions. Many widely used statistical models are generalized linear models. These include classical linear models with normal errors, logistic and probit models for binary data, and log-linear models for multinomial data.\r\n\r\n```SAS\r\n* reslik represents the likelihood residual for identifying poorly fitted observations;\r\n\r\nproc genmod data=alldat;\r\n    model score = calories;\r\n    output out=alldat reslik=resscore;\r\nrun;\r\n```\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fr0f1%2Fsashelp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fr0f1%2Fsashelp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fr0f1%2Fsashelp/lists"}