{"id":16481442,"url":"https://github.com/wsmelton/stackdb","last_synced_at":"2026-04-11T22:52:50.178Z","repository":{"id":147902294,"uuid":"77767705","full_name":"wsmelton/stackdb","owner":"wsmelton","description":"PowerShell module to build a SQL Server database from the StackExchange data archives","archived":false,"fork":false,"pushed_at":"2019-08-21T12:17:12.000Z","size":86,"stargazers_count":1,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-28T21:42:57.400Z","etag":null,"topics":["database","powershell","sql-server-database","stackexchange","stackoverflow"],"latest_commit_sha":null,"homepage":"https://archive.org/details/stackexchange","language":"PowerShell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wsmelton.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"wsmelton","patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"custom":"paypal.me/wshawnmelton"}},"created_at":"2017-01-01T05:59:01.000Z","updated_at":"2020-07-22T16:16:51.000Z","dependencies_parsed_at":"2023-05-27T21:15:33.463Z","dependency_job_id":null,"html_url":"https://github.com/wsmelton/stackdb","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/wsmelton/stackdb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wsmelton%2Fstackdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wsmelton%2Fstackdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wsmelton%2Fstackdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wsmelton%2Fstackdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wsmelton","download_url":"https://codeload.github.com/wsmelton/stackdb/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wsmelton%2Fstackdb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31698152,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-11T21:17:31.016Z","status":"ssl_error","status_checked_at":"2026-04-11T21:17:24.556Z","response_time":54,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","powershell","sql-server-database","stackexchange","stackoverflow"],"created_at":"2024-10-11T13:07:27.688Z","updated_at":"2026-04-11T22:52:50.160Z","avatar_url":"https://github.com/wsmelton.png","language":"PowerShell","funding_links":["https://github.com/sponsors/wsmelton","paypal.me/wshawnmelton"],"categories":[],"sub_categories":[],"readme":"# Summary\n\nPowerShell module to build an SQL Server database(s) from the [StackExchange Archives](https://archive.org/details/stackexchange). You can use this to create the database, tables and then import the data.\n\n\u003ctable\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003ctd\u003e\u003cimg align=\"left\" src=\"https://wshawnmelton.visualstudio.com/_apis/public/build/definitions/640c5abb-34bd-4423-9e10-8f7e92e7f918/2/badge\"\u003e\u003c/td\u003e\n\u003ctd\u003eDev Build Status\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\u003ctd\u003e\u003cimg align=\"left\" src=\"https://wshawnmelton.visualstudio.com/_apis/public/build/definitions/640c5abb-34bd-4423-9e10-8f7e92e7f918/1/badge\"\u003e\u003c/td\u003e\u003ctd\u003eCI Status\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\n\n# Example\n\nThe example below shows the general process to utilize in order to create a database from StackExchange data dumps:\n\n```powershell\nInstall-Module stackdb\n\nC:\\\u003e Get-StackArchive -SiteName woodworking -ListSite | Select-Object TinyName, Name, Total* | Format-Table\n\nTinyName      Name             TotalQuestions TotalAnswers TotalUsers TotalComments TotalTags\n--------      ----             -------------- ------------ ---------- ------------- ---------\nwoodworking   Woodworking      2142           4531         4852       12451         205\nwoodworkingme Woodworking Meta 124            228          335        596           72\n```\n\nI want to create a database for the woodworking data dump.\n\n```powershell\nC:\\\u003e Get-StackArchive -SiteName woodworking -DownloadPath c:\\temp\n[Get-StackArchive] Downloading https://archive.org/download/stackexchange/woodworking.stackexchange.com.7z to c:\\temp\\woodworking.stackexchange.com.7z][Get-StackArchive] Download completed!\nC:\\\u003e Get-ChildItem C:\\temp\\woodworking.stackexchange.com.7z\n\n    Directory: C:\\temp\n\n\nMode                LastWriteTime         Length Name\n----                -------------         ------ ----\n-a----       2018-06-08   8:47 PM        7038655 woodworking.stackexchange.com.7z\n```\n\nI need to expand that 7z file to access the XML files of the data dump. This will assume that you have 7z installed using the Windows installer. If you have happen to have just the executable (exe) you can set this at the module level using `Set-StackdbConfig -Name zpp.7zippath -Value 'C:\\whatever\\7z.exe'`\n\n```powershell\nC:\\\u003e Expand-StackArchive -FileName C:\\temp\\woodworking.stackexchange.com.7z -ExportPath c:\\temp\n\n7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30\n\nScanning the drive for archives:\n1 file, 7038655 bytes (6874 KiB)\n\nExtracting archive: C:\\temp\\woodworking.stackexchange.com.7z\n--\nPath = C:\\temp\\woodworking.stackexchange.com.7z\nType = 7z\nPhysical Size = 7038655\nHeaders Size = 306\nMethod = BZip2\nSolid = +\nBlocks = 1\n\nEverything is Ok\n\nFiles: 8\nSize:       37051719\nCompressed: 7038655\n```\n\nThe next thing I need is a database to import the data. I use containers for my most of my testing and all of my database files reside under `C:\\sqlfiles`:\n\n```powershell\nC:\\\u003e New-StackDatabase -SqlServer 'localhost,1416' -DatabaseName 'woodworkingse' -DataPath c:\\sqlfiles -LogPath c:\\sqlfiles\n[20:59:35][New-StackDatabase] woodworkingse created on [localhost,1416]\n C:\\\u003e Get-DbaDatabase -SqlInstance 'localhost,1416' -Database 'woodworkingse' | ft\n\nComputerName InstanceName SqlInstance  Name          Status IsAccessible RecoveryModel LogReuseWaitStatus SizeMB Compatibility Collation\n------------ ------------ -----------  ----          ------ ------------ ------------- ------------------ ------ ------------- ---------\n67D218C1FB41 MSSQLSERVER  67D218C1FB41 woodworkingse Normal         True          Full            Nothing    175    Version130 SQL_Latin1_Gen...\n```\n\nNow we just need to import all the data into the pre-built tables.\n\n```powershell\nC:\\\u003e Import-StackArchive -Folder C:\\temp\\woodworking.stackexchange.com\\ -SqlServer 'localhost,1416'-Database woodworkingse -Schema 'dbo'\nC:\\\u003e Get-DbaTable -SqlInstance 'localhost,1416' -Database woodworkingse | select database, schema, name, rowcount\nDatabase      Schema Name                  RowCount\n--------      ------ ----                  --------\nwoodworkingse dbo    Badges                    9377\nwoodworkingse dbo    CloseReasonIdDesc            5\nwoodworkingse dbo    Comments                 12451\nwoodworkingse dbo    PostHistory              17261\nwoodworkingse dbo    PostHistoryTypeIdDesc       22\nwoodworkingse dbo    PostLinks                  889\nwoodworkingse dbo    PostLinkTypeIdDesc           2\nwoodworkingse dbo    Posts                     6903\nwoodworkingse dbo    PostsTypeIdDesc              2\nwoodworkingse dbo    Tags                       205\nwoodworkingse dbo    Users                     4852\nwoodworkingse dbo    Votes                    33576\nwoodworkingse dbo    VoteTypeIdDesc              13\n```\n\n## ToDo\n\n- Build out `Invoke-StackDatabase`, wrapper function that calls all supported commands in proper sequence. Can use splatting to handle all the parameters that will be required.\n    1. Get-StackArchive\n    2. Expand-StackArchive\n    3. New-StackDatabase (deal with if database does not exist, or if it does and tables don't)\n    4. Import-StackArchive (all of it)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwsmelton%2Fstackdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwsmelton%2Fstackdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwsmelton%2Fstackdb/lists"}