{"id":24430757,"url":"https://github.com/gusanmaz/psuedo_big_data","last_synced_at":"2026-04-28T10:31:23.809Z","repository":{"id":189134614,"uuid":"615048416","full_name":"gusanmaz/psuedo_big_data","owner":"gusanmaz","description":"pseudo_big_data is a Python package that generates mock-up datasets for various data types and sizes for testing and development purposes.","archived":false,"fork":false,"pushed_at":"2023-03-16T21:22:20.000Z","size":3361,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-20T14:57:36.424Z","etag":null,"topics":["algorithms","algorithms-and-data-structures","faker-generator","mock-data","mock-data-generator","pseudo-data"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gusanmaz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-03-16T21:05:49.000Z","updated_at":"2024-05-17T19:51:46.000Z","dependencies_parsed_at":"2023-08-20T11:03:24.126Z","dependency_job_id":null,"html_url":"https://github.com/gusanmaz/psuedo_big_data","commit_stats":null,"previous_names":["gusanmaz/psuedo_big_data"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gusanmaz%2Fpsuedo_big_data","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gusanmaz%2Fpsuedo_big_data/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gusanmaz%2Fpsuedo_big_data/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gusanmaz%2Fpsuedo_big_data/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gusanmaz","download_url":"https://codeload.github.com/gusanmaz/psuedo_big_data/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243466987,"owners_count":20295309,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithms","algorithms-and-data-structures","faker-generator","mock-data","mock-data-generator","pseudo-data"],"created_at":"2025-01-20T14:57:43.118Z","updated_at":"2025-12-29T10:40:25.464Z","avatar_url":"https://github.com/gusanmaz.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Pseudo Big Data Generator\n\nThis project aims to ease generation of certain kinds of pseudo data. \n\n## Motivation\n\nThe motivation behind this program is to generate mock-up data to be used in a Data Structures and Algorithms class, in which the author is a co-teacher. The author believes that assigning assignments that encourage students to test their algorithms with larger, more complex and realistic mock-up data would have a positive outcome. In Data Structures and Algorithms classes, students often test complicated algorithms with small data, making it difficult for them to realize the real-world benefits of more complex algorithms with smaller time and space complexity. Furthermore, working with plain data may not be as meaningful to students as working with data that is closer to real-world scenarios. For example, sorting a bunch of numbers may be less relevant compared to sorting bank account data based on their balance.\n\n## Installation\n\n1. Clone the repository to your local machine:\n\n```bash\ngit clone https://github.com/username/project.git\n```\n\n\n2. Navigate to the project directory:\n\n```bash\ncd project\n``` \n\n3. Create and activate a new virtual environment:\n\n```bash\npython3 -m venv env\nsource env/bin/activate\n```\n\n4. Install the required packages:\n\n```bash\npip install -r requirements.txt\n```\n\n5. Run the main script:\n\n```bash\npython main.py\n```\n\nIn the main function, you will find calls to functions that generate mock-up data. These functions have default values for their parameters, which can be modified to suit your needs. If you want to change the default behavior of these function calls, you can modify their parameters accordingly.\n## Examples of Pseudo Data for Different Data Types\n\nFor now, you can generate data similar to following pseudo data:\n\n### Tabular Data\n\n   * University Student Data `tabular/student.py`\n\nName         |  Surname      |  Student ID  |  GPA   |  Department                              |  Enrollment Year  |  Date of Birth\n-------------|---------------|--------------|--------|------------------------------------------|-------------------|---------------\nKimberly     |  Glass        |  2194452203  |  1.28  |  Environmental Engineering               |  2022             |  1954-06-26\nMargaret     |  Dillon       |  2518248525  |  1.63  |  Mechanical Engineering                  |  2016             |  2004-11-12\nMegan        |  Jackson      |  1064912807  |  2.54  |  Psychology                              |  2012             |  1962-04-21\nMaria        |  Wiggins      |  2138736257  |  3.4   |  Sociology                               |  2021             |  1964-06-07\nRobin        |  Barrera      |  2787155251  |  2.16  |  Tourism                                 |  2020             |  1957-10-20\nJessica      |  Thompson     |  2450628649  |  2.47  |  Psychology                              |  2021             |  1973-03-13\n\n* Bank Account Data `tabular/bank.py` \n\nCustomer ID                           |  SSN Number   |  Customer Name  |  Customer Surname  |  Email                             |  Address Street                       |  Address State   |  Address City              |  Address Country  |  Address Zipcode  |  Phone Number            |  Account Type  |  Account Balance  |  Customer Age  |  Date of Account Opening  |  Customer Gender  |  Customer Occupation                                          |  Customer Education Level\n--------------------------------------|---------------|-----------------|--------------------|------------------------------------|---------------------------------------|------------------|----------------------------|-------------------|-------------------|--------------------------|----------------|-------------------|----------------|---------------------------|-------------------|---------------------------------------------------------------|--------------------------\n7e245be4-f6e4-4359-a431-f01cd2c24156  |  317-40-7798  |  Connie         |  Williams          |  mitchellgolden@example.net        |  02380 Miguel Square                  |  Massachusetts   |  Gatesberg                 |  USA              |  81082            |  +1-281-370-1004x261     |  Euro          |  78344.42         |  45            |  2002-12-04               |  Female           |  Drilling engineer                                            |  PhD\n22dd6103-74a5-4bc0-8ac4-8b48be17befa  |  032-73-1963  |  Brian          |  Sandoval          |  mthompson@example.net             |  9743 Jennifer Grove                  |  Alaska          |  Porterport                |  USA              |  12222            |  +1-326-554-5687         |  TL            |  77012.4          |  32            |  2000-06-02               |  Male             |  Mining engineer                                              |  PhD\nead46a32-1230-410c-8602-b4f1aa7119be  |  252-73-4109  |  Richard        |  Larsen            |  johnroberts@example.com           |  8915 Willis Stravenue                |  Colorado        |  West Matthew              |  USA              |  45542            |  +1-959-994-5360x97310   |  Dollar        |  87172.63         |  75            |  2007-10-01               |  Female           |  Archaeologist                                                |  PhD\n5dc232a6-847e-4d96-b197-f254dae9c41a  |  138-49-1931  |  Jason          |  Welch             |  dennispierce@example.com          |  56381 Fox Hills                      |  Florida         |  Davidville                |  USA              |  58107            |  269-514-3995            |  TL            |  95361.65         |  73            |  2013-06-30               |  Male             |  Volunteer coordinator                                        |  Bachelor\n3c4fa81d-53ae-47d4-bee5-5dfd62d66670  |  451-72-7652  |  Joseph         |  Barrera           |  david19@example.net               |  99215 Ross Roads                     |  North Carolina  |  Wallside                  |  USA              |  66723            |  997-884-2464            |  Euro          |  20685.92         |  63            |  2003-09-11               |  Female           |  Lobbyist                                                     |  High School\nabd33426-8267-40e5-a504-39cfb6dc0a89  |  808-55-2605  |  Joseph         |  Long              |  daniel38@example.net              |  469 Jason Fall                       |  North Dakota    |  Patrickville              |  USA              |  96204            |  +1-161-482-2051x714     |  Euro          |  54050.17         |  46            |  2005-11-10               |  Male             |  Intelligence analyst                                         |  Bachelor\nf421be78-9c39-4b5d-97b7-6e37a2f490ef  |  403-43-7916  |  Paula          |  Conner            |  clarkerin@example.com             |  6029 Rose Ways                       |  Nebraska        |  North Davidside           |  USA              |  55425            |  394.422.3793x317        |  TL            |  85434.34         |  27            |  2017-02-08               |  Male             |  Meteorologist                                                |  Bachelor\n\n  * EarthQuake Data `tabular/earthquake.py`\n\nDate        |  City            |  Magnitude  |  Depth (km)  |  Duration (s)  |  Latitude  |  Longitude\n------------|------------------|-------------|--------------|----------------|------------|-----------\n2015-10-20  |  Bayburt         |  5.6        |  10          |  3.7           |  38.76     |  36.99\n2019-08-01  |  Zonguldak       |  4.4        |  65          |  18.1          |  41.45     |  39.54\n2019-11-24  |  Manisa          |  6.8        |  5           |  3.3           |  38.07     |  34.15\n2022-04-18  |  Eskişehir       |  6.6        |  15          |  4.2           |  40.14     |  28.05\n2014-10-22  |  Tokat           |  3.2        |  45          |  9.9           |  40.42     |  35.79\n2023-03-14  |  Kastamonu       |  7.0        |  45          |  3.7           |  37.38     |  35.02\n2015-09-04  |  Nevşehir        |  4.6        |  5           |  10.2          |  40.42     |  40.51\n\n\n### Graph Data \n\n   * Train Routes Data `graph/train_routes.py`\n\nSource              |  Destination         |  Distance\n--------------------|----------------------|----------\nNew Jeffreyborough  |  Rodriguezville      |  94\nNew Jeffreyborough  |  South Donnahaven    |  19\nNew April           |  Rodriguezville      |  99\nCherylfurt          |  South Donnahaven    |  68\nNew Wandachester    |  South Donnahaven    |  64\nNew Jeffreyborough  |  West Kimberly       |  38\nAyersmouth          |  Cherylfurt          |  103\nNew April           |  New Wandachester    |  146\nNew Wandachester    |  West Kimberly       |  24\nAyersmouth          |  New April           |  81\nAyersmouth          |  Rodriguezville      |  60\nBrianfort           |  New Wandachester    |  62\nNew Jeffreyborough  |  New Wandachester    |  80\nCherylfurt          |  New April           |  29\nAyersmouth          |  South Donnahaven    |  50\nAyersmouth          |  New Wandachester    |  28\nCherylfurt          |  New Jeffreyborough  |  103\nBrianfort           |  South Donnahaven    |  56\nAyersmouth          |  Brianfort           |  23\nNew Catherine       |  New Wandachester    |  71\nNew Catherine       |  Rodriguezville      |  90\nRodriguezville      |  South Donnahaven    |  90\nNew April           |  West Kimberly       |  66\nBrianfort           |  New April           |  53\nNew April           |  New Catherine       |  147\nBrianfort           |  West Kimberly       |  69\nBrianfort           |  Cherylfurt          |  26\n\n  \n   * Social Media Data `graph/social_media.py`\n\nHandle           |  Name      |  Surname    |  Age  |  Email                        |  Total Post Count  |  Following\n-----------------|------------|-------------|-------|-------------------------------|--------------------|-----------\nmitchellkaitlyn  |  Chelsea   |  Estrada    |  21   |  brenda62@example.com         |  254               |  5\nmitchellkaitlyn  |  Chelsea   |  Estrada    |  21   |  brenda62@example.com         |  254               |  4\nmitchellkaitlyn  |  Chelsea   |  Estrada    |  21   |  brenda62@example.com         |  254               |  3\nmitchellkaitlyn  |  Chelsea   |  Estrada    |  21   |  brenda62@example.com         |  254               |  6\nmitchellkaitlyn  |  Chelsea   |  Estrada    |  21   |  brenda62@example.com         |  254               |  9\nlynn87           |  Jennifer  |  Wiggins    |  53   |  nelsonjennifer@example.com   |  998               |  6\nlynn87           |  Jennifer  |  Wiggins    |  53   |  nelsonjennifer@example.com   |  998               |  1\nlynn87           |  Jennifer  |  Wiggins    |  53   |  nelsonjennifer@example.com   |  998               |  2\nlynn87           |  Jennifer  |  Wiggins    |  53   |  nelsonjennifer@example.com   |  998               |  5\nmark20           |  Patricia  |  Lewis      |  32   |  conniehernandez@example.com  |  144               |  0\nmark20           |  Patricia  |  Lewis      |  32   |  conniehernandez@example.com  |  144               |  3\nmark20           |  Patricia  |  Lewis      |  32   |  conniehernandez@example.com  |  144               |  1\nmark20           |  Patricia  |  Lewis      |  32   |  conniehernandez@example.com  |  144               |  2\nmark20           |  Patricia  |  Lewis      |  32   |  conniehernandez@example.com  |  144               |  5\ncolebrandy       |  Kayla     |  Vargas     |  39   |  brett70@example.com          |  676               |  7\ncolebrandy       |  Kayla     |  Vargas     |  39   |  brett70@example.com          |  676               |  5\nvpatel           |  Tracy     |  Castaneda  |  45   |  weberchristina@example.com   |  393               |  4\nvpatel           |  Tracy     |  Castaneda  |  45   |  weberchristina@example.com   |  393               |  9\nvpatel           |  Tracy     |  Castaneda  |  45   |  weberchristina@example.com   |  393               |  1\nvpatel           |  Tracy     |  Castaneda  |  45   |  weberchristina@example.com   |  393               |  5\nruth11           |  Shannon   |  Smith      |  69   |  timothygarza@example.com     |  466               |  9\nruth11           |  Shannon   |  Smith      |  69   |  timothygarza@example.com     |  466               |  7\nfrederickdeleon  |  Joshua    |  Johnson    |  38   |  monique65@example.com        |  21                |  2\nfrederickdeleon  |  Joshua    |  Johnson    |  38   |  monique65@example.com        |  21                |  8\nfrederickdeleon  |  Joshua    |  Johnson    |  38   |  monique65@example.com        |  21                |  7\nwalkersandra     |  Amber     |  Brown      |  77   |  heathersmith@example.com     |  937               |  6\n\n\n## Contribution\n\nWe welcome contributions to this project! If you're interested in contributing, please follow these steps:\n\n1. Fork the project repository.\n2. Create a new branch for your changes. \n3. Make your changes and commit them with descriptive commit messages.\n4. Push your changes to your forked repository.\n5. Submit a pull request to the original repository.\n\nAdditionally, if you have an idea for a new kind of mock-up data that needs to be generated by a function, feel free to start a discussion on our Github discussion page. If you encounter any issues or bugs, please report them on our Issues page. We appreciate your contributions!\n\n## License:\n\nThe code in this project is licensed under the MIT License\n\n## Author:\n\nThis project was created by Güvenç USANMAZ. If you have any questions, comments or suggestions, you can contact me me through the project's GitHub page.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgusanmaz%2Fpsuedo_big_data","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgusanmaz%2Fpsuedo_big_data","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgusanmaz%2Fpsuedo_big_data/lists"}