{"id":37704275,"url":"https://github.com/appeler/clean-names","last_synced_at":"2026-01-16T13:06:17.650Z","repository":{"id":26543776,"uuid":"29997228","full_name":"appeler/clean-names","owner":"appeler","description":"Deduplicate and parse list of `dirty names'","archived":false,"fork":false,"pushed_at":"2020-11-04T08:28:23.000Z","size":42,"stargazers_count":23,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-09-09T09:30:19.041Z","etag":null,"topics":["firstname","lastname"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"flar2/marlin","license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/appeler.png","metadata":{"files":{"readme":"ReadMe.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-01-29T01:19:04.000Z","updated_at":"2025-05-11T22:11:25.000Z","dependencies_parsed_at":"2022-07-25T15:32:12.924Z","dependency_job_id":null,"html_url":"https://github.com/appeler/clean-names","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/appeler/clean-names","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/appeler%2Fclean-names","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/appeler%2Fclean-names/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/appeler%2Fclean-names/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/appeler%2Fclean-names/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/appeler","download_url":"https://codeload.github.com/appeler/clean-names/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/appeler%2Fclean-names/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28478915,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T11:59:17.896Z","status":"ssl_error","status_checked_at":"2026-01-16T11:55:55.838Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["firstname","lastname"],"created_at":"2026-01-16T13:06:13.360Z","updated_at":"2026-01-16T13:06:17.641Z","avatar_url":"https://github.com/appeler.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### Clean Names\n\n[![Build Status](https://travis-ci.org/appeler/clean-names.svg?branch=master)](https://travis-ci.org/appeler/clean-names)\n[![Build status](https://ci.appveyor.com/api/projects/status/k4ktm279ldl60aeq?svg=true)](https://ci.appveyor.com/project/appeler/clean-names)\n\nThe script takes a csv file with column 'Name' containing 'dirty names' --- names with all different formats: lastname firstname, firstname lastname, middlename lastname firstname etc. (see [sample input file](sample_input.csv)). And it produces a csv file that has all the columns of the original csv file and the following columns: 'uniqid', 'FirstName', 'MiddleInitial/Name', 'LastName', 'RomanNumeral', 'Title', 'Suffix'. The script takes out duplicate names by default (see [sample output file](sample_output.csv)).\n\n#### Application\nThe script was used to fix names in CF-Scores from [Database on Ideology, Money in Politics, and Elections](http://data.stanford.edu/dime). Processed database with clean names posted on [Harvard DVN](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/28949).\n\n#### Installation\n\n1. Clone this repository\n\ngit clone https://github.com/soodoku/clean-names.git\n\n2. Navigate to clean-names\n\n3. Run `python setup.py install` \n\n#### Using Clean Names\n\nUsage: `process_names.py [options]`\n\n#### Command Line Options\n```  \n \t-h, \t    --help show this help message and exit  \n \t-o OUTFILE, --out=OUTFILE  \n                  \tOutput file in CSV (default: sample_output.csv)  \n  -c COLUMN,  --column=COLUMN  \n                  \tColumn name in CSV that contains Names (default: Name)    \n   -a, \t    --all      \t\n    \t\t\tExport all names (do not take duplicate names out)  (default: False)  \n```\n\n#### Example\n\u003cpre\u003e\u003ccode\u003e python process_names.py -a sample_input.csv \u003c/code\u003e\u003c/pre\u003e\n\n### License\nScripts are released under the [MIT License](https://opensource.org/licenses/MIT)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fappeler%2Fclean-names","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fappeler%2Fclean-names","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fappeler%2Fclean-names/lists"}