{"id":19162763,"url":"https://github.com/centre-for-humanities-computing/gender-identification","last_synced_at":"2025-10-14T17:10:52.633Z","repository":{"id":240930949,"uuid":"803781825","full_name":"centre-for-humanities-computing/gender-identification","owner":"centre-for-humanities-computing","description":"Code and pipeline for gender identification based on names.","archived":false,"fork":false,"pushed_at":"2024-05-21T13:10:46.000Z","size":7,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-10T00:01:46.070Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/centre-for-humanities-computing.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-21T11:29:08.000Z","updated_at":"2024-05-21T13:10:50.000Z","dependencies_parsed_at":"2024-05-21T14:08:56.559Z","dependency_job_id":null,"html_url":"https://github.com/centre-for-humanities-computing/gender-identification","commit_stats":null,"previous_names":["centre-for-humanities-computing/gender-identification"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/centre-for-humanities-computing/gender-identification","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/centre-for-humanities-computing%2Fgender-identification","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/centre-for-humanities-computing%2Fgender-identification/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/centre-for-humanities-computing%2Fgender-identification/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/centre-for-humanities-computing%2Fgender-identification/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/centre-for-humanities-computing","download_url":"https://codeload.github.com/centre-for-humanities-computing/gender-identification/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/centre-for-humanities-computing%2Fgender-identification/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279020087,"owners_count":26086805,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T09:13:06.169Z","updated_at":"2025-10-14T17:10:52.590Z","avatar_url":"https://github.com/centre-for-humanities-computing.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# gender-identification\nCode and pipeline for gender identification based on names.\nThe repo contains a CLI and a package for easily adding a gender column to tabular data.\n\n## Usage\n\nInstall the package:\n```bash\npip install gender-identification\n```\n\nIf you have some tabular data in csv, tsv or jsonl you can just add a `gender` and a `gender_confidence` column to these using the CLI.\n\n```bash\npython3 -m gender_identification data.csv --name_column \"first_name\"\n```\n\nAlternatively you can save it to a different file:\n\n```bash\npython3 -m gender_identification data.csv --name_column \"first_name\" -o results.csv\n```\n\nYou can also just use the package in Python:\n```python\nfrom gender_identification import add_gender\n\ndf = pd.DataFrame({\"name\": [\"Peter Jørgensen\", \"Malte Larsen\"]})\n\ndf = add_gender(df, name_column=\"name\", remove_last_name=True)\n```\n\n## Parameters\n\n| Parameter         | Flag(s)             | Description                                                                                         | Default Value             |\n|-------------------|---------------------|-----------------------------------------------------------------------------------------------------|---------------------------|\n| `in_file`         |                     | Input file path.                                                                                    | -                      |\n| `name_column`     | `--name_column`, `-n` | Column where names are contained.                                                                   | -                      |\n| `out_file`        | `--out_file`, `-o`  | Output file path. If not specified, the original file will be overwritten.                           | None                      |\n| `remove_last_name`| `--remove_last_name`, `-r` | Indicates whether last names should be removed.                                                      | `False`                   |\n| `drop_confidence` | `--drop_confidence`, `-d` | Indicates whether to drop the column indicating the model's confidence in its predictions.            | `False`                   |\n| `batch_size`      | `--batch_size`, `-b` | Size of the batches to do inference in.                                                              | `32`                      |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcentre-for-humanities-computing%2Fgender-identification","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcentre-for-humanities-computing%2Fgender-identification","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcentre-for-humanities-computing%2Fgender-identification/lists"}