{"id":41847080,"url":"https://github.com/camel-lab/gender-reinflection","last_synced_at":"2026-01-25T10:03:08.756Z","repository":{"id":93673105,"uuid":"247649939","full_name":"CAMeL-Lab/gender-reinflection","owner":"CAMeL-Lab","description":"Code, models, and data for \"Gender-Aware Reinflection using Linguistically Enhanced Neural Models\". COLING 2020, GeBNLP.","archived":false,"fork":false,"pushed_at":"2024-07-25T10:57:20.000Z","size":77928,"stargazers_count":6,"open_issues_count":2,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-09-09T22:06:32.779Z","etag":null,"topics":["arabic","deep-learning","nlp"],"latest_commit_sha":null,"homepage":"https://aclanthology.org/2020.gebnlp-1.12","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CAMeL-Lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-03-16T08:35:13.000Z","updated_at":"2024-11-16T13:25:33.000Z","dependencies_parsed_at":"2023-07-31T06:30:34.448Z","dependency_job_id":null,"html_url":"https://github.com/CAMeL-Lab/gender-reinflection","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/CAMeL-Lab/gender-reinflection","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CAMeL-Lab%2Fgender-reinflection","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CAMeL-Lab%2Fgender-reinflection/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CAMeL-Lab%2Fgender-reinflection/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CAMeL-Lab%2Fgender-reinflection/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CAMeL-Lab","download_url":"https://codeload.github.com/CAMeL-Lab/gender-reinflection/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CAMeL-Lab%2Fgender-reinflection/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28751065,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-25T09:58:17.166Z","status":"ssl_error","status_checked_at":"2026-01-25T09:55:56.104Z","response_time":113,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arabic","deep-learning","nlp"],"created_at":"2026-01-25T10:03:07.816Z","updated_at":"2026-01-25T10:03:08.747Z","avatar_url":"https://github.com/CAMeL-Lab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Gender-Aware Reinflection using Linguistically Enhanced Neural Models:\nThis repo contains code to reproduce the results in our paper [Gender-Aware Reinflection using Linguistically Enhanced Neural Models](https://www.aclweb.org/anthology/2020.gebnlp-1.12.pdf)\n\n## Requirements:\nThe code was written for python\u003e=3.6 and pytorch 1.3, although newer versions of pytorch might work just fine. You will need a few additional packages. Here's how you can set up the environment using conda (assuming you have conda and cuda installed):\n\n```bash\ngit clone https://github.com/CAMeL-Lab/gender-reinflection.git\ncd gender-reinflection\n\nconda create -n gender_reinflection python=3.6\nconda activate gender_reinflection\n\npip install -r requirements.txt\n```\n\n## Training the model:\n\nTo train the best joint reinflection and identification (`joint+morph`) model we describe in our paper, you need to run `sbatch scripts/train_seq2seq.sh`. Training the model should take around 3 hours on a single GPU, although this may vary based on the GPU you're using. Once the training is done, the trained pytorch model will be saved in `saved_models/`.\n\n## Inference:\n\nTo get the gender reinflected sentences based on the trained seq2seq model, you would need to run `sbatch scripts/inference_seq2seq.sh`. The inference script will produce 3 files: .beam (beam search with beam size=10), .inf (greedy search), and .beam_greedy (beam search with beam size=1, i.e. greedy search).\u003c/br\u003e\nTo get the gender reinflected sentences based on the bigram MLE model, you would need to run `sbatch scripts/mle_inference.sh`. \u003c/br\u003e\u003c/br\u003e\n\nRefer to [logs/reinflection](/logs/reinflection) to get the reinflected sentences for all the experiments we report on in our paper.\n\n## Reinflection Evaluation:\n\nWe use the M\u003csup\u003e2\u003c/sup\u003e scorer and SacreBLEU in our evaluation. To run the evaluation, for the MLE, do nothing, and joint models we report on in the paper, you would need to run `sbatch scripts/run_eval_norm_joint.sh`. Make sure to change the path of inference data you want to evaluate on (refer to `SYSTEM_HYP` in ` scripts/run_eval_norm_joint.sh`). \u003c/br\u003e\u003c/br\u003e\nTo run the evaluation for the disjoint models, you would need to run `sbatch scripts/run_eval_norm_disjoint.sh`. Note that we merge the masculine disjoint system output (`logs/reinflection/disjoint_models/arin.to.M/dev.disjoint+morph.inf.norm`) and the feminine disjoint system output (`logs/reinflection/disjoint_models/arin.to.F/dev.disjoint+morph.inf.norm`) and we evaluate on the merged output (`logs/reinflection/disjoint_models/dev.disjoint+morph.inf.norm`). This is the same as reporting the average of both systems. \u003c/br\u003e\u003c/br\u003e\n\n\n## Gender Identification Evalutation:\n\nTo get the results of gender identification we report for our experiments in the paper, you would need to run `sbatch scripts/gender_identification.sh`. Make sure to change the `inference_data` path and the `inference_mode` based on the experiment you're running. Throughout all experiments, we report the average F\u003csub\u003e1\u003c/sub\u003e score over the masculine and feminine data. \u003c/br\u003e\u003c/br\u003e\nRefer to [logs/gender_id](/logs/gender_id) to get the gender id logs based on how we defined gender identification in our paper.\n\n## Error Analysis:\n\nWe also conduct a simple error analysis to indicate which words changed during inference. This helped us in conducting a more thourough manual error analysis which we reported in the paper. We did the error analysis on the results of our best model (`joint+morph`) on the dev set on the feminine and masculine data separately. To run the error analysis script, you would need to run `sbatch scripts/error_analysis`. Make sure to change the `EXPERIMENT_NAME` to `arin.to.F` to run the error analysis over the feminie dev set results and to `arin.to.M` to run the error analysis over the masuline dev set.  \u003c/br\u003e\u003c/br\u003e\n\nRefer to [logs/error_analysis](/logs/error_analysis) to get the error analysis logs.\n\n## License:\n\nThis repo is available under the MIT license. See the [LICENSE file](/LICENSE) for more info.\n\n## Citation:\n\nIf you find the code or data in this repo helpful, please cite [our paper](https://www.aclweb.org/anthology/2020.gebnlp-1.12.pdf):\n\n```bibtex\n@inproceedings{alhafni-etal-2020-gender,\n    title = \"Gender-Aware Reinflection using Linguistically Enhanced Neural Models\",\n    author = \"Alhafni, Bashar  and\n      Habash, Nizar  and\n      Bouamor, Houda\",\n    booktitle = \"Proceedings of the Second Workshop on Gender Bias in Natural Language Processing\",\n    month = dec,\n    year = \"2020\",\n    address = \"Barcelona, Spain (Online)\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://www.aclweb.org/anthology/2020.gebnlp-1.12\",\n    pages = \"139--150\",\n    abstract = \"In this paper, we present an approach for sentence-level gender reinflection using linguistically enhanced sequence-to-sequence models. Our system takes an Arabic sentence and a given target gender as input and generates a gender-reinflected sentence based on the target gender. We formulate the problem as a user-aware grammatical error correction task and build an encoder-decoder architecture to jointly model reinflection for both masculine and feminine grammatical genders. We also show that adding linguistic features to our model leads to better reinflection results. The results on a blind test set using our best system show improvements over previous work, with a 3.6{\\%} absolute increase in M2 F0.5.\",\n}\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcamel-lab%2Fgender-reinflection","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcamel-lab%2Fgender-reinflection","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcamel-lab%2Fgender-reinflection/lists"}