{"id":17718553,"url":"https://github.com/linuxscout/naftawayh","last_synced_at":"2025-09-23T08:32:37.858Z","repository":{"id":138755297,"uuid":"130862727","full_name":"linuxscout/naftawayh","owner":"linuxscout","description":"Naftawayh: arabic word tagger","archived":false,"fork":false,"pushed_at":"2020-08-27T19:21:17.000Z","size":3132,"stargazers_count":12,"open_issues_count":0,"forks_count":3,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-09-20T14:49:16.566Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/linuxscout.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"patreon":"linuxscout"}},"created_at":"2018-04-24T13:58:07.000Z","updated_at":"2022-11-12T08:45:26.000Z","dependencies_parsed_at":null,"dependency_job_id":"7c175d92-f3c5-4c26-9011-02ddca034d66","html_url":"https://github.com/linuxscout/naftawayh","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linuxscout%2Fnaftawayh","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linuxscout%2Fnaftawayh/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linuxscout%2Fnaftawayh/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linuxscout%2Fnaftawayh/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/linuxscout","download_url":"https://codeload.github.com/linuxscout/naftawayh/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":233957782,"owners_count":18757147,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-25T14:54:50.431Z","updated_at":"2025-09-23T08:32:31.631Z","avatar_url":"https://github.com/linuxscout.png","language":"Python","funding_links":["https://patreon.com/linuxscout"],"categories":[],"sub_categories":[],"readme":"# نفطويه: تصنيف الكلمات العربية\n## Naftawayh: Arabic Word Tagger\nNaftawayh is a python library for Arabic word tagging (word classification) into types (nouns, verbs, stopwords), which is useful in language processing, especially for text mining. Naftawayh works according to the Arabic word structure, and the ability to guess the word class, through certain signs. For example, a word which ends Teh Marbuta, is a noun. Hamza Below Alef, class the word as a noun. We can identify many kins of words, by patterns especially for verbs in present tense and defined words. \n\nنفطويه هو برنامج ومكتبة لتصنيف الكلمات إلى أنواعها (اسم، فعل، حرف)، ويفيد في المعالجة الآلية للغة وخصوصا التنقيب عن المعلومات، ومبدأه يعمل على بنية الكلمة العربية، وقدرتنا على تخمين نوعها، من خلال علامات معينة. فمثلا كل كلمة تنتهي بتاء مربوطة فهي اسم، وكل كلمة فيها همزة تحت الألف اسم. ويمكننا التعرف على كثير من الكلمات المعرّفة بالألف واللام، وبعض أنماط الأفعال المضارعة. \n\n\n  Developpers:  Taha Zerrouki: http://tahadz.com\n    taha dot zerrouki at gmail dot com\n\nFeatures |   value\n---------|---------------------------------------------------------------------------------\nAuthors  | Taha Zerrouki: http://tahadz.com,  taha dot zerrouki at gmail dot com\nRelease  | 0.3\nLicense  |[GPL](https://github.com/linuxscout/naftawayh/master/LICENSE)\nTracker  |[linuxscout/naftawayh/Issues](https://github.com/linuxscout/naftawayh/issues)\nWebsite  |[https://pypi.python.org/pypi/naftawayh](https://pypi.python.org/pypi/naftawayh)\nDoc  |[package Documentaion](http://pythonhosted.org/naftawayh/)\nSource  |[Github](http://github.com/linuxscout/naftawayh)\nDownload  |[pypi.python.org](https://pypi.python.org/pypi/naftawayh)\nFeedbacks  |[Comments](https://github.com/linuxscout/naftawayh/issues)\nAccounts  |[@Twitter](https://twitter.com/linuxscout)  [@Sourceforge](http://sourceforge.net/projects/naftawayh/)\n\n\n\n## Citation\nIf you would cite it in academic work, can you use this citation\n```\nT. Zerrouki‏, Naftawayh,  Arabic Word Tagger,\n  https://pypi.python.org/pypi/naftawayh/, 2010\n```\nor in bibtex format\n\n```bibtex\n@misc{zerrouki2012naftawayh,\n  title={Naftawayh : Arabic Word Tagger},\n  author={Zerrouki, Taha},\n  url={https://pypi.python.org/pypi/naftawayh,\n  year={2010}\n}\n```\n\n\n### Applications\n* Text mining.\n* Text summarizing.\n* Sentences identification.\n* Grammar analysis.\n* Morphological analysis acceleration.\n* Extraction of ngrams..\n\n### تطبيقات \n====\n* التنقيب عن المعلومات.\n* تلخيص النص.\n* التعرف على الجمل.\n* التحليل النحوي.\n* تسريع التحليل الصرفي.\n* استخراج المصطلحات والمسكوكات والمتلازمات.\n\n### من هو نفطويه Who is Naftawayh\n\n![Who is Naftawayh?](images/naftawayh_sample.png \"Who is Naftawayh?\")\n\n\n### Demo جرّب\n\nيمكن التجربة على [موقع مشكال](http://tahadz.com/mishkal)\n، اختر أدوات، ثم استخلاص ثم تصنيف\nYou can test it on [Mishkal Site](http://tahadz.com/mishkal), choose: Tool \u003e extraction \u003e Classify.\n![Naftawayh Demo](images/naftawayh_demo.png \"Naftawayh Demo\")\n\n\n\n### Installation\n\n```\npip install naftawayh\n```    \n    \n### Usage\n\n```python\nimport naftawayh.wordtag as wordtag\n```\n\nTest word list\n\n```python\n\u003e\u003e\u003e import naftawayh.wordtag \n\u003e\u003e\u003e word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام', \nu'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي', \nu'التطرف', u'اقتصادي', )\n\u003e\u003e\u003e tagger = naftawayh.wordtag.WordTagger();\n\u003e\u003e\u003e # test all words\n\u003e\u003e\u003e list_tags = tagger.word_tagging(word_list)\n\u003e\u003e\u003e for word, tag in zip(word_list, list_tags):\n\u003e\u003e\u003e     print word, tag\nبالبلاد n\nبينما vn3\nأو t\nانسحاب n\nانعدام n\nانفجار n\nالبرنامج n\nبانفعالاتها n\nالعربي n\nالصرفي n\nالتطرف n\nاقتصادي n\n```\n* Test word by word\n\n```python\n\u003e\u003e\u003e import naftawayh.wordtag \n\u003e\u003e\u003e word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام', \nu'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي', \nu'التطرف', u'اقتصادي', )\n\u003e\u003e\u003e tagger = naftawayh.wordtag.WordTagger();        \n\u003e\u003e\u003e #test word by word\n\u003e\u003e\u003e for word in word_list:\n\u003e\u003e\u003e     if tagger.is_noun(word):\n\u003e\u003e\u003e         print(u'%s is noun'%word)\n\u003e\u003e\u003e     if tagger.is_verb(word):\n\u003e\u003e\u003e         print(u'%s is verb'%word)\n\u003e\u003e\u003e     if tagger.is_stopword(word):\n\u003e\u003e\u003e         print(u'%s is stopword'%word)\nبالبلاد is noun\nبينما is noun\nبينما is verb\nأو is noun\nأو is verb\nأو is stopword\nانسحاب is noun\nانعدام is noun\nانفجار is noun\nالبرنامج is noun\nبانفعالاتها is noun\nالعربي is noun\nالصرفي is noun\nالتطرف is noun\nاقتصادي is noun\n\n```\n* Test word in context\n\n```python\n\u003e\u003e\u003e import naftawayh.wordtag \n\u003e\u003e\u003e word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام', \nu'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي', \nu'التطرف', u'اقتصادي', )\n\u003e\u003e\u003e tagger = naftawayh.wordtag.WordTagger();\n\u003e\u003e\u003e previous_word=\"\"\n\u003e\u003e\u003e print (\" **** test words in context***\")\n\u003e\u003e\u003e # test words in context\n\u003e\u003e\u003e for word in word_list:\n\u003e\u003e\u003e     tag=tagger.context_analyse(previous_word,word);\n\u003e\u003e\u003e     print(u\"%s from context is %s \"%(word,tag))\n\u003e\u003e\u003e     previous_word=word;\n**** test words in context***\nبالبلاد from context is vn \nبينما from context is vn \nأو from context is vn \nانسحاب from context is vn \nانعدام from context is vn \nانفجار from context is vn \nالبرنامج from context is vn \nبانفعالاتها from context is vn \nالعربي from context is vn \nالصرفي from context is vn \nالتطرف from context is vn \nاقتصادي from context is vn \n\n```\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinuxscout%2Fnaftawayh","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinuxscout%2Fnaftawayh","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinuxscout%2Fnaftawayh/lists"}