{"id":27483354,"url":"https://github.com/hectorpulido/peque-nlu","last_synced_at":"2025-04-16T15:46:14.236Z","repository":{"id":207484504,"uuid":"637275837","full_name":"HectorPulido/peque-nlu","owner":"HectorPulido","description":"Peque-NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts intends, features and information.","archived":false,"fork":false,"pushed_at":"2024-01-26T22:12:12.000Z","size":37,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-01-26T23:26:12.896Z","etag":null,"topics":["ai","data-science","intent","intent-classification","machine-learning","nlp","nlu","pequesoft","text-classification"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HectorPulido.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-05-07T03:40:02.000Z","updated_at":"2024-01-26T23:26:14.647Z","dependencies_parsed_at":"2024-01-26T23:26:13.911Z","dependency_job_id":"95f3755f-4411-4de1-babf-38ecac444ab9","html_url":"https://github.com/HectorPulido/peque-nlu","commit_stats":null,"previous_names":["hectorpulido/peque-nlu"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HectorPulido%2Fpeque-nlu","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HectorPulido%2Fpeque-nlu/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HectorPulido%2Fpeque-nlu/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HectorPulido%2Fpeque-nlu/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HectorPulido","download_url":"https://codeload.github.com/HectorPulido/peque-nlu/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249256510,"owners_count":21238988,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","data-science","intent","intent-classification","machine-learning","nlp","nlu","pequesoft","text-classification"],"created_at":"2025-04-16T15:46:13.459Z","updated_at":"2025-04-16T15:46:14.210Z","avatar_url":"https://github.com/HectorPulido.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Peque NLU - Natural Language Understanding with Machine Learning\nPeque-NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extract intents, features and information.\n\nFor example: `quiero conocer el ultimo blogpost de unity` \nResult: Timing -\u003e latest, Technology -\u003e unity, Intention -\u003e search\n\n## Table of Contents\n\n- [Features](#features)\n- [Use Cases](#use-cases)\n- [Getting Started](#getting-started)\n    - [Prerequisites](#prerequisites)\n    - [Installation](#installation)\n- [Usage](#usage)\n- [Contributing](#contributing)\n- [License](#license)\n- [Contact](#contact)\n\n## Features\n- Feature extraction from text\nAgnostic algorithm: you can use SGD, MLNN, LLMs, Word2Vec, etc.\n- 100% Free and Open source\n\n## Use cases\n- Chatbots, to get intention and extract features\n- Search engines, get keywords and intention from a semantic info\n- Data mining, classifying text and unstructured data without boilerplate\n\n\n## Getting Started\n\n### Prerequisites\n\n- Python 3.6+\n\n### Installation\n\n\u003e[!WARNING]\n\u003e pip installation coming soon\n\n1. Clone this repo\n```\ngit clone git@github.com:HectorPulido/peque-nlu.git\n```\n2. Install the requirements\n```\npip install -r requirements.txt\n```\n3. Use the library\n```py\nfrom peque_nlu.intent_engines import SGDIntentEngine\nfrom peque_nlu.intent_classifiers import ModelIntentClassifier\n\n\nintent_engine = SGDIntentEngine(\"spanish\")\nmodel = ModelIntentClassifier(\"spanish\", intent_engine)\nmodel.fit(DATASET_PATH)\n\nprediction = model.multiple_predict(\n    [\n        \"Hola como te encuentras?\",\n        \"Quiero aprender sobre lo último de python\",\n        \"describeme usando un meme\",\n    ]\n)\n\nassert len(prediction) == 3\nfirst_prediction = prediction[0]\nassert \"intent\" in first_prediction\nassert \"probability\" in first_prediction\nassert \"text\" in first_prediction\nassert \"features\" not in first_prediction\n\nassert first_prediction[\"intent\"] == \"small_talk\"\n\n```\n\n## Usage\nYou need to provide to the algorithm before start, you [can check this](https://github.com/HectorPulido/peque-nlu/blob/main/intents_example.json) as base\n```json\n{\n    \"intents\": {\n        \"small_talk\": [\n            \"hola\",\n            ...\n\n        ],\n        \"fun_phrases\": [\n            \"eres gracioso\",\n            ...\n        ],\n        \"meme\": [\n            \"¿conoces algun buen meme?\",\n            ...\n        ],\n        \"thanks\": [\n            \"gracias\",\n            ...\n        ]\n    },\n    \"entities\": {\n        \"technology\": [\n            \"python\",\n            ...\n        ],\n        \"timing\": [\n            \"recient\",\n            ...\n        ]\n    }\n}\n```\n\nWhen you have your format ready, you can load and fit your dataset.\n```py\nintent_engine = SGDIntentEngine(\"spanish\")\nmodel = ModelIntentClassifier(\"spanish\", intent_engine)\nmodel.fit(DATASET_PATH)\n```\n\nYou can also save and load your models to reduce time and resources.\n```py\n# Save\nsaver = PickleSaver()\nsaver.save(intent_engine, PICKLE_PATH)\n\n# Load\nintent_engine_loaded = SGDIntentEngine(\"spanish\")\nintent_engine_loaded = saver.load(PICKLE_PATH)\n```\n\nThen you can start to predict or extract features from a text\n```py\nprediction = model.predict(\"quiero conocer el ultimo blogpost de unity\")\n```\n\nResponse:\n```\n{\n    \"intent\": \"search\",\n    \"features\": [\n      {\n        \"word\": \"ultimo\",\n        \"entity\": \"timing\",\n        \"similarities\": 1\n      },\n      {\n        \"word\": \"otro_ejemplo\",\n        \"entity\": \"otra_entidad\",\n        \"similarities\": 0.9\n      }\n    ]\n  }\n```\n\n## Contributing\n\nYour contributions are greatly appreciated! Please follow these steps:\n\n1. Fork the project\n2. Create your feature branch `git checkout -b feature/MyFeature`\n3. Commit your changes `git commit -m \"my cool feature\"`\n4. Push to the branch `git push origin feature/MyFeature`\n5. Open a Pull Request\n\n## License\n\nEvery base code made by me is under the MIT license\n\n## Contact\n\n\u003chr\u003e\n\u003cdiv align=\"center\"\u003e\n\u003ch3 align=\"center\"\u003eLet's connect 😋\u003c/h3\u003e\n\u003c/div\u003e\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://www.linkedin.com/in/hector-pulido-17547369/\" target=\"blank\"\u003e\n\u003cimg align=\"center\" width=\"30px\" alt=\"Hector's LinkedIn\" src=\"https://www.vectorlogo.zone/logos/linkedin/linkedin-icon.svg\"/\u003e\u003c/a\u003e \u0026nbsp; \u0026nbsp;\n\u003ca href=\"https://twitter.com/Hector_Pulido_\" target=\"blank\"\u003e\n\u003cimg align=\"center\" width=\"30px\" alt=\"Hector's Twitter\" src=\"https://www.vectorlogo.zone/logos/twitter/twitter-official.svg\"/\u003e\u003c/a\u003e \u0026nbsp; \u0026nbsp;\n\u003ca href=\"https://www.twitch.tv/hector_pulido_\" target=\"blank\"\u003e\n\u003cimg align=\"center\" width=\"30px\" alt=\"Hector's Twitch\" src=\"https://www.vectorlogo.zone/logos/twitch/twitch-icon.svg\"/\u003e\u003c/a\u003e \u0026nbsp; \u0026nbsp;\n\u003ca href=\"https://www.youtube.com/channel/UCS_iMeH0P0nsIDPvBaJckOw\" target=\"blank\"\u003e\n\u003cimg align=\"center\" width=\"30px\" alt=\"Hector's Youtube\" src=\"https://www.vectorlogo.zone/logos/youtube/youtube-icon.svg\"/\u003e\u003c/a\u003e \u0026nbsp; \u0026nbsp;\n\u003ca href=\"https://pequesoft.net/\" target=\"blank\"\u003e\n\u003cimg align=\"center\" width=\"30px\" alt=\"Pequesoft website\" src=\"https://github.com/HectorPulido/HectorPulido/blob/master/img/pequesoft-favicon.png?raw=true\"/\u003e\u003c/a\u003e \u0026nbsp; \u0026nbsp;\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhectorpulido%2Fpeque-nlu","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhectorpulido%2Fpeque-nlu","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhectorpulido%2Fpeque-nlu/lists"}