{"id":19232065,"url":"https://github.com/0xibra/fluxify","last_synced_at":"2026-04-14T15:33:45.384Z","repository":{"id":53531052,"uuid":"223007968","full_name":"0xIbra/fluxify","owner":"0xIbra","description":"A micro python library that retrieves and organizes data from a yaml mapping.","archived":false,"fork":false,"pushed_at":"2021-03-25T23:16:25.000Z","size":167,"stargazers_count":2,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-01-05T00:16:45.429Z","etag":null,"topics":["csv","data-flow","data-flow-control","data-manipulation","data-mapper","data-structure","json","mapping","python","xml","yaml","yaml-mapping"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/0xIbra.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-11-20T19:07:02.000Z","updated_at":"2021-02-24T16:23:04.000Z","dependencies_parsed_at":"2022-09-02T11:01:07.892Z","dependency_job_id":null,"html_url":"https://github.com/0xIbra/fluxify","commit_stats":null,"previous_names":["ibragim64/fluxify"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xIbra%2Ffluxify","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xIbra%2Ffluxify/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xIbra%2Ffluxify/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xIbra%2Ffluxify/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/0xIbra","download_url":"https://codeload.github.com/0xIbra/fluxify/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240307384,"owners_count":19780815,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","data-flow","data-flow-control","data-manipulation","data-mapper","data-structure","json","mapping","python","xml","yaml","yaml-mapping"],"created_at":"2024-11-09T16:05:14.965Z","updated_at":"2026-04-14T15:33:45.358Z","avatar_url":"https://github.com/0xIbra.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Fluxify\n\u003e A Python package that eases the process of retrieving, organizing and altering data.\n\n####  Required packages\n- **pandas**\n- **imperium**\n- **ijson**\n\n## Installation\n```bash\npip install fluxify\n```\n\n## Main classes\n#####  `fluxify.mapper.Mapper`\nThis class is used read and processing fast files with small amounts of data that can be loaded into memory.\n\n##### `fluxify.lazy_mapper.LazyMapper`\nYou've probable guessed it, this class is used to iterate on large files of data wether it is of format CSV,\nJSON or XML. \n\n## Usage\nRetrieve data from a simple CSV file\n```csv\nid,brand,price,state,published_at\n938,Xaomi,390.90,used,2020-01-03 12:32:29\n04593,iPhone,1299.90,new,2020-01-02 09:48:12\n```\n#### Mapper implementation\n```python\nfrom fluxify.mapper import Mapper\nimport yaml\n\n# Could also be loaded from a file\nyamlmapping = \"\"\"\nbrand:\n    col: 1\nprice:\n    col: 2\nstate:\n    col: 3\npublish_date:\n    col: 4\n    transformations:\n        - { transformer: 'date', in_format: '%Y-%m-%d %H:%M:%S', out_format: '%H:%M %d/%m/%Y' }\nis_new:\n    conditions:\n        -\n            condition: \"subject['state'] == 'new'\"\n            returnOnSuccess: True\n            returnOnFail: False\n\"\"\"\n\nMap = yaml.load(yamlmapping, Loader=yaml.FullLoader)\nmapper = Mapper(_type='csv')\ndata = mapper.map('path/to/csvfile.csv', Map)\nprint(data)\n```\n**Output**\n```bash\n[\n    {\n        'brand': 'Xaomi',\n        'price': '390.90',\n        'state': 'used',\n        'published_date': '12:32 03/01/2020'\n        'is_new': False\n    },\n    {\n        'brand': 'iPhone',\n        'price': '1299.90',\n        'state': 'new',\n        'published_date': '09:48 02/01/2020'\n        'is_new': True\n    }\n]\n```\n\n#### LazyMapper implementation\nThe `LazyMapper` does not return all the mapped data at the end,  \ninstead it maps the data in small sizes that you can specify in order to not max out the memory.\n\n```python\nfrom fluxify.lazy_mapper import LazyMapper\nimport yaml\n\n# Could also be loaded from a file\nyamlmapping = \"\"\"\nbrand:\n    col: 1\nprice:\n    col: 2\nstate:\n    col: 3\npublish_date:\n    col: 4\n    transformations:\n        - { transformer: 'date', in_format: '%Y-%m-%d %H:%M:%S', out_format: '%H:%M %d/%m/%Y' }\nis_new:\n    conditions:\n        -\n            condition: \"subject['state'] == 'new'\"\n            returnOnSuccess: True\n            returnOnFail: False\n\"\"\"\n\nMap = yaml.load(yamlmapping, Loader=yaml.FullLoader)\nmapper = LazyMapper(_type='csv', error_tolerance=True, bulksize=500)\nmapper.map('path/to/csvfile.csv', Map)\n\ndef some_callback(results):\n    for item in results:\n        pass # Perform some action\n\nmapper.set_callback(some_callback)\n\nmapper.map('path/to/csvfile.csv', Map)\n```\nAs you can see, in this example the mapper will call the callback function every time it accumulates 500 mapped items.\n\n### Mapping settings\n`col` key is used to specify the column number or attribute name from where the value must be retrieved.  \nIf you want to specify the input data as the retrieved value use `_all_` as the value of `col`\n\n`transformations` key is used to apply transformations to the retrieved value.  Available transformers are listed below.\n\n`conditions` key is used to apply conditions and alter the retrieved value.  \nThese conditions are in Python syntax, but you may not use all of Python's native functions.  \nAvailable functions are listed below.\n\n`default` is used to define a default value for when a retrieved value is **null**.  \n**Warning**: If the `default` key is defined with a value, it will be applied before applying transformations\nand conditions.\n\n##### Special cases for JSON and XML\n**XML**  \nSet the `multiple` to `true` if you want to retrieve data from multiple XML tags with the same name.  \nUse the `index` key with `multiple: true` if you wish to retrieve only one value from a number of XML tags.  \n\nWhen retrieving a XML value, the default behaviour is to retrieve the `.text` value of the tag.  \nIf you wish to change this, to retrieve a tag containing many other tags, use `raw` key and set it to `false`.  \nThis will return you an object of type `xml.etree.Element`, you could later apply transformations on this object to alter,\n organize and retrieve the data.\n\n**JSON**  \nUse `index` key to retrieve a specific value from an array.  \nOf course, it only works if the retrieved value is of type **array**.\n\n### Supported formats\n\nFormat      | CSV | JSON | XML | TXT\n------------|-----|------|-----|-----\nSupported   | YES | YES  | YES | NO\n\n## Transformers\nFluxify has built-in transformers that can alter/modify the data.\n\nFunction        | Arguments                         | Description\n----------------|-----------------------------------|--------------\n**number**      | stringvalue                       | Parses a string to an **integer** or **float** value\n**split**       | delimiter, index                  | Splits a string into parts with a **delimiter** and returns the splitted result if the **index** argument is not defined.\n**date**        | in_format, out_format             | Let's you format a date string to the desired format.\n**replace**     | search, new                       | Replaces the **search** value with **new** value from string\n**boolean**     | No arguments                      | Parses a string to Boolean if the string contains [true|false|1|0]\n**equipments_from_string** | delimiter              | Custom usage\n**options_from_string**    | delimiter              | Custom usage\n\n## Exceptions\nFluxify has different Exception classes for different reasons\nThey reside in the **exceptions** sub-package ```fluxify.exceptions```\n\nClass                                   | Arguments             | Description\n----------------------------------------|-----------------------|-------------\n**ArgumentNotFoundException**           | message               | This exception is raised whenever a argument is not found.\n**InvalidArgumentException**            | message               | This exception is raised when a passed parameter/argument is invalid.\n**ConditionNotFoundException**          | message               | This exception is raised when the \"condition\" key is not defined in the mapping.\n**UnsupportedTransformerException**     | message               | This exception is raised when a transformer other than the ones defined above, is used.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xibra%2Ffluxify","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F0xibra%2Ffluxify","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xibra%2Ffluxify/lists"}