{"id":15612364,"url":"https://github.com/euberdeveloper/datasets-merger","last_synced_at":"2025-03-29T15:13:12.482Z","repository":{"id":97419524,"uuid":"298838704","full_name":"euberdeveloper/datasets-merger","owner":"euberdeveloper","description":"An npm package to quickly merge datasets for machine learning","archived":false,"fork":false,"pushed_at":"2020-09-26T15:53:47.000Z","size":30882,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-05T12:09:01.752Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/euberdeveloper.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-09-26T15:04:13.000Z","updated_at":"2020-09-26T15:53:49.000Z","dependencies_parsed_at":"2023-04-24T02:19:08.430Z","dependency_job_id":null,"html_url":"https://github.com/euberdeveloper/datasets-merger","commit_stats":{"total_commits":6,"total_committers":2,"mean_commits":3.0,"dds":"0.16666666666666663","last_synced_commit":"571bb2f978cbffb9f3b594aebc42f469ed569227"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euberdeveloper%2Fdatasets-merger","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euberdeveloper%2Fdatasets-merger/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euberdeveloper%2Fdatasets-merger/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euberdeveloper%2Fdatasets-merger/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/euberdeveloper","download_url":"https://codeload.github.com/euberdeveloper/datasets-merger/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246200323,"owners_count":20739566,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-03T06:42:00.657Z","updated_at":"2025-03-29T15:13:12.461Z","avatar_url":"https://github.com/euberdeveloper.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# datasets-merger\nAn npm package to quickly merge datasets for machine learning\n\n## Install\n\nTo install datasets-merger as a local module:\n\n```bash\n$ npm install datasets-merger\n```\n\nTo install datasets-merger as a global module:\n\n```bash\n$ npm install -g datasets-merger\n```\n\n## Purpose\n\nThis packages merges two **datasets** for machine learning with a specific format:\n* Each dataset is a **directory**\n* Each dataset contains a `classes.txt` file\n* Each `classes.txt` file contains a simple **list of classes** (such as objects in a photo) separated by a **newline**\n* Each dataset can contain some `.png` files\n* Each dataset can contain `.txt` files different from `classes.txt`, ideally one for each `.png` file. These files contain multiple rows. Each row begins with a **number** which is the **index** (from 0) of the **corrisponding*** object find in the photo and present in the `classes.txt` file. This index should be followed by other numbers (such the coordinates of the objects), but this does not matter.\n\nThe package will simply merge the given datasets, creating a new dataset in the specified destination directory.\n\n## Usage (local module)\n\n```javascript\nconst datasetsMerger = require('datasets-merger');\n\nconst datasetsPaths = [\n    './first_dataset',\n    './second_dataset',\n    './third_dataset'\n];\nconst destination = './destination';\n\ndatasetsMerger(datasetsPaths, destination);\n```\n\n## Usage (global module)\n\n```bash\n$ ds-merger merge --datasets ./first_dataset ./second_dataset --dest ./destination\n```\n\n## Example\n\nThere is an example in this repository, in the path `/example`.\n\nTo run it, go to that folder and execute:\n\n```bash\n$ node mains\n```\n\nIt will create the `destination` folder, which will be the result of the merging operation on the other two folders.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feuberdeveloper%2Fdatasets-merger","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feuberdeveloper%2Fdatasets-merger","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feuberdeveloper%2Fdatasets-merger/lists"}