{"id":19834315,"url":"https://github.com/orangesi/tinypandas","last_synced_at":"2025-08-20T11:15:06.251Z","repository":{"id":201241064,"uuid":"211987934","full_name":"orangeSi/tinypandas","owner":"orangeSi","description":"easy-to-use data structures and data analysis tools ( still be in draft, inspired by Python Pandas )","archived":false,"fork":false,"pushed_at":"2020-09-27T05:41:20.000Z","size":1801,"stargazers_count":9,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-05-01T17:32:09.803Z","etag":null,"topics":["crystal","crystal-lang","csv","dataframe","pandas-python","vcf"],"latest_commit_sha":null,"homepage":"","language":"Crystal","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/orangeSi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-10-01T01:07:22.000Z","updated_at":"2024-05-27T20:58:27.000Z","dependencies_parsed_at":null,"dependency_job_id":"3b9b3b36-7c24-43f7-be0f-8c86d768048e","html_url":"https://github.com/orangeSi/tinypandas","commit_stats":null,"previous_names":["orangesi/tinypandas"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/orangeSi/tinypandas","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orangeSi%2Ftinypandas","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orangeSi%2Ftinypandas/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orangeSi%2Ftinypandas/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orangeSi%2Ftinypandas/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/orangeSi","download_url":"https://codeload.github.com/orangeSi/tinypandas/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/orangeSi%2Ftinypandas/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266409697,"owners_count":23924288,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-21T11:47:31.412Z","response_time":64,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crystal","crystal-lang","csv","dataframe","pandas-python","vcf"],"created_at":"2024-11-12T12:03:54.944Z","updated_at":"2025-07-22T01:35:18.068Z","avatar_url":"https://github.com/orangeSi.png","language":"Crystal","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tinypandas\n\nTODO: Write a description here\n\n## Installation\n\n1. Add the dependency to your `shard.yml`:\n\n   ```yaml\n   dependencies:\n     tinypandas:\n       github: orangeSi/tinypandas\n   ```\n\n2. Run `shards install`\n\n## Features\n```\n1. support seprated by tab format or csv or vcf format file\n```\n## Usage\n\ntest code is in ```example/test.cr``` like this:\n```crystal\nrequire \"tinypandas\"\n\npd = Tinypandas.new\n\n## support seprate by tab format file\ndf = pd.read_table(ifile, sep: \"\\t\") # def read_table(filepath_or_buffer : String, sep = \"\\t\", delimiter : String = \"\\n\", header : HeaderType = 0, index_col : IndexColType = 0, comment : String|Regex = \"#\", skiprows : SkiprowsType = false, skip_blank_lines : Bool = true)\n\nputs \"df is #{df}\\n\"\n\nputs \"df.to_str is\\n#{df.to_str}\\n\"\n\nputs \"df[A2][B3] is #{df[\"A2\"][\"B3\"]}\\n\"\n\nputs \"df[df[A2]\u003e=5].to_str is\"\nputs df[df[\"A2\"]\u003e=5].to_str\n\nputs \"df[df[A3]==9][A2].to_str is \"\nputs df[df[\"A3\"]==9][\"A2\"].to_str\n\nputs \"df[df[A3]\u003e=3][A2].to_str is \"\nputs df[df[\"A3\"]\u003e=3][\"A2\"].to_str\n\nt = df[\"A2\"]\nputs \"t = df[A2]is #{t}\"\nputs \"t\u003e2 is #{t\u003e2}\"\n\nputs \"df.t.to_str is\\n#{df.t.to_str}\"\n\nputs \"df.t[B3][A1] is \"\nputs df.t[\"B3\"][\"A1\"]\n\n\n## support vcf format file\ndf = pd.load_vcf(\"demo.vcf\")\nputs \"df.head(1).to_s is\\n\"\nputs df.head(1).to_s\nputs \"\\n\"\n\n## support csv format file\ndf = pd.load_csv(\"sample.csv\")\nputs \"df is #{df}\\n\"\nputs \"df.to_str is\\n#{df.to_str}\\n\"\nputs \"df[col2][2] is #{df[\"col2\"][\"2\"]}\\n\"\n\n\n## convert Array(Array) to DataFrame\ndata = [[1,2,3],[4,5,6],[6,7,8]]\ndf = DataFrame.new(data, columns: [\"c1\",\"c2\",\"c3\"]) # read_array_by_row: true\nputs \"\\nArray(Array()):#{data} to DataFrame:\\n#{df.to_s}\"\n\n## read Hash(String, Array()) as DataFrame\ndata = {\"c1\"=\u003e[1,2,3], \"c2\"=\u003e[4,5,6], \"c3\"=\u003e[6,7,8]}\ndf = DataFrame.new(data)\nputs \"\\nHash(String, Array()):#{data} to DataFrame:\\n#{df.to_s}\"\n\n```\nthen go to example ```cd example; crystal build test.cr --release```\n```\n$cat demo.xls\n# note\n\tA1\tA3\tA2\nB1\t1\t3\t2\nB2\t7\t2\t8\nB3\t4\t9\t5\n```\nthen ```./test demo.xls``` or ```./test demo.xls.gz```\nwill get this:\n```\n## support seprate by tab format file\nintpu file demo.xls\n\ndf is DataFrame(@dict={\"A1\" =\u003e Series(@dict={\"B1\" =\u003e 1, \"B2\" =\u003e 7, \"B3\" =\u003e 4}), \"A3\" =\u003e Series(@dict={\"B1\" =\u003e 3, \"B2\" =\u003e 2, \"B3\" =\u003e 9}), \"A2\" =\u003e Series(@dict={\"B1\" =\u003e 2, \"B2\" =\u003e 8, \"B3\" =\u003e 5})}, @index=[\"B1\", \"B2\", \"B3\"], @columns=[\"A1\", \"A3\", \"A2\"])\n\ndf.to_str is\n\tA1\tA3\tA2\nB1\t1\t3\t2\nB2\t7\t2\t8\nB3\t4\t9\t5\n\ndf[A2][B3] is 5\ndf[df[A2]\u003e=5].to_str is\n\tA1\tA3\tA2\nB2\t7\t2\t8\nB3\t4\t9\t5\n\ndf[df[A3]==9][A2].to_str is \nB3\t5\n\ndf[df[A3]\u003e=3][A2].to_str is \nB1\t2\nB3\t5\nt = df[A2]is Series(@dict={\"B1\" =\u003e 2, \"B2\" =\u003e 8, \"B3\" =\u003e 5})\nt\u003e2 is Series(@dict={\"B2\" =\u003e 8, \"B3\" =\u003e 5})\n\ndf.t.to_str is\n\tB1\tB2\tB3\nA1\t1\t7\t4\nA3\t3\t2\t9\nA2\t2\t8\t5\n\ndf.t[B3][A1] is \n4\n\n## support vcf format file\ndf.head(1).to_s is\n\t#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\tFORMAT\tHG00096\tHG00097\tHG00099\n0\tMT\t10\t.\tT\tC\t100\tfa\tVT=S;AC=3\tGT\t0\t0\t0\n\n## support csv format file\ndf is DataFrame(@dict={\"date\" =\u003e Series(@dict={\"0\" =\u003e \"2020-02-01 12:00:02\", \"1\" =\u003e \"2020-02-01 12:00:07\", \"2\" =\u003e \"2020-02-01 12:00:12\", \"3\" =\u003e \"2020-02-01 12:00:17\", \"4\" =\u003e \"2020-02-01 12:00:22\", \"5\" =\u003e \"2020-02-01 12:00:27\", \"6\" =\u003e \"2020-02-01 12:00:32\", \"7\" =\u003e \"2020-02-01 12:00:37\"}), \"col1\" =\u003e Series(@dict={\"0\" =\u003e 66808, \"1\" =\u003e 66873, \"2\" =\u003e 66875, \"3\" =\u003e 66874, \"4\" =\u003e 66881, \"5\" =\u003e 66858, \"6\" =\u003e 66905, \"7\" =\u003e 66885}), \"col2\" =\u003e Series(@dict={\"0\" =\u003e 0.68, \"1\" =\u003e 0.67, \"2\" =\u003e 0.65, \"3\" =\u003e 0.67, \"4\" =\u003e 0.67, \"5\" =\u003e 0.66, \"6\" =\u003e 0.64, \"7\" =\u003e 0.66}), \"col3\" =\u003e Series(@dict={\"0\" =\u003e \"TRUE\", \"1\" =\u003e \"FALSE\", \"2\" =\u003e \"TRUE\", \"3\" =\u003e \"FALSE\", \"4\" =\u003e \"TRUE\", \"5\" =\u003e \"FALSE\", \"6\" =\u003e \"TRUE\", \"7\" =\u003e \"FALSE\"}), \"col4\" =\u003e Series(@dict={\"0\" =\u003e \"str1\", \"1\" =\u003e \"str2\", \"2\" =\u003e \"str3\", \"3\" =\u003e \"str4\", \"4\" =\u003e \"str5\", \"5\" =\u003e \"str6\", \"6\" =\u003e \"str7\", \"7\" =\u003e \"str8\"})}, @index=[\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\"], @columns=[\"date\", \"col1\", \"col2\", \"col3\", \"col4\"])\ndf.to_str is\n\tdate\tcol1\tcol2\tcol3\tcol4\n0\t2020-02-01 12:00:02\t66808\t0.68\tTRUE\tstr1\n1\t2020-02-01 12:00:07\t66873\t0.67\tFALSE\tstr2\n2\t2020-02-01 12:00:12\t66875\t0.65\tTRUE\tstr3\n3\t2020-02-01 12:00:17\t66874\t0.67\tFALSE\tstr4\n4\t2020-02-01 12:00:22\t66881\t0.67\tTRUE\tstr5\n5\t2020-02-01 12:00:27\t66858\t0.66\tFALSE\tstr6\n6\t2020-02-01 12:00:32\t66905\t0.64\tTRUE\tstr7\n7\t2020-02-01 12:00:37\t66885\t0.66\tFALSE\tstr8\n\ndf[col2][2] is 0.65\n\nArray(Array()):[[1, 2, 3], [4, 5, 6], [6, 7, 8]] to DataFrame:\n\tc1\tc2\tc3\n0\t1\t2\t3\n1\t4\t5\t6\n2\t6\t7\t8\n\nHash(String, Array()):{\"c1\" =\u003e [1, 2, 3], \"c2\" =\u003e [4, 5, 6], \"c3\" =\u003e [6, 7, 8]} to DataFrame:\n\tc1\tc2\tc3\n0\t1\t4\t6\n1\t2\t5\t7\n2\t3\t6\t8\n\n```\n\nTODO: Write usage instructions here\n\n## Development\n\nTODO: Write development instructions here\n\n## Contributing\n\n1. Fork it (\u003chttps://github.com/orangeSi/tinypandas/fork\u003e)\n2. Create your feature branch (`git checkout -b my-new-feature`)\n3. Commit your changes (`git commit -am 'Add some feature'`)\n4. Push to the branch (`git push origin my-new-feature`)\n5. Create a new Pull Request\n\n## Contributors\n\n- [orangeSi](https://github.com/orangeSi) - creator and maintainer\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Forangesi%2Ftinypandas","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Forangesi%2Ftinypandas","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Forangesi%2Ftinypandas/lists"}