{"id":15384693,"url":"https://github.com/bblodfon/elixir-task","last_synced_at":"2026-05-18T00:31:42.973Z","repository":{"id":90675187,"uuid":"430666240","full_name":"bblodfon/elixir-task","owner":"bblodfon","description":"Python development task for an ELIXIR Engineer position","archived":false,"fork":false,"pushed_at":"2021-11-24T21:10:53.000Z","size":50,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-11T05:53:25.909Z","etag":null,"topics":["elixir"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bblodfon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-11-22T10:45:10.000Z","updated_at":"2021-11-24T21:10:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"c053689e-d8b6-4f5e-84f1-566add096e35","html_url":"https://github.com/bblodfon/elixir-task","commit_stats":{"total_commits":40,"total_committers":2,"mean_commits":20.0,"dds":"0.025000000000000022","last_synced_commit":"8ed918b8dceaa90fa56bfd96c1789280e074ab9b"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bblodfon/elixir-task","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bblodfon%2Felixir-task","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bblodfon%2Felixir-task/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bblodfon%2Felixir-task/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bblodfon%2Felixir-task/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bblodfon","download_url":"https://codeload.github.com/bblodfon/elixir-task/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bblodfon%2Felixir-task/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33160455,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-17T22:39:12.733Z","status":"ssl_error","status_checked_at":"2026-05-17T22:39:10.741Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["elixir"],"created_at":"2024-10-01T14:43:05.660Z","updated_at":"2026-05-18T00:31:42.958Z","avatar_url":"https://github.com/bblodfon.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# elixir-task\n\n[![](https://github.com/bblodfon/elixir-task/actions/workflows/pytest-ci.yml/badge.svg)](https://github.com/bblodfon/elixir-task/actions)\n\nDevelopment task for Elixir.no position candidates.\n\n## Run\n\nFrom the root of this repo, run the simple test examples as follows:\n```\npython src/task.py test/X.s test/Y.s\npython src/task.py test/X.f test/Y.f\npython src/task.py test/X.f test/Y.s\npython src/task.py test/X.s test/Y.f\n```\n\nFor the larger test files given in the task (not available in this repo due to their large size), you can run for example:\n```\npython src/task.py testfile_a.s testfile_b.f\n```\n\n## Things to consider\n\n- Make separate child classes of a main (maybe abstract?) `ElixirTask`, each one implementing a subsequent task/method? (be able to expand to more tasks as separate classes vs now that's more like adding new methods to the same class)\n- Optimize calculation of Pearson's correlation coefficient?\n- Faster reading of FUNCTION files?\n- Best alternative to `warnings.warn()` used?\n\n## Inputs\n\nGiven two types of text files of genomic information, using the **SEGMENT** format and the **FUNCTION** format, as defined below:\n\n### SEGMENT (file suffix: \".s\")\n\n2 columns tab-separated file containing the coordinates of a set of regions (or segments/intervals) located along the reference DNA of some creature.\nThe 1st column is the start coordinate (starting from 0) and the 2nd column is the end coordinate of the region (end-exclusive).\nExample:\n```\n100\t200\n300\t400\n```\n\nIn this task, the regions within one SEGMENT file are not allowed to overlap. i.e.:\n\n```\n100\t200\n150\t250\n```\n\nis not allowed in one file. But\n\n```\n100\t200\n200\t300\n```\n\nis allowed (because of the end-exclusiveness of column 2). \n\nThe SEGMENT files are also always in sorted order.\n\n### FUNCTION (file suffix: \".f\")\n\nOne floating point number per position along the genome (i.e. per genome base pair), starting at position 0. E.g.:\n\n```\n25.0\n26.0\n10.0\n11.0\n...\n```\n\n### The \"genome\"\n\nThe \"genome\" in this test is just a line of positions from 0 to 10.000.000 (end exclusive).\nAll FUNCTION files must be fully defined, with one value per position.\nA FUNCTION file thus always has 10 million lines.\nA SEGMENT file, on the other hand, may have varying number of lines.\n\n## Task overview\n\nThe program should take two input files.\nBased on the types of file (SEGMENT or FUNCTION), the program should calculate a value as follows:\n\n- **2 SEGMENT files**: calculate the overlap (in number of positions) of the regions from file X.s with regions from file Y.s.\n\nExample:\n\nfile X.s:\n```\n1\t2\n3\t6\n```\n\nfile Y.s:\n```\n0\t1\n1\t5\n```\n\nhas an overlap of 3 (i.e. for positions 1, 3 and 4).\n\nNote that the example files shown here have a \"genome\" of length 7\n\n- **2 FUNCTION files**: calculate the sample *Pearson correlation coefficient* of the two number lists ([see here](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#For_a_sample) for the formula)\n\nExample: the two files\n\nX.f\n```\n10.0\n11.0\n12.0\n13.0\n14.0\n15.0\n16.0\n```\n\nY.f\n```\n10.5\n11.5\n12.0\n13.0\n13.5\n15.0\n14.0\n```\n\nHave a Pearson correlation of 0.9452853.\n\n- **1 SEGMENT and 1 FUNCTION file**: The mean of the numbers in the FUNCTION file whose positions are covered by the regions in the SEGMENT file.\nThat is, the regions in the SEGMENT file refer to positions on the genome and hence to the index of the lines in the FUNCTION file.\n\nExample: for files X.s and Y.f, the covered numbers are (11.5, 13.0, 13.5, 15.0), which are on lines with index 1,3,4 and 5, the ones covered by the SEGMENT regions.\nThese numbers have a mean of 13.25.\n\n## Notes on Implementation\n\nThe program should be written in Python, with limited use of external libraries.\nThe goal is to write quick code that still tries to follow good programming practices as regards object-oriented programming, system architecture, unit testing, and such.\nThis is to be regarded as pilot code designed in such a way to support the possible expansion with other file types and analyses in a simple manner.\nThe code does not need to be fully polished, but please comment places where improvements may be made.\n\nPerformance is not an issue; a basic Python implementation is enough.\nHowever, all combinations of the input files provided should be runnable.\nAlso some thought should be made on the algorithmic performance.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbblodfon%2Felixir-task","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbblodfon%2Felixir-task","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbblodfon%2Felixir-task/lists"}