{"id":23757759,"url":"https://github.com/broadinstitute/pe2loaddata","last_synced_at":"2025-09-05T04:33:06.964Z","repository":{"id":29965667,"uuid":"46865418","full_name":"broadinstitute/pe2loaddata","owner":"broadinstitute","description":"Script to parse a Phenix metadata XML file and generate a .CSV for CellProfiler's loaddata module","archived":false,"fork":false,"pushed_at":"2025-01-03T18:05:01.000Z","size":74502,"stargazers_count":2,"open_issues_count":13,"forks_count":7,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-04-05T22:51:11.535Z","etag":null,"topics":["carpenter-lab"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/broadinstitute.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-11-25T14:09:34.000Z","updated_at":"2024-11-20T22:48:29.000Z","dependencies_parsed_at":"2024-09-14T02:31:50.410Z","dependency_job_id":"879f5917-b051-4d65-b9d0-6996c5a6f456","html_url":"https://github.com/broadinstitute/pe2loaddata","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/broadinstitute/pe2loaddata","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fpe2loaddata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fpe2loaddata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fpe2loaddata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fpe2loaddata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/broadinstitute","download_url":"https://codeload.github.com/broadinstitute/pe2loaddata/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/broadinstitute%2Fpe2loaddata/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273713365,"owners_count":25154609,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-05T02:00:09.113Z","response_time":402,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["carpenter-lab"],"created_at":"2024-12-31T19:50:51.178Z","updated_at":"2025-09-05T04:33:01.947Z","avatar_url":"https://github.com/broadinstitute.png","language":"Python","readme":"# pe2loaddata\nScript to parse a Phenix metadata XML file and generate a .CSV for CellProfiler's loaddata module.\nTested for XML files made by Harmony versions V1, V5, and V7, but as far as we know supports all.\n\nTo install: \n\n```\ngit clone https://github.com/broadinstitute/pe2loaddata.git\ncd pe2loaddata/\npip install -e .\n```\n\nTo run CSV creation based on the XML file:\n\n    pe2loaddata --index-directory \u003cindex-directory\u003e config.yml output.csv\n\nwhere \\\u003cindex-directory\\\u003e is the directory containing the Index.idx.xml or Index.xml file and the images (any image set that is not complete will not be written to the CSV), config.yml is the LoadData configuration file and output.csv is the CSV that will be generated.\n\nThe config.yml file lets you name the channels you want to save and lets you pull metadata out of the image. An example:\n\n    channels:\n        HOECHST 33342: OrigDNA\n        Alexa 568: OrigAGP\n        Alexa 647: OrigMito\n        Alexa 488: OrigER\n        488 long: OrigRNA\n    metadata:\n        Row: Row\n        Col: Col\n        FieldID: FieldID\n        PlaneID: PlaneID\n        ChannelID: ChannelID\n        ChannelName: ChannelName\n        ImageResolutionX: ImageResolutionX\n        ImageResolutionY: ImageResolutionY\n        ImageSizeX: ImageSizeX\n        ImageSizeY: ImageSizeY\n        BinningX: BinningX\n        BinningY: BinningY\n        MaxIntensity: MaxIntensity\n        PositionX: PositionX\n        PositionY: PositionY\n        PositionZ: PositionZ\n        AbsPositionZ: AbsPositionZ\n        AbsTime: AbsTime\n        MainExcitationWavelength: MainExcitationWavelength\n        MainEmissionWavelength: MainEmissionWavelength\n        ObjectiveMagnification: ObjectiveMagnification\n        ObjectiveNA: ObjectiveNA\n        ExposureTime: ExposureTime\n\nIn the above example, \"HOECHST 33342\" is the label for the DNA channel and\nif you load the .csv file in LoadData, you will get an image named \"DNA\" in\nyour image set.\n\nThe metadata section selects items out of the image metadata and allows\nyou to rename them as metadata. In addition, pe2loaddata automatically\npopulates the plate, well and site metadata entriess.\n\npe2loaddata now supports experiments with multiple planes per field as long as the `PlaneID` field \nhas been set in the config file.\n\npe2loaddata also supports creating CSVs based on index-file and index-directory locations on AWS S3; note that the --search-subdirectories flag is mandatory for running on AWS. See `pe2loaddata --help` for more information on tunable parameters.\n\n    pe2loaddata --index-directory s3://cellpainting-gallery/cpg0001-cellpainting-protocol/source_4/images/2020_06_19_Stain2_Batch1/images/BR00113255__2020-06-19T11_13_27-Measurement2/Images/ --index-file s3://cellpainting-gallery/cpg0001-cellpainting-protocol/source_4/images/2020_06_19_Stain2_Batch1/images/BR00113255__2020-06-19T11_13_27-Measurement2/Images/Index.idx.xml config_remote.yml output.csv --search-subdirectories\n\n\n------\n\nTo run CSV creation based on the XML file, AND to append illumination columns (note that this requires \nthe CellProfiler names in your config file to start with `Orig`, which will be replaced by `Illum`)\n\n    pe2loaddata --index-directory \u003cindex-directory\u003e config.yml output.csv --illum --illum-directory \u003cillum-directory\u003e --plate-id \u003cplate-id\u003e --illum-output output_with_illum.csv\n\nwhere \\\u003cillum-directory\\\u003e is the directory where illumination files are (or will be) and \\\u003cplate-id\\\u003e is the plate ID that will be used by CellProfiler in your illumination files' names.\n    \nIf you've already generated `output.csv` and want to only add the illum files to it, you can run with \n\n    pe2loaddata --index-directory \u003cindex-directory\u003e config.yml output.csv --illum-only --illum-directory \u003cillum-directory\u003e --plate-id \u003cplate-id\u003e --illum-output output_with_illum.csv\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbroadinstitute%2Fpe2loaddata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbroadinstitute%2Fpe2loaddata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbroadinstitute%2Fpe2loaddata/lists"}