{"id":40545627,"url":"https://github.com/jmvanel/dwca-rdf","last_synced_at":"2026-01-20T23:37:57.196Z","repository":{"id":146299527,"uuid":"288957827","full_name":"jmvanel/dwca-rdf","owner":"jmvanel","description":null,"archived":false,"fork":false,"pushed_at":"2020-12-15T18:56:13.000Z","size":20,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-01-26T10:37:01.591Z","etag":null,"topics":["darwin-core","dwca","jena","rdf"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jmvanel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2020-08-20T09:04:58.000Z","updated_at":"2020-12-15T18:56:15.000Z","dependencies_parsed_at":null,"dependency_job_id":"a38db266-5aae-4962-a723-5dfa3c097ba9","html_url":"https://github.com/jmvanel/dwca-rdf","commit_stats":{"total_commits":13,"total_committers":1,"mean_commits":13.0,"dds":0.0,"last_synced_commit":"c192eb932250c1ff1d82cfa73d8b5b81ba505858"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jmvanel/dwca-rdf","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmvanel%2Fdwca-rdf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmvanel%2Fdwca-rdf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmvanel%2Fdwca-rdf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmvanel%2Fdwca-rdf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jmvanel","download_url":"https://codeload.github.com/jmvanel/dwca-rdf/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmvanel%2Fdwca-rdf/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28618803,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-20T22:24:05.405Z","status":"ssl_error","status_checked_at":"2026-01-20T22:20:31.342Z","response_time":117,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["darwin-core","dwca","jena","rdf"],"created_at":"2026-01-20T23:37:57.140Z","updated_at":"2026-01-20T23:37:57.191Z","avatar_url":"https://github.com/jmvanel.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Convert a Darwin Core Archive into Darwin Core RDF\nIt's work in progress.\n\nIt leverages on existing gbif Java library\nhttps://github.com/gbif/dwca-io\nand Apache Jena.\n\nI tried in on this Darwin Core Archive, that contains plants in my département:\nhttps://www.gbif.org/dataset/85a97a5f-751a-49ca-ad7d-238a80c0d30c\n\nTESTED in Scala REPL with\n```\nrunMain jmvanel.DWCA2RDF /home/jmv/data/Biologie/GBIF.org/Flore_Ain_0039246-200613084148143.zip\n```\n\nSee also the JSON-LD project for GBIF.org API :\nhttps://github.com/jmvanel/rdf-convert/tree/master/gbif.org\n\n## Current result sample\n\n```turtle\n# graph size 34489\n# person Map size 28\n\u003chttps://api.gbif.org/v1/occurrence/2487147795\u003e \u003chttp://rs.tdwg.org/dwc/iri/recordedBy\u003e \u003chttps://api.gbif.org/v1/person/A_Cl__BOLOMIER_-\u003e .\n\u003chttps://api.gbif.org/v1/occurrence/2487147795\u003e \u003curn:taxonKey\u003e \u003chttps://api.gbif.org/v1/species/5366215\u003e .\n\u003chttps://api.gbif.org/v1/occurrence/2487147795\u003e \u003chttp://rs.tdwg.org/dwc/terms/scientificName\u003e \"Potentilla crantzii (Crantz) Fritsch\" .\n\u003chttps://api.gbif.org/v1/occurrence/2487147795\u003e \u003chttp://rs.tdwg.org/dwc/terms/decimalLatitude\u003e \"46.15952\" .\n\u003chttps://api.gbif.org/v1/occurrence/2487147795\u003e \u003chttp://rs.tdwg.org/dwc/terms/decimalLongitude\u003e \"5.39777\" .\n\u003chttps://api.gbif.org/v1/occurrence/2487147795\u003e \u003chttp://rs.tdwg.org/dwc/iri/toTaxon\u003e \u003chttp://taxref.mnhn.fr/lod/taxon/139270/12.0\u003e .\n\u003chttps://api.gbif.org/v1/occurrence/2487147795\u003e \u003chttp://www.w3.org/1999/02/22-rdf-syntax-ns#type\u003e \u003chttp://rs.tdwg.org/dwc/terms/Occurrence\u003e .\n\u003chttps://api.gbif.org/v1/occurrence/2487147795\u003e \u003chttp://www.w3.org/1999/02/22-rdf-syntax-ns#type\u003e \u003chttp://rs.tdwg.org/dwc/terms/HumanObservation\u003e .\n\n\u003chttps://api.gbif.org/v1/person/A_Cl__BOLOMIER_-\u003e \u003chttp://xmlns.com/foaf/0.1/name\u003e \"A.Cl. BOLOMIER -\" .\n\u003chttps://api.gbif.org/v1/person/A_Cl__BOLOMIER_-\u003e \u003chttp://www.w3.org/1999/02/22-rdf-syntax-ns#type\u003e \u003chttp://xmlns.com/foaf/0.1/Person\u003e .\n```\n\nNOTES\n- I suppose that soon :) the GBIF API URL's will be dereferenceable RDF URI; this gbif API is not currently RDF, but it could be made RDF later with JSON-LD : https://api.gbif.org/v1/occurrence/\n- same for persons, but here I don't know what is the API for persons, if any ; I just used prefix https://api.gbif.org/v1/person/ for now\n- detect if a \"recordedBy\"  value is a person or an organization is not trivial; I made no attempt yet ...\n- taxonKey is very important : it is the global GBIF ID for the taxon; a permanent dereferenceable URI has to be defined\n- coordinates should be xsd:float's ; same predicates as geo:lat, geo:long\n- the collectors do not AFAIK have a  GBIF ID\n- \"modified\" key should be used\n- \"identifier\" key should be used; is there an API for this ?\n- \"eventID\" key should be used; is there an API for this ?\n- \"nameAccordingTo\": \"TAXREF v12\" was taken for granted; should be processed; case of other taxon registries to study\n- applies a flat RDF structure (except for persons); it's not fully compliant to DSW ... but simple\n\nDONE\n- given \"basisOfRecord\": \"HUMAN_OBSERVATION\", the class dwc:HumanObservation should also be assigned\n\nHere is the GBIF API result for this (observation) occurrence\n```json\nwget -O - https://api.gbif.org/v1/occurrence/2487147795 |jq .\n{\n  \"key\": 2487147795,\n  \"datasetKey\": \"85a97a5f-751a-49ca-ad7d-238a80c0d30c\",\n  \"publishingOrgKey\": \"1928bdf0-f5d2-11dc-8c12-b8a03c50a862\",\n  \"installationKey\": \"07ea29ef-e386-4278-ae0f-095778a1b061\",\n  \"publishingCountry\": \"FR\",\n  \"protocol\": \"DWC_ARCHIVE\",\n  \"lastCrawled\": \"2020-06-03T01:11:30.173+0000\",\n  \"lastParsed\": \"2020-06-08T16:39:04.529+0000\",\n  \"crawlId\": 1,\n  \"extensions\": {},\n  \"basisOfRecord\": \"HUMAN_OBSERVATION\",\n  \"taxonKey\": 5366215,\n  \"kingdomKey\": 6,\n  \"phylumKey\": 7707728,\n  \"classKey\": 220,\n  \"orderKey\": 691,\n  \"familyKey\": 5015,\n  \"genusKey\": 8079058,\n  \"speciesKey\": 8285546,\n  \"acceptedTaxonKey\": 8370002,\n  \"scientificName\": \"Potentilla crantzii (Crantz) Fritsch\",\n  \"acceptedScientificName\": \"Potentilla crantzii subsp. crantzii\",\n  \"kingdom\": \"Plantae\",\n  \"phylum\": \"Tracheophyta\",\n  \"order\": \"Rosales\",\n  \"family\": \"Rosaceae\",\n  \"genus\": \"Potentilla\",\n  \"species\": \"Potentilla crantzii\",\n  \"genericName\": \"Potentilla\",\n  \"specificEpithet\": \"crantzii\",\n  \"taxonRank\": \"SPECIES\",\n  \"taxonomicStatus\": \"SYNONYM\",\n  \"decimalLongitude\": 5.39777,\n  \"decimalLatitude\": 46.15952,\n  \"coordinateUncertaintyInMeters\": 5000,\n  \"issues\": [\n    \"GEODETIC_DATUM_ASSUMED_WGS84\",\n    \"RECORDED_DATE_INVALID\"\n  ],\n  \"modified\": \"2019-02-28T00:00:00.000+0000\",\n  \"lastInterpreted\": \"2020-06-08T16:39:04.529+0000\",\n  \"license\": \"http://creativecommons.org/licenses/by-nc/4.0/legalcode\",\n  \"identifiers\": [],\n  \"media\": [],\n  \"facts\": [],\n  \"relations\": [],\n  \"geodeticDatum\": \"WGS84\",\n  \"class\": \"Magnoliopsida\",\n  \"countryCode\": \"FR\",\n  \"recordedByIDs\": [],\n  \"identifiedByIDs\": [],\n  \"country\": \"France\",\n  \"identifier\": \"82e3ca10-dcbf-24f6-e053-2614a8c008ee\",\n  \"eventID\": \"82e3ca10-dcbf-24f6-e053-2614a8c008ee\",\n  \"dataGeneralizations\": \"Géographie transmise soumise à floutage (grille avec mailles de 10x10km) pour le grand public » en conformité avec les règles de diffusion du SINP | Geographic information generalized during aggregation (grid with 10x10km cells) for the general public, according to SINP communication rules\",\n  \"county\": \"01\",\n  \"identificationVerificationStatus\": \"Control could not be conclusive due to insufficient knowledge\",\n  \"gbifID\": \"2487147795\",\n  \"occurrenceID\": \"82e3ca10-dcbf-24f6-e053-2614a8c008ee\",\n  \"taxonID\": \"139270\",\n  \"occurrenceStatus\": \"Présent\",\n  \"recordedBy\": \"A.Cl. BOLOMIER - (Non renseigné)\",\n  \"locationRemarks\": \"Data isn’t the original geo referenced one, but attached to the nearest 10x10 km grid cell\",\n  \"institutionCode\": \"Non renseigné\",\n  \"originalNameUsage\": \"Potentilla verna\",\n  \"datasetID\": \"4A9DDA1F-B72E-3E13-E053-2614A8C02B7C\",\n  \"nameAccordingTo\": \"TAXREF v12\",\n  \"identifiedBy\": \"Non renseigné (Non renseigné)\"\n}\n```\n\n**Links**\n- https://dwc.tdwg.org/rdf/\n- https://dwc.tdwg.org/terms/\n- http://baskauf.blogspot.com/2019/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmvanel%2Fdwca-rdf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjmvanel%2Fdwca-rdf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmvanel%2Fdwca-rdf/lists"}