{"id":32564021,"url":"https://github.com/euroargodev/dmqc_status_and_statistics","last_synced_at":"2025-10-29T03:54:14.306Z","repository":{"id":52785373,"uuid":"343831453","full_name":"euroargodev/DMQC_status_and_statistics","owner":"euroargodev","description":"Figures for DMQC statistics for a given list of floats","archived":false,"fork":false,"pushed_at":"2025-10-17T06:43:35.000Z","size":11583,"stargazers_count":6,"open_issues_count":1,"forks_count":2,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-10-18T09:45:55.719Z","etag":null,"topics":["argo","argo-floats","dmqc","quality-control"],"latest_commit_sha":null,"homepage":"","language":"MATLAB","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/euroargodev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-03-02T16:03:04.000Z","updated_at":"2025-10-17T08:23:10.000Z","dependencies_parsed_at":"2025-03-04T17:26:56.111Z","dependency_job_id":"9a14482c-7e0a-4343-b3ea-ac1ceb010ed2","html_url":"https://github.com/euroargodev/DMQC_status_and_statistics","commit_stats":{"total_commits":106,"total_committers":3,"mean_commits":"35.333333333333336","dds":"0.23584905660377353","last_synced_commit":"0d73e4e41b1de9ac88ee2023ce14874d2ceadc5b"},"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/euroargodev/DMQC_status_and_statistics","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euroargodev%2FDMQC_status_and_statistics","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euroargodev%2FDMQC_status_and_statistics/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euroargodev%2FDMQC_status_and_statistics/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euroargodev%2FDMQC_status_and_statistics/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/euroargodev","download_url":"https://codeload.github.com/euroargodev/DMQC_status_and_statistics/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/euroargodev%2FDMQC_status_and_statistics/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281556916,"owners_count":26521571,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-29T02:00:06.901Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["argo","argo-floats","dmqc","quality-control"],"created_at":"2025-10-29T03:54:13.272Z","updated_at":"2025-10-29T03:54:14.300Z","avatar_url":"https://github.com/euroargodev.png","language":"MATLAB","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DMQC_status_and_statistics\n\n## A. Script **get_DMQC_stats.m** \u003cbr /\u003e\n\nThis script computes DMQC statistics for a given list of floats \u003cbr /\u003e\n\u003cbr /\u003e\n**INPUTS**\n - **argo_profile_detailled_index.txt** and \u003cbr /\u003e\n   **argo_synthetic-profile_detailled_index.txt** if i_BGC=1\n - **argo_sensor_exclusion_list.txt**\n - **floats list**: csv file, separated by \";\", with 4 fields:\u003cbr /\u003e\n    WMO; COUNTRY; LAUNCH_DATE; PROGRAM\u003cbr /\u003e\n        COUNTRY is the country in charge of the delayed mode processing (from OceanOPS)\u003cbr /\u003e\n        LAUNCH_DATE must be in the format \"YYYY/MM/DD\".\u003cbr /\u003e\n   e.g.:\u003cbr /\u003e\n   WMO;COUNTRY;LAUNCH_DATE;PROGRAM\u003cbr /\u003e\n   3901496;United Kingdom;2014/10/20;Argo UK Bio\u003cbr /\u003e\n   3901497;United Kingdom;2014/10/24;Argo UK Bio\u003cbr /\u003e\n   These 4 fields can be extracted from the OceanOPS website or directly downloaded from \u003cbr /\u003e\n   https://www.ocean-ops.org/share/Argo/Status/\u003cbr /\u003e\n   In the argo_all.csv from OceanOPS, the corresponding columns to account for are :\u003cbr /\u003e\n         REF (-\u003e WMO), COUNTRY , DEPL_DATE (-\u003e LAUNCH_DATE) and PROGRAM \n - **country_code.csv**: csv file, separated by \";\", with 2 fields:\u003cbr /\u003e\n    COUNTRY; COUNTRY_CODE\u003cbr /\u003e\n       COUNTRY should follow the OceanOPS conventions\u003cbr /\u003e\n       COUNTRY_CODE is a 3-digit CODE that will be used on graphs and outputs.\u003cbr /\u003e\n    The default template country_code_template.csv from the script directory can be used.\u003cbr /\u003e\n    This default contains all the countries associated to Argo in OceanOPS with the\u003cbr /\u003e\n    corresponding NATO (https://en.wikipedia.org/wiki/List_of_NATO_country_codes) official \u003cbr /\u003e\n    3-digits codes (list of relevant countries extracted the 2023/07/07 from OceanOPS database)\u003cbr /\u003e\n\u003cbr /\u003e\n\n **CONFIGURATION PARAMETERS**\n Since the V3.5 release, the configuration has been externalised within **get_DMQC_status_config.txt** file. Besides paths \n for the input files mentioned here-above, it allows to configure the following parameters:\n - **i_descending_profile**: 1 means descending profiles are considered, \u003cbr /\u003e\n                         0 means descending profiles are not considered.\n - **profile_age_method** = 'date' or 'days': choose the method to filter \"old\" profiles\n - **profile_age_min_days**: \"old profiles\" threshold (in days) for floats and observations statistics.\n - **profile_age_max_date**: \"old profiles\" threshold (in date, format yyyy/mm/dd) for floats and observations statistics.\n - **i_bgc**: 1 means bgc profiles/parameters are considered (the detailed argo synthetic index will be \u003cbr /\u003e\n            read for BGC parameters, the detailed argo index will be read for CTD)\u003cbr /\u003e\n          0 means core information (CTD) is analysed from the detailed argo index.\n - **input_list_of_parameters_to_treat** is the list of core parameters to analyse\n - **input_list_of_BGC_parameters_to_treat** is the list of BGC parameters to analyse\n - **print_svg**: 1 means figures will be saved in .svg format as well\u003cbr /\u003e\n (interesting for high quality, but a little longer to save).\n - **output_graphs_per_float**: flag to indicate if graphs per float should be\u003cbr /\u003e\n   recorded. (Graphs will be recorded by bunch of 40 floats max. For treatment with a large\u003cbr /\u003e\n   number of floats, this may not be relevant)\n - **n_max_float_per_graph**: associated to output_graphs_per_float.\n - **i_group_AB_profQC**: 1 means that on QC-related plots profile QC A and profile QC B will be grouped\n\u003cbr /\u003e\n\n**OUTPUTS**\n - **Figures**   saved in folder output_files_yyyy-mm-dd_hhmmss/Plots \n - **Analyses**  saved in folder output_files_yyyy-mm-dd_hhmmss/Syntheses \n - **Copy of input files** saved in folder output_files_yyyy-mm-dd_hhmmss\n\n **Auxiliary functions needed**\n  - read_csv\n  - get_data_from_index \n  - plotBarStackGroups\n  - grep (Matlab grep equivalent function)\n  - load_configuration.m\n\n **WARNING 1** : Profile_QC for PRES information is not yet available in the Argo detailed index. \u003cbr /\u003e\n It is filled with qc=\"X\" in the script for the moment, and plots related to pres profile\u003cbr /\u003e\n qc are skipped.\u003cbr /\u003e\n\n **Nota Bene** : Profile_QC X means the profile contained no data.\n\n **Author**: Euro-Argo ERIC (contact@euro-argo.eu)\u003cbr /\u003e\n\n **Version**: 3.5 (2025/10/17)\u003cbr /\u003e\n\n **Historic**:\u003cbr /\u003e\n - V1.0 : This script was originally created by Andrea Garcia Juan and Romain\u003cbr /\u003e\n        Cancouët, and updated by Luca Arduini Plaisant.\n - V2.0 (2023/06/19): \n   - The script architecture was reviewed on 2023/06/19 by Delphine Dobler \u003cbr /\u003e\n     to include the processing of BGC floats and to enhance performances.\n - V2.01 (2023/07/07): \n   - adding one test for the existence of found indices\n   - only keep BGC parameters that were found in the synthetic index file\n   - output an additional file with this information\n - V2.1  (2023/07/07): \n   - fetch DAC information from the index\n   - record D-profile last update date in output\n   - associate country_codes from a config file\n   - add -f option to final zip to allow overwrite if script is\u003cbr /\u003e\n        launched several times on the same day\n - V3.0  (2023/07/10):\n   - merge CTD and BGC processings (no need for the user to know\u003cbr /\u003e\n          which WMO is BGC, which one is not)\n - V3.1  (2023/07/12):\n   - add plot with the number of R/A/D profiles per variable\n   - add plots with information by cycle and by WMO:\n     - R/A/D status\n     - profile QC status\n     - PSAL_adjustement\u003cbr /\u003e\n     =\u003e These plots replace the old get_DMQC_adjustment.m script\n   - stop duplicating graphs for CTD mode\n   - skip plots with PRES profile_QC (information not yet available in index)\n - V3.2 (2023/07/21) :\n   - correct bug at the final zipping step when file does not exist\u003cbr /\u003e\n     (in case i_bgc = 0).\n   - add case All Argo Fleet and manage optimization section for large number of floats.\n   - dealing with float_wmo type for cases when not all WMOs are coded on 7 characters.\n   - dealing with graph layout when n_countries is large\n   - special workaround for float 4900566 that used QC 1 instead of QC A for profile QC.\n   - add quotes in synthese output for program, in case comma is used.\n - V3.3 (2023/10/02) :\n   - add hhmmss in the output directory name\n   - change search for param name in index for a more robust means\n   - add x grid and minor grid for psal adjustment display by wmo.\n   - add an option to group prof QC A and B\n - V3.4 (2024/02/23) :\n   - remove CTD from plot 09 when i_bgc=1.\n   - 1-yr static string was replaced by the config value\n   - add a new graph per profile year with DMQC and profile QC F + \n     output values in a text file\n   - add a log (diary)\n - V3.5 (2025/10/17) :\n   - modify psal adj plot to separate RA mode not QC-F from RA mode QC-F profiles.\n   - correct cycles and descending profiles processing in get_data_from_index.m routine\n   - replace \"grey list\" by \"exclusion list\"\n   - correct issue with print('-dpng') for more recent Matlab versions, which do not support a space.\n   - externalise the configuration in a conf file\n   - choose the method to select old profiles/floats: either by profile age in days or by the profile date\n\n\n## B. Graphical outputs for **get_DMQC_stats.m** \nDifferent outputs are produced: graphical and textual. Here after are examples of graphical outputs obtained for floats from the MOCCA project.\n\n- __Plot 01 and 02__: State of the DMQC per country, in number of floats and/or profiles (one plot per parameter) \u003cbr /\u003e\n\nThese plots present the number of floats (resp. profiles) by country:\ntotal number, number at least DMQCed once, number older than {sage} year  and number at least DMQCed once and older than {sage} year. By default, sage = 1 year.\nHere country refers to country associated to the float in the OceanOPS database. It is coded using the OTAN trigramme code. The used correspondence can be found in the file country_code_{yyy-mm-dd}.txt. N.B.: For the European case and the European BGC case, the countries outside Europe but for which floats are decoded at Coriolis where gathered under the 'EXT' trigramme.\n\n\u003cp float=\"left\"\u003e\n\u003cimg \nsrc=\"OUTPUT_examples/MOCCA_case/Plots/01_MOCCA_Fleet_CTD_DMQC_status_nb_floats_by_country_20230713.png\" width=\"400\" /\u003e \n\u003cimg src=\"OUTPUT_examples/MOCCA_case/Plots/02_MOCCA_Fleet_CTD_DMQC_status_nb_profiles_by_country_20230713.png\" width=\"400\" /\u003e\n\u003c/p\u003e\n\n- __Plot 03__: Profile quality (all profiles and only Delayed Mode - i.e. consolidated - profiles) in number of floats profiles (one plot per parameter) \u003cbr /\u003e\n\nThis plot presents the number of profiles with respect to the profile_QC code, both for all processed mode and for D-mode only. \nProfile QC codes are defined in the Reference table 2a of the Argo QC manual (https://archimer.ifremer.fr/doc/00228/33951/32470.pdf) and is recalled hereafter:\nGOOD data = QC flag values of 1, 2, 5 or 8\nBAD data = QC flag values of 3 or 4\n- profile QC A: 100% GOOD data (All the measurement points of the profile are GOOD data);\n- profile QC B: 75% to 100% GOOD data;\n- profile QC C: 50% to 75% GOOD data;\n-\tprofile QC D: 25% to 50% GOOD data;\n-\tprofile QC E:  0% to 25% GOOD data;\n-\tprofile QC F: no good data GOOD data\n\nThe exact number and relative percentages are also indicated above the bars. \nThe relative percentages for the profiles processed in delayed mode provide a consolidated view.\nA few profiles do not have a profile QC in the index file. This observation deserves further analysis.\nHere is an example for the batch of floats declared as ASD:\n\n\n\u003cp float=\"center\"\u003e\n\u003cimg \nsrc=\"OUTPUT_examples/ASD_case/Plots/03_ASD_Fleet_PSAL_profile_QC_20231002.svg\" width=\"400\" /\u003e \n \u003cimg \nsrc=\"OUTPUT_examples/ASD_case/Plots/03_ASD_Fleet_TEMP_profile_QC_20231002.svg\" width=\"400\" /\u003e \n\u003c/p\u003e\n\n- __Plot 04 and 05__ Profile quality evolution (all profiles and only Delayed Mode - i.e. consolidated - profiles) in percentage of floats profiles (one plot per parameter) \u003cbr /\u003e\n\nThe upper panel of this plot is a time evolution view of plot 03 for all profiles (plot 04) and D-profiles only (plot 05), with respect to the float launch year (rapid proxi for sensor generation). To indicate the significance of the statistics, the number of profiles for the corresponding year is also provided on the lower panel. Here is an example for the batch of floats declared as ASD:\n\n\u003cp float=\"center\"\u003e\n\u003cimg \nsrc=\"OUTPUT_examples/ASD_case/Plots/04_ASD_Fleet_PSAL_profile_QC_evolution_20231002.svg\" width=\"400\" /\u003e \n \u003cimg \nsrc=\"OUTPUT_examples/ASD_case/Plots/05_ASD_Fleet_PSAL_Dprofile_QC_evolution_20231002.svg\" width=\"400\" /\u003e \n\u003c/p\u003e\n\n- __Plot 06__ Global DMQC status and grey list information (one plot per parameter) \u003cbr /\u003e\n\nThe first four coloured bars provide the same information as Plot 01 but summed for all countries. \nThe three bars on the right side provide information about grey listing (QC3 or QC4):\n   - number for active floats (and in the legend, the relative part of all active floats is mentioned)\n   - number of inactive floats that still have profiles not processed in delayed mode\n   - number of inactive floats that have been fully processed in delayed mode.\nHere active floats refers to floats having emitted a profiles within the last 30 days (with respect to the index file update date). \nThis limit is arbitrary and can be tuned.\n\n\n\u003cp float=\"center\"\u003e\n\u003cimg \nsrc=\"OUTPUT_examples/MOCCA_case/Plots/06_MOCCA_Fleet_PSAL_DMQC_status_and_grey_list_20230713.png\" width=\"400\" /\u003e \n\u003c/p\u003e\n\n- __Plot 07 and 09__ DMQC status per profile year, and age histogram of non-DMQC profiles (one plot per parameter) \u003cbr /\u003e\n\nPlot 07 presents the time evolution of the percentage of profiles processed in delayed mode with respect to the profile date.\nPlot 09 presents the age histogram of profiles with no DMQC performed yet.\n\n\u003cp float=\"center\"\u003e\n\u003cimg \nsrc=\"OUTPUT_examples/MOCCA_case/Plots/07_MOCCA_Fleet_CTD_prof_DMQCstatus_byyear_20230713.png\" width=\"400\" /\u003e \n \u003cimg \nsrc=\"OUTPUT_examples/MOCCA_case/Plots/08_MOCCA_Fleet_CTD_prof_DMQCstatus_agehist_20230713.png\" width=\"400\" /\u003e \n\u003c/p\u003e\n\n- __Plot 08 (new v3.4)__ DMQC and profile -F status per profile year\u003cbr /\u003e\n\u003cp float=\"center\"\u003e\n\u003cimg \nsrc=\"OUTPUT_examples/MOCCA_case/Plots/08_MOCCA_Fleet_PSAL_prof_DMQC-and-F_status_byyear_20240731.png\" width=\"400\" /\u003e \n\u003c/p\u003e\n\n- __Plot 10__ R-A-D status for all parameters \u003cbr /\u003e\nPlot 10 presents by parameter (x axis), the number of R-profiles, A-profiles and D-profiles.\n\n\u003cp float=\"center\"\u003e\n\u003cimg \nsrc=\"OUTPUT_examples/MOCCA_case/Plots/09_MOCCA_Fleet_prof_RAD_mode_per_param_20230713.png\" width=\"400\" /\u003e \n\u003c/p\u003e\n\n- __Plot 11 and 12__ DMQC and quality profile status by batch of WMOs per cycle (one plot per parameter) \u003cbr /\u003e\nThese plots show the DMQC (plot 11) and quality profile status (plot 12) by batch of WMOs per cycle (one plot per parameter).\nThese plots are output only on demand. The number of WMOs shown by graph can be tuned.\n\n\u003cp float=\"center\"\u003e\n \u003cimg \nsrc=\"OUTPUT_examples/MOCCA_case/Plots/10_MOCCA_Fleet_CTD_RAD_mode_per_wmo_per_cycle_001_20230713.png\" width=\"400\" /\u003e \n \u003cimg \nsrc=\"OUTPUT_examples/MOCCA_case/Plots/11_MOCCA_Fleet_PSAL_profile_QC_per_wmo_per_cycle_001_20230713.png\" width=\"400\" /\u003e \n\u003c/p\u003e\n\n- __Plot 13__ PSAL adjustment by batch of WMOs per cycle  \u003cbr /\u003e\nThis plot shows PSAL_adjustment by batch of WMOs per cycle.\nThis plot is output only on demand. The number of WMOs shown by graph can be tuned.\nIn grey color: the profiles that are not yet processed in delayed mode and that are not profile QC F.\nIn black: the profiles (either real-time or delayed mode) that are QC-F.\nIn jet colorscale, the value of the PSAL adjustment for D-profiles bounded by [-0.07 0.07]. The same bounds are used for all plots for better intercomparison and to ensure that \"no adjustment\" case will always appear in green.\n\n/!\\ WARNING 2: There is an issue with Argo detailed index: for a few {floats,cycles}, PSAL_adjustment is not computed in the detailed index (https://gitlab.ifremer.fr/coriolis/actions/actions-argo/-/issues/63).\n\n \u003cimg \nsrc=\"OUTPUT_examples/ASD_case/Plots/12_ASD_Fleet_PSAL_PSAL_adj_per_wmo_per_cycle_001_20231002.svg\" width=\"500\" /\u003e \n\u003c/p\u003e\n\n## C. Syntheses outputs for **get_DMQC_stats.m** \n\nThere are 5 kinds of syntheses produced by the script:\n\n-  Additional_Info_{yyyymmdd}.txt\n This file indicates which WMO, if any, were not found in the Argo detailed index, and in the Argo detailed synthetic index if i_bgc = 1.\nIf i_bgc=1, it also indicates which BGC parameters, if any, were not found in the Argo detailed synthetic index for the input list of WMOs. On the other hand, it indicates which BGC parameters were found and were not requested.\n\n-  DMQC_status_per_country_for_{param}_{yyyymmdd}.txt\n This file is the numbered information for plots 01 and plots 02.\n\n-  DMQC_status_per_wmo_for_{param}_{yyyymmdd}.txt is a synthesis by wmo. This output can be used to define priorities and the list of floats to be treated in DMQC with respect to some given-criteria. There is quite a number of indication. Some may be added if needed. The associated parameter is recalled in the first column to make sure that the information is understood as being relative to this parameter only. The column name are quite straightforward, however a few may deserve some more details:\n   - The nb_prof_QC_X and nb_prof_DM_QC_X refer to profiles with no value for profile QC (see Plot 03 comment above).\n   - DM_done column refers to the fact that this WMO has been seen in delayed mode at least once. To get the delayed mode completeness, refer to percentage_DM_prof column.\n   - greylist means that the float was put on greylist with QC3 or QC4 for the given parameter.\n \n- DMQC_status_per_profile_year_for_{param}_{yyyymmdd}.txt (NEW V3.4) is the numbered information for new plot 08.\n\n- The DMQC_status_warnings_per_wmo_for_{param}_{yyyymmdd}.txt is an output of what was thought should raise the attention such as:\n  - there is at least one profile_QC set to F (prof_QC=F_once)\n  - there is at least one profile_QC with no value (prof_QC_not_filled_once)\n  - the DMQC was never performed whereas the float is older than 1 year (No_DMQC_older_1yr) (Nota bene: this may be a criterion little bit too restrictive ...)\n  - The float is in grey list.   \n\n## D. TOOLBOX\nName \u0026 Description of the auxiliary functions:\n- read_csv:\tread a csv file and generates an struct with file variables\t\n- get_data_from_index: gets chosen variables from index file for a given list of floats\n- plotBarStackGroups: Permit to make a bar plot with stacked bars for one graph tick.\n- grep : Matlab equivalent of the unix grep command (not as performant).\n- load_configuration: read the configuration file\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feuroargodev%2Fdmqc_status_and_statistics","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feuroargodev%2Fdmqc_status_and_statistics","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feuroargodev%2Fdmqc_status_and_statistics/lists"}