{"id":14976743,"url":"https://github.com/gyanz/pydsstools","last_synced_at":"2025-04-09T14:15:54.813Z","repository":{"id":46281345,"uuid":"118285406","full_name":"gyanz/pydsstools","owner":"gyanz","description":"Python library for simple HEC-DSS functions","archived":false,"fork":false,"pushed_at":"2024-08-07T16:33:25.000Z","size":194419,"stargazers_count":83,"open_issues_count":15,"forks_count":36,"subscribers_count":18,"default_branch":"master","last_synced_at":"2024-10-30T09:09:57.534Z","etag":null,"topics":["ctypes","cython","database","engineering","hec-dss","linux","python","windows10"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gyanz.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.MD","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-01-20T22:20:27.000Z","updated_at":"2024-10-14T02:52:43.000Z","dependencies_parsed_at":"2024-11-13T04:13:06.272Z","dependency_job_id":null,"html_url":"https://github.com/gyanz/pydsstools","commit_stats":{"total_commits":135,"total_committers":9,"mean_commits":15.0,"dds":"0.28148148148148144","last_synced_commit":"206cd7b505d547497323268f1cd000687646b2e3"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyanz%2Fpydsstools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyanz%2Fpydsstools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyanz%2Fpydsstools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gyanz%2Fpydsstools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gyanz","download_url":"https://codeload.github.com/gyanz/pydsstools/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248054194,"owners_count":21039952,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ctypes","cython","database","engineering","hec-dss","linux","python","windows10"],"created_at":"2024-09-24T13:54:21.032Z","updated_at":"2025-04-09T14:15:54.792Z","avatar_url":"https://github.com/gyanz.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"About pydsstools\n===\n\npydsstools is an experimental Cython based Python library to manipulate [HEC-DSS](http://www.hec.usace.army.mil/software/hec-dssvue/) database file. It supports regular/irregular time-series, paired data series and spatial grid records. It is compatible with 64-bit Python on Windows 10 and Ubuntu like linux distributions. For the later, zlib, math, quadmath, and gfortran libraries must be installed. [dssvue](https://github.com/gyanz/dssvue) python library provides graphical user interface for HEC-DSS. There is also a Rust binding [hecdss](https://github.com/gyanz/hecdss-rs) for HEC-DSS.    \n\nAbout HEC-DSS \u003csup\u003e[1]\u003c/sup\u003e\n===\n\nHEC-DSS is designed to be optimal for storing and retrieving large sets, or series, of data. HEC-DSS is not a relational database, but a database that is designed to retrieve and store large amounts of data quickly that are not necessarily interlinked to other sets of data, like relational databases are. Additionally, HEC-DSS provides a flexible set of utility programs and is easy to add to a user's application program. These are the features that distinguish HEC-DSS from most commercial relational database programs and make it optimal for scientific applications.\n\nHEC-DSS uses a block of sequential data as the basic unit of storage. Each block contains a series of values of a single variable over a time span appropriate for most applications. The basic concept underlying HEC-DSS is the organization of data into records of continuous, applications-related elements as opposed to individually addressable data items. This approach is more efficient for scientific applications than a relational database system because it avoids the processing and storage overhead required to assemble an equivalent record from a relational database.\n\nData is stored in blocks, or records, within a file and each record is identified by a unique name called a \"pathname.\" Each time data is stored or retrieved from the file, its pathname is used to access its data. Information about the record (e.g., units) is stored in a \"header array.\" This includes the name of the program writing the data, the number of times the data has been written to, and the last written date and time. HEC-DSS documents stored data completely via information contained in the pathname and stored in the header so no additional information is required to identify it. One data set is not directly related to another so there is no need to update other areas of the database when a new data set is stored. The self-documenting nature of the database allows information to be recognized and understood months or years after it was stored.\n\nBecause of the self-documenting nature of the pathname and the conventions adopted, there is no need for a data dictionary or data definition file as required with other database systems. In fact, there are no database creation tasks or any database setup. Both HEC-DSS utility programs and applications that use HEC-DSS will generate and configure HEC-DSS database files automatically. There is also no pre-allocation of space; the software automatically expands the file size as needed.\n\nHEC-DSS references data sets, or records, by their pathnames. A pathname may consist of up to 391 characters and is, by convention, separated into six parts, which may be up to 64 characters each. Each part is delimited by a slashe \"/\", and is labeled \"A\" through \"F\", as follows: /A/B/C/D/E/F/.\n\nA list of the pathnames in a DSS file is called a \"catalog.\" In version 6, the catalog was a separate file; in version 7, the catalog is constructed directly from pathnames in the file.\n\nMulti-user access mode is handled automatically by HEC-DSS. The user does not need to do anything to turn it on. Multi-user access allows multiple users, multiple processes, to read and write to the same HEC-DSS file at the same time. This is true for a network drive as well as a local drive. You can have a shared network HEC-DSS file that has several processes reading and writing to it at the same time. The only drawback is that file access may be slower, depending on the operating system.\n\n 1. USACE, Hydrologic Engineering Center (July, 2019). HEC Data Storage System Guide (Draft).\n\nChanges\n===\n\n[**changelog**][changelog]\n\n   [changelog]: https://github.com/gyanz/pydsstools/blob/master/CHANGES.MD\n\nUsage\n===\n\nSample dss file available in examples folder.\n\n### Example 1\nWrite regular time-series data to example.dss\n\nNotes:\n     The interval must be [any] integer greater than 0 for regular time-series.\n     Actual time-series interval implied from E-Part of pathname\n     The values attribute can be list, array or numpy array\n\n```\nfrom datetime import datetime\nfrom pydsstools.heclib.dss import HecDss\nfrom pydsstools.core import TimeSeriesContainer,UNDEFINED\n\ndss_file = \"example.dss\"\npathname = \"/REGULAR/TIMESERIES/FLOW//1HOUR/Ex1/\"\ntsc = TimeSeriesContainer()\ntsc.pathname = pathname\ntsc.startDateTime = \"15JUL2019 19:00:00\"\ntsc.numberValues = 7\ntsc.units = \"cfs\"\ntsc.type = \"INST\"\ntsc.interval = 1\ntsc.values = [100,UNDEFINED,500,5000,10000,24.1,25]\n\nfid = HecDss.Open(dss_file)\nfid.deletePathname(tsc.pathname)\nfid.put_ts(tsc)\nts = fid.read_ts(pathname)\nfid.close()\n```\n\n### Example 2\nRead and plot regular time-series\n```\nfrom pydsstools.heclib.dss import HecDss\nimport matplotlib.pyplot as plt\nimport numpy as np\n\ndss_file = \"example.dss\"\npathname = \"/REGULAR/TIMESERIES/FLOW//1HOUR/Ex1/\"\nstartDate = \"15JUL2019 19:00:00\"\nendDate = \"15AUG2019 19:00:00\"\n\nfid = HecDss.Open(dss_file)\nts = fid.read_ts(pathname,window=(startDate,endDate),trim_missing=True)\n\ntimes = np.array(ts.pytimes)\nvalues = ts.values\nplt.plot(times[~ts.nodata],values[~ts.nodata],\"o\")\nplt.show()\nfid.close()\n```\n\n### Example 3\nWrite irregular time-series data\n\nNotes:\n     The interval must be [any] integer \u003c= 0 for irregular time-series.\n     DParts: IR-MONTH, IR-YEAR, IR-DECADE, IR-CENTURY\n```\nfrom datetime import datetime\nfrom pydsstools.heclib.dss import HecDss\nfrom pydsstools.core import TimeSeriesContainer, UNDEFINED\n\ndss_file = \"example.dss\"\npathname = \"/IRREGULAR/TIMESERIES/FLOW//IR-DECADE/Ex3/\"\n\ntsc = TimeSeriesContainer()\ntsc.numberValues = 5\ntsc.pathname = pathname\ntsc.units =\"cfs\"\ntsc.type = \"INST\"\ntsc.interval = -1\ntsc.values = [100,UNDEFINED,500,5000,10000]\n\n\ntsc.times = [datetime(1900,1,12),datetime(1950,6,2,12),\n             datetime(1999,12,31,23,0,0),datetime(2009,1,20),\n             datetime(2019,7,15,5,0)]\n\nwith HecDss.Open(dss_file) as fid:\n    status = fid.put_ts(tsc)\n```\n\n### Example 4\nRead irregular time-series data\n```\nfrom pydsstools.heclib.dss import HecDss\n\ndss_file = \"example.dss\"\npathname = \"/IRREGULAR/TIMESERIES/FLOW//IR-DECADE/Ex3/\"\n\nwith HecDss.Open(dss_file) as fid:\n    ts = fid.read_ts(pathname,regular=False,window_flag=0)\n    print(ts.pytimes)\n    print(ts.values)\n    print(ts.nodata)\n    print(ts.empty)\n```\n\n### Example 5 \nWrite paired data series\n```\nimport numpy as np\nfrom pydsstools.heclib.dss import HecDss\nfrom pydsstools.core import PairedDataContainer\n\ndss_file = \"example.dss\"\npathname =\"/PAIRED/DATA/FREQ-FLOW///Ex5/\"\n\npdc = PairedDataContainer()\npdc.pathname = pathname\npdc.curve_no = 2\npdc.independent_axis = list(range(1,10))\npdc.data_no = 9\npdc.curves = np.array([[5,50,500,5000,50000,10,100,1000,10000],\n                       [11,11,11,11,11,11,11,11,11]],dtype=np.float32)\npdc.labels_list = ['Column 1','Elevens']\npdc.independent_units = 'Number'\npdc.dependent_units = 'Feet'\n\nfid = HecDss.Open(dss_file)\nfid.put_pd(pdc)\nfid.close()\n``` \n\n### Example 6 \nRead paired data-series\n\nNotes:\n    Row and column/curve indices start at 1 (not zero)\n```\nfrom pydsstools.heclib.dss import HecDss\n\ndss_file = \"example.dss\"\npathname =\"/PAIRED/DATA/FREQ-FLOW///Ex5/\"\n\n#labels_list = ['Column 1','Elevens']\n\nwith HecDss.Open(dss_file) as fid:\n    read_all = fid.read_pd(pathname)\n\n    row1,row2 = (2,4)\n    col1,col2 = (1,2)\n    read_partial = fid.read_pd(pathname,window=(row1,row2,col1,col2))\n```\n\n### Example 7 \nPre-allocate paired data-series\n```\nfrom pydsstools.heclib.dss import HecDss\n\ndss_file = \"example.dss\"\npathname =\"/PAIRED/PREALLOCATED DATA/FREQ-FLOW///Ex7/\"\n\nwith HecDss.Open(dss_file) as fid:\n    rows = 10\n    curves = 15\n    fid.preallocate_pd((rows,curves),pathname=pathname)\n```\n\n### Example 8 \nWrite individual curve data in pre-allocated paired data-series \n```\nfrom pydsstools.heclib.dss import HecDss\n\ndss_file = \"example.dss\"\npathname =\"/PAIRED/PREALLOCATED DATA/FREQ-FLOW///Ex7/\"\n\nwith HecDss.Open(dss_file) as fid:\n    curve_index = 5\n    curve_label = 'Column 5'\n    curve_data = [10,20,30,40,50,60,70,80,90,100]\n    fid.put_pd(curve_data,curve_index,pathname=pathname,labels_list=[curve_label])\n\n    curve_index = 2\n    curve_label = 'Column 2'\n    curve_data = [41,56,60]\n    row1,row2 = (5,7)\n    fid.put_pd(curve_data,curve_index,window = (row1,row2),\n            pathname=pathname,labels_list=[curve_label])\n```\n\n### Example 9 \nRead Spatial Grid \n```\nfrom pydsstools.heclib.dss.HecDss import Open\n\ndss_file = \"example.dss\"\n\npathname = \"/GRID/RECORD/DATA/01jan2001:1200/01jan2001:1300/Ex9/\"\n\nwith Open(dss_file) as fid:\n    dataset = fid.read_grid(pathname)\n    grid_array = dataset.read()\n    profile = dataset.profile\n    # if rasterio library is installed\n    # raster attribute is available for dataset object\n    # save grid as geotiff with epsg 2868 for coordinate reference system\n    try:\n        dataset.raster.save_tiff(r'grid_dataset.tif', {'crs': 2868})\n    except:\n        pass\n    else:\n        print('grid data saved as grid_dataset.tif')\n```\n\n### Example 10 \nWrite Spatial Grid record\n \n```\nimport numpy as np\nimport numpy.ma as ma\nfrom affine import Affine\nfrom pydsstools.heclib.dss.HecDss import Open\nfrom pydsstools.heclib.utils import gridInfo\n\ndss_file = \"example.dss\"\n\npathname_out1 = \"/GRID/RECORD/DATA/01jan2019:1200/01jan2019:1300/Ex10a/\"\npathname_out2 = \"/GRID/RECORD/DATA/01jan2019:1200/01jan2019:1300/Ex10b/\"\n\nwith Open(dss_file) as fid:\n    # Type 1: data is numpy array\n    # np.nan is considered as nodata\n    data = np.reshape(np.array(range(100),dtype=np.float32),(10,10))\n    data[0] = np.nan # assign nodata to first row\n    grid_info = gridInfo()\n    cellsize = 100 # feet\n    xmin,ymax = (1000,5000) # grid top-left corner coordinates\n    affine_transform = Affine(cellsize,0,xmin,0,-cellsize,ymax)\n    grid_info.update([('grid_type','specified'),\n                      ('grid_crs','unknown'),\n                      ('grid_transform',affine_transform),\n                      ('data_type','per-aver'),\n                      ('data_units','mm'),\n                      ('opt_time_stamped',False)])\n    fid.put_grid(pathname_out1,data,grid_info)\n            \n    # Type 2: data is numpy masked array, where masked values are considered nodata\n    data = np.reshape(np.array(range(100),dtype=np.float32),(10,10))\n    data = ma.masked_where((data \u003e= 10) \u0026 (data \u003c30),data) # mask second and third rows\n    fid.put_grid(pathname_out2,data,grid_info)\n\n```\n\n### Example 11 \nRead DSS-6 Spatial Grid record\nCopy DSS-6 Grid to DSS-7 file \n\n```\nfrom pydsstools.heclib.dss.HecDss import Open\nfrom pydsstools.heclib.utils import dss_logging\ndss_logging.config(level='Diagnostic')\n\ndss6_file = \"example_dss6.dss\"\ndss7_file = \"example.dss\"\n\npathname_in = \"/SHG/MAXTEMP/DAILY/08FEB1982:0000/08FEB1982:2400/PRISM/\"\npathname_out = \"/SHG/MAXTEMP/DAILY/08FEB1982:0000/08FEB1982:2400/Ex11/\"\n\nwith Open(dss6_file) as fid:\n    dataset = fid.read_grid(pathname_in)\n    data = dataset.read()\n    profile = dataset.profile\n\nwith Open(dss6_file) as fidin, Open(dss7_file) as fidout:\n    dataset = fidin.read_grid(pathname_in)\n    fidout.put_grid(pathname_out,dataset,compute_range = True) # recomputing range limit table\n\n```\n\n### Example 12 \nRead pathname catalog\n```\nfrom pydsstools.heclib.dss.HecDss import Open\n\ndss_file = \"example.dss\"\n\npathname_pattern =\"/PAIRED/*/*/*/*/*/\"\n\nwith Open(dss_file) as fid:\n    path_list = fid.getPathnameList(pathname_pattern,sort=1)\n    print('list = %r' % path_list)\n```\n\n### Example 13 \nCopy dss record\n```\nfrom pydsstools.heclib.dss.HecDss import Open\n\ndss_file = \"example.dss\"\n\npathname_in =\"/PAIRED/DATA/FREQ-FLOW///Ex5/\"\npathname_out =\"/PAIRED/DATA/FREQ-FLOW///Ex12/\"\n\nwith Open(dss_file) as fid:\n    fid.copy(pathname_in,pathname_out)\n```\n\n### Example 14 \nDelete dss record\n```\nfrom pydsstools.heclib.dss.HecDss import Open\n\ndss_file = \"example.dss\"\n\npathname =\"/PAIRED/DATA/FREQ-FLOW///Ex12/\"\n\nwith Open(dss_file) as fid:\n    fid.deletePathname(pathname)\n```\n\n### Example 15 \nSpatial Analysis on grid\n```\n# Notes\n# Experimental geospatial methods for grid\n# Not 100% sure about gridinfo that is computed for the cropped grid esp. for SHG and HRAP\n# Will apreciate user feedbacks on this\n# This example code was tested using the following libraries\n# gdal 3.2.2\n# matplotlib 3.4.4\n# rasterio 1.2.1\n# Potential rasterio issue with CRS\n# https://github.com/mapbox/rasterio/blob/master/docs/faq.rst#why-cant-rasterio-find-projdb\n# Unset PROJ_LIB environmental variable (i.e., SET PROJ_LIB= )\n\nfrom pydsstools.heclib.dss.HecDss import Open\nfrom pydsstools.heclib.utils import BoundingBox\n\ndss_file = \"example.dss\"\n\npathname = r\"/SHG/LCOLORADO/PRECIP/02JAN2020:1500/02JAN2020:1600/Ex15/\"\npathname_out = r\"/SHG/LCOLORADO/PRECIP/02JAN2020:1500/02JAN2020:1600/Ex15 OUT/\"\n\nfid = Open(dss_file)\nds0 = fid.read_grid(pathname)\n\nif not getattr(ds0,'raster',None) is None:\n    ds0.raster.plot(mask_zeros = True, title = 'Original Spatial Grid')\n    bbox = BoundingBox(-50000,6*10**5,50000,7*10**5)\n    ds1 = ds0.raster.mask(bbox,crop = False)\n    ds1.raster.plot(mask_zeros = True, title = 'Clipped Spatial Grid')\n    ds2 = ds1.raster.mask(bbox,crop = True)\n    ds2.raster.plot(mask_zeros = True, title = 'Cropped Spatial Grid')\n    fid.put_grid(pathname_out,ds2)\n```\n\n\u003cimg src=\"extra/Ex15_Fig1.JPG\" width=\"400\"\u003e\u003cimg src=\"extra/Ex15_Fig2.JPG\" width=\"400\"\u003e\u003cimg src=\"extra/Ex15_Fig3.JPG\" width=\"400\"\u003e\n\nDependencies\n===\n\n- [NumPy](https://www.numpy.org)\n- [pandas](https://pandas.pydata.org/)\n- [affine](https://pypi.org/project/affine/)\n- [MS Visual C++ Redistributable for VS 2015 - 2019](https://aka.ms/vs/16/release/vc_redist.x64.exe)\n\nBuild from source\n===\nDownload the source files, open the command prompt in the root directory, and enter the following command. Note that the command prompt must be setup with build tools and python environment.\n```\npython -m build \n```\n\nInstallation from [PyPI](https://pypi.org/project/pydsstools/)\n===\n```\npip install pydsstools\n```\n\nContributing\n===\nAll contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.\nFeel free to ask questions on my [email](mailto:gyanBasyalz@gmail.com).\n\n\nLicense\n===\nThis program is a free software: you can modify and/or redistribute it under [MIT](LICENSE) license. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgyanz%2Fpydsstools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgyanz%2Fpydsstools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgyanz%2Fpydsstools/lists"}