{"id":37078298,"url":"https://github.com/vganjali/pcwa","last_synced_at":"2026-01-14T09:05:27.451Z","repository":{"id":65352106,"uuid":"391213804","full_name":"vganjali/PCWA","owner":"vganjali","description":"A highly parallel and fast event detector based on CWT transform.","archived":false,"fork":false,"pushed_at":"2023-03-21T01:33:31.000Z","size":11949,"stargazers_count":10,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-21T12:10:24.016Z","etag":null,"topics":["cwt","event-detection","multiscale","parallel-computing","peak-detection","python","wavelet"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vganjali.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-07-31T00:03:03.000Z","updated_at":"2025-08-13T07:31:48.000Z","dependencies_parsed_at":"2023-01-22T21:31:22.356Z","dependency_job_id":"ba4a607b-8579-49c9-be58-53a7d59e0e9f","html_url":"https://github.com/vganjali/PCWA","commit_stats":{"total_commits":56,"total_committers":1,"mean_commits":56.0,"dds":0.0,"last_synced_commit":"2222b5c69179a0a5f7686384fb842c448cf5f4a1"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/vganjali/PCWA","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vganjali%2FPCWA","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vganjali%2FPCWA/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vganjali%2FPCWA/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vganjali%2FPCWA/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vganjali","download_url":"https://codeload.github.com/vganjali/PCWA/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vganjali%2FPCWA/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28414741,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T08:38:59.149Z","status":"ssl_error","status_checked_at":"2026-01-14T08:38:43.588Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cwt","event-detection","multiscale","parallel-computing","peak-detection","python","wavelet"],"created_at":"2026-01-14T09:05:26.753Z","updated_at":"2026-01-14T09:05:27.443Z","avatar_url":"https://github.com/vganjali.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PCWA\nA highly parallel and fast event detector based on CWT transform. *PCWA* is a multiscale approach to finding events with any shape with a mother wavelet that matches with events shape (details provided in the *Nature Communications* paper). Unlike previous CWT-based peak finders, *PCWA* is able to fit with any user-defined mother wavelet function, \u003cimg src=\"https://render.githubusercontent.com/render/math?math=\\psi(u,s)\"\u003e, by grouping and clustering initial candidate points (local maxima). The clustering step involves *Macro-* and *u-* clustering steps to break big data into smaller *M-clusters*. The clustering steps utilized **x-axis**, **scale-axis** and **coefficient values** all together to improve the accuracy of located events.\n\n\n## Requirements\n- Python \u003e= 3.8.5\n- numpy \u003e= 1.19.2\n- scipy \u003e= 1.6.2\n- matplotlib \u003e= 3.3.4\n- h5py \u003e= 2.10.0\n- pandas \u003e= 1.2.1\n\nMost likely will work with older versions (Python \u003e 3), not tested by the time of writing this document.\n\n## Installing PCWA\n*PCWA* is available at **PyPi** repository and can be installed via `pip install --upgrade pcwa`.\n\n## How to use PCWA\n*PCWA* is designed as a Python class and requires initializing. Import *pcwa* and initiate a new instant:\n\n```python\nimport pcwa as pcwa\n\npcwa_analyzer = pcwa.PCWA(parallel=False)\n# pcwa_analyzer.show_wavelets = True\npcwa_analyzer.w, pcwa_analyzer.h = 1.5, 1.5\npcwa_analyzer.selectivity = 0.7\npcwa_analyzer.use_scratchfile = False\n```\nproperties can be set during or after initializing. A list of properties are as below:\n\n## Properties\n```python\ndt = 1e5                               # sampling period of the signal in s\nparallel = True                        # enable/disable multiprocessing \nmcluster = True                        # enable/disable macro-clustering\nlogscale = True                        # enable/disable logarithmic scale for scale-axis\nwavelet = ['ricker']                   # list of wavelet function names\nscales = [0.01e-3,0.1e-3,30]           # scale range and count in in s\nselectivity = 0.5                      # minimum number of candidates in a valid micro-cluster\nw = 2                                  # spreading factor in x-axis\nh = 6                                  # spreading factor in y-axis (scale-axis)\nextent = 1                             # global extent along x and y axis, used in macro-clustering\ntrace = None                           # trace (data) variable. 1D numpy vector\nevents = []                            # list of detected events (valid after calling detect_events() function)\ncwt = {}                               # dictionary of cwt coefficients\nwavelets = {}                          # dictionary of generated scaled\u0026normalized 1D wavelet arrays\nshow_wavelets = False                  # plot wavelet functions\nupdate_cwt = True                      # if False, will use the current cwt coefficients to detect events to save time tuning threshold parameters\nkeep_cwt = False                       # if False, will use less memory by running conv() and local_maxima() at the same time. Otherwise will generate entire CWT coefficient before looking for local maxima (conventional method)\nuse_scratchfile = False                 # stores cwt coefficients in the scarach file (hdf5 formatted) file\n```\n\n## Event Detection\nAfter initializing, events can be detected by calling `detect_events()` method.\n\n```python\nevents = pcwa_analyzer.detect_events(trace=data,wavelet=['ricker'],scales=[0.1e-3,1.0e-3,50],threshold=3)\ntpr, fdr = pcwa.tprfdr(truth,events['loc'],e=7e-3/1e-5,MS=True) # e is the tolerance of error for event location, here 7ms/0.01ms (in data points), 0.01ms is the bin size\n```\nsome of pcwa parameters can overridden when calling `detect_events()` by passing the following parameters:\n- `trace`:        overrides the trace\n- `wavelet`:      overrides wavelet functions\n- `scales`:       overrides scales\n\n`threshold` is the only parameter required at each call.\n\n\n## Example Code\nThe example below shows how to use *PCWA* to detect peaks in a simulated mass spectroscopy data. \n\n```python\nimport numpy as np\nimport pandas as pd\nimport pcwa as pcwa\nimport matplotlib.pyplot as plt\n\n# read the raw mass scpectroscopy data and truth values\ndf_raw = pd.read_csv('n100sig66_dataset_1_25/Dataset_14/RawSpectra/noisy22.txt',sep=' ')\ndf_true = pd.read_csv('n100sig66_dataset_1_25/Dataset_14/truePeaks/truth22.txt',sep=' ')\n\n# create pcwa_analyzer object and set the desired parameters\npcwa_analyzer = pcwa.PCWA()\npcwa_analyzer.trace = df_raw['Intensity']\npcwa_analyzer.dt = 1\npcwa_analyzer.scales = [10,100,100]\npcwa_analyzer.wavelet = ['ricker']\npcwa_analyzer.keep_cwt = False\npcwa_analyzer.w, pcwa_analyzer.h = 0.2, 1\npcwa_analyzer.show_wavelets = False\npcwa_analyzer.use_scratchfile = False\n\n# detect events (peaks)\nevents = pcwa_analyzer.detect_events(threshold=200)\n\n# fine tune the location of detected peaks\nloc = [int(e-events['scale'][n]+np.argmax(df_raw['Intensity'][int(e-events['scale'][n]):int(e+events['scale'][n])])) for n,e in enumerate(events['loc'])]\n\nfig, ax = plt.subplots(3,1,figsize=(16,4),dpi=96,sharex=True,gridspec_kw={'height_ratios': [12,1,1]})\nplt.subplots_adjust(hspace=0,wspace=0)\nl0, = ax[1].plot(df_true['Mass'],df_true['Particles']*0, '|',markersize=10,color='gray',label='Truth')\nax[0].plot(df_raw['Mass'],df_raw['Intensity'],color='blue')\nl1, = ax[2].plot(df_raw['Mass'].iloc[loc],[0]*len(loc),'|',markersize=10,color='red',label='PCWA')\nax[1].set_yticks([])\nax[1].set_ylim(0,0)\nax[2].set_yticks([])\nax[2].set_ylim(0,0)\nax[0].set_ylabel('Intensity')\nax[-1].set_xlabel('m/z')\nplt.legend(handles=[l0,l1], bbox_to_anchor=(1.0, 4), loc='upper left')\nplt.show()\n```\n\nOnce the analysis is finished, the plot window should show the results as below:\n\u003e![example_0](images/example_0_output.png)\n\nIf you want to calculate **TPR** and **FDR** value, useful for *ROC* plots, *PCWA* class provides a function to do that.\n```python\ntrue_peaks = np.sort(df_true['Mass'].to_numpy())\ndetected_peaks = np.sort(df_raw['Mass'].iloc[loc].to_numpy())\ntpr, fdr = pcwa.tprfdr(true_peaks, detected_peaks, e=0.01, MS=True)\nprint(f\"TPR={tpr:.3f}, FDR={fdr:.3f}\")\n```\n\u003e TPR=0.864, FDR=0.014\n\nthe command window should show the *TPR* and *FDR* values based on the ground truth values and acceptable error range (1% here). *MS* parameter determines the way of applying acceptable error, for mass spectroscopy data error is considered relative to mass value (\u003cimg src=\"https://render.githubusercontent.com/render/math?math=e \\times Mass\"\u003e). If `MS=False`, the absolute error value is considered.\nThe full example file is provided in this repository ([ms_example.py](https://github.com/vganjali/PCWA/blob/main/ms_example.py)).\n\n## Citation\nIf you use *PCWA* or want to cite this work, please cite our [paper](https://www.nature.com/articles/s41467-022-28703-z) published in _Nature Communications_:\n\n```Properties\n@article{ganjalizadehFastCustomWavelet2022,\n  title = {Fast Custom Wavelet Analysis Technique for Single Molecule Detection and Identification},\n  author = {Ganjalizadeh, Vahid and Meena, Gopikrishnan G. and Wall, Thomas A. and Stott, Matthew A. and Hawkins, Aaron R. and Schmidt, Holger},\n  year = {2022},\n  journal = {Nature Communications},\n  volume = {13},\n  number = {1},\n  pages = {1--9},\n  publisher = {{Nature Publishing Group}},\n  doi = {10.1038/s41467-022-28703-z},\n  isbn = {2041-1723}\n}\n```\n\n## Reference\nThe provided dataset is a subset taken from the simulated Mass Spectroscopy dataset (DOI: [10.1093/bioinformatics/bti254](https://doi.org/10.1093/bioinformatics/bti254)).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvganjali%2Fpcwa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvganjali%2Fpcwa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvganjali%2Fpcwa/lists"}