{"id":28376347,"url":"https://github.com/nikothomas/nppes","last_synced_at":"2025-06-26T10:31:17.678Z","repository":{"id":295982855,"uuid":"991898071","full_name":"nikothomas/nppes","owner":"nikothomas","description":"A comprehensive Rust library for working with National Plan and Provider Enumeration System (NPPES) healthcare provider data.","archived":false,"fork":false,"pushed_at":"2025-05-30T14:07:43.000Z","size":108,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-16T01:09:21.027Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nikothomas.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-28T10:07:36.000Z","updated_at":"2025-06-07T08:41:18.000Z","dependencies_parsed_at":"2025-05-28T11:43:41.302Z","dependency_job_id":"c1416254-3490-42af-b016-800679f2922e","html_url":"https://github.com/nikothomas/nppes","commit_stats":null,"previous_names":["nikothomas/nppes"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nikothomas/nppes","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikothomas%2Fnppes","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikothomas%2Fnppes/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikothomas%2Fnppes/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikothomas%2Fnppes/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nikothomas","download_url":"https://codeload.github.com/nikothomas/nppes/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikothomas%2Fnppes/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260080036,"owners_count":22955809,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-30T00:06:31.561Z","updated_at":"2025-06-26T10:31:17.669Z","avatar_url":"https://github.com/nikothomas.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NPPES Data Library\n\nA comprehensive Rust library for working with National Plan and Provider Enumeration System (NPPES) healthcare provider data.\n\n## Overview\n\nThe NPPES dataset contains information about healthcare providers in the United States, including:\n- ~8 million healthcare provider records\n- 330+ data columns including NPI numbers, provider information, taxonomy codes\n- Entity types: Individual providers (code 1) vs Organizations (code 2) \n- Healthcare provider taxonomy codes for specialties\n- Geographic information and licensing data\n\n## Features\n\n- **Type-safe data structures** for all NPPES file formats\n- **CSV parsing and loading** with validation and error handling  \n- **Analytics and querying** functionality for data analysis\n- **Schema validation** against official NPPES documentation\n- **Support for all NPPES reference files** (Other Names, Practice Locations, Endpoints)\n\n## NPPES Data Files Supported\n\n### Main Data File\n- File: `npidata_pfile_yyyymmdd-yyyymmdd.csv` (9.9GB)\n- Contains: ~8M healthcare provider records with 330+ columns\n\n### Reference Files\n- **Other Name Reference**: Additional organization names for Type 2 NPIs\n- **Practice Location Reference**: Non-primary practice locations\n- **Endpoint Reference**: Healthcare endpoints associated with NPIs  \n- **Taxonomy Reference**: Healthcare provider classification codes (NUCC)\n\n## Installation\n\nAdd this to your `Cargo.toml`:\n\n```toml\n[dependencies]\nnppes = \"0.0.3\n```\n\nThe CLI binary is called `npcli`.\n\n## Usage\n\n### Basic Usage\n\n```rust\nuse nppes::prelude::*;\n\n// Load main NPPES data\nlet reader = NppesReader::new();\nlet providers = reader.load_main_data(\"data/npidata_pfile_20050523-20250511.csv\")?;\n\nprintln!(\"Loaded {} providers\", providers.len());\n\n// Load taxonomy reference data\nlet taxonomy_data = reader.load_taxonomy_data(\"data/nucc_taxonomy_250.csv\")?;\n```\n\n### Command Line Interface (CLI)\n\nYou can use the CLI tool `npcli` to download, query, and export NPPES data.\n\n#### Example: Download the latest NPPES data\n\n```sh\nnpcli download --out-dir ./data\n```\n\n#### Example: Show statistics for a dataset\n\n```sh\nnpcli stats --data-dir ./data\n```\n\n#### Example: Query providers by state and specialty\n\n```sh\nnpcli query --data-dir ./data --state CA --specialty Cardiology\n```\n\n#### Example: Export data to JSON\n\n```sh\nnpcli export --data-dir ./data --output ca_cardiologists.json --format json --state CA --specialty Cardiology\n```\n\n### Analytics and Querying\n\n```rust\nuse nppes::prelude::*;\n\n// Create analytics engine\nlet analytics = NppesAnalytics::new(\u0026providers)\n    .with_taxonomy_reference(\u0026taxonomy_data);\n\n// Get dataset statistics\nlet stats = analytics.dataset_stats();\nstats.print_summary();\n\n// Find providers by state\nlet ca_providers = analytics.find_by_state(\"CA\");\nprintln!(\"California providers: {}\", ca_providers.len());\n\n// Find providers by taxonomy code\nlet physicians = analytics.find_by_taxonomy_code(\"208600000X\");\nprintln!(\"Internal Medicine physicians: {}\", physicians.len());\n\n// Complex queries with builder pattern\nlet query_results = ProviderQuery::new(\u0026analytics)\n    .entity_type(EntityType::Individual)\n    .state(\"NY\")\n    .active_only()\n    .execute();\n\nprintln!(\"Active individual providers in NY: {}\", query_results.len());\n```\n\n### Working with Individual Records\n\n```rust\nuse nppes::prelude::*;\n\n// Find a specific provider by NPI\nlet npi = Npi::new(\"1234567890\".to_string())?;\nif let Some(provider) = analytics.find_by_npi(\u0026npi) {\n    println!(\"Provider: {}\", provider.display_name());\n    println!(\"Entity Type: {:?}\", provider.entity_type);\n    println!(\"Active: {}\", provider.is_active());\n    \n    // Get primary taxonomy\n    if let Some(primary_taxonomy) = provider.primary_taxonomy() {\n        println!(\"Primary specialty: {}\", primary_taxonomy.code);\n    }\n}\n```\n\n### Data Enrichment\n\n```rust\nuse nppes::prelude::*;\n\n// Enrich providers with human-readable taxonomy descriptions\nlet enriched_providers = analytics.enrich_with_taxonomy_descriptions()?;\n\nfor enriched in enriched_providers.iter().take(10) {\n    println!(\"Provider: {}\", enriched.provider.display_name());\n    \n    for taxonomy in \u0026enriched.enriched_taxonomies {\n        if let Some(display_name) = \u0026taxonomy.display_name {\n            println!(\"  Specialty: {}\", display_name);\n        }\n    }\n}\n```\n\n### Advanced Analytics\n\n```rust\nuse nppes::prelude::*;\n\n// Get top states by provider count\nlet top_states = analytics.top_states_by_provider_count(10);\nfor (state, count) in top_states {\n    println!(\"{}: {} providers\", state, count);\n}\n\n// Get top specialties\nlet top_specialties = analytics.top_taxonomy_codes_by_provider_count(10);\nfor (code, count) in top_specialties {\n    if let Some(taxonomy_ref) = analytics.get_taxonomy_description(\u0026code) {\n        if let Some(display_name) = \u0026taxonomy_ref.display_name {\n            println!(\"{}: {} providers\", display_name, count);\n        }\n    }\n}\n\n// Date-based queries\nuse chrono::NaiveDate;\nlet start_date = NaiveDate::from_ymd_opt(2023, 1, 1).unwrap();\nlet end_date = NaiveDate::from_ymd_opt(2023, 12, 31).unwrap();\n\nlet new_providers = analytics.providers_enumerated_between(start_date, end_date);\nprintln!(\"Providers enumerated in 2023: {}\", new_providers.len());\n```\n\n## Configuration Options\n\n### Reader Configuration\n\n```rust\nuse nppes::prelude::*;\n\nlet reader = NppesReader::new()\n    .with_header_validation(true)  // Validate CSV headers (default: true)\n    .with_skip_invalid_records(false); // Skip invalid records (default: false)\n```\n\n### Error Handling\n\nThe library uses a comprehensive error system:\n\n```rust\nuse nppes::prelude::*;\n\nmatch reader.load_main_data(\"invalid_path.csv\") {\n    Ok(providers) =\u003e println!(\"Loaded {} providers\", providers.len()),\n    Err(NppesError::FileNotFound(path)) =\u003e {\n        eprintln!(\"File not found: {}\", path);\n    }\n    Err(NppesError::CsvParse(msg)) =\u003e {\n        eprintln!(\"CSV parsing error: {}\", msg);\n    }\n    Err(NppesError::DataValidation(msg)) =\u003e {\n        eprintln!(\"Data validation error: {}\", msg);\n    }\n    Err(e) =\u003e eprintln!(\"Other error: {}\", e),\n}\n```\n\n## Data Structures\n\n### Core Types\n\n- `NppesRecord`: Main provider record with all NPPES data\n- `EntityType`: Individual vs Organization provider type\n- `Npi`: Type-safe NPI number wrapper\n- `TaxonomyCode`: Healthcare specialty/taxonomy information\n- `Address`: Mailing and practice location addresses\n\n### Reference Types\n\n- `TaxonomyReference`: Healthcare taxonomy lookup data\n- `OtherNameRecord`: Additional organization names\n- `PracticeLocationRecord`: Secondary practice locations  \n- `EndpointRecord`: Healthcare endpoints\n\n## Performance Considerations\n\n- The main NPPES file is 9.9GB with ~8M records\n- Recommend 16GB+ RAM for full dataset processing\n- Use streaming or chunked processing for memory-constrained environments\n- Consider creating indexes for frequently queried fields\n\n## License\n\nMIT License\n\n## Contributing\n\nContributions welcome! Please see CONTRIBUTING.md for guidelines. ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnikothomas%2Fnppes","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnikothomas%2Fnppes","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnikothomas%2Fnppes/lists"}