{"id":31962154,"url":"https://github.com/agentmorris/inat-diff","last_synced_at":"2025-10-14T16:36:08.567Z","repository":{"id":317945429,"uuid":"1069470193","full_name":"agentmorris/inat-diff","owner":"agentmorris","description":"Find species that recently occurred for the first time in iNaturalist","archived":false,"fork":false,"pushed_at":"2025-10-04T02:59:21.000Z","size":26,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-04T04:11:49.052Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/agentmorris.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-04T02:06:15.000Z","updated_at":"2025-10-04T02:59:24.000Z","dependencies_parsed_at":"2025-10-04T04:12:43.724Z","dependency_job_id":null,"html_url":"https://github.com/agentmorris/inat-diff","commit_stats":null,"previous_names":["agentmorris/inat-diff"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/agentmorris/inat-diff","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agentmorris%2Finat-diff","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agentmorris%2Finat-diff/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agentmorris%2Finat-diff/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agentmorris%2Finat-diff/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/agentmorris","download_url":"https://codeload.github.com/agentmorris/inat-diff/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/agentmorris%2Finat-diff/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279019575,"owners_count":26086753,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-14T16:36:07.493Z","updated_at":"2025-10-14T16:36:08.554Z","avatar_url":"https://github.com/agentmorris.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# iNaturalist Difference Detection\n\nA Python library and CLI tool for querying iNaturalist observations to detect species presence patterns across regions and time periods. Designed for invasive species monitoring and biodiversity research.\n\nAn example of the output of this system is available [here](http://dmorris.net/misc/tmp/last-30-days-oregon.html); that page shows taxa that were observed in Oregon for the first time in the 30 days prior to 2025.10.03.\n\n## Features\n\n- Query species observations by region and time period\n- Detect potentially \"new\" species in regions (no previous observations)\n- List all species observed in a region during a time period\n- Support for flexible time period formats\n- Command-line interface and Python library\n\n## Installation\n\n```bash\npip install -r requirements.txt\npip install -e .\n```\n\n## Quick Start\n\n### Command Line Interface\n\n```bash\n# Query specific species observations\ninat-diff query \"Panthera leo\" \"last 30 days\" \"Kenya\"\n\n# Find all new species in a region (i.e., species observed in a region for the first time recently)\ninat-diff new-species \"this month\" \"Oregon\" --lookback-years 20\n\n# Check if a specific species is new to a region\ninat-diff new-species \"this year\" \"Florida\" \"Python bivittatus\" --lookback-years 20\n\n# List all species in a region during time period\ninat-diff list-species \"last month\" \"Oregon\"\n```\n\n### Python Library\n\n```python\nfrom inat_diff import SpeciesQuery\n\n# Initialize query engine\nquery = SpeciesQuery()\n\n# Find all new species in a region (main use case)\nresults = query.find_all_new_species_in_period(\n    time_period=\"this month\",\n    region=\"Oregon\",\n    lookback_years=20,\n    rate_limit=1.0,  # seconds between API calls\n    verbose=True\n)\n\nprint(f\"Found {results['new_species_count']} new species:\")\nfor species in results['new_species']:\n    print(f\"  {species['name']} ({species.get('preferred_common_name', 'no common name')})\")\n\n# Check if a specific species is new to a region\nspecific = query.find_new_species_in_period(\n    taxon_name=\"Python bivittatus\",\n    time_period=\"this year\",\n    region=\"Florida\",\n    lookback_years=20\n)\n\nprint(f\"New to region: {specific['is_new_to_region']}\")\nprint(f\"Analysis: {specific['analysis']}\")\n```\n\n## Supported Formats\n\n### Time Periods\n- `\"last N days/weeks/months/years\"`\n- `\"past N days/weeks/months/years\"`\n- `\"this month/year\"`\n- `\"YYYY-MM-DD to YYYY-MM-DD\"` (explicit date ranges)\n\n### Regions\niNaturalist supports various place types. The tool works with any place name recognized by iNaturalist:\n\n**Standard Places (maintained by iNaturalist staff):**\n- **Countries**: \"United States\", \"Canada\", \"Kenya\", \"Mexico\" (253 total)\n- **States/Provinces**: \"California\", \"Oregon\", \"British Columbia\", \"Ontario\" (~3,000 total)\n- **Counties/Level 2**: \"Multnomah County\", \"King County\" (~40,000 total)\n- **Continents**: \"North America\", \"Africa\", \"Europe\"\n- **US National Parks**: \"Yellowstone National Park\", \"Yosemite National Park\" (429 parks)\n\n**Community Curated Places:**\n- State parks, wildlife areas, watersheds, and other user-created boundaries\n- Search by name: the tool will find the best match from iNaturalist's database\n\n**Tips:**\n- Use specific names: \"Washington\" (state) vs \"Washington County\"\n- The tool prioritizes: countries → states → counties when multiple matches exist\n- If unsure, check [iNaturalist Places](https://www.inaturalist.org/places) to verify the exact name\n\n### Taxa\n- Latin names in iNaturalist taxonomy: \"Panthera leo\", \"Python bivittatus\"\n- Genus and species format recommended\n\n## Commands\n\n### `query`\nQuery for specific species observations in a region and time period.\n\n```bash\ninat-diff query \"Canis lupus\" \"last 6 months\" \"Montana\"\n```\n\n### `new-species` (Main Use Case)\nFind all species that appear to be new to a region, or check a specific species.\n\n**Find all new species in a region:**\n```bash\ninat-diff new-species \"this month\" \"Oregon\" --lookback-years 20 --rate-limit 1.0\n```\n\n**Options:**\n- `--lookback-years N`: Years to look back for historical data (default: 5, recommended: 20)\n- `--rate-limit N`: Seconds between API calls (default: 1.0 = 60 req/min, max safe: 0.6 = 100 req/min)\n\n**Performance:** Performance notes for Oregon:\n- Last week (~2,000 species): ~35 minutes at default rate\n- Last month (~6,000 species): ~100 minutes at default rate\n- Use `--rate-limit 0.6` to go faster (max iNaturalist allows)\n\n### `list-species`\n\nList all species observed in a region during a time period.\n\n```bash\ninat-diff list-species \"this month\" \"Oregon\"\n```\n\n## HTML Visualization\n\nGenerate interactive HTML reports from JSON output:\n\n```bash\n# Save query results to JSON\ninat-diff new-species \"this month\" \"Oregon\" --output-file results.json\n\n# Generate HTML visualization\ninat-diff-visualize results.json report.html\n```\n\nThe HTML report includes:\n- Summary statistics with visual cards\n- Sortable species lists with common and scientific names\n- Direct links to iNaturalist observations for each species\n- Badges showing taxonomic rank, iconic taxon, and \"new\" status\n- Observation counts for current and historical periods\n- Responsive design for mobile and desktop viewing\n\n**Quality Grade Annotations (Optional):**\n```bash\n# Include observation quality grades (Research/Needs ID/Casual)\ninat-diff-visualize results.json report.html --include-quality\n\n# Customize API rate limiting (default: 1.2 seconds between calls)\ninat-diff-visualize results.json report.html --include-quality --rate-limit 0.6\n```\n\nThis fetches the highest available quality grade for each species from iNaturalist's API:\n- Displays \"Best quality: Research Grade\", \"Needs ID\", or \"Casual\" for each species\n- Requires O(N) API calls where N = number of species (can be slow for large datasets)\n- Includes automatic retry logic (3 attempts with exponential backoff) for failed API calls\n- Progress indication shows current species being processed\n- Rate limiting respects iNaturalist API guidelines (default: 1.2s = 50 req/min, safe range: 0.6-1.2s)\n- Useful for filtering to high-quality observations for scientific purposes\n- Disabled by default to keep visualization fast and offline\n\n**Example:**\n```bash\n# Complete workflow\ninat-diff new-species \"last month\" \"Delaware\" -o delaware.json\ninat-diff-visualize delaware.json delaware.html\nopen delaware.html  # or xdg-open on Linux\n\n# With quality grades (slower but more informative)\ninat-diff-visualize delaware.json delaware_with_quality.html --include-quality\n```\n\n## JSON Output Format\n\nAll commands support the `--output-file` (`-o`) option to save results as JSON. This is useful for creating visualizations, further analysis, or linking to iNaturalist observations.\n\n### `new-species` Output\n\n```json\n{\n  \"query\": {\n    \"region\": \"Oregon\",\n    \"place_id\": 10,\n    \"time_period\": \"this month\",\n    \"start_date\": \"2025-10-01\",\n    \"end_date\": \"2025-10-03\"\n  },\n  \"lookback_period\": \"2005-09-30 to 2025-09-30\",\n  \"lookback_years\": 20,\n  \"total_species_in_period\": 1234,\n  \"new_species_count\": 5,\n  \"established_species_count\": 1229,\n  \"new_species\": [\n    {\n      \"id\": 12345,\n      \"name\": \"Panthera leo\",\n      \"preferred_common_name\": \"Lion\",\n      \"rank\": \"species\",\n      \"iconic_taxon\": \"Animalia\",\n      \"observation_count\": 3,\n      \"historical_count\": 0\n    }\n  ],\n  \"established_species\": [\n    {\n      \"id\": 67890,\n      \"name\": \"Canis lupus\",\n      \"preferred_common_name\": \"Gray Wolf\",\n      \"rank\": \"species\",\n      \"iconic_taxon\": \"Animalia\",\n      \"observation_count\": 15,\n      \"historical_count\": 142\n    }\n  ],\n  \"rate_limit_seconds\": 1.2\n}\n```\n\n**Field Descriptions:**\n- **`query`**: Metadata about the search parameters\n  - `region`: Region name as provided\n  - `place_id`: iNaturalist place ID for the region\n  - `time_period`: Time period string as provided\n  - `start_date`/`end_date`: Parsed date range (YYYY-MM-DD)\n- **`lookback_period`**: Historical date range used for comparison\n- **`lookback_years`**: Years of lookback used\n- **`total_species_in_period`**: Total unique species observed in the current period\n- **`new_species_count`**: Number of species with no historical observations\n- **`established_species_count`**: Number of species with historical observations\n- **`new_species`**: Array of species objects with no prior observations\n- **`established_species`**: Array of species objects with prior observations\n- **`rate_limit_seconds`**: Rate limiting setting used\n\n**Species Object Fields:**\n- `id`: iNaturalist taxon ID (can be used to construct URLs: `https://www.inaturalist.org/taxa/{id}`)\n- `name`: Scientific (Latin) name\n- `preferred_common_name`: Common name in English (may be `null`)\n- `rank`: Taxonomic rank (`\"species\"`, `\"genus\"`, `\"subspecies\"`, etc.)\n- `iconic_taxon`: High-level taxonomic group (`\"Animalia\"`, `\"Plantae\"`, `\"Insecta\"`, `\"Fungi\"`, etc.)\n- `observation_count`: Number of observations in the current period\n- `historical_count`: Number of observations in the lookback period (0 for new species)\n\n### `query` Output\n\n```json\n{\n  \"query\": {\n    \"taxon_name\": \"Panthera leo\",\n    \"taxon_id\": 12345,\n    \"region\": \"Kenya\",\n    \"place_id\": 6986,\n    \"time_period\": \"last 30 days\",\n    \"start_date\": \"2025-09-03\",\n    \"end_date\": \"2025-10-03\"\n  },\n  \"place_info\": {\n    \"id\": 6986,\n    \"name\": \"Kenya\",\n    \"display_name\": \"Kenya\"\n  },\n  \"observations\": {\n    \"total_results\": 42,\n    \"per_page\": 30,\n    \"page\": 1,\n    \"results\": [...]\n  },\n  \"total_results\": 42,\n  \"per_page\": 30,\n  \"page\": 1\n}\n```\n\n### `list-species` Output\n\n```json\n{\n  \"query\": {\n    \"region\": \"Oregon\",\n    \"place_id\": 10,\n    \"time_period\": \"last month\",\n    \"start_date\": \"2025-09-03\",\n    \"end_date\": \"2025-10-03\"\n  },\n  \"species_count\": 1234,\n  \"total_observations\": 5678,\n  \"species\": [\n    {\n      \"id\": 12345,\n      \"name\": \"Canis lupus\",\n      \"preferred_common_name\": \"Gray Wolf\",\n      \"rank\": \"species\",\n      \"observation_count\": 15\n    }\n  ]\n}\n```\n\n## Library Components\n\n- **`iNatClient`**: Core API client for iNaturalist\n- **`SpeciesQuery`**: Main query interface\n- **`parse_time_period()`**: Time period parsing utilities\n- **CLI**: Command-line interface\n\n## Use Cases\n\n- **Invasive Species Monitoring**: Detect when non-native species first appear in new regions\n- **Biodiversity Research**: Track species distribution changes over time\n- **Citizen Science**: Analyze iNaturalist observation patterns\n- **Conservation**: Monitor protected species presence\n\n## Implementation Details\n\n### Efficient API Usage\nThe system uses iNaturalist's `species_counts` endpoint for efficient querying:\n1. Fetches all species in the current period (a few API calls with pagination)\n2. For each species, checks historical presence (1 API call per species)\n3. Respects rate limits: 60-100 requests/minute per [iNaturalist API guidelines](https://www.inaturalist.org/pages/api+recommended+practices)\n\n### Rate Limiting\n- Default: 1.0 second between requests (60 req/min)\n- Recommended max: 0.6 seconds (100 req/min)\n- Automatically adjusts on errors with exponential backoff\n\n## Limitations\n\n- \"New\" species detection is relative to available iNaturalist data, not actual species establishment\n- Performance scales with number of unique species in the time period\n- Subject to iNaturalist API rate limits (can take hours for large queries)\n- Geographic boundaries depend on iNaturalist's place database\n- Lookback period is limited to available historical data (iNaturalist started ~2008)\n\n## Contributing\n\nThis is a prototype library. Future enhancements could include:\n- Caching for place and taxon lookups\n- Batch processing for multiple species\n- Geographic boundary file support\n- Web-based interface\n- Advanced statistical analysis\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fagentmorris%2Finat-diff","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fagentmorris%2Finat-diff","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fagentmorris%2Finat-diff/lists"}