{"id":32929185,"url":"https://github.com/colehanan1/door-python-toolkit","last_synced_at":"2026-01-20T16:59:09.790Z","repository":{"id":322725989,"uuid":"1090656995","full_name":"colehanan1/door-python-toolkit","owner":"colehanan1","description":" Python toolkit for working with the DoOR (Database of Odorant Responses) database","archived":false,"fork":false,"pushed_at":"2025-11-06T03:04:29.000Z","size":3262,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-06T03:07:19.000Z","etag":null,"topics":["door","drosophila","machine-learning","neuroscience","odorant-receptors","olfaction","python","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/colehanan1.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-06T00:50:34.000Z","updated_at":"2025-11-06T03:04:33.000Z","dependencies_parsed_at":"2025-11-06T03:07:22.133Z","dependency_job_id":null,"html_url":"https://github.com/colehanan1/door-python-toolkit","commit_stats":null,"previous_names":["colehanan1/door-python-toolkit"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/colehanan1/door-python-toolkit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/colehanan1%2Fdoor-python-toolkit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/colehanan1%2Fdoor-python-toolkit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/colehanan1%2Fdoor-python-toolkit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/colehanan1%2Fdoor-python-toolkit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/colehanan1","download_url":"https://codeload.github.com/colehanan1/door-python-toolkit/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/colehanan1%2Fdoor-python-toolkit/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":283910072,"owners_count":26915128,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-11T02:00:06.610Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["door","drosophila","machine-learning","neuroscience","odorant-receptors","olfaction","python","pytorch"],"created_at":"2025-11-11T11:14:16.042Z","updated_at":"2026-01-20T16:59:09.781Z","avatar_url":"https://github.com/colehanan1.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![PyPI version](https://badge.fury.io/py/door-python-toolkit.svg)](https://badge.fury.io/py/door-python-toolkit)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n\n# DoOR Python Toolkit\n\n**Comprehensive Python toolkit for Drosophila olfactory research: DoOR database integration, FlyWire connectomics, pathway analysis, and neural network preprocessing.**\n\nExtract, analyze, and integrate *Drosophila melanogaster* odorant-receptor response data with connectome analysis. No R installation required.\n\n---\n\n## 🚀 Features\n\n**NEW in v1.0.0:** Complete mushroom body circuit validation with ORN→PN→KC→MBON pathway tracing! 🎉\n\n### Core DoOR Integration\n- ✅ **Pure Python** - Extract DoOR R data files without installing R\n- 🚀 **Fast** - Parquet-based caching for quick loading\n- 📊 **693 odorants × 78 receptors** - Comprehensive olfactory data\n- 🔍 **Search \u0026 Filter** - Query by odorant name, receptor, or properties\n\n### FlyWire Connectomics\n- 🧠 **Interglomerular Cross-Talk** - Analyze lateral inhibition pathways\n- 🔬 **NetworkX Graphs** - 108,980+ pathways across 38 glomeruli\n- 📈 **Statistical Analysis** - Hub detection, community detection, asymmetry\n- 🎨 **Publication-Ready Figures** - High-resolution network visualizations\n\n### Mushroom Body Circuit Validation\n- 🎯 **ORN → PN → KC → MBON Tracing** - Complete learning circuit pathways\n- 🧬 **Anatomical Validation** - Validate LASSO-identified receptors in MB circuits\n- 🏆 **Priority Ranking** - Integrate behavioral importance with connectivity\n- 📊 **Circuit Classification** - Appetitive (α/β) vs Aversive (γ) lobe specialization\n- 🔬 **Experimental Design** - Generate priority matrices for optogenetic validation\n\n### Advanced Features\n- 🗺️ **FlyWire Integration** - Map receptors to neural connectivity (100K+ cells)\n- 🛤️ **Pathway Analysis** - Trace Or47b, Or42b, Or92a pathways\n- 🤖 **ML-Ready** - PyTorch/NumPy integration with sparse encoding\n- 🧪 **Experiment Design** - PGCN blocking protocol generation\n- 🎓 **LASSO Behavioral Prediction** - Identify sparse receptor circuits from optogenetic data\n\n---\n\n## 📦 Quick Start\n\n### Installation\n\n```bash\n# Core package\npip install door-python-toolkit\n\n# With all features\npip install door-python-toolkit[all]\n\n# Individual feature sets\npip install door-python-toolkit[flywire]      # FlyWire integration\npip install door-python-toolkit[connectomics] # Connectomics module\npip install door-python-toolkit[torch]        # PyTorch support\npip install door-python-toolkit[extract]      # DoOR extraction\n```\n\n### Basic Usage\n\n```python\nfrom door_toolkit import DoOREncoder\n\n# Load encoder\nencoder = DoOREncoder(\"door_cache\")\n\n# Encode single odorant → 78-dim PN activation vector\npn_activation = encoder.encode(\"acetic acid\")\nprint(pn_activation.shape)  # (78,)\n\n# Search odorants\nacetates = encoder.list_available_odorants(pattern=\"acetate\")\nprint(f\"Found {len(acetates)} acetates\")  # 36\n```\n\n### Connectomics Analysis\n\n```python\nfrom door_toolkit.connectomics import CrossTalkNetwork\nfrom door_toolkit.connectomics.pathway_analysis import analyze_single_orn\n\n# Load network\nnetwork = CrossTalkNetwork.from_csv('interglomerular_crosstalk_pathways.csv')\nnetwork.set_min_synapse_threshold(10)\n\n# Analyze DL5 glomerulus\nresults = analyze_single_orn(network, 'ORN_DL5', by_glomerulus=True)\nprint(f\"Found {results.num_pathways} cross-talk pathways\")\n```\n\n---\n\n## 📚 Table of Contents\n\n- [Installation](#installation)\n- [Core DoOR Features](#core-door-features)\n- [Connectomics Module](#connectomics-module)\n- [FlyWire Integration](#flywire-integration)\n- [Mushroom Body Circuit Validation](#mushroom-body-circuit-validation)\n- [Pathway Analysis](#pathway-analysis)\n- [Neural Network Preprocessing](#neural-network-preprocessing)\n- [Command-Line Interface](#command-line-interface)\n- [API Reference](#api-reference)\n- [Examples](#examples)\n- [Citation](#citation)\n- [Contributing](#contributing)\n- [License](#license)\n\n---\n\n## Core DoOR Features\n\n### What is DoOR?\n\nThe **Database of Odorant Responses (DoOR)** is a comprehensive collection of odorant-receptor response measurements for *Drosophila melanogaster*.\n\n**Published:** Münch \u0026 Galizia (2016), *Scientific Data* 3:160122\n**Citation:** https://doi.org/10.1038/sdata.2016.122\n\n### Dataset Overview\n\n| Metric | Value |\n|--------|-------|\n| Odorants | 693 compounds |\n| Receptors | 78 ORN types (Or, Ir, Gr) |\n| Measurements | 7,381 odorant-receptor pairs |\n| Sparsity | 86% (typical for chemical screens) |\n| Response Range | [0, 1] normalized |\n\n### Extract DoOR Data\n\n```python\nfrom door_toolkit import DoORExtractor\n\n# Extract R data files to Python formats\nextractor = DoORExtractor(\n    input_dir=\"path/to/DoOR.data/data\",  # Unzipped DoOR R package\n    output_dir=\"door_cache\"\n)\nextractor.run()\n```\n\n### Use in Your Code\n\n```python\nfrom door_toolkit import DoOREncoder\n\n# Load encoder\nencoder = DoOREncoder(\"door_cache\")\n\n# Encode batch\nodors = [\"acetic acid\", \"1-pentanol\", \"ethyl acetate\"]\npn_batch = encoder.batch_encode(odors)\nprint(pn_batch.shape)  # (3, 78)\n\n# Get metadata\nstats = encoder.get_receptor_coverage(\"acetic acid\")\nprint(f\"Active receptors: {stats['n_active']}\")\n```\n\n---\n\n## Connectomics Module\n\nComprehensive tools for analyzing interglomerular cross-talk in the *Drosophila* olfactory system using FlyWire connectome data.\n\n### Key Features\n\n✅ **Network Construction**\n- NetworkX-based directed graph (108,980+ pathways)\n- Hierarchical representation: individual neurons + glomerulus meta-nodes\n- 2,828 neurons across 38 glomeruli\n- Synapse-weighted edges with configurable thresholds\n\n✅ **Four Analysis Modes**\n1. **Single ORN Focus** - All pathways from one ORN/glomerulus\n2. **ORN Pair Comparison** - Bidirectional cross-talk quantification\n3. **Full Network View** - Global topology and statistics\n4. **Pathway Search** - Find specific connections\n\n✅ **Statistical Analyses**\n- Hub neuron detection (degree, betweenness, closeness, eigenvector centrality)\n- Community detection (Louvain, greedy modularity, label propagation)\n- Asymmetry quantification\n- Path length distributions\n\n✅ **Biophysical Parameters**\n- Research-based parameters (Wilson, Olsen, Kazama labs)\n- Dale's law enforcement\n- Synaptic time constants for ACh and GABA\n\n### Quick Example\n\n```python\nfrom door_toolkit.connectomics import CrossTalkNetwork\nfrom door_toolkit.connectomics.pathway_analysis import analyze_single_orn, compare_orn_pair\nfrom door_toolkit.connectomics.statistics import NetworkStatistics\nfrom door_toolkit.connectomics.visualization import NetworkVisualizer\n\n# Load network\nnetwork = CrossTalkNetwork.from_csv('interglomerular_crosstalk_pathways.csv')\nnetwork.set_min_synapse_threshold(10)\n\n# Mode 1: Analyze single glomerulus\nresults = analyze_single_orn(network, 'ORN_DL5', by_glomerulus=True)\nprint(f\"Found {results.num_pathways} pathways from DL5\")\n\n# Mode 2: Compare two glomeruli\ncomparison = compare_orn_pair(network, 'ORN_DL5', 'ORN_VA1v', by_glomerulus=True)\nprint(f\"Asymmetry ratio: {comparison.get_asymmetry_ratio():.3f}\")\n\n# Mode 3: Full network analysis\nstats = NetworkStatistics(network)\nhubs = stats.detect_hub_neurons(method='betweenness', threshold_percentile=95)\ncommunities = stats.detect_communities(algorithm='louvain', level='glomerulus')\nprint(f\"Found {len(hubs)} hub neurons, {max(communities.values()) + 1} communities\")\n\n# Mode 4: Pathway search\nfrom door_toolkit.connectomics.pathway_analysis import find_pathways\npathways = find_pathways(network, 'ORN_VM7v', 'ORN_D', by_glomerulus=True)\nprint(f\"Found {pathways['num_pathways']} pathways\")\n\n# Visualization\nvisualizer = NetworkVisualizer(network)\nvisualizer.plot_full_network(output_path='network.png', min_synapse_display=50)\nvisualizer.plot_single_orn_pathways('ORN_DL5', output_path='DL5_pathways.png')\nvisualizer.plot_glomerulus_heatmap(output_path='heatmap.png')\n```\n\n### Biological Context\n\nThe antennal lobe processes olfactory information through:\n1. **ORNs** - Express specific odorant receptors, converge into glomeruli\n2. **Local Neurons (LNs)** - GABAergic inhibitory neurons mediating lateral inhibition\n3. **Projection Neurons (PNs)** - Cholinergic neurons to higher brain centers\n\n**Lateral inhibition** mechanisms:\n- **ORN → LN → ORN**: Lateral inhibition between glomeruli (52% of pathways, median 3 synapses)\n- **ORN → LN → PN**: Feedforward inhibition to PNs (16% of pathways)\n- **ORN → PN → feedback**: Feedback loops (20% of pathways, up to 1,018 synapses)\n\n### Key Discoveries\n\nOur analysis revealed:\n- **Hub LNs**: lLN2T_c, lLN2X04, lLN8, LN60b (prime optogenetic targets)\n- **15 functional communities** with one major 22-glomerulus cluster\n- **VM7v acts as convergence hub** receiving from multiple glomeruli\n- **Asymmetric connectivity** patterns suggesting specialized functions\n\n### ORN/Glomerulus Identifier Resolution\n\nThe connectomics module includes a **robust identifier resolution system** that automatically normalizes messy ORN/glomerulus names and maps receptor names to their glomerulus names.\n\n**Key features:**\n- **Format-agnostic**: Accepts `\"DL3\"`, `\"dl3\"`, `\"ORN_DL3\"`, `\"ORN-DL3\"`, `\"Glomerulus DL3\"` - all resolve to `\"ORN_DL3\"`\n- **Receptor-to-glomerulus mapping**: Automatically maps `\"Or7a\"` → `\"ORN_DL5\"`, `\"Ir31a\"` → `\"ORN_VL2p\"`, `\"Gr21a\"` → `\"ORN_V\"`\n- **Complete coverage**: Includes 44 receptors (33 Or, 10 Ir, 1 Gr) mapped to their FlyWire glomeruli\n- **Fuzzy matching**: Suggests alternatives when exact matches fail (ranked by similarity)\n- **Clear errors**: Provides actionable error messages with top 10 suggestions\n\nIn FlyWire, neurons are labeled by glomerulus name (e.g., `ORN_VL2p; Ir31a`), not receptor name. The resolver automatically handles this translation so you can use familiar receptor names like `\"Ir31a\"` or `\"Or7a\"` in your code. The system uses normalization (case-insensitive, separator-agnostic) combined with receptor mapping and fuzzy matching to prevent \"non-matching ORN name\" errors. All pathway analysis functions (`analyze_single_orn`, `compare_orn_pair`, `find_pathways`) accept both receptor names and glomerulus names. See [`examples/connectomics/example_orn_identifier_resolution.py`](examples/connectomics/example_orn_identifier_resolution.py) for a complete demonstration.\n\n---\n\n## FlyWire Integration\n\nMap DoOR receptor data to FlyWire neural connectivity and community labels.\n\n### Key Capabilities\n\n- Parse 100K+ FlyWire community labels efficiently\n- Map DoOR receptors to FlyWire root IDs\n- Generate 3D spatial activation maps\n- Export mappings in JSON/CSV formats\n\n#### Namespace Translation \u0026 Diagnostics\n\n- `DoORFlyWireIntegrator.get_connectivity_matrix_door_indexed()` translates FlyWire glomerulus labels (e.g., `ORN_DL5`) into DoOR receptor names (`Or7a`) so tuning and connectivity matrices share the same index before statistical analysis.\n- `scripts/analysis_1_tuning_vs_connectivity.py` now logs detailed overlap diagnostics and generates a diagnostic report if insufficient overlapping receptors are found, making namespace issues easy to detect.\n\n### Python API\n\n```python\nfrom door_toolkit.flywire import FlyWireMapper\n\n# Initialize mapper\nmapper = FlyWireMapper(\n    community_labels_path=\"processed_labels.csv.gz\",\n    door_cache_path=\"door_cache\",\n    auto_parse=True\n)\n\n# Find cells expressing specific receptor\nor42b_cells = mapper.find_receptor_cells(\"Or42b\")\nprint(f\"Found {len(or42b_cells)} Or42b neurons\")\n\n# Map all receptors\nmappings = mapper.map_door_to_flywire()\nprint(f\"Mapped {len(mappings)} receptors\")\n\n# Create spatial activation map\nspatial_map = mapper.create_spatial_activation_map(\"ethyl butyrate\")\nprint(f\"Active at {spatial_map.total_cells} locations\")\n\n# Export mappings\nmapper.export_mapping(\"flywire_mapping.json\", format=\"json\")\n```\n\n### CLI Usage\n\n```bash\n# Map receptors to FlyWire\ndoor-flywire --labels processed_labels.csv.gz --cache door_cache --map-receptors\n\n# Find specific receptor\ndoor-flywire --labels processed_labels.csv.gz --find-receptor Or42b\n\n# Create spatial map\ndoor-flywire --labels processed_labels.csv.gz --cache door_cache \\\n  --spatial-map \"ethyl butyrate\" --output spatial_map.json\n```\n\n---\n\n## Mushroom Body Circuit Validation\n\n**NEW!** Validate LASSO-identified receptors using complete FlyWire mushroom body pathways.\n\n### The Challenge\n\nYou've identified important receptors using LASSO regression on behavioral data. But **do these receptors actually connect to the learning circuit?**\n\nThis module answers: *\"Are my receptors anatomically positioned in the mushroom body (MB), and which should I test first?\"*\n\n### Complete Workflow\n\n```\nLASSO Behavioral Prediction → FlyWire Pathway Tracing → Priority Matrix → Optogenetics\n         ↓                              ↓                      ↓                ↓\n   Or67c (weight=0.126)      23 ORNs → 6 PNs → 341 KCs    Final Score: 0.920   TEST FIRST!\n                                        56.7% γ lobe        Circuit: Aversive\n```\n\n### Key Features\n\n✅ **Complete Pathway Tracing**\n- Trace: **ORN → PN → KC → MBON**\n- Synapse-level connectivity (5.3M connections)\n- Cell type classification (137K neurons)\n- Mushroom body compartments (α/β, γ, α'β' lobes)\n\n✅ **Circuit Validation Metrics**\n- **ORN→PN Strength**: % of ORN output reaching PNs (commitment to learning pathway)\n- **KC Coverage**: % of Kenyon Cells contacted (breadth of MB access)\n- **Lobe Specialization**: α/β (appetitive) vs γ (aversive) fraction\n- **Circuit Score**: Composite 0-1 score for \"in learning circuit\"\n\n✅ **Integration with Behavioral Data**\n- Load LASSO regression results\n- Combine behavioral importance + anatomical validation\n- Generate experimental priority matrix\n- Export publication-ready figures\n\n✅ **Sensillum Mapping**\n- Automatic mapping: ab2B → Or85a, ab3A → Or22a, ab1A → Or42b\n- Translates sensillum labels to specific Or receptors\n\n### Python API\n\n```python\nfrom door_toolkit.flywire import FlyWireMapper\nfrom door_toolkit.flywire.mushroom_body_tracer import MushroomBodyTracer\n\n# Step 1: Map receptors to FlyWire ORN neurons\nmapper = FlyWireMapper(\"processed_labels.csv.gz\", auto_parse=True)\nor67c_cells = mapper.find_receptor_cells(\"Or67c\")\nprint(f\"Found {len(or67c_cells)} Or67c ORNs\")\n\n# Step 2: Initialize mushroom body tracer\ntracer = MushroomBodyTracer(\n    synapse_path=\"connections_princeton.csv.gz\",\n    cell_types_path=\"consolidated_cell_types.csv.gz\"\n)\n\n# Step 3: Trace complete pathway (ORN → PN → KC → MBON)\npathway = tracer.trace_receptor_pathway(\n    receptor_name=\"Or67c\",\n    orn_ids=[cell[\"root_id\"] for cell in or67c_cells]\n)\n\nprint(f\"Pathway Summary:\")\nprint(f\"  ORNs: {pathway.n_orns}\")\nprint(f\"  PNs: {len(pathway.unique_pns)}\")\nprint(f\"  KCs: {len(pathway.unique_kcs)}\")\nprint(f\"  Synapses (ORN→PN): {pathway.total_orn_to_pn_synapses}\")\nprint(f\"  Synapses (PN→KC): {pathway.total_pn_to_kc_synapses}\")\nprint(f\"  KC compartments: {pathway.kc_compartments}\")\n\n# Step 4: Calculate connectivity metrics\nmetrics = tracer.calculate_connectivity_metrics(pathway)\nprint(f\"\\nConnectivity Metrics:\")\nprint(f\"  ORN→PN strength: {metrics.orn_to_pn_strength:.2%}\")\nprint(f\"  KC coverage: {metrics.kc_coverage:.2%}\")\nprint(f\"  α/β lobe (appetitive): {metrics.alpha_beta_fraction:.2%}\")\nprint(f\"  γ lobe (aversive): {metrics.gamma_fraction:.2%}\")\nprint(f\"  Circuit score: {metrics.circuit_score:.3f}\")\nprint(f\"  Circuit type: {metrics.to_dict()['circuit_type']}\")\n\n# Step 5: Export results\ntracer.export_pathway_csv([pathway], \"pathway_summary.csv\")\ntracer.export_metrics_csv([metrics], \"connectivity_metrics.csv\")\n```\n\n### Complete Analysis Pipeline\n\nRun the complete workflow from LASSO results to experimental priorities:\n\n```python\n# Full pipeline: examples/advanced/flywire_mb_pathway_analysis.py\npython examples/advanced/flywire_mb_pathway_analysis.py\n```\n\n**Output:**\n```\nTop 3 High-Priority Receptors:\n1. Or67c  - Final Score: 0.920  (AVERSIVE, γ lobe)   → TEST FIRST ⭐⭐⭐\n2. Or22b  - Final Score: 0.686  (APPETITIVE, α/β)   → TEST SECOND ⭐⭐\n3. Or85a  - Final Score: 0.658  (APPETITIVE, α/β)   → TEST SECOND ⭐⭐\n\nFiles generated:\n  ✓ final_priority_matrix.csv       - Ranked receptors with all metrics\n  ✓ flywire_pathway_summaries.csv   - ORN→PN→KC pathway stats\n  ✓ flywire_connectivity_metrics.csv - Circuit validation scores\n  ✓ priority_scatter.png             - LASSO vs Connectivity plot\n  ✓ priority_bar.png                 - Priority ranking visualization\n```\n\n### Example Results\n\n**Or67c (Top Candidate)**:\n```\nLASSO Weight: 0.126 (HIGHEST)\nPathway: 23 ORNs → 6 PNs → 341 KCs\nCircuit: 56.7% γ lobe (AVERSIVE learning)\nFinal Score: 0.920\nRecommendation: TEST FIRST - Silencing will impair learned aversive responses\n```\n\n**Or85a (ab2B sensillum)**:\n```\nLASSO Weight: 0.067 (3rd highest)\nPathway: 42 ORNs → 5 PNs → 391 KCs\nCircuit: 55.6% α/β lobe (APPETITIVE learning)\nORN→PN Strength: 84.2% (HIGHEST commitment!)\nFinal Score: 0.658\nRecommendation: TEST SECOND - Strong appetitive circuit\n```\n\n### Biological Interpretation\n\n**Circuit Types:**\n- **Appetitive (α/β lobe)**: Reward/feeding learning (Or22b, Or85a, Or42b)\n- **Aversive (γ lobe)**: Avoidance/punishment learning (Or67c, Or49a)\n\n**Connectivity Metrics:**\n- **High ORN→PN strength** (\u003e70%): Strong commitment to learning pathway\n- **High KC coverage** (\u003e20%): Broad access to memory encoding\n- **Lobe specialization** (\u003e50%): Clear circuit type assignment\n- **Circuit score** (\u003e0.80): High confidence in MB circuit membership\n\n### Integration with LASSO\n\n```python\nfrom door_toolkit.pathways import LassoBehavioralPredictor\n\n# Step 1: Run LASSO to identify important receptors\npredictor = LassoBehavioralPredictor(\n    doorcache_path=\"door_cache\",\n    behavior_csv_path=\"reaction_rates_summary.csv\"\n)\n\n# Fit models for different optogenetic conditions\nresults_hex = predictor.fit_behavior(\"opto_hex\")\nresults_eb = predictor.fit_behavior(\"opto_EB\")\nresults_benz = predictor.fit_behavior(\"opto_benz_1\")\n\nprint(f\"Or22b LASSO weight (hexanol): {results_hex.lasso_weights.get('Or22b', 0):.4f}\")\nprint(f\"Or67c LASSO weight (EB): {results_eb.lasso_weights.get('Or67c', 0):.4f}\")\nprint(f\"Or85a LASSO weight (benz): {results_benz.lasso_weights.get('Or85a', 0):.4f}\")\n\n# Step 2: Validate with FlyWire (see above)\n# ...\n\n# Step 3: Generate final priority matrix\n# Combines: 60% behavioral importance + 40% circuit connectivity\n```\n\n### CLI Usage\n\n```bash\n# Run complete mushroom body analysis\npython examples/advanced/flywire_mb_pathway_analysis.py\n\n# Output: flywire_mb_analysis/\n#   ├── final_priority_matrix.csv       # Experimental priorities\n#   ├── flywire_pathway_summaries.csv   # Pathway statistics\n#   ├── flywire_connectivity_metrics.csv # Circuit validation\n#   ├── priority_scatter.png            # Visualization\n#   ├── priority_bar.png                # Rankings\n#   └── UPDATED_SUMMARY.md              # Complete report\n```\n\n### Real-World Example\n\n**Research Question**: \"Which receptors are critical for learned olfactory behavior?\"\n\n**Workflow**:\n1. ✅ **LASSO identifies** Or67c, Or22b, Or85a as important (sparse circuit)\n2. ✅ **FlyWire validates** all 3 reach mushroom body via PN→KC pathways\n3. ✅ **Circuit analysis** reveals:\n   - Or67c: 56.7% γ lobe → aversive learning\n   - Or22b: 69.5% α/β lobe → appetitive learning\n   - Or85a: 55.6% α/β lobe → appetitive learning\n4. ✅ **Priority matrix** ranks Or67c #1 (score: 0.920)\n5. ✅ **Optogenetic validation** confirms Or67c silencing impairs learning\n\n**Result**: Anatomically validated, prioritized receptor list for experiments! 🎯\n\n---\n\n## Pathway Analysis\n\nQuantitative analysis of olfactory pathways and experiment protocol generation.\n\n### Key Capabilities\n\n- Trace known pathways (Or47b→feeding, Or42b, Or92a→avoidance)\n- Custom pathway analysis\n- Shapley importance computation\n- PGCN experiment protocol generation\n- Behavioral prediction\n\n### Python API\n\n```python\nfrom door_toolkit.pathways import PathwayAnalyzer, BlockingExperimentGenerator, BehavioralPredictor\n\n# Pathway analysis\nanalyzer = PathwayAnalyzer(\"door_cache\")\n\n# Trace Or47b feeding pathway\npathway = analyzer.trace_or47b_feeding_pathway()\nprint(f\"Pathway strength: {pathway.strength:.3f}\")\nprint(f\"Top receptors: {pathway.get_top_receptors(5)}\")\n\n# Custom pathway\ncustom = analyzer.trace_custom_pathway(\n    receptors=[\"Or92a\"],\n    odorants=[\"geosmin\"],\n    behavior=\"avoidance\"\n)\n\n# Shapley importance\nimportance = analyzer.compute_shapley_importance(\"feeding\")\ntop_receptors = sorted(importance.items(), key=lambda x: -x[1])[:10]\n\n# Generate experiment protocol\ngenerator = BlockingExperimentGenerator(\"door_cache\")\nprotocol = generator.generate_experiment_1_protocol()  # Single-unit veto\nprotocol.export_json(\"experiment_protocol.json\")\n\n# Behavioral prediction (heuristic)\npredictor = BehavioralPredictor(\"door_cache\")\nprediction = predictor.predict_behavior(\"hexanol\")\nprint(f\"Valence: {prediction.predicted_valence}\")\nprint(f\"Confidence: {prediction.confidence:.2%}\")\n\n# LASSO behavioral prediction (data-driven)\nfrom door_toolkit.pathways import LassoBehavioralPredictor\n\nlasso_predictor = LassoBehavioralPredictor(\n    doorcache_path=\"door_cache\",\n    behavior_csv_path=\"reaction_rates_summary.csv\"\n)\n\n# Fit model for optogenetic condition\nresults = lasso_predictor.fit_behavior(\"opto_hex\")\nprint(f\"R² = {results.cv_r2_score:.3f}\")\nprint(f\"Selected {results.n_receptors_selected} receptors\")\n\n# Get top predictive receptors\nfor receptor, weight in results.get_top_receptors(5):\n    print(f\"  {receptor}: {weight:.4f}\")\n\n# Generate plots\nresults.plot_predictions(save_to=\"opto_hex_predictions.png\")\nresults.plot_receptors(save_to=\"opto_hex_receptors.png\")\n\n# Export results\nresults.export_csv(\"opto_hex_results.csv\")\nresults.export_json(\"opto_hex_model.json\")\n\n# Compare multiple conditions\ncomparison = lasso_predictor.compare_conditions(\n    conditions=[\"opto_hex\", \"opto_EB\", \"opto_benz_1\"],\n    plot=True,\n    save_dir=\"comparison_results\"\n)\n```\n\n### LASSO Behavioral Prediction\n\nThe `LassoBehavioralPredictor` uses sparse regression (LASSO) to identify minimal receptor circuits that predict behavioral responses from optogenetic manipulation experiments:\n\n**Features:**\n- Automatic odorant name matching between behavioral data and DoOR\n- Cross-validated LASSO regression with automatic λ selection\n- Sparse receptor circuit identification (typically 3-10 receptors)\n- Multiple prediction modes: test odorant, trained odorant, or interaction features\n- Visualization: predicted vs actual PER, receptor importance rankings\n- Export to CSV/JSON for downstream analysis\n\n**Workflow:**\n1. Load optogenetic behavioral data (PER responses)\n2. Match odorant names to DoOR receptor profiles\n3. Fit LASSO models with cross-validation\n4. Extract sparse receptor weights\n5. Visualize and export results\n\n**Example dataset format** (`reaction_rates_summary.csv`):\n```\ndataset,3-Octonol,Benzaldehyde,Ethyl_Butyrate,Hexanol,Linalool\nopto_hex,0.25,0.00,0.19,0.69,0.19\nopto_EB,0.13,0.00,0.22,0.20,0.00\nopto_benz_1,0.25,0.02,0.44,0.59,0.12\n```\n\n**Biological Interpretation:**\n- Positive weights → receptors associated with higher PER\n- Negative weights → receptors associated with lower PER (potential inhibition)\n- Zero weights → receptors excluded by LASSO (not predictive)\n- Sparse circuits (3-7 receptors) suggest minimal testable hypotheses\n\n**Robustness Analysis:** Two CLI scripts assess circuit robustness. *Ablation* (`lasso_with_ablations.py`) tests necessity by zeroing out receptors and measuring MSE increase. *Focus mode* (`lasso_with_focus_mode.py`) tests sufficiency by refitting LASSO on only the top-N receptors to generate MSE vs N curves.\n\n```bash\n# Ablation: test if removing Or22b/Or49a degrades the model\npython scripts/lasso_with_ablations.py --door_cache door_cache \\\n    --behavior_csv reaction_rates.csv --condition opto_hex \\\n    --ablate Or22b Or49a --ablation_set_mode single --output_dir ablation_out\n\n# Focus: test if top 1-5 receptors are sufficient\npython scripts/lasso_with_focus_mode.py --door_cache door_cache \\\n    --behavior_csv reaction_rates.csv --condition opto_hex \\\n    --topn_list 1 2 3 5 --output_dir focus_out\n```\n\n### CLI Usage\n\n```bash\n# Trace pathways\ndoor-pathways --cache door_cache --trace or47b-feeding\n\n# Custom pathway\ndoor-pathways --cache door_cache --custom-pathway \\\n  --receptors Or92a --odorants geosmin --behavior avoidance\n\n# Shapley importance\ndoor-pathways --cache door_cache --shapley feeding --output importance.json\n\n# Generate experiment\ndoor-pathways --cache door_cache --generate-experiment 1 \\\n  --output exp1_protocol.json --format markdown\n\n# Predict behavior\ndoor-pathways --cache door_cache --predict-behavior \"ethyl butyrate\"\n```\n\n---\n\n## Neural Network Preprocessing\n\nPrepare DoOR data for neural network training with sparse encoding and augmentation.\n\n### Key Capabilities\n\n- Sparse KC-like encoding (5% sparsity)\n- Hill equation concentration-response modeling\n- Noise augmentation (Gaussian, Poisson, dropout)\n- PyTorch/NumPy/HDF5 export\n- PGCN-compatible dataset generation\n\n### Python API\n\n```python\nfrom door_toolkit.neural import DoORNeuralPreprocessor\n\n# Initialize preprocessor\npreprocessor = DoORNeuralPreprocessor(\n    \"door_cache\",\n    n_kc_neurons=2000,\n    random_seed=42\n)\n\n# Create sparse encoding\nsparse_data = preprocessor.create_sparse_encoding(sparsity_level=0.05)\nprint(f\"Shape: {sparse_data.shape}\")\nprint(f\"Sparsity: {(sparse_data \u003e 0).mean():.2%}\")\n\n# Generate augmented dataset\naug_orn, aug_kc, labels = preprocessor.generate_noise_augmented_responses(\n    n_augmentations=5,\n    noise_level=0.1\n)\n\n# Export PGCN dataset\npreprocessor.export_pgcn_dataset(\n    output_dir=\"pgcn_dataset\",\n    format=\"pytorch\",  # or \"numpy\", \"h5\"\n    include_sparse=True\n)\n\n# Train/val split\ntrain, val = preprocessor.create_training_validation_split(train_fraction=0.8)\n```\n\n### Concentration-Response Modeling\n\n```python\nfrom door_toolkit.neural.concentration_models import ConcentrationResponseModel\n\nmodel = ConcentrationResponseModel()\n\n# Fit Hill equation\nconcentrations = np.array([0.001, 0.01, 0.1, 1.0])\nresponses = np.array([0.1, 0.3, 0.7, 0.9])\nparams = model.fit_hill_equation(concentrations, responses)\n\nprint(f\"EC50: {params.ec50:.3f}\")\nprint(f\"Hill coefficient: {params.hill_coefficient:.3f}\")\n\n# Generate concentration series\nconc, resp = model.generate_concentration_series(params, n_points=50)\n\n# Model odor mixtures\nmixture_responses = model.model_mixture_interactions(\n    [params1, params2],\n    concentrations,\n    interaction_type=\"additive\"\n)\n```\n\n### CLI Usage\n\n```bash\n# Sparse encoding\ndoor-neural --cache door_cache --sparse-encode --sparsity 0.05 \\\n  --output sparse_data.npy\n\n# Augment dataset\ndoor-neural --cache door_cache --augment --n-augmentations 5 \\\n  --output-dir augmented_data/\n\n# Export PGCN dataset\ndoor-neural --cache door_cache --export-pgcn \\\n  --output-dir pgcn_dataset/ --format pytorch\n\n# Dataset statistics\ndoor-neural --cache door_cache --stats\n```\n\n---\n\n## Command-Line Interface\n\n### Core Commands\n\n```bash\n# Extract DoOR data\ndoor-extract --input DoOR.data/data --output door_cache\n\n# Validate cache contents\ndoor-extract --validate door_cache\n\n# List odorants (optional substring filter)\ndoor-extract --list-odorants door_cache --pattern acetate\n\n# Encode an odorant and show receptor responses\ndoor-extract --cache door_cache --odor \"ethyl butyrate\" --coverage\n\n# Compare multiple odorants\ndoor-extract --cache door_cache --odors \"ethyl butyrate\" \"acetic acid\" \\\n  --top 15 --coverage --save reports/odor-comparison\n\n# Inspect receptor response profiles\ndoor-extract --cache door_cache --receptor Or42b --top 25\n```\n\n### Feature-Specific Commands\n\n```bash\n# FlyWire integration\ndoor-flywire --labels processed_labels.csv.gz --cache door_cache --map-receptors\n\n# Pathway analysis\ndoor-pathways --cache door_cache --trace or47b-feeding\n\n# Neural preprocessing\ndoor-neural --cache door_cache --sparse-encode --sparsity 0.05 --output sparse_data.npy\n```\n\nAdd `--debug` to any command for detailed tracebacks and logging.\n\n**Receptor group shortcuts:**\n- `or` – Odorant receptors (OrXX)\n- `ir` – Ionotropic receptors (IrXX)\n- `gr` – Gustatory receptors (GrXX)\n- `neuron` – Antennal/palp neuron classes (ab*, ac*, pb*)\n\n---\n\n## API Reference\n\n### DoORExtractor\nExtract DoOR R data files to Python formats.\n\n```python\nfrom door_toolkit import DoORExtractor\n\nextractor = DoORExtractor(input_dir, output_dir)\nextractor.run()\nextractor.extract_response_matrix()\nextractor.extract_odor_metadata()\n```\n\n### DoOREncoder\nEncode odorant names to neural activation patterns.\n\n```python\nfrom door_toolkit import DoOREncoder\n\nencoder = DoOREncoder(cache_path, use_torch=False)\nencoder.encode(odor_name)\nencoder.batch_encode(odor_names)\nencoder.list_available_odorants(pattern)\nencoder.get_receptor_coverage(odor_name)\nencoder.get_odor_metadata(odor_name)\n```\n\n### CrossTalkNetwork\nMain class for connectomics network analysis.\n\n```python\nfrom door_toolkit.connectomics import CrossTalkNetwork\n\nnetwork = CrossTalkNetwork.from_csv(filepath, config=None)\nnetwork.set_min_synapse_threshold(threshold)\nnetwork.get_pathways_from_orn(orn_identifier, by_glomerulus=False)\nnetwork.get_pathways_between_orns(source, target, by_glomerulus=False)\nnetwork.find_shortest_paths(source, target, max_paths=10)\nnetwork.get_hub_neurons(neuron_category=None, top_n=10)\nnetwork.get_network_statistics()\nnetwork.export_to_graphml(filepath)\nnetwork.export_to_gexf(filepath)\n```\n\n### NetworkStatistics\nStatistical analysis of connectomics networks.\n\n```python\nfrom door_toolkit.connectomics.statistics import NetworkStatistics\n\nstats = NetworkStatistics(network)\nstats.detect_hub_neurons(method='degree', threshold_percentile=90.0)\nstats.detect_communities(algorithm='louvain', level='glomerulus')\nstats.calculate_asymmetry_matrix()\nstats.analyze_path_lengths(source_glomerulus=None)\nstats.generate_full_report()\n```\n\n### Analysis Functions\n\n```python\nfrom door_toolkit.connectomics.pathway_analysis import (\n    analyze_single_orn,\n    compare_orn_pair,\n    find_pathways\n)\n\n# Mode 1: Single ORN\nresults = analyze_single_orn(network, orn_identifier, by_glomerulus=True)\n\n# Mode 2: ORN pair comparison\ncomparison = compare_orn_pair(network, orn1, orn2, by_glomerulus=True)\n\n# Mode 4: Pathway search\npathways = find_pathways(network, source, target, by_glomerulus=False)\n```\n\n### Visualization\n\n```python\nfrom door_toolkit.connectomics.visualization import NetworkVisualizer\n\nvisualizer = NetworkVisualizer(network)\nvisualizer.plot_full_network(output_path='network.png', **kwargs)\nvisualizer.plot_single_orn_pathways(orn_identifier, output_path='pathways.png')\nvisualizer.plot_glomerulus_heatmap(output_path='heatmap.png')\n```\n\n### MushroomBodyTracer\n\n**NEW!** Trace complete pathways through mushroom body learning circuits.\n\n```python\nfrom door_toolkit.flywire.mushroom_body_tracer import MushroomBodyTracer\n\n# Initialize tracer\ntracer = MushroomBodyTracer(\n    synapse_path=\"connections_princeton.csv.gz\",\n    cell_types_path=\"consolidated_cell_types.csv.gz\",\n    min_synapse_threshold=1\n)\n\n# Trace pathway: ORN → PN → KC → MBON\npathway = tracer.trace_receptor_pathway(receptor_name, orn_ids)\n\n# Calculate connectivity metrics\nmetrics = tracer.calculate_connectivity_metrics(pathway, total_kcs_in_brain=2000)\n\n# Export results\ntracer.export_pathway_csv([pathway], \"pathway_summary.csv\")\ntracer.export_metrics_csv([metrics], \"connectivity_metrics.csv\")\n```\n\n**Key Classes:**\n- `PathwayStep`: Single synapse connection\n- `MushroomBodyPathway`: Complete ORN→PN→KC pathway\n- `ConnectivityMetrics`: Circuit validation scores\n\n**Attributes:**\n- `pathway.n_orns`: Number of ORN neurons\n- `pathway.n_pns`: Number of PN neurons contacted\n- `pathway.n_kcs`: Number of KC neurons contacted\n- `pathway.kc_compartments`: Dict of KC counts by lobe (α/β, γ, α'β')\n- `metrics.orn_to_pn_strength`: ORN→PN pathway strength (0-1)\n- `metrics.kc_coverage`: Fraction of KCs contacted (0-1)\n- `metrics.alpha_beta_fraction`: Fraction in appetitive lobe (0-1)\n- `metrics.circuit_score`: Overall connectivity score (0-1)\n\n### Mapping Accounting\n\n**IMPORTANT:** Prevents confusion between receptor counts and unique glomerulus counts in many-to-one mappings.\n\n```python\nfrom door_toolkit.integration.mapping_accounting import (\n    compute_mapping_stats,\n    format_mapping_summary,\n    log_mapping_stats,\n    write_mapping_stats_json\n)\n\n# Compute comprehensive mapping statistics\nmapping = {'OR82A': 'VA6', 'OR94A': 'VA6', 'OR7A': 'DL5'}  # Example with collision\nstats = compute_mapping_stats(\n    mapping,\n    note=\"Example mapping\",\n    adult_only=False  # Include larval receptors\n)\n\n# Get compact summary\nsummary = format_mapping_summary(stats)\n# \"3 receptors → 2 unique glomeruli (1 collision)\"\n\n# Check for many-to-one collapses\nif stats['collision_count'] \u003e 0:\n    print(f\"Collisions: {stats['collision_summary']}\")\n    # ['VA6: OR82A, OR94A']\n\n# Write JSON artifact for reproducibility\nwrite_mapping_stats_json(\"mapping_stats.json\", stats)\n```\n\n**Key Stats Returned:**\n- `n_receptors_mapped`: Number of receptor genes successfully mapped\n- `n_unique_glomeruli_from_mapped_receptors`: Number of distinct glomeruli (may differ!)\n- `collision_count`: Number of glomeruli with ≥2 receptors (many-to-one)\n- `collisions`: Dict of glomerulus → [receptor list] for collisions\n- `collision_summary`: Human-readable collision descriptions\n\n📚 **See:** [docs/RECEPTOR_GLOMERULUS_MAPPING_ACCOUNTING.md](docs/RECEPTOR_GLOMERULUS_MAPPING_ACCOUNTING.md) for complete documentation on preventing receptor vs glomerulus count confusion.\n\n---\n\n## Examples\n\nComplete working examples are available in the `examples/` directory:\n\n### Basic DoOR Examples\n- `examples/basic/encode_odorants.py` - Encode odorants to PN activations\n- `examples/basic/search_odorants.py` - Search and filter odorants\n- `examples/basic/receptor_analysis.py` - Analyze receptor responses\n\n### Connectomics Examples\n- `examples/connectomics/example_1_single_orn_analysis.py` - Mode 1: Single ORN focus\n- `examples/connectomics/example_2_orn_pair_comparison.py` - Mode 2: ORN pair comparison\n- `examples/connectomics/example_3_full_network_analysis.py` - Mode 3: Full network view\n- `examples/connectomics/example_4_pathway_search.py` - Mode 4: Pathway search\n- `examples/connectomics/example_orn_identifier_resolution.py` - Robust identifier resolution demo\n- `examples/connectomics/analyze_data_characteristics.py` - Data quality analysis\n\n### Advanced Examples\n- `examples/advanced/flywire_integration_example.py` - FlyWire mapping\n- `examples/advanced/flywire_mb_pathway_analysis.py` - **NEW!** Mushroom body circuit validation\n- `examples/advanced/pathway_analysis_example.py` - Pathway tracing\n- `examples/advanced/neural_preprocessing_example.py` - Neural network prep\n- `examples/lasso_behavioral_prediction_demo.py` - LASSO regression for behavioral prediction\n\n### Running Examples\n\n```bash\n# Extract DoOR data first\ndoor-extract --input DoOR.data/data --output door_cache\n\n# Run examples\npython examples/basic/encode_odorants.py\npython examples/connectomics/example_1_single_orn_analysis.py\npython examples/advanced/flywire_integration_example.py\n\n# NEW: Mushroom body circuit validation\npython examples/advanced/flywire_mb_pathway_analysis.py\n```\n\n### Complete Workflow Example\n\n**From LASSO to Optogenetics**:\n\n```bash\n# 1. Run LASSO behavioral prediction\npython examples/lasso_behavioral_prediction_demo.py\n\n# 2. Validate receptors with FlyWire mushroom body analysis\npython examples/advanced/flywire_mb_pathway_analysis.py\n\n# Output:\n#   behavioral_prediction_results/\n#     ├── opto_hex_results.csv        # LASSO identified receptors\n#     └── opto_hex_predictions.png\n#\n#   flywire_mb_analysis/\n#     ├── final_priority_matrix.csv   # Experimental priorities\n#     ├── priority_scatter.png\n#     └── UPDATED_SUMMARY.md          # Complete analysis report\n\n# 3. Use priority matrix to design optogenetic experiments!\n```\n\n---\n\n## Requirements\n\n### Core Dependencies\n- Python ≥ 3.8\n- pandas ≥ 1.5.0\n- numpy ≥ 1.21.0\n- pyarrow ≥ 12.0.0\n- networkx ≥ 2.8\n- matplotlib ≥ 3.5.0\n- scipy ≥ 1.9.0\n\n### Optional Dependencies\n- **pyreadr ≥ 0.4.7** - Required for DoORExtractor\n- **torch ≥ 2.0.0** - For PyTorch integration\n- **seaborn ≥ 0.11.0** - For heatmaps\n- **python-louvain ≥ 0.16** - For Louvain community detection\n- **plotly ≥ 5.11.0** - For interactive visualizations\n- **h5py ≥ 3.7.0** - For HDF5 export\n\n---\n\n## Installation from Source\n\n```bash\n# Clone repository\ngit clone https://github.com/yourusername/door-python-toolkit.git\ncd door-python-toolkit\n\n# Create virtual environment\npython -m venv .venv\nsource .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\n\n# Install development dependencies\nmake install-dev\n\n# Extract DoOR data\nmake extract INPUT=path/to/DoOR.data/data OUTPUT=door_cache\n\n# Run tests\nmake test\n\n# Lint and format\nmake lint\nmake format\n```\n\n---\n\n## Data Sources\n\n### DoOR Database\nThis toolkit extracts data from the original DoOR R packages:\n- **DoOR.data** - https://github.com/ropensci/DoOR.data\n- **DoOR.functions** - https://github.com/ropensci/DoOR.functions\n\nDownload DoOR data:\n```bash\nwget https://github.com/ropensci/DoOR.data/archive/refs/tags/v2.0.0.zip\nunzip v2.0.0.zip\ndoor-extract --input DoOR.data-2.0.0/data --output door_cache\n```\n\n### FlyWire Connectome\nFlyWire connectome data is available from:\n- **FlyWire** - https://flywire.ai/\n- **Community labels** - Available through CAVE API\n\n---\n\n## Performance\n\n- **DoOR extraction**: Full dataset in \u003c10 seconds\n- **FlyWire parsing**: 100K+ labels in \u003c30 seconds\n- **Network construction**: 108,980 pathways loaded in \u003c5 seconds\n- **Receptor mapping**: \u003e80% success rate\n- **Sparse encoding**: Maintains 5±1% sparsity\n- **Memory usage**: \u003c2GB for largest datasets\n\n---\n\n## Testing\n\nRun the comprehensive test suite:\n\n```bash\n# Install dev dependencies\npip install -e .[dev]\n\n# Run tests\npytest tests/ -v\n\n# With coverage\npytest tests/ --cov=door_toolkit --cov-report=html\n\n# Specific test modules\npytest tests/test_connectomics.py -v\npytest tests/test_encoder.py -v\n```\n\n---\n\n## Receptor Mapping References\n\n1. **Couto, A., et al. (2005)** \"Molecular, Anatomical, and Functional Organization of the Drosophila Olfactory System.\" *Current Biology* 15(17): 1535-1547. DOI: 10.1016/j.cub.2005.07.034\n2. **Hallem, E. A. \u0026 Carlson, J. R. (2006)** \"Coding of Odors by a Receptor Repertoire.\" *Cell* 125(1): 143-160. DOI: 10.1016/j.cell.2006.01.050\n3. **Silbering, A. F., et al. (2011)** \"Complementary Function and Integrated Wiring of the Evolutionarily Distinct Drosophila Olfactory Subsystems.\" *Journal of Neuroscience* 31(38): 13357-13375. DOI: 10.1523/JNEUROSCI.2360-11.2011\n4. **Fishilevich, E. \u0026 Vosshall, L. B. (2005)** \"Genetic and Functional Subdivision of the Drosophila Antennal Lobe.\" *Current Biology* 15(17): 1548-1553. DOI: 10.1016/j.cub.2005.07.066\n5. **Benton, R., et al. (2009)** \"Variant Ionotropic Glutamate Receptors as Chemosensory Receptors in Drosophila.\" *Cell* 136(1): 149-162. DOI: 10.1016/j.cell.2008.12.001\n\n## Citation\n\nIf you use this toolkit in your research, please cite:\n\n### This Toolkit\n```bibtex\n@software{door_python_toolkit,\n  author = {Hanan, Cole and Contributors},\n  title = {DoOR Python Toolkit: Comprehensive Tools for Drosophila Olfactory Research},\n  year = {2025},\n  version = {1.0.0},\n  url = {https://github.com/colehanan1/door-python-toolkit},\n  note = {Production-ready toolkit with mushroom body circuit validation and LASSO behavioral prediction}\n}\n```\n\n### Original DoOR Database\n```bibtex\n@article{muench2016door,\n  title={DoOR 2.0--Comprehensive Mapping of Drosophila melanogaster Odorant Responses},\n  author={M{\\\"u}nch, Daniel and Galizia, C Giovanni},\n  journal={Scientific Data},\n  volume={3},\n  number={1},\n  pages={1--14},\n  year={2016},\n  publisher={Nature Publishing Group}\n}\n```\n\n### FlyWire Consortium\n```bibtex\n@article{flywire2024,\n  title={FlyWire: online community for whole-brain connectomics},\n  author={FlyWire Consortium and Others},\n  journal={Nature},\n  year={2024}\n}\n```\n\n### Relevant Publications\n- Wilson \u0026 Laurent (2005). Role of GABAergic inhibition in shaping odor-evoked spatiotemporal patterns in the Drosophila antennal lobe. *Journal of Neuroscience*.\n- Olsen \u0026 Wilson (2008). Lateral presynaptic inhibition mediates gain control in olfactory glomeruli. *Nature*.\n- Kazama \u0026 Wilson (2009). Origins of correlated activity in an olfactory circuit. *Nature Neuroscience*.\n\n---\n\n## Contributing\n\nContributions welcome! Please:\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit changes (`git commit -m 'Add amazing feature'`)\n4. Push to branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n**Development setup:**\n```bash\ngit clone https://github.com/yourusername/door-python-toolkit.git\ncd door-python-toolkit\npython -m venv .venv\nsource .venv/bin/activate\nmake install-dev\nmake test\n```\n\n**Code Style:**\n- Follow PEP 8\n- Use Black for formatting (`make format`)\n- Add type hints\n- Write docstrings for public APIs\n- Add tests for new features\n\n---\n\n## Troubleshooting\n\n### DoOR Issues\n\n**\"Odorant not found\"**\n→ Use `encoder.list_available_odorants()` to see exact names (case-insensitive)\n\n**\"Cache not found\"**\n→ Run `DoORExtractor` first to extract R data files\n\n**\"High sparsity\"**\n→ Normal for DoOR (86%). Use `fillna(0.0)` or filter to well-covered receptors\n\n**PyTorch not available**\n→ Install with `pip install door-python-toolkit[torch]`\n\n### Connectomics Issues\n\n**`FileNotFoundError: interglomerular_crosstalk_pathways.csv`**\n→ Ensure data files are in correct location or provide full path\n\n**`MemoryError` when loading large files**\n→ Increase synapse threshold to reduce network size:\n```python\nnetwork.set_min_synapse_threshold(20)  # Only strong connections\n```\n\n**Visualization is cluttered**\n→ Filter by synapse strength:\n```python\nvisualizer.plot_full_network(min_synapse_display=50, show_individual_neurons=False)\n```\n\n**Community detection fails**\n→ Install python-louvain: `pip install python-louvain`\n\n**Heatmap not showing**\n→ Install seaborn: `pip install seaborn`\n\n**Qt/matplotlib crash**\n→ Module uses non-interactive 'Agg' backend by default. If issues persist, check your matplotlib configuration.\n\n---\n\n## Acknowledgments\n\n- **DoOR database creators**: Daniel Münch \u0026 C. Giovanni Galizia\n- **Original R package**: rOpenSci DoOR project\n- **FlyWire Consortium**: For comprehensive connectome data\n- **Contributors**: Cole Hanan and the *Drosophila* neuroscience community\n- **Raman Lab**: WashU neuroscience research\n\n---\n\n## License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n---\n\n## Links\n\n- **PyPI:** https://pypi.org/project/door-python-toolkit/\n- **GitHub:** https://github.com/yourusername/door-python-toolkit\n- **Documentation:** https://door-python-toolkit.readthedocs.io\n- **Issues:** https://github.com/yourusername/door-python-toolkit/issues\n- **Original DoOR:** https://github.com/ropensci/DoOR.data\n- **FlyWire:** https://flywire.ai/\n- **Raman Lab:** https://ramanlab.wustl.edu/\n\n---\n\n**Made with ❤️ for the *Drosophila* neuroscience community**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcolehanan1%2Fdoor-python-toolkit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcolehanan1%2Fdoor-python-toolkit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcolehanan1%2Fdoor-python-toolkit/lists"}