{"id":49723495,"url":"https://github.com/aminzibayi/atfc","last_synced_at":"2026-05-09T03:04:02.246Z","repository":{"id":353947965,"uuid":"1221247269","full_name":"AminZibayi/ATFC","owner":"AminZibayi","description":"Technology forecasting toolkit","archived":false,"fork":false,"pushed_at":"2026-05-04T23:29:50.000Z","size":28773,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-05T01:26:31.622Z","etag":null,"topics":["data-analysis","data-visualization","graph","technology-forecasting"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AminZibayi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-04-26T00:14:37.000Z","updated_at":"2026-05-04T23:29:53.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/AminZibayi/ATFC","commit_stats":null,"previous_names":["aminzibayi/atfc"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/AminZibayi/ATFC","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AminZibayi%2FATFC","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AminZibayi%2FATFC/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AminZibayi%2FATFC/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AminZibayi%2FATFC/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AminZibayi","download_url":"https://codeload.github.com/AminZibayi/ATFC/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AminZibayi%2FATFC/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32805514,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-08T08:22:46.396Z","status":"online","status_checked_at":"2026-05-09T02:00:06.633Z","response_time":123,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-visualization","graph","technology-forecasting"],"created_at":"2026-05-09T03:04:00.340Z","updated_at":"2026-05-09T03:04:02.236Z","avatar_url":"https://github.com/AminZibayi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Technology Forecasting — Additive Manufacturing\n\nBibliometric analysis and technology forecasting of **Additive Manufacturing** using Web of Science publications and patent data. The pipeline covers data collection, NLP preprocessing, LDA topic modeling, trend analysis, and network visualization.\n\n## Monorepo Architecture\n\nThis project is structured as a strict **Monorepo**, seamlessly combining Python data pipelines and a TypeScript/Vite frontend visualization app.\n\n- **Data Locality:** Code and data are strictly separated. Raw data lives in `data_source/`, while all generated artifacts are cached and output to `dist/apps/\u003capp-name\u003e/`.\n\n```text\nTechnology Forecasting/\n├── apps/\n│   ├── bibliometric-pipeline/    # Python pipeline (Data extraction, graph building, visualization)\n│   └── g6-networks/              # TS/Vite frontend (Interactive G6 network visualizations)\n├── libs/\n│   └── shared-python/            # Shared Python utilities (e.g., dynamic workspace path resolution)\n├── data_source/                  # Raw and derived input datasets (not committed by default)\n├── dist/                         # Generated artifacts and build outputs (gitignored)\n│   └── apps/\n│       ├── bibliometric-pipeline/\n│       │   ├── data/             # Generated CSV, JSON, GraphML, and Excel files\n│       │   └── plots/            # Generated static plots and network HTML files\n│       └── g6-networks/          # Compiled Vite frontend and exported G6 JSON data\n├── package.json                  # Root Node.js manifest and Nx plugins\n├── pnpm-workspace.yaml           # pnpm workspace definition\n└── nx.json                       # Nx configuration and caching rules\n```\n\n## Running the Pipeline\n\nAll tasks must be run through Nx to ensure proper caching and dependency resolution. Do not run `uv` or `pnpm` directly inside the app directories.\n\n### Setup\n\nInstall all dependencies (Node and Python) from the workspace root:\n\n```bash\npnpm install\n```\n\n### Full Pipeline Execution\n\nRun the entire pipeline (Extract → Build Networks → Visualize → Export G6 Data → Build Vite App) in one command:\n\n```bash\npnpm nx run-many -t extract build visualize export-data build\n```\n\n### Individual Targets\n\n```bash\n# 1. Extract raw WoS data into canonicalized CSVs\npnpm nx run bibliometric-pipeline:extract\n\n# 2. Build institutional, funding, and journal graphs (GraphML, Excel metrics)\npnpm nx run bibliometric-pipeline:build\n\n# 3. Generate static plots and interactive HTML networks\npnpm nx run bibliometric-pipeline:visualize\n\n# 4. Export graph data to JSON for the G6 frontend\npnpm nx run g6-networks:export-data\n\n# 5. Build the Vite frontend application\npnpm nx run g6-networks:build\n\n# 6. Serve the interactive G6 visualization locally\npnpm nx serve g6-networks\n```\n\n## Datasets\n\nAll datasets reside in `data_source/`. The project uses **two parallel corpora** — academic publications (Web of Science) and patents — processed through the same topic modeling pipeline.\n\n| File                                                | Description                                               | Rows                    | Key Columns                                                                                                                             |\n| --------------------------------------------------- | --------------------------------------------------------- | ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |\n| `wos_raw_bibliography.xlsx`                         | Raw WoS export                                            | 126,001                 | Authors, Article Title, Abstract, Cited References, Times Cited, Publication Year, Keywords, Affiliations, Funding Orgs, WoS Categories |\n| `wos_filtered_bibliography.xlsx`                    | Filtered WoS subset (post-2000, with abstracts)           | 93,937                  | Same 73 columns as raw                                                                                                                  |\n| `patents_with_dominant_topic.xlsx`                  | Patent abstracts with NLP columns + LDA topic assignments | 30,281                  | abstract_text, year, clean_text, tokens, bigrams, lemmatized, stemmed, Dominant_Topic, Contribution %                                   |\n| `publication_stemmed_tokens_for_lda.json`           | Stemmed token lists (LDA input for publications)          | 68,867 docs             | List of token lists                                                                                                                     |\n| `publication_lda_topic_keywords.xlsx`               | Top 25 keywords per publication topic (25 topics)         | 25 keywords × 25 topics | Topic 1…Topic 25                                                                                                                        |\n| `patent_lda_topic_keywords.xlsx`                    | Top 25 keywords per patent topic (14 topics)              | 25 keywords × 14 topics | Topic 1…Topic 14                                                                                                                        |\n| `publication_topic_document_distribution.xlsx`      | Document count \u0026 percentage per publication topic         | 25                      | Dominant Topic, Doc_Count, Total_Docs_Perc                                                                                              |\n| `patent_topic_document_distribution.xlsx`           | Document count \u0026 percentage per patent topic              | 14                      | Dominant Topic, Doc_Count, Total_Docs_Perc                                                                                              |\n| `publication_topic_proportions_by_year.xlsx`        | Publication topic proportions over time                   | 25 years                | Year, Topic 1…Topic 25                                                                                                                  |\n| `patent_topic_proportions_by_year.xlsx`             | Patent topic proportions over time                        | 54 years                | Year, Topic 1…Topic 14                                                                                                                  |\n| `patent_topic_proportions_by_year_no_year_col.xlsx` | Same as above, without Year column                        | 54 rows                 | Topic 1…Topic 14                                                                                                                        |\n| `publication_topic_mann_kendall_results.xlsx`       | Mann-Kendall trend test on 25 publication topics          | 26                      | Variable, Trend, h, p-value, Z, Tau, S, Var(S), Sen's Slope, Intercept                                                                  |\n| `patent_topic_mann_kendall_results.xlsx`            | Mann-Kendall trend test on 14 patent topics               | 15                      | Same columns as above                                                                                                                   |\n| `cross_technology_mann_kendall_trends.xlsx`         | MK trend test across 30 technologies                      | 30                      | Technology, Trend, p-value, Slope                                                                                                       |\n| `wos_category_counts.xlsx`                          | WoS category frequency distribution                       | 222                     | Category, Count                                                                                                                         |\n\n## Analysis Pipeline\n\n```\n\nWoS Search (126K) ──► Filter (94K) ──► NLP Preprocessing ──► LDA (25 topics) ──► Trend Computation ──► Mann-Kendall Test\nPatent Search (30K) ─────────────► NLP Preprocessing ──► LDA (14 topics) ──► Trend Computation ──► Mann-Kendall Test\n│\nCross-technology MK test (30 technologies)\n\n```\n\n1. **Data Collection** — WoS search for Additive Manufacturing literature (126K records); patent database search (30K patents)\n2. **Filtering** — Removed pre-2000 records, documents without abstracts, non-article types → 94K records\n3. **NLP Preprocessing** — Text cleaning, tokenization, bigram extraction, lemmatization, stemming\n4. **LDA Topic Modeling** — 25 publication topics, 14 patent topics (Gensim)\n5. **Topic Assignment** — Each document assigned a dominant topic + contribution percentage\n6. **Trend Computation** — Topic proportions calculated by year for both corpora\n7. **Statistical Testing** — Mann-Kendall trend test with Sen's Slope on all topic time series\n\n## Key Findings (from Mann-Kendall Tests)\n\n| Dimension         | Publications (WoS)      | Patents |\n| ----------------- | ----------------------- | ------- |\n| Increasing topics | 9 of 25                 | 6 of 14 |\n| Decreasing topics | 6 of 25                 | 1 of 14 |\n| No trend          | 10 of 25                | 7 of 14 |\n| Top WoS category  | Materials Science (29K) | N/A     |\n\nAll 30 cross-technology trends show \"decreasing\" patterns, consistent with post-peak Hype Cycle behavior.\n\n## Topic Themes\n\n### Publication Topics (25)\n\nMedical/surgical AM, lattice/metamaterial design, microfluidics, bioprinting/hydrogel, WAAM/laser metal, directed energy deposition, FDM/PLA, powder bed fusion/ceramic, flexible electronics, tissue engineering scaffolds, and more.\n\n### Patent Topics (14)\n\nMetal powder/sintering, FDM nozzle/extruder, medical/bone fabrication, general layer deposition, SLM/SLS laser, structural/cavity/scaffold, microfluidics, and more.\n\n## Available Analyses\n\nWith these datasets, the following analyses are supported:\n\n- **Topic evolution visualization** — Stacked area charts / streamgraphs from topic proportion time series\n- **Emerging vs. declining topic identification** — From Mann-Kendall results + Sen's Slope magnitude\n- **Science-technology linkage** — Compare publication vs. patent topic landscapes\n- **Technology life cycle modeling** — S-curve, logistic, or Bass diffusion fitting\n- **Citation analysis** — Times Cited, Cited References, Cited Reference Count available\n- **Co-authorship / collaboration networks** — From Authors, Addresses, Affiliations fields\n- **Keyword co-occurrence networks** — Author Keywords + Keywords Plus for 108K/83K records\n- **Interdisciplinary analysis** — Cross-tabulate topics with 222 WoS categories\n- **Funding landscape mapping** — Funding Orgs available for ~71K records\n- **Hype Cycle positioning** — Cross-technology MK results for 30 technologies\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faminzibayi%2Fatfc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faminzibayi%2Fatfc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faminzibayi%2Fatfc/lists"}