{"id":25990074,"url":"https://github.com/jwalsh/syntree-generator","last_synced_at":"2025-03-05T13:24:26.335Z","repository":{"id":280706342,"uuid":"942893437","full_name":"jwalsh/syntree-generator","owner":"jwalsh","description":"A tool for converting French literary text into S-expression syntax trees for linguistic analysis, with visualization capabilities","archived":false,"fork":false,"pushed_at":"2025-03-04T22:10:22.000Z","size":132,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-04T22:23:13.495Z","etag":null,"topics":["abstract-syntax-tree","constituency-parsing","emacs","french","linguistics","literary-analysis","nlp","org-mode","parser","proust","python","s-expression","spacy","syntax-analysis","syntax-tree"],"latest_commit_sha":null,"homepage":"https://wal.sh","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jwalsh.png","metadata":{"files":{"readme":"README.org","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-04T21:06:12.000Z","updated_at":"2025-03-04T22:10:26.000Z","dependencies_parsed_at":"2025-03-04T22:23:17.609Z","dependency_job_id":"ae49d190-7cab-48c0-bc41-2de4cf2b8d34","html_url":"https://github.com/jwalsh/syntree-generator","commit_stats":null,"previous_names":["jwalsh/syntree-generator"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jwalsh%2Fsyntree-generator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jwalsh%2Fsyntree-generator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jwalsh%2Fsyntree-generator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jwalsh%2Fsyntree-generator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jwalsh","download_url":"https://codeload.github.com/jwalsh/syntree-generator/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242032652,"owners_count":20060831,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["abstract-syntax-tree","constituency-parsing","emacs","french","linguistics","literary-analysis","nlp","org-mode","parser","proust","python","s-expression","spacy","syntax-analysis","syntax-tree"],"created_at":"2025-03-05T13:24:25.903Z","updated_at":"2025-03-05T13:24:26.325Z","avatar_url":"https://github.com/jwalsh.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"#+TITLE: syntree-generator\n#+AUTHOR: Jason Walsh\n#+EMAIL: j@wal.sh\n\n* Syntree Generator\n\nA tool for converting French literary text into S-expression syntax trees for linguistic analysis, with visualization capabilities.\n\n[[./static/screenshots/syntax-tree-ui.png]]\n\n** Overview\n\nSyntree Generator converts natural language text into structured Abstract Syntax Trees (ASTs) represented in S-expression format. It's particularly optimized for analyzing French literary texts, such as Proust's works, by mapping syntactic dependencies to constituent structure.\n\nThe tool breaks down sentences into their grammatical components (nouns, verbs, phrases, clauses) and generates formal representations that can be visualized and studied. It includes a web UI for interactive exploration of the syntax trees and supports extracting samples for use with external visualization tools.\n\n** Features\n\n- Converts text to constituency-based syntax trees in S-expression format\n- Optimized for French literary texts with special focus on complex syntactic structures\n- Web-based visualization of syntax trees\n- Integration with spaCy for linguistic analysis\n- Sample extraction for use with external tools\n- Command-line interface for batch processing\n- Configurable chunking for processing large texts\n- Emacs integration for syntax highlighting and advanced tree visualization\n\n** Installation\n\nThe project uses Poetry for dependency management.\n\n#+BEGIN_SRC bash\n# Clone the repository\ngit clone https://github.com/jwalsh/syntree-generator.git\ncd syntree-generator\n\n# Install dependencies\nmake setup\n#+END_SRC\n\n** Usage\n\n*** Basic usage\n\nTo parse a text file and generate S-expressions:\n\n#+BEGIN_SRC bash\n# Using the shell script\n./_.sh path/to/input.txt path/to/output.lisp\n\n# Or using make\nmake run INPUT_FILE=data/pg15288.txt OUTPUT_FILE=output/proust.lisp\n#+END_SRC\n\n*** Getting samples\n\nTo generate a sample of S-expressions that can be easily loaded into the S-expression Grammar Analyzer:\n\n#+BEGIN_SRC bash\nmake samples SAMPLE_SIZE=5\n#+END_SRC\n\nThis will create a file with the extension ~.sample.lisp~ containing 5 sample S-expressions.\n\n*** Visualization\n\nTo view the syntax trees in the web UI:\n\n#+BEGIN_SRC bash\nmake serve\n# Then visit http://localhost:8765 in your browser\n#+END_SRC\n\n*** Advanced Emacs Integration\n\nFor users with Emacs, the project provides enhanced syntax highlighting and visualization:\n\n#+BEGIN_SRC bash\n# Copy the provided .emacs.d/init.el or load publish.el\nemacs -l publish.el\n\n# Then open any .lisp file in the examples directory\n# Use C-c t to visualize the tree structure\n#+END_SRC\n\n** Literary Texts\n\nThe repository includes org-mode scripts to download various French literary texts for testing and analysis:\n\n#+BEGIN_SRC bash\n# Download all texts\nmake download-texts\n\n# Process specific texts\nmake run INPUT_FILE=data/pg2650.txt OUTPUT_FILE=output/swann.lisp\nmake run INPUT_FILE=data/pg6099.txt OUTPUT_FILE=output/baudelaire.lisp\n#+END_SRC\n\n** Examples\n\nThe ~examples/~ directory contains S-expression examples at various complexity levels:\n\n- Simple examples with basic subject-verb structures\n- Medium examples with prepositional phrases and modifiers\n- Complex examples with relative/subordinate clauses\n- Very complex examples typical of Proust's style\n\n** Documentation\n\nThe documentation is available in the org-mode files and can be published to HTML:\n\n#+BEGIN_SRC bash\n# Generate all documentation\nmake docs\n\n# View the documentation\nopen docs/index.html\n#+END_SRC\n\n** Development\n\n*** Running Tests\n\n#+BEGIN_SRC bash\nmake test\n#+END_SRC\n\n*** Code Formatting\n\n#+BEGIN_SRC bash\nmake format\n#+END_SRC\n\n*** Capturing Screenshots\n\n#+BEGIN_SRC bash\n# Setup shot-scraper\n./setup-shot-scraper.sh\n\n# Capture screenshots of the web UI\nmake screenshots\n#+END_SRC\n\n** License\n\nMIT License\n\nCopyright (c) 2025 Jason Walsh\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjwalsh%2Fsyntree-generator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjwalsh%2Fsyntree-generator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjwalsh%2Fsyntree-generator/lists"}