{"id":13815106,"url":"https://github.com/simonw/symbex","last_synced_at":"2025-04-05T17:09:07.550Z","repository":{"id":176275716,"uuid":"655378976","full_name":"simonw/symbex","owner":"simonw","description":"Find the Python code for specified symbols","archived":false,"fork":false,"pushed_at":"2023-09-05T21:15:37.000Z","size":159,"stargazers_count":231,"open_issues_count":2,"forks_count":6,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-10-18T07:53:41.686Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/simonw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-06-18T18:01:50.000Z","updated_at":"2024-10-12T21:17:43.000Z","dependencies_parsed_at":"2024-01-15T13:35:35.877Z","dependency_job_id":null,"html_url":"https://github.com/simonw/symbex","commit_stats":null,"previous_names":["simonw/py-grep"],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonw%2Fsymbex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonw%2Fsymbex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonw%2Fsymbex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonw%2Fsymbex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/simonw","download_url":"https://codeload.github.com/simonw/symbex/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247369952,"owners_count":20927928,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-04T04:02:58.069Z","updated_at":"2025-04-05T17:09:07.506Z","avatar_url":"https://github.com/simonw.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Symbex\n\n[![PyPI](https://img.shields.io/pypi/v/symbex.svg)](https://pypi.org/project/symbex/)\n[![Changelog](https://img.shields.io/github/v/release/simonw/symbex?include_prereleases\u0026label=changelog)](https://github.com/simonw/symbex/releases)\n[![Tests](https://github.com/simonw/symbex/workflows/Test/badge.svg)](https://github.com/simonw/symbex/actions?query=workflow%3ATest)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/symbex/blob/master/LICENSE)\n\nFind the Python code for specified symbols\n\nRead [Symbex: search Python code for functions and classes, then pipe them into a LLM](https://simonwillison.net/2023/Jun/18/symbex/) for background on this project.\n\n## Installation\n\nInstall this tool using `pip`:\n```bash\npip install symbex\n```\nOr using Homebrew:\n```bash\nbrew install simonw/llm/symbex\n```\n## Usage\n\n`symbex` can search for names of functions and classes that occur at the top level of a Python file.\n\nTo search every `.py` file in your current directory and all subdirectories, run like this:\n\n```bash\nsymbex my_function\n```\nYou can search for more than one symbol at a time:\n```bash\nsymbex my_function MyClass\n```\nWildcards are supported - to search for every `test_` function run this (note the single quotes to avoid the shell interpreting the `*` as a wildcard):\n```bash\nsymbex 'test_*'\n```\nTo search for methods within classes, use `class.method` notation:\n```bash\nsymbex Entry.get_absolute_url\n```\nWildcards are supported here as well:\n```bash\nsymbex 'Entry.*'\nsymbex '*.get_absolute_url'\nsymbex '*.get_*'\n```\nOr to view every method of every class:\n```bash\nsymbex '*.*'\n```\nTo search within a specific file, pass that file using the `-f` option. You can pass this more than once to search multiple files.\n\n```bash\nsymbex MyClass -f my_file.py\n```\nTo search within a specific directory and all of its subdirectories, use the `-d/--directory` option:\n```bash\nsymbex Database -d ~/projects/datasette\n```\nIf you know that you want to inspect one or more modules that can be imported by Python, you can use the `-m/--module name` option. This example shows the signatures for every symbol available in the `asyncio` package:\n```bash\nsymbex -m asyncio -s --imports\n```\nYou can search the directory containing the Python standard library using `--stdlib`. This can be useful for quickly looking up the source code for specific Python library functions:\n```bash\nsymbex --stdlib -in to_thread\n```\n`-in` is explained below. If you provide `--stdlib` without any `-d` or `-f` options then `--silent` will be turned on automatically, since the standard library otherwise produces a number of different warnings.\n\nThe output starts like this:\n```python\n# from asyncio.threads import to_thread\nasync def to_thread(func, /, *args, **kwargs):\n    \"\"\"Asynchronously run function *func* in a separate thread.\n    # ...\n```\nYou can exclude files in specified directories using the `-x/--exclude` option:\n```bash\nsymbex Database -d ~/projects/datasette -x ~/projects/datasette/tests\n```\nIf `symbex` encounters any Python code that it cannot parse, it will print a warning message and continue searching:\n```\n# Syntax error in path/badcode.py: expected ':' (\u003cunknown\u003e, line 1)\n```\nPass `--silent` to suppress these warnings:\n```bash\nsymbex MyClass --silent\n```\n### Filters\n\nIn addition to searching for symbols, you can apply filters to the results.\n\nThe following filters are available:\n\n- `--function` - only functions\n- `--class` - only classes\n- `--async` - only `async def` functions\n- `--unasync` - only non-async functions\n- `--documented` - functions/classes that have a docstring\n- `--undocumented` - functions/classes that do not have a docstring\n- `--public` - functions/classes that are public - don't have a `_name` prefix (or are `__*__` methods)\n- `--private` - functions/classes that are private - have a `_name` prefix and are not `__*__`\n- `--dunder` - functions matching `__*__` - this should usually be used with `*.*` to find all dunder methods\n- `--typed` - functions that have at least one type annotation\n- `--untyped` - functions that have no type annotations\n- `--partially-typed` - functions that have some type annotations but not all\n- `--fully-typed` - functions that have type annotations for every argument and the return value\n- `--no-init` - Exclude `__init__(self)` methods. This is useful when combined with `--fully-typed '*.*'` to avoid returning `__init__(self)` methods that would otherwise be classified as fully typed, since `__init__` doesn't need argument or return type annotations.\n\nFor example, to see the signatures of every `async def` function in your project that doesn't have any type annotations:\n\n```bash\nsymbex -s --async --untyped\n```\n\nFor class methods instead of functions, you can combine filters with a symbol search argument of `*.*`.\n\nThis example shows the full source code of every class method in the Python standard library that has type annotations for all of the arguments and the return value:\n\n```bash\nsymbex --fully-typed --no-init '*.*' --stdlib\n```\n\nTo find all public functions and methods that lack documentation, just showing the signature of each one:\n\n```bash\nsymbex '*' '*.*' --public --undocumented --signatures\n```\n\n### Example output\n\nIn a fresh checkout of [Datasette](https://github.com/simonw/datasette) I ran this command:\n\n```bash\nsymbex MessagesDebugView get_long_description\n```\nHere's the output of the command:\n```python\n# File: setup.py Line: 5\ndef get_long_description():\n    with open(\n        os.path.join(os.path.dirname(os.path.abspath(__file__)), \"README.md\"),\n        encoding=\"utf8\",\n    ) as fp:\n        return fp.read()\n\n# File: datasette/views/special.py Line: 60\nclass PatternPortfolioView(View):\n    async def get(self, request, datasette):\n        await datasette.ensure_permissions(request.actor, [\"view-instance\"])\n        return Response.html(\n            await datasette.render_template(\n                \"patterns.html\",\n                request=request,\n                view_name=\"patterns\",\n            )\n        )\n```\n### Just the signatures\n\nThe `-s/--signatures` option will list just the signatures of the functions and classes, for example:\n```bash\nsymbex -s -f symbex/lib.py\n```\n\u003c!-- [[[cog\nimport cog\nfrom click.testing import CliRunner\nimport pathlib\nfrom symbex.cli import cli\n\ndef sorted_chunks(text):\n    chunks = text.strip().split(\"\\n\\n\")\n    chunks.sort()\n    return \"\\n\\n\".join(chunks)\n\npath = pathlib.Path(\"symbex\").resolve()\nrunner = CliRunner()\nresult = runner.invoke(cli, [\"-s\", \"-f\", str(path / \"lib.py\")])\ncog.out(\n    \"```python\\n{}\\n```\\n\".format(sorted_chunks(result.output))\n)\n]]] --\u003e\n```python\n# File: symbex/lib.py Line: 107\ndef function_definition(function_node: AST):\n\n# File: symbex/lib.py Line: 13\ndef find_symbol_nodes(code: str, filename: str, symbols: Iterable[str]) -\u003e List[Tuple[(AST, Optional[str])]]:\n\n# File: symbex/lib.py Line: 175\ndef class_definition(class_def):\n\n# File: symbex/lib.py Line: 209\ndef annotation_definition(annotation: AST) -\u003e str:\n\n# File: symbex/lib.py Line: 227\ndef read_file(path):\n\n# File: symbex/lib.py Line: 253\nclass TypeSummary:\n\n# File: symbex/lib.py Line: 258\ndef type_summary(node: AST) -\u003e Optional[TypeSummary]:\n\n# File: symbex/lib.py Line: 304\ndef quoted_string(s):\n\n# File: symbex/lib.py Line: 315\ndef import_line_for_function(function_name: str, filepath: str, possible_root_dirs: List[str]) -\u003e str:\n\n# File: symbex/lib.py Line: 37\ndef code_for_node(code: str, node: AST, class_name: str, signatures: bool, docstrings: bool) -\u003e Tuple[(str, int)]:\n\n# File: symbex/lib.py Line: 71\ndef add_docstring(definition: str, node: AST, docstrings: bool, is_method: bool) -\u003e str:\n\n# File: symbex/lib.py Line: 82\ndef match(name: str, symbols: Iterable[str]) -\u003e bool:\n```\n\u003c!-- [[[end]]] --\u003e\nThis can be combined with other options, or you can run `symbex -s` to see every symbol in the current directory and its subdirectories.\n\nTo include estimated import paths, such as `# from symbex.lib import match`, use `--imports`. These will be calculated relative to the directory you specified, or you can pass one or more `--sys-path` options to request that imports are calculated relative to those directories as if they were on `sys.path`:\n\n```bash\n~/dev/symbex/symbex match --imports -s --sys-path ~/dev/symbex\n```\nExample output:\n\u003c!-- [[[cog\nresult = runner.invoke(cli, [\n    \"--imports\", \"-d\", str(path), \"match\", \"-s\", \"--sys-path\", str(path.parent)\n])\ncog.out(\n    \"```python\\n{}\\n```\\n\".format(result.stdout.strip())\n)\n]]] --\u003e\n```python\n# File: symbex/lib.py Line: 82\n# from symbex.lib import match\ndef match(name: str, symbols: Iterable[str]) -\u003e bool:\n```\n\u003c!-- [[[end]]] --\u003e\nTo suppress the `# File: ...` comments, use `--no-file` or `-n`.\n\nSo to both show import paths and suppress File comments, use `-in` as a shortcut:\n```bash\nsymbex -in match\n```\nOutput:\n\u003c!-- [[[cog\nresult = runner.invoke(cli, [\n    \"-in\", \"-d\", str(path), \"match\", \"-s\", \"--sys-path\", str(path.parent)\n])\ncog.out(\n    \"```python\\n{}\\n```\\n\".format(result.stdout.strip())\n)\n]]] --\u003e\n```python\n# from symbex.lib import match\ndef match(name: str, symbols: Iterable[str]) -\u003e bool:\n```\n\u003c!-- [[[end]]] --\u003e\n\nTo include docstrings in those signatures, use `--docstrings`:\n```bash\nsymbex match --docstrings -f symbex/lib.py\n```\nExample output:\n\u003c!-- [[[cog\nresult = runner.invoke(cli, [\"match\", \"--docstrings\", \"-f\", str(path / \"lib.py\")])\ncog.out(\n    \"```python\\n{}\\n```\\n\".format(result.stdout.strip())\n)\n]]] --\u003e\n```python\n# File: symbex/lib.py Line: 82\ndef match(name: str, symbols: Iterable[str]) -\u003e bool:\n    \"Returns True if name matches any of the symbols, resolving wildcards\"\n```\n\u003c!-- [[[end]]] --\u003e\n\n## Counting symbols\n\nIf you just want to count the number of functions and classes that match your filters, use the `--count` option. Here's how to count your classes:\n\n```bash\nsymbex --class --count\n```\nOr to count every async test function:\n```bash\nsymbex --async 'test_*' --count\n```\n## Structured output\n\nLLM defaults to outputting plain text (actually valid Python code, thanks to the way it uses comments).\n\nYou can request output in CSV, TSV, JSON or newline-delimited JSON instead, using the following options:\n\n- `--json`: a JSON array, `[{\"id\": \"...\", \"code\": \"...\"}]`\n- `--nl`: newline-delimited JSON, `{\"id\": \"...\", \"code\": \"...\"}` per line\n- `--csv`: CSV with `id,code` as the heading row\n- `--tsv`: TSV with `id\\tcode` as the heading row\n\nIn each case the ID will be the path to the file containing the symbol, followed by a colon, followed by the line number of the symbol, for example:\n\n```json\n{\n  \"id\": \"symbex/lib.py:82\",\n  \"code\": \"def match(name: str, symbols: Iterable[str]) -\u003e bool:\"\n}\n```\nIf you pass `-i/--imports` the ID will be the import line instead:\n```json\n{\n  \"id\": \"from symbex.lib import match\",\n  \"code\": \"def match(name: str, symbols: Iterable[str]) -\u003e bool:\"\n}\n```\nPass `--id-prefix 'something:'` to add the specified prefix to the start of each ID.\n\nThis example will generate a CSV file of all of your test functions, using the import style of IDs and a prefix of `test:`:\n\n```bash\nsymbex 'test_*' \\\n  --function \\\n  --imports \\\n  --csv \u003e tests.csv\n```\n\n## Using with LLM\n\nThis tool is primarily designed to be used with [LLM](https://llm.datasette.io/), a CLI tool for working with Large Language Models.\n\n`symbex` makes it easy to grab a specific class or function and pass it to the `llm` command.\n\nFor example, I ran this in the Datasette repository root:\n\n```bash\nsymbex Response | llm --system 'Explain this code, succinctly'\n```\nAnd got back this:\n\n\u003e This code defines a custom `Response` class with methods for returning HTTP responses. It includes methods for setting cookies, returning HTML, text, and JSON responses, and redirecting to a different URL. The `asgi_send` method sends the response to the client using the ASGI (Asynchronous Server Gateway Interface) protocol.\n\nThe structured output feature is designed to be used with [LLM embeddings](https://llm.datasette.io/en/stable/embeddings/index.html). You can generate embeddings for every symbol in your codebase using [llm embed-multi](https://llm.datasette.io/en/stable/embeddings/cli.html#llm-embed-multi) like this:\n\n```bash\nsymbex '*' '*:*' --nl | \\\n  llm embed-multi symbols - \\\n  --format nl --database embeddings.db --store\n```\nThis creates a database in `embeddings.db` containing all of your symbols along with embedding vectors.\n\nYou can then search your code like this:\n```bash\nllm similar symbols -d embeddings.db -c 'test csv' | jq\n```\n\n## Replacing a matched symbol\n\nThe `--replace` option can be used to replace a single matched symbol with content piped in to standard input.\n\nGiven a file called `my_code.py` with the following content:\n```python\ndef first_function():\n    # This will be ignored\n    pass\n\ndef second_function():\n    # This will be replaced\n    pass\n```\nRun the following:\n```bash\necho \"def second_function(a, b):\n    # This is a replacement implementation\n    return a + b + 3\n\" | symbex second_function --replace\n```\nThe result will be an updated-in-place `my_code.py` containing the following:\n\n```python\ndef first_function():\n    # This will be ignored\n    pass\n\ndef second_function(a, b):\n    # This is a replacement implementation\n    return a + b + 3\n```\nThis feature should be used with care! I recommend only using this feature against code that is already checked into Git, so you can review changes it makes using `git diff` and revert them using `git checkout my_code.py`.\n\n## Replacing a matched symbol by running a command\n\nThe `--rexec COMMAND` option can be used to replace a single matched symbol by running a command and using its output.\n\nThe command will be run with the matched symbol's definition piped to its standard input. The output of that command will be used as the replacement text.\n\nHere's an example that uses `sed` to add a `# ` to the beginning of each matching line, effectively commenting out the matched function:\n\n```bash\nsymbex first_function --rexec \"sed 's/^/# /'\"\n```\nThis modified the first function in place to look like this:\n```python\n# def first_function():\n#    # This will be ignored\n#    pass\n```\nA much more exciting example uses LLM. This example will use the `gpt-3.5-turbo` model to add type hints and generate a docstring:\n\n```bash\nsymbex second_function \\\n  --rexec \"llm --system 'add type hints and a docstring'\"\n```\nI ran this against this code:\n```python\ndef first_function():\n    # This will be ignored\n    pass\n\ndef second_function(a, b):\n    return a + b + 3\n```\nAnd the second function was updated in place to look like this:\n```python\ndef second_function(a: int, b: int) -\u003e int:\n    \"\"\"\n    Returns the sum of two integers (a and b) plus 3.\n\n    Parameters:\n    a (int): The first integer.\n    b (int): The second integer.\n\n    Returns:\n    int: The sum of a and b plus 3.\n    \"\"\"\n    return a + b + 3\n```\n## Using in CI\n\nThe `--check` option causes `symbex` to return a non-zero exit code if any matches are found for your query.\n\nYou can use this in CI to guard against things like public functions being added without documentation:\n\n```bash\nsymbex --function --public --undocumented --check\n```\nThis will fail silently but set a `1` exit code if there are any undocumented functions.\n\nUsing this as a step in a CI tool such as GitHub Actions should result in a test failure.\n\nRun this to see the exit code from the last command:\n```bash\necho $?\n```\n\n`--check` will not output anything by default. Add `--count` to output a count of matching symbols, or `-s/--signatures` to output the signatures of the matching symbols, for example:\n```bash\nsymbex --function --public --undocumented --check --count\n```\n\n## Similar tools\n\n- [pyastgrep](https://github.com/spookylukey/pyastgrep) by Luke Plant offers advanced capabilities for viewing and searching through Python ASTs using XPath.\n- [cq](https://github.com/fullstackio/cq) is a tool thet lets you \"extract code snippets using CSS-like selectors\", built using [Tree-sitter](https://tree-sitter.github.io/tree-sitter/) and primarily targetting JavaScript and TypeScript.\n\n## symbex --help\n\n\u003c!-- [[[cog\nresult2 = runner.invoke(cli, [\"--help\"])\nhelp = result2.output.replace(\"Usage: cli\", \"Usage: symbex\")\ncog.out(\n    \"```\\n{}\\n```\".format(help)\n)\n]]] --\u003e\n```\nUsage: symbex [OPTIONS] [SYMBOLS]...\n\n  Find symbols in Python code and print the code for them.\n\n  Example usage:\n\n      # Search current directory and subdirectories\n      symbex my_function MyClass\n\n      # Search using a wildcard\n      symbex 'test_*'\n\n      # Find a specific class method\n      symbex 'MyClass.my_method'\n\n      # Find class methods using wildcards\n      symbex '*View.handle_*'\n\n      # Search a specific file\n      symbex MyClass -f my_file.py\n\n      # Search within a specific directory and its subdirectories\n      symbex Database -d ~/projects/datasette\n\n      # View signatures for all symbols in current directory and subdirectories\n      symbex -s\n\n      # View signatures for all test functions\n      symbex 'test_*' -s\n\n      # View signatures for all async functions with type definitions\n      symbex --async --typed -s\n\n      # Count the number of --async functions in the project\n      symbex --async --count\n\n      # Replace my_function with a new implementation:\n      echo \"def my_function(a, b):\n          # This is a replacement implementation\n          return a + b + 3\n      \" | symbex my_function --replace\n\n      # Replace my_function with the output of a command:\n      symbex first_function --rexec \"sed 's/^/# /'\"\n      # This uses sed to comment out the function body\n\nOptions:\n  --version                  Show the version and exit.\n  -f, --file FILE            Files to search\n  -d, --directory DIRECTORY  Directories to search\n  --stdlib                   Search the Python standard library\n  -x, --exclude DIRECTORY    Directories to exclude\n  -s, --signatures           Show just function and class signatures\n  -n, --no-file              Don't include the # File: comments in the output\n  -i, --imports              Show 'from x import y' lines for imported symbols\n  -m, --module TEXT          Modules to search within\n  --sys-path TEXT            Calculate imports relative to these on sys.path\n  --docs, --docstrings       Show function and class signatures plus docstrings\n  --count                    Show count of matching symbols\n  --silent                   Silently ignore Python files with parse errors\n  --function                 Filter functions\n  --async                    Filter async functions\n  --unasync                  Filter non-async functions\n  --class                    Filter classes\n  --documented               Filter functions with docstrings\n  --undocumented             Filter functions without docstrings\n  --public                   Filter for symbols without a _ prefix\n  --private                  Filter for symbols with a _ prefix\n  --dunder                   Filter for symbols matching __*__\n  --typed                    Filter functions with type annotations\n  --untyped                  Filter functions without type annotations\n  --partially-typed          Filter functions with partial type annotations\n  --fully-typed              Filter functions with full type annotations\n  --no-init                  Filter to exclude any __init__ methods\n  --check                    Exit with non-zero code if any matches found\n  --replace                  Replace matching symbol with text from stdin\n  --rexec TEXT               Replace with the result of piping to this tool\n  --csv                      Output as CSV\n  --tsv                      Output as TSV\n  --json                     Output as JSON\n  --nl                       Output as newline-delimited JSON\n  --id-prefix TEXT           Prefix to use for symbol IDs\n  --help                     Show this message and exit.\n\n```\n\u003c!-- [[[end]]] --\u003e\n\n## Development\n\nTo contribute to this tool, first checkout the code. Then create a new virtual environment:\n```bash\ncd symbex\npython -m venv venv\nsource venv/bin/activate\n```\nNow install the dependencies and test dependencies:\n```bash\npip install -e '.[test]'\n```\nTo run the tests:\n```bash\npytest\n```\n### just\n\nYou can also install [just](https://github.com/casey/just) and use it to run the tests and linters like this:\n\n```bash\njust\n```\nOr to list commands:\n```bash\njust -l\n```\n```\nAvailable recipes:\n    black         # Apply Black\n    cog           # Rebuild docs with cog\n    default       # Run tests and linters\n    lint          # Run linters\n    test *options # Run pytest with supplied options\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonw%2Fsymbex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsimonw%2Fsymbex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonw%2Fsymbex/lists"}