https://github.com/binpash/annotations
Repo containing annotations for the PaSh project
https://github.com/binpash/annotations
Last synced: 10 months ago
JSON representation
Repo containing annotations for the PaSh project
- Host: GitHub
- URL: https://github.com/binpash/annotations
- Owner: binpash
- License: mit
- Created: 2022-02-15T22:30:26.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2025-05-27T21:51:56.000Z (about 1 year ago)
- Last Synced: 2025-06-01T06:30:40.232Z (about 1 year ago)
- Language: Python
- Homepage:
- Size: 464 KB
- Stars: 2
- Watchers: 6
- Forks: 7
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Annotations
This repository contains a framework for generating annotations for command invocations.
It comprises a parser which turns a string into a command invocation data structure.
For the time being, there are two sets of annotation generators:
- input-output information which specifies how a command invocation interacts with the files, pipes, stdin, stdout, etc.
- parallelizability information which describes how a command invocation can be parallelized - containing information about how to split inputs, mappers and aggregators, etc.
## Command-line tool
`main.py` contains a command line tool which, provided a command invocation returns:
- the parsed command invocation data structure
- the input-output information generated
- the parallelizability information generated
## Adding an annotation
- Add the command in the dictionary in (https://github.com/binpash/annotations/blob/main/pash_annotations/annotation_generation/AnnotationGeneration.py#L13)
- Add a json file with the command flags in (https://github.com/binpash/annotations/tree/main/pash_annotations/parser/command_flag_option_info/data). This could be used to generate a first version of it: (https://github.com/binpash/annotations/blob/main/pash_annotations/parser/command_flag_option_info/manpage-to-json.sh).
- Add an `InputOutputInfoGeneratorXXX.py` in (https://github.com/binpash/annotations/tree/main/pash_annotations/annotation_generation/annotation_generators)
- (Optionally) add a `ParallelizabilityInfoGeneratorXXX.py` in (https://github.com/binpash/annotations/tree/main/pash_annotations/annotation_generation/annotation_generators)
## Parser
Use command_flag_option_info JSON files to parse xbd-type terminal commands.
Will split on spaces (`" "`) and equal signs (`"="`).
## Flag and Option Information
The folder command_flag_option_info contains [command_name].json files with list of flags and options for each command.
For arguments that have two options (e.g. `-a` and `--all`), store them as a pair in the format [short version, long version].
In addition, we store here in which way an argument is accessed if applicable, e.g., if it is a file.
We also have a regex-based script that can be used to generate initial JSON files with parsed arguments.
Since there is no standard for man-pages, the quality of results varies but it usually provides a good skeleton and saves quite some time.
## Annotation Generation
Currently, annotation generators for input-output information and parallelizability information has been implemented.
Each annotation generator implements a specific generator interface (e.g., `InputOutputInfoGenerator_Interface.py`) which specializes a more general generator interface (`Generator_Interface.py`).
The general generator interface contains functions that help to check conditions on the command invocation while
the more specific generator interface provides functionality to change the respective information (object) generated.
## Terms
- flag = takes no arguments, e.g. `--verbose`
- option = takes arguments, e.g. `-n 10`
- operand = argument with no flag, e.g. `input.txt`
## Coding
## typing
We strive to use types and typecheck with `pyright` (v1.1.232).
This does not only help to catch bugs but shall also help future developers to understand the code more easily.
## tests
Use `pytest` to run tests.
It will run all tests found (recursively) in the current directory.
## imports
For clean imports, we add empty `__init__.py` modules in all non-root directories.
Thus, `pytest` will add the root directory to sys.path and
we can import modules by prefixing the path from there.
For instance, to import `Parallelizer.py`, we use
```
from annotation_generation.parallelizers.Parallelizer import Parallelizer
```
# How to Add a New Command to PaSh's Annotation System
Here is a detailed walkthrough on [how to add a new command to PaSh's Annotation System](docs/adding-annotations.md)