{"id":13709934,"url":"https://github.com/TensorLab/tensorfx","last_synced_at":"2025-05-06T18:33:27.169Z","repository":{"id":57474169,"uuid":"75460543","full_name":"tensorlab/tensorfx","owner":"tensorlab","description":"TensorFlow framework for training and serving machine learning models","archived":false,"fork":false,"pushed_at":"2017-03-30T07:14:58.000Z","size":1129,"stargazers_count":196,"open_issues_count":13,"forks_count":41,"subscribers_count":18,"default_branch":"master","last_synced_at":"2025-04-22T21:50:50.702Z","etag":null,"topics":["machine-learning","ml","python","tensorflow","tensorfx"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tensorlab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-12-03T07:59:09.000Z","updated_at":"2024-12-19T01:59:57.000Z","dependencies_parsed_at":"2022-09-12T21:01:05.591Z","dependency_job_id":null,"html_url":"https://github.com/tensorlab/tensorfx","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorlab%2Ftensorfx","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorlab%2Ftensorfx/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorlab%2Ftensorfx/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorlab%2Ftensorfx/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tensorlab","download_url":"https://codeload.github.com/tensorlab/tensorfx/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252744838,"owners_count":21797689,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","ml","python","tensorflow","tensorfx"],"created_at":"2024-08-02T23:00:48.718Z","updated_at":"2025-05-06T18:33:26.671Z","avatar_url":"https://github.com/tensorlab.png","language":"Python","readme":"# Introduction to TensorFX\n\nTensorFX is an end to end application framework to simplifies machine learning with\n[TensorFlow](http://tensorflow.org) - both training models and using them for prediction. It is\ndesigned from the ground up to make the mainline scenarios simple with higher level building blocks,\nwhile ensuring custom or complex scenarios remain possible by preserving the flexibility of\nTensorFlow APIs.\n\nThere are some important principles that shape the design of the framework:\n\n1. **Simple, consistent set of usage patterns** \n   Local or cloud, single node or distributed execution, in-memory data or big data sharded across\n   files, you should have to write code once, in a single way regardless of how the code executes.\n\n2. **A Toolbox with Useful Abstractions**\n   The right entrypoint for the task at hand, starting with off-the-shelf algorithms that let you\n   focus on feature engineering and hyperparam tuning. If you need to solve something unqiue, you\n   can focus on building TensorFlow graphs, rather than infrastructure code (distributed cluster\n   setup, checkpointing, logging, exporting models etc.).\n\n3. **Declarative**\n   Using YAML, JSON, and simplified Python interfaces to minimize the amount of boilerplate code.\n\nOK, enough context... here is some information to get you started.\n\n\n## Getting Started\nOnce you have a Python environment (recommendation: use Miniconda), installation is straightforward:\n\n    pip install tensorflow\n    pip install tensorfx\n\nNote that TensorFX depends on TensorFlow 1.0, and supporting libraries such as numpy and pandas.\n\n\n## Documentation\nDocumentation is at https://tensorlab.github.io/tensorfx/. This includes API reference topics, as\nwell as conceptual and how-to topics. They are a work-in-progress, but check them out! There are a\nfew samples that demonstrate how to get started as well in the repository. Likewise, more to be\nadded over time.\n\n\n## Contributions and Development\nWe welcome contributions in form of ideas, issues, samples as well as code. Since the project is at\na super-early stage, and evolving rapidly, its best to start a discussion by filing an issue for\nany contribution.\n\n### Building and Testing\nIf you want to develop within the repository, clone it, and run the following commands:\n\n    # Install requirements and setup envionment\n    source init.sh install\n\n    # Build and Test\n    ./build.sh test\n\n### Related Links\n\n* Development workflow [TODO: Add wiki entry]\n\n\n## Hello World - Iris Classification Model\nThis sample here is a quick 5-minute introduction to using TensorFX. Here is the code for building\na feed-forward neural network classification model for the\n[iris dataset](https://archive.ics.uci.edu/ml/datasets/Iris).\n\n    import tensorfx as tfx\n    import tensorfx.models.nn as nn\n\n    # Hyperparameters, training parameters, and data\n    args, job = nn.FeedForwardClassificationArguments.parse(parse_job=True)\n    dataset = tfx.data.CsvDataSet(args.data_schema,\n                                  train=args.data_train,\n                                  eval=args.data_eval,\n                                  metadata=args.data_metadata,\n                                  features=args.data_features)\n\n    # Instantiating the model builder\n    classification = nn.FeedForwardClassification(args, dataset)\n\n    # Training\n    trainer = tfx.training.ModelTrainer()\n    model = trainer.train(classification, job)\n\n    # Prediction\n    instances = [\n      '6.3,3.3,6,2.5',   # virginica\n      '4.4,3,1.3,0.2',   # setosa\n      '6.1,2.8,4.7,1.2'  # versicolor\n    ]\n    predictions = model.predict(instances)\n\nHere's an outline steps to perform for basic usage of what TensorFX offers:\n\n1. Parse (or build) an Arguments object, usually from the command-line to define hyperparameters.\n   This object corresponds to the kind of model you are training, so,\n   `FeedForwardClassificationArguments` in this case.\n2. Create a DataSet to reference training and evaluation data, along with supporting configuration -\n   namely - schema, metadata, and features (more on these below).\n3. Initialize the model builder - in this case `FeedForwardClassification`.\n4. Initialize the model trainer, and invoke `train()` which runs the training process to return a\n   model.\n5. Load some instances you want to run through the model and call `predict()`.\n\n#### Schema - schema.yaml\nThe schema describes the structure of your data. This can be defined programmatically, but is\nconveniently expressible in declarative YAML form, and placed alongside training data.\n\n    fields:\n    - name: species\n      type: discrete\n    - name: petal_length\n      type: numeric\n    - name: petal_width\n      type: numeric\n    - name: sepal_length\n      type: numeric\n    - name: sepal_width\n      type: numeric\n\n#### Metadata - metadata.json\nMetadata is the result of analyzing training data, based on type information in the schema.\nIris is a tiny dataset, so metadata is readily producable using simple python code looping over\nthe data. For real-world and large datasets, you'll find Spark and BigQuery (on Google Cloud\nPlatform) as essential data processing runtimes. Stay tuned - TensorFX will provide support for\nthese capabilities out of the box.\n\n    {\n      \"species\": { \"entries\": [\"setosa\", \"virginica\", \"versicolor\"] },\n      \"petal_length\": { \"min\": 4.3, \"max\": 7.9 },\n      \"petal_width\": { \"min\": 2.0, \"max\": 4.4 },\n      \"sepal_length\": { \"min\": 1.1, \"max\": 6.9 },\n      \"sepal_width\": { \"min\": 0.1, \"max\": 2.5 }\n    }\n\n#### Features - features.yaml\nLike schema, features can also be defined programmatically, or expressed in YAML. Features describe\nthe set of inputs that your models operate over, and how they are produced by applying\ntransformations to the fields in your data. These transformations are turned into TensorFlow graph\nconstructs and applied consistently to both training and prediction data.\n\nIn this particular example, the FeedForwardClassification model requires two features: X defining\nthe values the model uses for producing inferences, and Y, the target label that the model is\nexpected to predict which are defined as follows:\n\n    features:\n    - name: X\n      type: concat\n      features:\n      - name: petal_width\n        type: scale\n      - name: petal_length\n        type: scale\n      - name: sepal_width\n        type: log\n      - name: sepal_length\n        type: log\n    - name: Y\n      type: target\n      fields: species\n\n#### Running the Model\nThe python code in the sample can be run directly, or using a `train` tool, as shown:\n\n    cd samples\n    tfx train \\\n      --module iris.trainer.main \\\n      --output /tmp/tensorfx/iris/csv \\\n      --data-train iris/data/train.csv \\\n      --data-eval iris/data/eval.csv \\\n      --data-schema iris/data/schema.yaml \\\n      --data-metadata iris/data/metadata.json \\\n      --data-features iris/features.yaml \\\n      --log-level-tensorflow ERROR \\\n      --log-level INFO \\\n      --batch-size 5 \\\n      --max-steps 2000 \\\n      --checkpoint-interval-secs 1 \\\n      --hidden-layers:1 20 \\\n      --hidden-layers:2 10\n\nOnce the training is complete, you can list the contents of the output directory. You should\nsee the model (the prediction graph, and learnt variables) in the `model` subdirectory, alongside\ncheckpoints, and summaries.\n\n    ls -R /tmp/tensorfx/iris/csv\n    checkpoints\tjob.yaml\tmodel\t\tsummaries\n\n    /tmp/tensorfx/iris/csv/checkpoints:\n    checkpoint                             model.ckpt-2000.index\n    model.ckpt-1.data-00000-of-00001       model.ckpt-2000.meta\n    model.ckpt-1.index                     model.ckpt-2001.data-00000-of-00001\n    model.ckpt-1.meta                      model.ckpt-2001.index\n    model.ckpt-1562.data-00000-of-00001    model.ckpt-2001.meta\n    model.ckpt-1562.index                  model.ckpt-778.data-00000-of-00001\n    model.ckpt-1562.meta                   model.ckpt-778.index\n    model.ckpt-2000.data-00000-of-00001    model.ckpt-778.meta\n\n    /tmp/tensorfx/iris/csv/model:\n    saved_model.pb\tvariables\n\n    /tmp/tensorfx/iris/csv/model/variables:\n    variables.data-00000-of-00001\tvariables.index\n\n    /tmp/tensorfx/iris/csv/summaries:\n    eval\t\tprediction\ttrain\n\n    /tmp/tensorfx/iris/csv/summaries/eval:\n    events.out.tfevents.1488351760\n    events.out.tfevents.1488352853\n\n    /tmp/tensorfx/iris/csv/summaries/prediction:\n    events.out.tfevents.1488351765\n\n    /tmp/tensorfx/iris/csv/summaries/train:\n    events.out.tfevents.1488351760\n    events.out.tfevents.1488352852\n\nSummaries are TensorFlow events logged during training. They can be observed while the training\njob is running (which is essential when running a long or real training job) to understand how your\ntraining is progressing, or how the model is converging (or not!).\n\n    tensorboard --logdir /tmp/tensorfx/iris/csv\n\nThis should bring up TensorBoard. Its useful to see the graph structure, metrics and other tensors\nthat are automatically published.\n\n**Training Graph**\n\n![Graphs in TensorBoard](https://tensorlab.github.io/tensorfx/_static/images/intro-graph.jpg)\n\n**Training Metrics -- Accuracy, Loss and Throughput**\n\n![Metrics in TensorBoard](https://tensorlab.github.io/tensorfx/_static/images/intro-metrics.jpg)\n\n**Model Variables -- Weights, Gradients, etc.**\n\n![Watchin Learnt Variables](https://tensorlab.github.io/tensorfx/_static/images/intro-watch.jpg)\n\n\nAs you can see, the out-of-box model takes care of a number of details. The same code can be run on\na single machine, or in a cluster (of course, iris is too simple of a problem to need that).\n","funding_links":[],"categories":["Deep Learning Framework"],"sub_categories":["High-Level DL APIs"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTensorLab%2Ftensorfx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FTensorLab%2Ftensorfx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTensorLab%2Ftensorfx/lists"}