{"id":14959127,"url":"https://github.com/ubccr/terf","last_synced_at":"2025-07-27T00:02:40.913Z","repository":{"id":144200821,"uuid":"129948091","full_name":"ubccr/terf","owner":"ubccr","description":"Go library for reading/writing TensorFlow TFRecords file format","archived":false,"fork":false,"pushed_at":"2018-04-20T15:19:03.000Z","size":86,"stargazers_count":19,"open_issues_count":0,"forks_count":3,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-01-31T03:12:17.982Z","etag":null,"topics":["golang","machine-learning","tensorflow-examples"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ubccr.png","metadata":{"files":{"readme":"README.rst","changelog":"ChangeLog.rst","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-04-17T18:26:42.000Z","updated_at":"2021-12-06T14:31:14.000Z","dependencies_parsed_at":"2023-06-18T07:45:15.621Z","dependency_job_id":null,"html_url":"https://github.com/ubccr/terf","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ubccr%2Fterf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ubccr%2Fterf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ubccr%2Fterf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ubccr%2Fterf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ubccr","download_url":"https://codeload.github.com/ubccr/terf/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237999613,"owners_count":19399914,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["golang","machine-learning","tensorflow-examples"],"created_at":"2024-09-24T13:18:53.226Z","updated_at":"2025-02-09T18:31:15.762Z","avatar_url":"https://github.com/ubccr.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"===============================================================================\nterf - TensorFlow TFRecords file format Reader/Writer\n===============================================================================\n\n|godoc|\n\nterf is a Go library for reading/writing TensorFlow `TFRecords files\n\u003chttps://www.tensorflow.org/versions/r1.1/api_guides/python/python_io#tfrecords_format_details\u003e`_.\nThe goals of this project are two fold:\n\n1. Read/Write TensorFlow TFRecords files in Go\n2. Provide an easy way to generate example image datasets for use in TensorFlow\n\nWith terf you can easily build, inspect, and extract image datasets from the\ncommand line without having to install TensorFlow. terf was developed for use\nwith `MARCO \u003chttps://marco.ccr.buffalo.edu\u003e`_ but should work with most image\ndatasets. The TFRecords file format is based on the imagenet dataset from the\ninception research model in TensorFlow.\n\n-------------------------------------------------------------------------------\nInstall\n-------------------------------------------------------------------------------\n\nBinaries for your platform can be found `here \u003chttps://github.com/ubccr/terf/releases\u003e`_\n\nUsage::\n\n    $ ./terf --help\n\n-------------------------------------------------------------------------------\nExamples\n-------------------------------------------------------------------------------\n\n~~~~~~~~~~~~~~~~~~~~~~~~~\nCreate an image dataset\n~~~~~~~~~~~~~~~~~~~~~~~~~\n\nYou have a directory of images that have been labeled and you want to build an\nimage dataset that can be used in TensorFlow. First step is to generate a CSV\nfile in the following format::\n\n\timage_path,image_id,label_id,label_text,label_raw,source\n\nWhere image_path is the path to the raw image file, image_id is the unique\nidentifier for an image, label_id is the integer identifier of the normalized\nlabel, label_raw is the integer identifier for the raw label, label_text is the\nnormalized label, and source is the source (organization/creator etc) that\nproduced the image. For example::\n\n\timage_path,image_id,label_id,label_text,label_raw,source\n\t/data/03c3_G6_ImagerDefaults_6.jpg,123,1,Crystals,12,101\n\t/data/X0000056450155200509052032.png,124,0,Clear,15,104\n\n\nTo build the image dataset run the following command::\n\n\t$ ./terf -d build --input images.csv --output train_directory/ --size 1024\t\n\nThis will convert the image data into a sharded data set of TFRecords files in\nthe train/ output directory::\n\t\n\ttrain_directory/train-00000-of-00024\n\ttrain_directory/train-00001-of-00024\n\t...\n\ttrain_directory/train-00023-of-00024\n\nEach TFRecord file will contain ~1024 records. Each record within the TFRecord\nfile is a serialized Example proto. The Example proto contains the following\nfields::\n\n\timage/height: integer, image height in pixels\n\timage/width: integer, image width in pixels\n\timage/colorspace: string, specifying the colorspace, always 'RGB'\n\timage/channels: integer, specifying the number of channels, always 3\n\timage/class/label: integer, specifying the index in a normalized classification layer\n\timage/class/raw: integer, specifying the index in the raw (original) classification layer\n\timage/class/source: integer, specifying the index of the source (creator of the image)\n\timage/class/text: string, specifying the human-readable version of the normalized label\n\timage/format: string, specifying the format, always 'JPEG'\n\timage/filename: string containing the basename of the image file\n\timage/id: integer, specifying the unique id for the image\n\timage/encoded: string, containing JPEG encoded image in RGB colorspace\n\n~~~~~~~~~~~~~~~~~~~~~~~~~\nInspect an image dataset\n~~~~~~~~~~~~~~~~~~~~~~~~~\n\nGenerate summary statistics on an image dataset::\n\n\t$ ./terf -d summary --input train_directory/\n\tINFO[0000] Processing file  path=train_directory/train-00001-of-00001 zlib=false\n\tTotal: 10\n\tLabel: \n\t\t- Clear: 5\n\t\t- Precipitate: 4\n\t\t- Crystals: 1\n\tSource: \n\t\t- 2: 2\n\t\t- 3: 6\n\t\t- 1: 2\n\tLabel ID: \n\t\t- 1: 1\n\t\t- 0: 5\n\t\t- 3: 4\n\tLabel Raw: \n\t\t- 30: 1\n\t\t- 2: 3\n\t\t- 8: 1\n\t\t- 16: 1\n\t\t- 1: 2\n\t\t- 14: 2\n\n~~~~~~~~~~~~~~~~~~~~~~~~~\nExtract an image dataset\n~~~~~~~~~~~~~~~~~~~~~~~~~\n\nExtract the raw image data from a dataset::\n\n\t$ ./terf -d extract --input train_directory -o dump/\n\tINFO[0000] Processing file    path=train_directory/train-00001-of-00001 zlib=false\n\t$ find dump/\n\tdump/\n\tdump/info.csv\n\tdump/Clear\n\tdump/Clear/396612.jpg\n\tdump/Clear/90089.jpg\n\tdump/Clear/192089.jpg\n\tdump/Clear/283709.jpg\n\tdump/Clear/82162.jpg\n\tdump/Precipitate\n\tdump/Precipitate/286612.jpg\n\tdump/Precipitate/421709.jpg\n\tdump/Precipitate/296118.jpg\n\tdump/Precipitate/163507.jpg\n\tdump/Crystals\n\tdump/Crystals/80373.jpg\n\n\n~~~~~~~~~~~~~~~~~~~~~~\nGo\n~~~~~~~~~~~~~~~~~~~~~~\n\nParse TFRecords file in Go:\n\n.. code-block:: go\n\n\t// Open TFRecord file\n\tin, err := os.Open(\"train-000\")\n\tif err != nil {\n\t\tlog.Fatal(err)\n\t}\n\tdefer in.Close()\n\n\tr := terf.NewReader(in)\n\n\tcount := 0\n\tfor {\n\t\t// example will be a TensorFlow Example proto\n\t\texample, err := r.Next()\n\t\tif err == io.EOF {\n\t\t\tbreak\n\t\t} else if err != nil {\n\t\t\tlog.Fatal(err)\n\t\t}\n\n\t\t// Do something with example\n\n\t\tid := terf.ExampleFeatureInt64(example, \"image/id\")\n\t\tlabelID := terf.ExampleFeatureInt64(example, \"image/class/label\")\n\t\tlabelText := string(terf.ExampleFeatureBytes(example, \"image/class/text\"))\n\n\t\tfmt.Printf(\"Image: %d Label: %s (%d)\\n\", id, labelText, labelID)\n\t\tcount++\n\t}\n\n\tfmt.Printf(\"Total records: %d\\n\", count)\n\n-------------------------------------------------------------------------------\nLicense\n-------------------------------------------------------------------------------\n\nterf is released under the GPLv3 License. See the LICENSE file.\n\n.. |godoc| image:: https://godoc.org/github.com/golang/gddo?status.svg\n    :target: https://godoc.org/github.com/ubccr/terf\n    :alt: Godoc\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fubccr%2Fterf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fubccr%2Fterf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fubccr%2Fterf/lists"}