{"id":18605894,"url":"https://github.com/varshneydevansh/tfcheatsheet","last_synced_at":"2026-01-24T13:14:16.619Z","repository":{"id":160355483,"uuid":"117566042","full_name":"varshneydevansh/TFCheatSheet","owner":"varshneydevansh","description":"Tensorflow cheat sheet for reference","archived":false,"fork":false,"pushed_at":"2018-01-15T16:02:42.000Z","size":21,"stargazers_count":10,"open_issues_count":0,"forks_count":4,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-05-16T19:47:47.636Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/varshneydevansh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-01-15T16:01:55.000Z","updated_at":"2023-12-08T13:50:53.000Z","dependencies_parsed_at":null,"dependency_job_id":"e2508051-4a08-427d-ab73-23811280148f","html_url":"https://github.com/varshneydevansh/TFCheatSheet","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/varshneydevansh/TFCheatSheet","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varshneydevansh%2FTFCheatSheet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varshneydevansh%2FTFCheatSheet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varshneydevansh%2FTFCheatSheet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varshneydevansh%2FTFCheatSheet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/varshneydevansh","download_url":"https://codeload.github.com/varshneydevansh/TFCheatSheet/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/varshneydevansh%2FTFCheatSheet/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28728580,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-24T10:24:43.181Z","status":"ssl_error","status_checked_at":"2026-01-24T10:24:36.112Z","response_time":89,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T02:23:28.113Z","updated_at":"2026-01-24T13:14:16.612Z","avatar_url":"https://github.com/varshneydevansh.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# TFCheatSheet\nTensorflow cheat sheet for reference\n# Index\n- [Introduction](#introduction)\n- [Graph and Session notions](#graph-and-session)\n- [Dynamic and Static shape](#dynamic-shape-vs-static-shape)\n- [Tensorflow variable](#variable)\n- [Structure tensorflow code with decorator](#how-to-structure-a-tensorflow-model)\n- [Save and restore model](#checkpoint)\n- [Tensoarboard](#tensoarboard)\n- [Regularization](#regularization)\n- [Preprocessing](#input-data-to-the-graph)\n- [Computer Vision](#computer-vision-application)\n- [Natural Language Processing](#nlp-application)\n- [Higher order operations](#higher-order-operators)\n- [Debugging](#debugging-and-tracing)\n- [Miscellanous](#miscellanous)\n- [Dynamic graph computation](#tensorflow-fold)\n- [Tensorflow Estimator](#tensorflow-estimator)\n- [Variable scope](#sonnet)\n\n# Introduction\n* Tensorflow, a Symbolic library [on symbolic and imperative libraries](http://mxnet-tqchen.readthedocs.io/en/latest/system/program_model.html))  \n* [Linux installation](https://www.tensorflow.org/install/install_linux)\n* [Windows installation (GPU support)](http://www.netinstructions.com/how-to-install-and-run-gpu-enabled-tensorflow-on-windows/)\n* [Azure installation](https://www.lutzroeder.com/blog/2016-12-27-tensorflow-azure)\n\n# Graph and Session\n### Graph vs Session\n* A graph defines the computation. It doesn’t compute anything; it doesn’t hold any values, it just defines the operations that you specified in your code.\n* A session allows executing graphs or part of graphs. It allocates resources (on one or more machines) for that and holds the actual values of intermediate results and variables. One can create a session with tf.Session, and be sure to use a context manager or tf.Session.close(), because all resources of the session are saved. To run some graph element, you should use the function .run(graph_element, feed_dic), it returns values, or list of values if a list of graph elements was passed.\n\n### Interactive session\nInteractive session is useful when multiple different sessions needed to be run in the same script\n```\nstart the session\nsess = tf.InteractiveSession()\n\n# stop the session\nsess.stop()\nops.reset_default_graph()\n```\n\n### Collecting variables in the graph\nTo collect and retrieve values associated with a Graph, it is possible to get them with GraphKeys. Variables are automatically placed in collections. Here are the main collections used: For example ```GLOBAL_VARIABLE```, or ```MODEL_VARIABLE```, or ```TRAINABLE_VARIABLE```, ```QUEUE_RUNNERS```, or even more specifically the ```WEIGHTS```, ```BIASES```, or ```ACTIVATIONS```.\n\n* You can get the name of all variables that have not been initialized by passing a list of Variable to the function ```tf.report_uninitialized_variables(list_var). It returns the list of uninitialised variables\n\n### Split training variables between two neural network. An example with GAN architecture\nIN GANs, there are two neural network: the Generator and the Discriminator. Each one have their own loss, but only the discriminator updated all weights. The Generator updates only its own weights. For this, we need to use different scope (or use Sonnet, see bottom): one \"generator\" scope, and one \"discriminator\" scope, where all the necessary weight will be instanciated.  \n1. First you need to retrieve all trainable variable with ```train_variables = tf.train_variables()```  \n2. Then you split the training variable in two lists:  \n```\nlist_gen = self.generator_variables = [v for v in train_variables if v.name.startswith(\"generator\")]\nlist_dis = self.discriminator_variables = [v for v in train_variables if v.name.startswith(\"discriminator\")]    \n```  \n3. Create two functions for training them:  \n```\ngrads = optimizer.compute_gradients(loss_generator, var_list=list_gen)\ntrain_gen = optimizer.apply_gradients(grads)\n```\n\n### Get an approximation of the size in byte of the Graph\n```\n# This is the best approximation I've been able to found online. Not sure how good it is, because it doesn't take into\n# account used for automatic differentiation\nfor v in tf.global_variables():\n    vars += np.prod(v.get_shape().as_list())\nvars *= 4\n```\n\n# Dynamic shape vs Static shape\n```dynamic shape = static shape``` when you run a session. When Tensorflow can't infer a shape during graph construction, it will\nset the dimension value to None \n```\nvar = tf.placeholder(shape=(None, 2), dtype=tf.float32)\ndynamic_shape = tf.shape(var)\nprint(var.get_shape().as_list())  # =\u003e [None, 2]\nsess = tf.Session()\nsess.run(tf.global_variables_initializer())\nprint(sess.run(dynamic_shape, feed_dict={var: [[1, 2], [1, 2]]}))  # =\u003e [2 2]\n```\n\n### Sparse vector\nA sparse vector is usually created from sequence features when parsing a protobuf example. By default, all sequence features are transformed into a sparse vector. To transform them back in a dense vector, use ```tf.sparse_tensor_to_dense(sparse_vector)```\n\n# Variable\n## Some parameters\n* Setting ```trainable=False``` keeps the variable out of the ```GraphKeys.TRAINABLE_VARIABLES``` collection in the graph, so they won't be trained when back-propagating. \n* Setting ```collections=[]``` keeps the variable out of the ```GraphKeys.GLOBAL_VARIABLES``` collection used for saving and restoring checkpoints.  \nExample: ```input_data = tf.Variable(data_initializer, trainable=False, collections=[])```\n\n## Tensors\nTensors are similar as Variable but they don't conserve their state between two calls of ```sess.run```.\n\n## Shared variable\nIt is possible to reuse weights, just by setting new variable to older one defined previously. For that, you must be in the same namescope, and look for a variable with the same name. Here is an example:\n```\ndef build():\n    # Create variable named \"weights\".\n    weights = tf.get_variable(\"weights\", kernel_shape,\n        initializer=tf.random_normal_initializer())\n    ...\n\ndef build_funct():\n     with tf.variable_scope(\"scope1\"):\n    relu1 = build()\n     with tf.variable_scope(\"scope2\"):\n    # relu2 is different from relu1, even if they shared the same name, they are in different namescope \n    relu2 = build()\n\n#, however, calling twice the build_funct() will return an error\n\n\nresult1 = build_funct()\nresult2 = build_funct()\n# Raises ValueError(... scope1/weights already exists ...) because\n# f.get_variable_scope().reuse == False \n\n# to avoid the error, you must defined this way:\nwith tf.variable_scope(\"image_filters\") as scope:\n    result1 = build_funct()\n    scope.reuse_variables()\n    result2 = build_funct()\n```\n\n# Checkpoint\n### Save and Restore\n```\nsaver = tf.train.Saver(max_to_keep=5)\n\n# Try to restore an old model\nlast_saved_model = tf.train.latest_checkpoint(\"model/\")\n\ngroup_init_ops = tf.group(tf.global_variables_initializer())\nself.sess.run(group_init_ops)\n\nsummary_writer = tf.summary.FileWriter('logs/', graph=self._sess.graph, flush_secs=20)\n\nif last_saved_model is not None:\n    saver.restore(self._sess, last_saved_model)\nelse:\n    tf.train.global_step(self._sess, self.global_step)\n```\n\n### What are the files saved?\n* The checkpoint file is used in combination of high-level helper    for different time loading saved chkg\n* The meta ckpt hold the compressed Protobuf graph of your model and all the metadata associated\n* The chkp file contains the data\n* The events file store everything for visualization\n\n### Connect an already trained Graph \nIt is possible to connect multiple graphs, for example if you want to connect vgg19 to a new graph, and only trained the last one, here is a simple example:\n```\nvgg_saver = tf.train.import_meta_graph(dir + '/vgg/results/vgg-16.meta')\nvgg_graph = tf.get_default_graph()\n\n# retrieve inputs\nself.x_plh = vgg_graph.get_tensor_by_name('input:0')\n\n# choose the node to connect to \noutput_conv =vgg_graph.get_tensor_by_name('conv1_2:0')\n\n# stop the gradient for fine tuning\noutput_conv_sg = tf.stop_gradient(output_conv)\n\n# create your own neural network\noutput_conv_shape = output_conv_sg.get_shape().as_list()\n\"\"\"...\"\"\"\n```\n\n### Different function between forward and backward\nIn the next snippet, g(x) is used for the bacward pass, but f(x) is used for forwarding the signal.  \n```\nt = g(x)\ny = t + tf.stop_gradient(f(x) - t)\n```\n\n\n# Tensoarboard\n* Launch a session with ```tensorboard --logdir=\"\"```.\n\n### Save the graph for Tensorflow visualization\nThe FileWriter class provides a mechanism to create an event file in a given directory and add summaries and events to it. The class updates is called asynchronously, which means it will never slow down the training loop calling.\n```python\nsess = tf.Session()\nsummary_writer = tf.summary.FileWriter('logs', graph=sess.graph)\n```\nConnection between them is done with the line: ```with tf.Session(graph=graph) as sess:```\n\n### Summary about activation and gradient\n```\ndef add_activation_summary(var):\n    tf.summary.histogram(var.op.name + \"/activation\", var)\n    tf.summary.scalar(var.op.name + \"/sparsity\", tf.nn.zero_fraction(var))\n\n\ndef add_gradient_summary(grad, var):\n    if grad is not None:\n        tf.summary.histogram(var.op.name + \"/gradient\", grad)\n```\n\n### Summary about an image\nAutomatic rescaling of float value to value between [0, 255]  \n```\ntf.summary.image([batch_size, height, width, channels], max_output= num_max_of_images_to_display)```\n```\n\n### Summary about a cost function\n```\ncost_function = -tf.reduce_sum(var)\n# don't need to store the reference, next function is responsible to collect all summaries\ntf.summary.scalar(\"cost_function\", cross_entropy)\n```\n\n### Summary about a python scalar\nIt is also possible to add a summary for a Python scalar\n```\nsummary_accuracy = tf.Summary(value=[\n            tf.Summary.Value(tag=\"value_name\", simple_value=value_to_save),\n        ])\n```\nYou can call this function as many time as you want, if you call it with the same tag, no duplicate value will be saved. You can even plot a graph of this Python value, by writing the ```summary_accuracy``` every epoch.\n\n### Merge all summaries operations\n```python\nmerged_summary_op = tf.summary.merge_all()\n```\nIf you create new summary after this function, they won't be part of the summary collected.\n\n### Collect stats during each iteration\n```python\n# First compute the global_step\n# Global step is a variable passed to the optimizer and is incremented each times\n# optimizer.apply_gradients(grads, global_step=self.global_step)\ncurrent_iter = self._sess.run(self.global_step)\n\n# Run the summary function\nsummary_str, _ = sess.run([merged_summary_op, optimize], {x: batchX, y: batchY})\nsummary_writer.add_summary(summary_str, current_iter)\n```\n\n### Plot embeddings\n\n1. Create an embedding vector (dim: nb_embeddings, embedding_size) or \n    ```python\n    embedding = tf.Variable(tf.random_normal([nb_embedding, embedding_size]))\n    ```\n\n2. Create a tag for every embedding ( the first name in the file correspond to name of the first embedding  \n    ``` \n    LOG_DIR = 'log/'\n    metadata = os.path.join(LOG_DIR, 'metadata.tsv')\n    # Mention label name\n    metadata_file.write(\"Label1\\tLabel2\\n\")\n    with open(metadata, 'w') as metadata_file:\n        for data in whatever_object:\n            metadata_file.write('%s\\t%s\\n' % data.label1, data.label2)\n    ```\n\n3. Save embedding\n    ```    \n    # See more advance tuto on Saver object\n    tf.global_variables_initializer().run()\n    saver = tf.train.Saver()\n    saver.save(sess, save_path=os.path.join(log_dir, 'model.ckpt'), global_step=None)\n    ```\n\n4. Create a projector for Tensorboard\n    ```\n    summary_writer = tf.summary.FileWriter(log_dir, graph=tf.get_default_graph())\n\n    metadata_path = os.path.join(log_dir, \"metadata.tsv\") # file for name metadata (every line is an embedding)\n    config = projector.ProjectorConfig()\n\n    embedding = config.embeddings.add()\n    embedding.metadata_path = metadata_path\n\n    embedding.tensor_name = embeddings.name\n\n    # Add spirit metadata\n    embedding.sprite.image_path = filename_spirit_picture\n    embedding.sprite.single_image_dim.extend([thumbnail_width, thumbnail_height])\n    projector.visualize_embeddings(summary_writer, config\n    ```\n\n\u003cdetails\u003e\n\u003csummary\u003eA code example\u003c/summary\u003e\n\u003cp\u003e\u003ccode\u003e\n\n    # Prerequisite\n    dictionary = # It is a dictionary where keys are incremental integers\n                 # and value is a pair of embedding and image\n\n    # Size of the thumbmail\n    thumbnail_width = 28  # width of a small thumbnail\n    thumbnail_height = thumbnail_width  # height\n\n    # size of the embeddings (in the dictionnary)\n    embeddings_length = 4800\n\n    # 1. Make the big spirit picture\n    filename_spirit_picture = \"master.jpg\"\n    filename_temporary_embedding = \"features.p\"\n\n\n    if not os.path.isfile(filename_spirit_picture) or not os.path.isfile(filename_temporary_embedding) or True:\n        print(\"Creating spirit\")\n        Image.MAX_IMAGE_PIXELS = None\n        images = []\n\n        features = np.zeros((len(dictionary), embeddings_length))\n\n        # Make a vector for all images and a list for their respective embedding (same index)\n        for iteration, pair in dictionary.items():\n            #\n            array = cv2.resize(pair[1], (thumbnail_width, thumbnail_height))\n\n            img = Image.fromarray(array)\n            # Append the image to the list of images\n            images.append(img)\n            # Get the embedding for that picture\n            features[iteration] = pair[0]\n\n        # Build the spirit image\n        print('Number of images %d' % len(images))\n        image_width, image_height = images[0].size\n        master_width = (image_width * (int)(np.sqrt(len(images))))\n        master_height = master_width\n        print('Length (in pixel) of the square image %d' % master_width)\n        master = Image.new(\n            mode='RGBA',\n            size=(master_width, master_height),\n            color=(0, 0, 0, 0))\n\n        for count, image in enumerate(images):\n            locationX = (image_width * count) % master_width\n            locationY = image_height * (image_width * count // master_width)\n            master.paste(image, (locationX, locationY))\n        master.save(filename_spirit_picture, transparency=0)\n        pickle.dump(features, open(filename_temporary_embedding, 'wb'))\n    else:\n        print('Spirit already created')\n        features = pickle.load(open(filename_temporary_embedding, 'r'))\n\n    print('Starting session')\n    sess = tf.InteractiveSession()\n    log_dir = 'logs'\n\n    # Create a variable containing all features\n    embeddings = tf.Variable(features, name='embeddings')\n\n    # Initialize variables\n    tf.global_variables_initializer().run()\n    saver = tf.train.Saver()\n    saver.save(sess, save_path=os.path.join(log_dir, 'model.ckpt'), global_step=None)\n\n    # add metadata\n    summary_writer = tf.summary.FileWriter(log_dir, graph=tf.get_default_graph())\n\n    metadata_path = os.path.join(log_dir, \"metadata.tsv\")\n    config = projector.ProjectorConfig()\n\n    embedding = config.embeddings.add()\n    embedding.metadata_path = metadata_path\n\n    print('Add metadata')\n    embedding.tensor_name = embeddings.name\n\n    # add image metadata\n    embedding.sprite.image_path = filename_spirit_picture\n    embedding.sprite.single_image_dim.extend([thumbnail_width, thumbnail_height])\n    projector.visualize_embeddings(summary_writer, config)\n\n    print('Finish now clean repo')\n    # Clean actual repo\n    if not to_saved:\n        os.remove(filename_temporary_embedding)\n\u003c/code\u003e\u003c/p\u003e\n\u003c/details\u003e\n\n# Regularization\n### L2 regularization\n```python\nw = tf.Variable()\ncost = # define your loss\nregularizer = tf.nn.l2_loss(w)\nloss = cost + regularizer\n```\n### L1 and L2 regularization\n```\ndef l1_l2_regularizer(weight_l1=1.0, weight_l2=1.0, scope=None):\n    \"\"\"\n    L1 and L2 regularizer\n    :param weight_l1:\n    :param weight_l2:\n    :param scope:\n    :return:\n    \"\"\"\n    def regularizer(tensor):\n        with tf.name_scope(scope, 'L1L2Regularizer', [tensor]):\n            weight_l1_t = tf.convert_to_tensor(weight_l1,\n                                               dtype=tensor.dtype.base_dtype,\n                                               name='weight_l1')\n            weight_l2_t = tf.convert_to_tensor(weight_l2,\n                                               dtype=tensor.dtype.base_dtype,\n                                               name='weight_l2')\n            reg_l1 = tf.multiply(weight_l1_t, tf.reduce_sum(tf.abs(tensor)),\n                                 name='value_l1')\n            reg_l2 = tf.multiply(weight_l2_t, tf.nn.l2_loss(tensor),\n                                 name='value_l2')\n            return tf.add(reg_l1, reg_l2, name='value')\n\n    return regularizer\n```\n\n### Dropout\n```\nhidden_layer_drop = tf.nn.dropout(some_activation_output, keep_prob)\n```\n\n\n### Batch normalization\nUse the tf.nn.contrib.layers.batch_norm\n```\nis_training = tf.placeholder(tf.bool)\nbatch_norm(pre_activation, is_training=is_training, scale=True)\n```  \nBy default, movingmean, and movingscale are not in the default graph, but in ```updateOperation````, hence to compute the movingmean, and movingscale, you should that the operation should be computed before the loss function is calculated:\n\n```\nupdate_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)\n# puis par exemple\nwith tf.control_dependencies(update_ops):\n            grads_dis = self.optimizer.compute_gradients(loss=self.dis_loss, var_list=self.dis_variables)\n            self.train_dis = self.optimizer.apply_gradients(grads_dis)\n```\nAnother way, of computing the moving variable, is to do in place in the graph, is to set ```updates_collections=None```.\nThe trainable boolean can be a placeholder so that depending on the feeding dictionary, the computation in the batch norm layer will be different  \n\n# Input data to the graph\nIt is possible to load data directly from Numpy arrays using ```feed_dict```. However, it is the best practice to use protobuf tensor flow formats such as ```tf.Example``` or ```tf.SequenceExample```. It makes the model decouple from the data preprocessing.\nOne drawback of this method, is that it is quite verbose.  \n\n1. Create a function to transform a batch element to a ```SequenceExample```:  \n    ```python\n    def make_example(inputs, labels):\n        ex = tf.train.SequenceExample()\n        # Add non-sequential feature\n        seq_len = len(inputs)\n        # could be a float_list, or a byte_list\n        ex.context.feature[\"length\"].int64_list.value.append(sequence_length)\n\n        # Add sequential feature \n        # All sequential features should be retrieve from the sequence_feature\n        # of parse_single_sequence_example\n        fl_labels = ex.feature_lists.feature_list[\"labels\"].feature.add().int64_list.value.extend(labels)\n        fl_tokens = ex.feature_lists.feature_list[\"inputs\"].feature.add().int64_list.value.append(inputs)\n        return ex\n    ```\n\n2. Write all example into TFRecords. You can split TfRecords into multiple files, by creating multiple tfRecordWriter.  \n    ```\n    import tempfile\n    with tempfile.NamedTemporaryFile() as fp:\n        writer = tf.python_io.TFRecordWriter(fp.name)\n        for input, label_sequence in zip(all_inputs, all_labels):\n            ex = make_example(input, label_sequence)\n            writer.write(ex.SerializeToString())\n        writer.close()\n        # check where file is writen with fp.name\n    ```\n3. Create a Reader object, ```TFRecordReader``` for tfrecords file, or ```WholeFileReader``` for raw files such as jpg files. \n4. Read a single example with:\n    ```\n    writer_filename = \"examples/val.tfrecords\"\n    # note that writer_filename can also be a list of tfrecords filename,\n    # or a list of jpg file (use tensorflow internal functions)\n    filename_queue = tf.train.string_input_producer([writer_filename])\n    key, image_file = reader.read(filename_queue) # key is not interesting\n    ```\n\n6. Define how to parse the data\n    ```python\n    context_features = {\n        \"length\": tf.FixedLenFeature([], dtype=tf.int64)\n    }\n\n    sequence_features = {\n        # If the sequence length is fixed for every example\n        \"tokens\": tf.FixedLenSequenceFeature([], dtype=tf.int64),\n        \"labels\": tf.FixedLenSequenceFeature([], dtype=tf.int64),\n        # else use VarLenFeature which will create SparseVector\n        \"sentences\": tf.VarLenFeature(dtype=tf.float32)\n    }\n\n    context_parsed, sequence_parsed = tf.parse_single_sequence_example(\n        serialized=ex,\n        context_features=context_features,\n        sequence_features=sequence_features\n    )\n    ```\n7. Retrieve the data into array instantly\n    ```python\n\n    # get back in array format\n    context = tf.contrib.learn.run_n(context_parsed, n=1, feed_dict=None)\n    ```\n7. bis) Or retrieve the examples by their name. Example ```sentences = sequence_parsed[\"sentences\"]```\n\n8. Use queues. There is three main type of Queues.\n\n### Queues\n#### Shuffle queues\nIt shuffle elements\n```\nimages = tf.train.shuffle_batch(\n    inputs, # all dimensions must be defined\n    batch_size=batch_size, # number of element to output\n    capacity=min_queue_examples + 3 * batch_size, # max capacity\n    min_after_dequeue=min_queue_examples) # capacity at any moment after a batch dequeue\n```\n\n#### Batch queues\nSame as shuffle queues without ```min_after_queues```. It is also possible to dynamically pad entries in the queues. Every ```VarLenFeatures``` created which are now ```SparseVector``` will be padded to the maximum length between all elements in the same category and batch.\n```\ntf.train.batch(tensors=[review, score, film_id],\n                          batch_size=batch_size,\n                          dynamic_pad=True, # dynamically pad sparse tensor\n                          allow_smaller_final_batch=False, # disallow batch smaller than batch size\n                          capacity=capacity) \n```  \nAs of now, dynamic pad is not supported with shuffle, but one may use a shuffle_batch as input tensors of a dynamical pad queue.\n\n#### Bucket queues\n1. What to pass to the bucket queues?    \n    ```python\n    # Set this variable to the maximum length between all tensor of a single example\n    # For example, if an example, consists of a encoder sentence, an a decoder sentence\n    # Then, pick the longest length\n    # Consider, that a pair of (encoder sentence, decoder sentence) is return by a shuffle_batch queue\n    encoder_sentence, decoder_sentence = tf.train.shuffle_queue(..., batch_size=1, ...)\n    ``` \n2. Then set length_table to the max between both length, for example. Note that setting the minimum length gives optimal performance because tensor are not append to a too largebucket:  \n    ```\n    length_table = tf.constant([], dtype=tf.int32)\n    ```\n3. Call ```bucket_by_sequence_length()```:    \n    ```\n    # the first argument is the sequence length specifed in the input_length\n    _, batch_tensors = tf.contrib.training.bucket_by_sequence_length(\n        input_length=length_table,\n        tensors=[encoder_sentence, decoder_sentence]\n        batch_size=,\n        \n        # devices buckets into [len \u003c 3, 3 \u003c= len \u003c 5, 5 \u003c= len]\n        bucket_boundaries=[3, 5],\n        \n        # this will bad the source_batch and target_batch independently\n        dynamic_pad=True,\n        capacity=2\n    )\n    ```\n\n### Validation and Testing queues\nIt is **not recommended** to use a ```tf.cond(is_training, lambda _: training_queue, lambda _: test_queue)``` because training becomes very slow becomes at each iteration as both queues output elements but only one of them is used.  \nThe recommended way is to have a different script that runs separately (in another script), fetch some checkpoint, and compute accuracy\n\n## How to use tf.contrib.data.Dataset and why using it?\n```tf.contrib.data.Dataset``` takes care of the data loading into the graph. Compared to the previous implementation, it can be use to fed single input, or use this dataset to create queues. Dataset can be made of text files, TfRecords, or even Numpy arrays.\nHere is an example where we have two files. In each file, on each line, there is a sequence of ids, representing token id of a vocabulary.\nThe first file contains the ids of the questions; the second file contains the ids of the answers:\n```\ndef input_fn():\n\"\"\"Let's define an input_fn of an Estimator\n        # We load data from text files\n        source_dataset = tf.contrib.data.TextLineDataset(context_filename)\n        target_dataset = tf.contrib.data.TextLineDataset(answer_filename)\n        \n        # We define a set of operations that will be applied to each input\n        def map_dataset(dataset):\n            dataset = dataset.map(lambda string: tf.string_split([string]).values)\n            dataset = dataset.map(lambda token: tf.string_to_number(token, tf.int64))\n            dataset = dataset.map(lambda tokens: (tokens, tf.size(tokens)))\n            dataset = dataset.map(lambda tokens, size: (tokens[:max_sequence_len], tf.minimum(size, max_sequence_len)))\n            return dataset\n\n        # For all elements in both datasets, we apply the same operation\n        source_dataset = map_dataset(source_dataset)\n        target_dataset = map_dataset(target_dataset)\n        \n        # Merge the dataset. Note that it means that both txt file should contain the same number of lines\n        dataset = tf.contrib.data.Dataset.zip((source_dataset, target_dataset))\n        # How many time each element will be fed into queues\n        dataset = dataset.repeat(num_epochs)\n        # We pad each sequence to a max lengths\n        dataset = dataset.padded_batch(batch_size,\n                                       padded_shapes=((tf.TensorShape([max_sequence_len]), tf.TensorShape([])),\n                                                      (tf.TensorShape([max_sequence_len]), tf.TensorShape([]))\n                                                      ))\n        # We create an iterator, that will pull element from the queues\n        iterator = dataset.make_one_shot_iterator()\n        next_element = iterator.get_next()\n        return next_element, None\n\nreturn input_fn\n```\n\nTo dive deeper, here are some cool features:\n### Create a dataset\n* ```Dataset.from_tensor_slices(tensor, numpy array, tuple of tensor, tuple of tuple ...)```: Everything is loaded into memory\n* ```Dataset.zip((dataset1, dataset2))```: zip multiple dataset\n* ```Dataset.range(100)``` \n*```Dataset.TFRecordDataset(list_of_filenames)```\n\n### Transform a dataset\n* ```dataset.map(_function_to_apply_to_each_element)```. Can even work with non tensorflow operation, using ```py_func```:  \n    ```dataset.map(lambda x, y: tf.py_func(_function, [objects], [types]))```\n* ```dataset.flat_map```: not sure\n* ```dataset.filter```: filter based on a condition\n\n### Get the Shape \u0026 outputs of the dataset elements\n* ```dataset.output_types``` and ```dataset.output_shapes```\n\n### Iterate\n* one shot iterator. ```dataset.make_one_shot_iterator()```, then ```get_next()```. It is not possible to condition the dataset elements on some other graph variables such as placeholders\n* initializable iterator: ```dataset.make_initializable_iterator()```, then ```get_next()```. Element in the dataset can be loaded from placeholders by calling ```sess.run(iterator.initializer, feed_dict={})```.\n* reinitializable iterator: two dataset with same output type and shape  \n    ```\n    it = Iterator.from_structure(output_types, output_shapes)\n    get_next()\n    # For example\n    it.make_initializer(training_dataset)\n    it.make_initializer(validation_dataset)\n    ```\n\n### Other functions\n* unroll iterator until ```tf.errors.OutofRangeError``` or ```dataset.repeat(num_times)```\n* shuffle dataset with ```dataset.shuffle(buffer_size)```\n* batch dataset with ```dataset.batch(batch_size)```, or ```padded_batch``` for sequence. \n\n# Computer vision application\n## Convolution\nReminder on simple convolution:\n* Given an input of channel size |k| equal 1, the neuron (one output) of a feature map (a channel, let's say channel 1 in the output) is the result given by a filter W1 apply to some location in the single input channel. Then you stride the same filter over the input image and compute another input for the same output channel. Every output channel has its filter.  \n* Now if the input channel size |k| is superior than 1, for each output channel, there as k filter (kernel). Each of theses filter is applied respectively on every input channel location and then sum (not mean) to give the value of a neuron. Hence the number of parameters in a convolution is ```|filter_height * filter_width * nb_input_channels * nb_output_channels|```. This is also why it's difficult for the first layer of a convolution neural network to catch high-level features because usually the input channel size is small, and hence, the information for an output neuron, wasn't computed with a lot of filters. In deeper layer, usually output neuron in a given channel are computed by summing over a lot of filters. Hence each filter can capture different representations.  \nImplementation:\n```\n# 5*5 conv, 1 input_channel_size, output_channel_size\nW = tf.Variable(tf.random_normal([5, 5, 1, 32]))\n# dimension of x is [batch_size, 28, 28, 1]\nx = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')\n```\n\n## Transpose convolution\nThere is nothing fancy in the formula of the transpose convolution; the only trick is that the input channel is transformed (padded) concerning the size of the filter and the dimension of the output. Here is a nice example of deconvolutions: [deconvolution without no stride](https://i.stack.imgur.com/YyCu2.gif), and [deconvolution with stride](https://i.stack.imgur.com/f2RiP.gif).  \nImplementation:\n```\n# h2 is of shape [self.batch_size, 7, 7, 128]\noutput_shape_h3 = [self.batch_size, 14, 14, 64]\n# filter_size, output_channel_size, input_channel_size\nW3 = utils.weight_variable([5, 5, 64, 128], name=\"W3\")\nb3 = utils.bias_variable([64], name=\"b3\")\nh3 = utils.conv2d_transpose(h2, W3, b3, output_shape_h3, strides=[1, 2, 2, 1], padding=\"SAME\")\n```\n\n\n\n# NLP application\n* Look for embedding in a matrix given an id: ```tf.nn.embedding_lookup(embeddings, mat_ids)```\n\n## RNN, LSTM, and shits\n### Dynamic or static rnn\n* ```tf.dynamic_rnn````uses a ```tf.While``` allowing to dynamically construct the graph, and passing different sentence lengths between batches. Do not use ```static_rnn```.\n\n### Set state for LSTM cell stacked\n1. An LSTM cell state is a tuple containing two tensors (the context, and the hidden state). Let's create a placeholder for both of these tensors:  \n    ```\n\n    # create a (context tensor, hidden tensor) for every layers\n    state_placeholder = tf.placeholder(tf.float32, [num_layers, 2, batch_size, state_size])\n    # unpack them\n    l = tf.unstack(state_placeholder, axis=0)\n    ```\n2. Transform them into tuples\n    ```\n    rnn_tuple_state = tuple(\n             [tf.contrib.rnn.LSTMStateTuple(l[idx][0],l[idx][1])\n              for idx in range(num_layers)]\n    )\n    ```\n3. Create the dynamic rnn, and passed initialized state\n    ```\n    cells = [tf.contrib.rnn.LSTMCell(state_size, state_is_tuple=True) for _ in num_layers]\n    cell = tf.contrib.rnn.MultiRNNCell(cells, state_is_tuple=True)\n\n    outputs, state = tf.nn.dynamic_rnn(cell, series_batch_input, initial_state=rnn_tuple_state)\n    ```\n\n### Stacking recurrent neural network cells\n1. Create the architecture (example for a GRUCell with dropout and residual connections between every stacking cell\n    ```\n    from tensorflow.contrib.rnn import GRUCell, DropoutWrapper, MultiRNNCell\n\n    num_neurons = 200\n    num_layers = 3\n    dropout = tf.placeholder(0.1, tf.float32)\n    \n    cells = list()\n    for _ in range(num_layers):\n        cell = GRUCell(num_neurons)\n        # you can set input_keep_prob, state_keep_prob or output_keep_prob.\n        # You can also use variational_recurrent, and the same dropout mask will\n            # be applied at each timesteps.\n        cell = DropoutWrapper(cell, output_keep_prob=dropout)\n        # You can use ResidualWrapper, which will combines the input and the output of the cell\n        cell = ResidualWrapper(cell)\n        cells.append(cell)\n    # Concat all this cells\n    cell = MultiRNNCell(cells)\n    ```\n\n2. Simulate the recurrent network over the time step of the input with ```dynamic_rnn```:\n    ```\n    output, state = tf.nn.dynamic_rnn(cell, some_variable, dtype=tf.float32)\n    ```\n\n### Variable sequence length input\n* First of all ```dynamic_rnn()``` return an output vector, which is of size ```batch_size x max_length_sentence x hidden_vector```. It contains all the hidden state at every timestep. The other output of this function is the state of the cells. \n\nWhen passing sequences to RNN, their length may vary. Tensorflow wants us to pass into an RNN a tensor of shape ```batch_size x sentence_length x embedding_length```. To support this in our RNN, we have first to create a 3D array where for each row (every batch element), we pad with zeros after reaching the end of the batch element sentence. For example if the length of the first sentence is 10, and ```sentence_length=20```, then all element ```tensor[0,10:, :] = 0``` will be zero padded.  \n\n1. It is possible to compute the length of every batch element with this function:  \n    ```\n\n    def length(sequence):\n        @sequence: 3D tensor of shape (batch_size, sequence_length, embedding_size)\n        used = tf.sign(tf.reduce_sum(tf.abs(sequence), reduction_indices=2))\n        length = tf.reduce_sum(used, reduction_indics=1)\n        length = tf.cast(length, tf.int32)\n        return length # vector of size (batch_size) containing sentence lengths\n    ```\n2. Using the length function, we can use  ```dynamic_rnn```  \n    ```\n\n    from tensorflow.nn.rnn_cell import GRUCell\n\n    max_length = 100\n    embedding_size = 32\n    num_hidden = 120\n\n    sequence = tf.placeholder([None, max_length, embedding_size])\n    output, state = tf.nn.dynamic_rnn(\n        GRUCell(num_hidden),\n        sequence,\n        dtype=tf.float32,\n        sequence_length=length(sequence),\n    )\n\n    ```\n\nNote: A better solution is to always pass as input the length of the sequence. This vector can be used to create a mask with ```tf.sequence_mask(sequence_length, maxlen=tf.shape(sequence)[1])```. This mask can be used when computing a loss and masking value that should not be accounted.\n\nThere are two main use of loss function for RNN: whether we are interested in only the last element outputed, or all outputs at every time step. Let's define a function for both of them\n\n#### Case 1: Output at each time steps\n__Example__: Compute the cross-entropy for every batch element of different size (we can't use ```reduce_mean()```)\n\n```\ntargets = tf.placeholder([batch_size, sequence_length, output_size])\n# targets is padded with zeros in the same way as sequence has been done\ndef cost(targets):\n    cross_entropy = targets * tf.log(output)\n    cross_entropy = -tf.reduce_sum(cross_entropy, reduction_indices=2)\n    mask = tf.sign(tf.reduce_max(tf.abs(target), reduction_indices=2))\n    cross_entropy *= mask\n\n    # Average over all sequence_length\n    cross_entropy = tf.reduce_sum(cross_entropy, reduction_indices=1)\n    cross_entropy /= tf.reduce_sum(mask, reduction_indices=1)\n    return tf.reduce_mean(cross_entropy)\n```\n\n#### Case 2: Output at the last timestep\n__Example__: Get the last output for every batch element:\n```\ndef last_relevant(output, length):\n    batch_size = tf.shape(output)[0]\n    max_length = tf.shape(output)[1]\n    out_size = int(output.get_shape()[2])\n    index = tf.range(0, batch_size) * max_length + (length - 1)\n    flat = tf.reshape(output, [-1, out_size])\n    relevant = tf.gather(flat, index)\n    return relevant\n```\n\n### Bidirectionnal Recurrent Neural Network\nNot so different from the standart ```dynamic_rnn```, we just need to pass cell for forward and backward pass, and it will return two outputs, and two states variables, both tuples\nExample:\n```\ncell = tf.nn.rnn_cell.LSTMCell(num_units=hidden_size, state_is_tuple=True)\n \noutputs, states  = tf.nn.bidirectional_dynamic_rnn(\n    cell_fw=cell, # same cell for both passes\n    cell_bw=cell,\n    dtype=tf.float64,\n    sequence_length=X_lengths, # didn't mention them in the snippet\n    inputs=X)\noutput_fw, output_bw = outputs\nstates_fw, states_bw = states\n```\n\n\n\n# Higher order operators\n* tf.map_fn() : apply a function to a list of elements. This function is quite useful in combination with complex tensorflow operation that operate only on 1D input such as ```tf.gather()```.\n```\narray = (np.array([1, 2]), np.array([2, 3])\ntf.map_fn(lambda x: (x[0] + x[1], x[0] * x[1]), array)\n# =\u003e return ((3, 5), (2, 6))\n```\n* tf.foldl(): accumulate and apply a function on a sequence.\n```\narray = np.array([1, 3, 4, 3, 2, 4])\ntf.foldl(lambda a, x: a + x, array) =\u003e 17\ntf.foldl(lambda a, x: a + x, array, initializer=3) =\u003e 20\ntf.foldr(lamnbda a, x: a + x,  array) =\u003e -9\n```\n* tf.scan(): \n```\ntf.scan(loop_element, range_element: function(), elems = all_elems_to_iterate_over,\n                         initializer=  initializer\n# function() should return a tensor of shape initializer\n# loop_element is of shape initializer\n# range_element is iterate over and is not always necessary (ex: np.arange(10))\n# scan return a vector of all vector of shape initializer\n```\n\n* tf.while_loop(condition, body, init)\n```\ninit = (i, (j,k))\ncondition = lambda i, _: i\u003c10\nbody = lambda i, jk: return (i+1, (jk[0] - jk[1], jk[0] + jk[1]))\n(i_final, jk_final) = tf.while_loop(condition, body, init)\n```\n\n# Debugging and Tracing\n## Debugging\nDebugging tensorflow variables is becoming easier with the **working** tensorflow Debugger. I found it useful (in a sense, that it is better than nothing), but I'm always spending hours finding the correct variables in the list of variable names. Here is how to activate tensorflow debugger:   \n```\nfrom tensorflow.python import debug as tf_debug\n\nsess = tf_debug.LocalCLIDebugWrapperSession(sess)\n```\nYou can create filter. If so, the debugger might run until fitlering catch a value. Here is an example to catch nan values:\n```\nsess.add_tensor_filter(\"has_inf_or_nan\", tf_debug.has_inf_or_nan)\n\n# where the filter is defined this way\ndef has_inf_or_nan(datum, tensor):\n  return np.any(np.isnan(tensor)) or np.any(np.isinf(tensor))\n```\n\nIn practise, a command line will prompt at first sess.run.\nHere is a non exhaustive list of useful command:\n* Page down\\up to move in the page (clicking in the terminal is also available)\n* Print the value of a tensor: ````pt hidden/Relu:0```\n* Print a sub-array ```pt hidden/Relu:0[0:50,:]```\n* Print a sub-array and highlight specific element in a given range ```pt hidden/Relu:0[0:10,:] -r [1,inf]```\n* Navigate to the current index of a tensor being displayed @[10, 0]\n* Search for regex pattern such as /inf or /nan\n* Display information about the node attribute ```ni -a hidden/Relu:0```.\n* Display information about the current run ```run_info or ri```\n* ```help``` command\n* Run a session until a filter catch something ```run -f filter_name``` (Note that the filter name is the filter name passed to add_tensor_filter).\n* Run a session for a number of step: run -t 10\n\n## Tracing\nIt is possible to trace one call of ```sess.run``` with minimal code modification.  \n[cupt64_80.dll error](https://github.com/tensorflow/tensorflow/issues/6235)  \n\n```\nrun_metadata = tf.RunMetadata()\nsess.run(op,\n         feed_dict,\n         options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE),\n         run_metadata=run_metadata)\n# run_metadata contains StepStats protobuf grouped by device\n\nfrom tensorflow.python.client import timeline\ntrace = timeline.Timeline(step_stats=run_metadata.step_stats)\n\ntrace_file = open('timeline.ctf.json', 'w')\ntrace_file.write(trace.generate_chrome_trace_format())\n\n# open chrome, chrome://tracing\n# search the file :)\n```\n\n## Debugging function\n### tf.Print\n```python\n# examples\nout = fully_connected(out, num_outputs)\n# tf.Print() is an Identity operation. \nout = tf.Print(out, \n               list_of_tensor_to_print, \n               str_message, \n               first_nb_times_to_log, \n               nb_element_to_print)\n\n```\n\n### tf.Assert\nAssert operation should always be used with conditionnal dependance. One way to do this is to create a collection of assertions, and group them before passing them as a run operation.\n```python\ntf.add_to_collection('Assertions',\n         tf.Assert(tf.reduce_all(whatever_condition), \n                   [tensor_to_print_if_condition], \n                   name=...)\n\n# Then group assertion\nassert_op = tf.group(*tf.get_collection('Assertions'))\n... = session.run([train_op, assert_op], feed_dict={...})\n```\n\n### Python trick\n* ```from IPython import embed; embed()```: Open an IPython shell in the current context. It stops the execution.\n\n# Miscellaneous\n* Use any Numpy operations in the graph. Note that it does not support model serialization\n    ```\n    def function(tensor):\n        return np.repeat(tensor,2, axis=0)\n    inp = tf.placeholder(tf.float32, [None])\n    op = tf.py_func(function, [inp], tf.float32)\n\n    sess = tf.InteractiveSession()\n    print(np.repeat([4, 3, 4, 5],2, axis=0))\n    print(op.eval({inp: [4, 3, 5, 4]}))\n    ```\n* ```tf.squeeze(dens)``` Remove all dimension of length 1\n* ```tf.sign(var)``` return -1, 0, or 1 depending the var sign.\n* ```tf.reduce_max(3D_tensor, reduction_indices=2)``` return a 2D tensor, where only the max element in the 3dim is kept.\n* ```tf.unstack(value, axis=0)```: If given an array of shape (A, B, C, D), and an axis=2, it will return a list of |C| tensor of shape (A, B, D).\n* ```tf.nn.moments(x, axes)```: return the mean and variance of the vector in the dimension=axis\n* ```tf.nn.xw_plus_b(x, w, b)```: explicit\n* tf.global_variables(): return every new variables that are shred across machines in a distributed environment. Each time a Variable() constructor is called, it adds a new variabl ot he graph collection\n* tf.convert_to_tensor(args, dtype): (tf.convert_to_tensor([[1, 2],[2, 3]], dtype=tf.float32)): convert an numpy array, a python list or scalar, to a Tensor.\n* ```tf.placeholder_with_default(defautl_output, shape)```: One can see a placeholder as an element in the graph that must be fed an output value with the feed dictionnary, however it is possible to define placeholder that take default value.\n* ```tf.variable_scope(name_or_scope, default_name)```: if name_or_scope is None, then scope.name is default_name.\n* ```tf.get_default_graph().get_operations()```: return all operations in the graph, operations can be filtered by scope then with the python function ```startwith```. It returns a list of tf.ops.Operation\n* ```tf.expand_dims([1, 2], axis=1)``` return a tensor where the axis dimensions is expanded. Here the new shape will be (2) -\u003e (2, 1)\n* ``` tf.pad(image, [[16, 16], [16, 16], [0, 0]])```: pad a tensor. Here the tensor is a 3D tensor of shape (5, 4, 3) for example. Afterwards it will be of size (16 + 5 + 16, 16 + 4 + 16, 0 + 3 + 0), where zeros are add _upper_ and _after_ the current vector.\n* ```tf.groups(op_1, op_2, op_3)``` can be pass to sess.run and it will run all operations (but it will not return any output, only computed operations) \n* ```tf.nn.sparse_softmax_cross_entropy_with_logits(labels, logits) expects labels to be int32 of size (batchsize), where every element is an integer from 0 to nbclasses. logits should be a float32 vector of size (batchsize, nbclasses) with values in it are not probabilities (logit form, before softmax)\n* ```tensor.get_shape().assert_is_compatible_with(shape=)```: Check if shape matched\n* ```tf.cond(pred, fn1, fn2)```: Given a condition, fn1 or fn2 (a callable) is return. Here is an example to return a rgb image if it isn't already one: \n    ```\n    image = tf.cond(pred=tf.equal(tf.shape(image)[2], 3), fn2=lambda: tf.image.grayscale_to_rgb(image), fn1=lambda: image)\n    ```\n* FLAGS is an internal mecanism that allowed the same functionnality as argparse\n* Create an operator to run in a sess that will clip values\n  ```\n  clip_discriminator_var_op = [var.assign(tf.clip_by_value(var, clip_value_min, clip_value_max)) for\n                                         var in list_tf_variables]\n  ``` .\n\n### Tensor operations\n* ```tf.slice(tensor, begin_tensor, slice_tensor)```. Extract a slice of a tensor. For example, if tensor has 3 dimension, begin_tensor[i] represents the offset to start the slice in the i dimensions, and slice_tensor[i]  represents the number of value to take in every dimension i  \n* ```tf.split(axis, numb_splits, tensor)```. Split a tensor along the axis in numb_splits. Return numbsplits tensors.\n* ```tf.tile(tensor, multiple)```. Repeat a tensor in dimensions i by multiple[i]\n* ```tf.dynamic_partition(tensor, partitions, num_partitions)```: Split a tensor into multiple tensor given a partitions vector. If partitions = [1, 0, 0, 1, 1], then the first and the last two elements will form a separate tensor from the other. Return a list of tensor.\n* ```tf.one_hot(tensor, depth, on_value=1, off_value=0, axis=-1)``` replace all indices by a one hot tensor. If tensor is of shape ```batch_size x nb_indices```, a new dim of size ```depth``` is added in the ```axis``` dimension\n\n# Tensorflow fold\nI've used it once, not useful :/ \nAll tensorflow_fold function to treat sequences:\n* td.Map(f): Takes a sequence as input, applies block f to every element in the sequence, and produces a sequence as output.\n* td.Fold(f, z): Takes a sequence as input, and performs a left-fold, using the output of block z as the first element.\n* td.RNN(c): A recurrent neural network, which is a combination of Map and Fold. Takes an initial state and input sequence, use the rnn-cell c to produce new states and outputs from previous states and inputs, and returns a final state and output sequence.\n* td.Reduce(f): Takes a sequence as input, and reduces it to a single value by applying f to elements pair-wise, essentially executing a binary expression tree with f.\n* td.Zip(): Takes a tuple of sequences as inputs, and produces a sequence of tuples as output.\n* td.Broadcast(a): Takes the output of block a, and turns it into an infinitely repeating sequence. Typically used in conjunction with Zip and Map, to process each element of a sequence with a function that uses a.\n\n# Tensorflow Estimator\n### Create an input function\n```\ndef get_input_fn():\n    def input_fn():\n        # This function must be able to be used as a generator function which will be fed\n        # to the tensorflow queues\n        return features, labels # both are tensors\n    return input_fn\n```\n\n\n### Create a model function\nHere is a simple example of how to create a model function\n```\ndef model_fn(features, targets, mode, params):\n    \"\"\"\n    features: All the feature vector. If the input_fn returns a dictionary of features, this will be a dictionary\n    targets: All the labels, same as features, but can be None if no labels are needed\n    Mode: ModeKeys. This is useful to decide how to build the model given the mode. If training you want to return a training operation, while in evaluation mode, you only need logits for example \n    params: a dictionary of params (Optional)\n    \"\"\"\n\n    # 1. build NN out of the inputs which are contained in features and targets\n    if mode == ModeKeys.TRAIN:\n        predicitons = None\n        train_op, loss, _ = CreateNeuralNetwork(....) # define it how you want\n    elif mode == ModeKeys.EVAL:\n        train_op = None\n        _, loss, predicitons = CreateNeuralNetwork(..., eval_mode=True)\n\n\n    return tensorflow.python.estimator.model_fn.EstimatorSpec(\n        mode=mode,\n        predictions=predictions, # will be provide if you call estimator.predic()\n        loss=loss, \n        train_op=train_op)  # will be use if train_op is not None\n```\n\n### Create an estimator\n```python\nnn = tf.contrib.learn.Estimator(model_fn = model_fn,\n                                params=some_parameters,\n                                model_dir=where_to_log,\n                                contrib=tf.contrib.learn.RunConfig(save_checkpoints_sec=10)) # In RunConfig, you define when to save the model, where...\n```    \n\n### Add values to monitor to the Estimator\nYou can attach ```hooks``` around an Estimator that will be used when the Estimator is training/predicting/evaluating. For example, when training, you might want to monitor some metrics, such as accuracy, loss.  Here is a link of the most [common hooks](https://github.com/tensorflow/tensorflow/blob/r1.2/tensorflow/python/training/basic_session_run_hooks.py) but you can define your hooks.\n\n### Train the estimator\n```\nnn.train(input_fn, hooks=[logging_hook], steps=1000)\n```\n\n### Evaluate a model\n```\nnn.evaluate(input_fn, hooks=[logging_hook], steps=1000)\n```\n\n### Using ```tf.contrib.learn.Experiment```\nExperiment is a wrapper around ```Estimator``` that allows you to simultaneously train and evaluate, with minimal extra code. \nHere is an example:\n```\ndef train_and_evaluate(self, train_params, validation_params, extra_hooks=None):\n        self.training_params = train_params\n\n        input_fn = get_training_input_fn() # Return an input function for training\n        validation_input_fn = get_input_fn() # Return an input function for validation\n\n        self.experiment = tf.contrib.learn.Experiment(estimator=self.estimator, # an estimator\n                                                      train_input_fn=input_fn, \n                                                      eval_input_fn=validation_input_fn,\n                                                      train_steps=train_params.get(\"steps\", None),\n                                                      eval_steps=1,\n                                                      train_monitors=extra_hooks,\n                                                      train_steps_per_iteration=100) # 100 iteration of training before evaluation is called\n\nself.experiment.train_and_evaluate()\n``` \n\n### Conclusion\nIn practice, I find ```Estimator``` very useful as it abstracts a lot of boilerplate code (saving, restoring, monitoring). It also forces you to decouple your code between creating a model and creating the input of your model.\n\n# Sonnet\nSonnet is one of the best library builds for Tensorflow. It allows you to group part of a Tensorflow graph as modules. You don't have to worry about scope. At the end of the journey, you write better code, less code, and reusable code. \nHere is an [example](https://github.com/louishenrifranc/attention/blob/master/attention/modules/encoders/encoder_block.py) of a module.\n\n* Everything should inherit from ```sonnet.AbstractModule```.\n* The main idea is to have module that gets called multiple times, but variable is created only once.\n\n### Already defined module\n* ```Linear(output_size, initializers={'w': ..., 'b': ...})```\n* ```SelectInput(idx=[1, 0])(input0, input1) --\u003e (input1, input0)```\n\n* ```AttentiveRead```: See here an example:  \n```\nlogit_mode = some_func # produces logit corresponding to a attention vector slot compability\na = AttentiveRead(attention_logit_mod=...)\n_build(memory : tf.Tensor([batch_size, num_att_vec, attention_dim]),\n       query: tf.Tensor([batch_size, vector_to_attend_size],\n       mask : tf.Tensor([batch_size, num_att_vec])))\n--\u003e return [batch_size, attention_dim]: computed weighted sum,\n           [batch_size, num_att_vec]: softmax weights\n           [batch_size, num_att_vec]: unormalized weights\n```\n* ```LSTM(hidden_size)```. Also possibility to apply batch norm on each input\n\n\n### Define your own module\n* Inherit ```snt.AbstractModule()```, and call ```super(BaseClas, self).__init__(name=name_module)```\n* Implement ```_build()```, and inside always create variables with ```tf.get_variables()```\n* If you want to enter the scope of the module (outside of build), do it inside ```with self.enter_variable_scope()``` if you want to create variables\n\n### Define your recurrent module\n* Inherit ```snt.RNNCore```\n* Implement ```_build()``` which compute one timestep\n* Implement ```state_size```, and ```output_size``` which are properties of the cell\n\n### Share variable scope between multiple functions\nExample:\n```\nclass GAN(snt.AbstractModule):\n    ...\n\n    def _build(input)\n       fake = self.generator(input)\n       return self.discirminator(fake)\n\n    @snt.experimental.reuse_vars\n    def discriminator(sample)\n        ...\n\n\n    @snt.experimental.reuse_vars\n    def generator(sample)\n        ...\n\n\ngan = GAN()\nfake_disc_out = gan(noise)\n# shared variable even if not in build and not enter_variable_scope\ntrue_disc_out = gan.generator(true)\n```\n\n### Notes\n* Get variables of the module: ```self.get_variables()```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvarshneydevansh%2Ftfcheatsheet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvarshneydevansh%2Ftfcheatsheet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvarshneydevansh%2Ftfcheatsheet/lists"}