{"id":15934633,"url":"https://github.com/datitran/ml-berlin-blog","last_synced_at":"2026-01-12T07:28:17.932Z","repository":{"id":107997776,"uuid":"91670359","full_name":"datitran/ml-berlin-blog","owner":"datitran","description":null,"archived":false,"fork":false,"pushed_at":"2017-05-19T11:30:22.000Z","size":9,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-10-29T08:04:42.733Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datitran.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-05-18T08:46:33.000Z","updated_at":"2020-03-24T11:54:19.000Z","dependencies_parsed_at":null,"dependency_job_id":"45859f07-e848-4f18-b925-68ac34b884d5","html_url":"https://github.com/datitran/ml-berlin-blog","commit_stats":{"total_commits":3,"total_committers":1,"mean_commits":3.0,"dds":0.0,"last_synced_commit":"ae6ff81bcaea9328c89f6feae78253e456c93e59"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/datitran/ml-berlin-blog","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datitran%2Fml-berlin-blog","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datitran%2Fml-berlin-blog/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datitran%2Fml-berlin-blog/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datitran%2Fml-berlin-blog/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datitran","download_url":"https://codeload.github.com/datitran/ml-berlin-blog/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datitran%2Fml-berlin-blog/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28336514,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-12T06:09:07.588Z","status":"ssl_error","status_checked_at":"2026-01-12T06:05:18.301Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-07T03:20:25.750Z","updated_at":"2026-01-12T07:28:17.916Z","avatar_url":"https://github.com/datitran.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Keras or TensorFlow? - An Incomplete Introductory Guide In Both APIs\n\nKeras or TensorFlow? Which one should I use? This blog post describes the differences between the [Keras API](https://keras.io/) against the [TensorFlow (TF) API](https://www.tensorflow.org/) for Python, my language of choice for machine learning. In particular, I will focus on the most important features that are needed to get started into deep learning. At the end, you should be hopefully able to make a decision yourself which one to use.\n\nMost of the features will be demonstrated with the help of a simple linear regression problem. Please note that I will not be able to cover every topic in detail or some of the features that you might think are important may be not in there. But this post is rather kept _simplistic_ and should serve as an overview. I will give pointers to more advanced/additional topics - that’s why it is incomplete. Moreover, this is not a typical tool versus another tool blog post. I don’t like to compare oranges against apples if you understand what I mean;)\n\n## General\n\n**Keras:**\n\n* Keras is a lightweight high-level wrapper around numerical computation libraries ([called backend](https://keras.io/backend/)) dedicated to deep learning such as TensorFlow and Theano\n* It is open source and written in Python (supports 2.7-3.5)\n* It was originally created by François Chollet ([@fchollet](https://twitter.com/fchollet))\n* Keras is now integrated into TensorFlow\n\n\n**TensorFlow:**\n\n* TensorFlow is an open source software library for numerical computation that uses data flow graphs (many machine learning models, in particular neural networks, can be visualized via directed graphs)\n* Competitors are  for example Torch, Theano, and Caffe\n* TF was originally developed by Google’s Brain team\n* It supports Python, C++, Haskell, Java and Go\n* In TensorFlow there are two important concepts:\n    1. Everything is a tensor which is an n-dimensional matrix\n        * 0-d tensor: scalar\n        * 1-d tensor: vector\n        * 2-d tensor: matrix\n        * ...\n    2. Lazy evaluation of the computational graph (a series of TF operations arranged into a graph of nodes) i.e. computation is separated from execution\n* A TF core program consists of two sections:\n    1. Building the computational graph\n    2. Running the computational graph\n\n## Installation\n\n**Keras:**\n\n```sh\npip install keras\n```\n\n**TensorFlow:**\n\nCPU:\n```sh\npip install tensorflow\n```\n\nGPU:\n```sh\npip install tensorflow-gpu\n```\n\n\n* For GPU support [CUDA](https://developer.nvidia.com/cuda-toolkit) and [cuDNN](https://developer.nvidia.com/cudnn) are needed ([check out the official TF guide](https://www.tensorflow.org/install/) for more information)\n\n## Import\n\n**Keras:**\n\n```python\nimport keras\n```\n\n**TensorFlow:**\n\n```python\nimport tensorflow as tf\n```\n\n## Building the Model\n\n**Keras:**\n\nWe can easily create linear to complex models through the [`sequential`](https://keras.io/models/sequential/) API, which stacks up layers in a linear fashion. For linear regression, we can write this as follow:\n\n```python\nfrom keras.models import Sequential\nfrom keras.layers.core import Dense\n\nmodel = Sequential()\nmodel.add(Dense(units=1, activation=\"linear\", input_dim=1))\n```\n\n* Keras includes many commonly used layers:\n    - Regular `dense` layers for feed-forward neural networks\n    - `Dropout`, `normalization` and `noise` layers to prevent overfitting and improve learning\n    - Common `convolutional` and `pooling layers` (max and average) for CNNs\n    - `Flatten` layers to add fully connected layers after convolutional nets\n    - `Embedding` layers for Word2Vec problems\n    - `Recurrent` layers (simpleRNN, GRU and LSTM) for RNNs\n    - `Merge` layers to combine more than one neural net\n    - Many more… you can also [write your own Keras layers](https://keras.io/layers/writing-your-own-keras-layers/)\n* Many common activation functions are available like `relu`, `tanh` and `sigmoid` depending on the problem that you like to solve (read [“Yes you should understand backprop” by Andrej Karpathy](https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b) if you want to understand the effect of backpropagation on some activation functions)\n* The activation function can also be passed through an `Activation` layer instead of the `activation` argument\n\nAlternatively if you prefer one-liners, we could have also done something like this:\n\n```python\nmodel = Sequential([Dense(units=1, activation=\"linear\", input_dim=1)])\n```\n\nOr we could have used [Keras’s functional API](https://keras.io/getting-started/functional-api-guide/):\n\n```python\nfrom keras.models import Model\nfrom keras.layers import Dense, Input\n\nX = Input(shape=(1,))\nY = Dense(units=1, activation =\"linear\")(X)\n\nmodel = Model(inputs=X, outputs=Y)\n```\n\nThen we need to configure the learning settings, which is done via the `compile` step. For linear regression it makes sense to use [mean square error](https://en.wikipedia.org/wiki/Mean_squared_error) to evaluate the quality of our estimated model, `loss=\"mean_squared_error\"` (other [loss functions](https://keras.io/losses/) like `cross-entropy` are also available out-of-the-box).\n\nWe will also use the standard settings for [stochastic gradient descent (SGD)](https://en.wikipedia.org/wiki/Stochastic_gradient_descent), `optimizer=\"sgd\"`\n\n```python\nmodel.compile(loss=\"mean_squared_error\", optimizer=\"sgd\")\n```\n\nIf you want different settings for the optimizers then do this:\n\n``` python\nfrom keras.optimizers import SGD\nsgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)\nmodel.compile(loss=\"mean_squared_error\", optimizer=sgd)\n```\n\nKeras supports [most common optimizers](https://keras.io/optimizers/) like RMSprop, Adagrad and many others. An excellent overview of different optimization algorithms and what effect they have while training a neural network can be found by this [blog article of Sebastian Ruder](http://sebastianruder.com/optimizing-gradient-descent/index.html#gradientdescentoptimizationalgorithms).\n\n**TensorFlow:**\n\nIn TF, we first need to build up the computational graph. We define the inputs first:\n\n```python\nX = tf.placeholder(tf.float32, name=\"X\")\nY = tf.placeholder(tf.float32, name=\"Y\")\n\nW = tf.Variable(tf.random_normal(shape=[]), name=\"weight\")\nb = tf.Variable(tf.random_normal(shape=[]), name=\"bias\")\n```\n\n* A [placeholder](https://www.tensorflow.org/api_docs/python/tf/placeholder) is a tensor where values are provided later\n* [Variables](https://www.tensorflow.org/api_docs/python/tf/Variable) are tensors which allow us to add trainable parameters to a graph\n* There are also [constants, sequences and random tensors](https://www.tensorflow.org/api_guides/python/constant_op)\n\nThen we need to define the linear model, the loss function and the optimizer. We will use the same loss (`mse`) and optimizer (`sgd`) as in the Keras case.\n\n```python\nY_predicted = tf.add(tf.multiply(X, W), b)\ncost = tf.losses.mean_squared_error(labels=Y, predictions=Y_predicted)\noptimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)\n```\n\n* The way how we define the type/number of layers along with its activation is very different in TF than in Keras - it is much more explicit\n* It is a good practice to wrap up all the placeholders, variables, constants, model definitions etc in a `Graph` class especially if you want to use multiple graphs in the same process\n```python\ng = tf.Graph()\nwith g.as_default():\n    # Define the computational graph\n    ...\n```\n* It also has many common used layers like Keras out-of-the box, see [`tf.layers`](https://www.tensorflow.org/api_docs/python/tf/layers) and [`tf.contrib.layers`](https://www.tensorflow.org/api_docs/python/tf/contrib/layers)\n* TF also supports a wide range of [loss functions](https://www.tensorflow.org/api_docs/python/tf/losses) and [optimizers](https://www.tensorflow.org/api_guides/python/train#Optimizers) like Keras\n\n## Training the Model\n\n**Keras:**\n\n```python\nmodel.fit(x, y)\n```\n\nIn the standard output we get a progress bar which shows the training progress (full gradient update), the loss at each epoch and the epoch itself:\n\n```sh\nEpoch 1/10\n100/100 [==============================] - 0s - loss: 1.1310\nEpoch 2/10\n100/100 [==============================] - 0s - loss: 0.5461\nEpoch 3/10\n100/100 [==============================] - 0s - loss: 0.2656\n...\n```\n\nWe can also control the size of the batch per gradient update (common sizes are 32, 64, 128, 256...) through `batch_size`.\n\n```python\nmodel.fit(x, y, batch_size=1)\n```\n\nWe can also easily evaluate our model with `validation_split` which holds a given percentage between 0 and 1 of the data back.\n\n```python\nmodel.fit(x, y, batch_size=1, validation_split=0.2)\n\nTrain on 80 samples, validate on 20 samples\nEpoch 1/10\n80/80 [==============================] - 0s - loss: 0.9673 - val_loss: 0.6618\nEpoch 2/10\n80/80 [==============================] - 0s - loss: 0.5153 - val_loss: 0.3498\nEpoch 3/10\n80/80 [==============================] - 0s - loss: 0.2726 - val_loss: 0.1853\n...\n```\n\nFinally, we can also use [`callback`](https://keras.io/callbacks/) which is set of functions that can used during training. Important functions are:\n\n* `keras.callbacks.History()` - Recording the loss history, `loss` and `val_loss` if `validation_split` is set\n* `keras.callbacks.ModelCheckpoint()` - Saving the model after each epoch\n* `keras.callbacks.EarlyStopping()` - Stopping training early depending on the monitored metric such as `val_loss`\n* `keras.callbacks.TensorBoard()` - Storing the log for Tensorboard\n\n**TensorFlow:**\n\nFor training the model we need to run the computational graph via a [`Session`](https://www.tensorflow.org/api_docs/python/tf/Session) object. Below is the most simplistic way to do this which prints the training loss after each epoch:\n\n```python\n# Parameters\nepochs = 10\nlearning_rate = 0.01\n\n# Start the session and initialized variables\nsess = tf.Session()\nsess.run(tf.global_variables_initializer()) # Variables needs to be initialized\n\n# Train the model for n epochs\nfor epoch in range(epochs):\n    print(\"Epoch: {}/{}\".format(epoch+1, epochs))\n    for i, j in zip(x, y): # batch_size=1\n        _, loss = sess.run([optimizer, cost], feed_dict={X: i, Y: j})\n    print(\"loss : {}\".format(loss))\n\nprint(sess.run([W, b]))\nsess.close()\n```\n\n* Functionalities like early stopping, a Keras-like progress bar for training or even validation split have to be implemented; though some of the features like early stopping are available via `tf.contrib` but those are experimental code\n* It is also a good practice to start and close the `Session` with a `with` statement\n```python\nwith tf.Session(graph=g) as sess:\n    # Run the computational graph\n    ...\n```\n\n### Reproducibility\n\n* To get reproducible results during the training process in TensorFlow only it makes sense to fix the [graph-level random seed](https://www.tensorflow.org/api_docs/python/tf/set_random_seed) by setting `tf.set_random_seed(...)` for all operations or for a specific one (Please note that random seed only affects the active graph only)\n* For Keras while using TF as backend, we need to fix both numpy’s and TF’s seed:\n\n``` python\nimport numpy as np\nnp.random.seed(...)\nimport tensorflow as tf\ntf.set_random_seed(...)\n```\n\n## Saving the Model\n\n**Keras:**\n\nSave the model with `model.save()` and reload the model with `load_model()`:\n\n```python\nfrom keras.models import load_model\n\n# save model\nmodel.save(\"./model.h5\")\n\n# load model\nmodel_loaded = load_model(\"./model.h5\")\n```\n\n* `pip install h5py` before saving the model as Keras uses [HDF5](https://support.hdfgroup.org/HDF5/) to store its models and this doesn’t come with installing Keras\n* You can also [save the model architecture (JSON/YAML) and model weights separately](https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model)\n\n**TensorFlow:**\n\nTensorFlow models can be saved via the `tf.train.Saver()` object. Usually what you do is:\n\n```python\ng = tf.Graph()\nwith g.as_default():\n    # Define the computational graph\n    ...\n\n    # Initialize the variables\n    init = tf.global_variables_initializer()\n\n    # Add saver to store the model\n    saver = tf.train.Saver()\n\nwith tf.Session(graph=g) as sess:\n    sess.run(init)\n    # Run the computational graph\n    ...\n\n    # Save the session in a file\n    save_path = saver.save(sess, \"./model.ckpt\")\n```\n\nRestoring the model:\n\n```python\ng = tf.Graph()\nwith g.as_default():\n    # Recreate the computational graph\n    ...\n\n    # Add saver to load the model\n    saver = tf.train.Saver()\n\nwith tf.Session(graph=g) as sess:\n    # Restore the model\n    saver.restore(sess, \"./model.ckpt\")\n```\n\n* TF’s `Saver` actually saves an [intermediate state (called checkpoint)](https://www.tensorflow.org/programmers_guide/meta_graph) of the trained model (weights, the graph and its metadata for different timesteps)\n* Have a look at [this StackOverFlow post](http://stackoverflow.com/questions/33759623/tensorflow-how-to-save-restore-a-model) if you want to restore your model without defining the graph again\n* TF uses [Protocol Buffers](https://developers.google.com/protocol-buffers/?hl=en) (`protobuf`) to [save all its files to disk](https://www.tensorflow.org/extend/tool_developers/)\n* [`freeze_graph`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py) can be used to save the model into a single file - this is important when serving a model in production as we don’t need any special metadata files along with the model or if you want to port the models to other languages such as Java, C++ etc… (read [Morgan Giraud’s post](https://blog.metaflow.fr/tensorflow-how-to-freeze-a-model-and-serve-it-with-a-python-api-d4f3596b3adc) for an excellent tutorial on how to freeze a TF model and then serve it as python API with flask)\n\n## Model Information\n\n**Keras:**\n\nA useful feature is `model.summary()` which shows the number of used layers, output shape and number of trainable parameters (alternatively `model.get_config()` can be used to get information of the model as well).\n\n```python\nmodel.summary()\n_________________________________________________________________\nLayer (type)                 Output Shape              Param #\n=================================================================\ndense_9 (Dense)              (None, 1)                 2\n=================================================================\nTotal params: 2\nTrainable params: 2\nNon-trainable params: 0\n_________________________________________________________________\n```\n\n* After the model is trained it is also useful to see the model weights `model.get_weights()`\n* Keras also provides tools to visualize your models through [`keras.utils.vis_utils`](https://keras.io/visualization/) (Note: [`pydot`](https://github.com/erocarrera/pydot) is needed `pip install pydot`)\n* Through `callback`, we can also use Tensorboard\n\n**TensorFlow:**\n\nTF includes [Tensorboard](https://www.tensorflow.org/get_started/summaries_and_tensorboard) - a visualization tool that can be used to better understand, debug and optimize TF programs. Here is a simple example just to record the model graph:\n\n```python\ng = tf.Graph()\nwith g.as_default():\n    # Define the computational graph\n    ...\n\nwith tf.Session(graph=g) as sess:\n    writer = tf.summary.FileWriter(\"./log_folder/\", sess.graph)\n    # Run the computational graph\n    ...\n\n    writer.close()\n```\n\n* It is useful to use the [`summary`](https://www.tensorflow.org/api_guides/python/summary) operations to get more explicit information about the model:\n    - Use the [`tf.summary.scalar`](https://www.tensorflow.org/api_docs/python/tf/summary/scalar) operations to record for example the learning rate and loss\n    - To visualize the distribution of weights or bias, you could use [`tf.summary.histogram`](https://www.tensorflow.org/api_docs/python/tf/summary/histogram)\n* If the neural network is large, i.e. has many nodes, it would make sense to organize logically related operations into groups using [`tf.name_scope`](https://www.tensorflow.org/api_docs/python/tf/name_scope)\n* A more advanced example can be [found on the TF page](https://www.tensorflow.org/get_started/summaries_and_tensorboard)\n\nTo run TensorBoard, use this command:\n\n```sh\npython -m tensorflow.tensorboard --logdir=\"./log_folder/\"\n```\n\nOnce it is running, go to your web browser (`localhost:6006`) to view TensorBoard.\n\n## Prediction\n\n**Keras:**\n\n```python\nmodel.predict(x_test)\n```\n\nWe can use `model.predict(x_test)` to generate predictions (real values in terms of regression and probabilities for classification) or we could use `model.predict_classes(x_test)` to get the class in a classification problem directly.\n\n**TensorFlow:**\n\nSimilar to saving the model, prediction happens in the `session`:\n\n```python\ng = tf.Graph()\nwith g.as_default():\n    # Define the computational graph\n    ...\n\n    # Placeholders \u0026 variables\n    ...\n\n    # Linear model\n    Y_predicted = tf.add(tf.multiply(X, W), b)\n\n    # Cost function and optimizer\n    ...\n\nwith tf.Session(graph=g) as sess:\n    # Run the computational graph\n    ...\n\n    # Train the model\n    ...\n\n    # Make prediction\n    prediction = sess.run(Y_predicted, feed_dict={X: x_test})\n    print(prediction)\n```\n\nFor classification problems in TF, we need to take the `arg_max` of the `tf.nn.softmax(Y_predicted)` function to get the predicted classes.\n\n## Some Other Model Examples\n\n### Logistic Regression\n\n**Keras:**\n```python\nmodel = Sequential()\nmodel.add(Dense(num_classes, activation=\"softmax\", input_shape=(num_features,)))\nmodel.compile(loss=\"categorical_crossentropy\", optimizer=\"sgd\", metrics=[\"accuracy\"])\n```\n\n**TensorFlow:**\n```python\ng = tf.Graph()\nwith g.as_default():\n    X = tf.placeholder(tf.float32, shape=[None, num_features])\n    Y = tf.placeholder(tf.float32, shape=[None, num_classes])\n    W = tf.Variable(tf.zeros([num_features, num_classes]))\n    b = tf.Variable(tf.zeros([num_classes]))\n\n    Y_predicted = tf.add(tf.matmul(X, W),  b)\n\n    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=Y_predicted))\n    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)\n\n    correct_prediction = tf.equal(tf.argmax(Y, 1), tf.argmax(Y_predicted, 1))\n    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))\n```\n\n### Multilayer Perceptron (MLP)\n\n**Keras:**\n```python\nmodel = Sequential()\nmodel.add(Dense(64, activation=\"relu\", input_shape=(num_features,)))\nmodel.add(Dropout(0.5))\nmodel.add(Dense(32, activation=\"relu\"))\nmodel.add(Dropout(0.5))\nmodel.add(Dense(num_classes, activation=\"softmax\"))\nmodel.compile(loss=\"categorical_crossentropy\", optimizer=\"sgd\", metrics=[\"accuracy\"])\n```\n\n**TensorFlow:**\n```python\ng = tf.Graph()\nwith g.as_default():\n    def init_weights(shape):\n        weights = tf.random_normal(shape, stddev=0.1)\n        return tf.Variable(weights)\n\n    X = tf.placeholder(tf.float32, shape=[None, num_features])\n    Y = tf.placeholder(tf.float32, shape=[None, num_classes])\n    w_1 = init_weights((num_features, 64))\n    w_2 = init_weights((64, 32))\n    w_3 = init_weights((32, num_classes))\n    keep_prob = tf.constant(0.5, tf.float32)\n\n    def mlp(X, w_1, w_2, w_3):\n        layer_1 = tf.nn.relu(tf.matmul(X, w_1))\n        layer_1_drop = tf.nn.dropout(layer_1, keep_prob)\n        layer_2 = tf.nn.relu(tf.matmul(layer_1_drop, w_2))\n        layer_2_drop = tf.nn.dropout(layer_2, keep_prob)\n        out_layer = tf.matmul(layer_2_drop, w_3)\n        return out_layer\n\n    Y_predicted = mlp(X, w_1, w_2, w_3)\n\n    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=Y_predicted))\n    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)\n\n    correct_prediction = tf.equal(tf.argmax(Y, 1), tf.argmax(Y_predicted, 1))\n    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))\n```\n\n### Convolutional Neural Network (CNN)\n\n**Keras:**\n```python\nmodel = Sequential()\nmodel.add(Conv2D(32, (5, 5), strides=(1, 1), activation=\"relu\", input_shape=(width, height, img_dim), padding=\"same\"))\nmodel.add(MaxPooling2D(pool_size=(2, 2), padding=\"same\"))\nmodel.add(Conv2D(64, (5, 5), strides=(1, 1), activation=\"relu\", padding=\"same\"))\nmodel.add(MaxPooling2D(pool_size=(2, 2), padding=\"same\"))\nmodel.add(Flatten())\nmodel.add(Dense(1024, activation=\"relu\"))\nmodel.add(Dropout(0.5))\nmodel.add(Dense(num_classes, activation=\"softmax\"))\nmodel.compile(loss=\"categorical_crossentropy\", optimizer=\"sgd\", metrics=[\"accuracy\"])\n```\n\n**TensorFlow:**\n```python\ng = tf.Graph()\nwith g.as_default():\n    X = tf.placeholder(tf.float32, shape=[None, width, height, img_dim])\n    Y = tf.placeholder(tf.float32, shape=[None, num_classes])\n    keep_prob = tf.constant(0.5, tf.float32)\n\n    def weight_variable(shape):\n        initial = tf.truncated_normal(shape, stddev=0.1)\n        return tf.Variable(initial)\n\n    def bias_variable(shape):\n        initial = tf.constant(0.1, shape=shape)\n        return tf.Variable(initial)\n\n    def conv2d(x, W):\n        return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding=\"SAME\")\n\n    def max_pool_2x2(x):\n        return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=\"SAME\")\n\n    W_conv1 = weight_variable([5, 5, 1, 32])\n    b_conv1 = bias_variable([32])\n\n    h_conv1 = tf.nn.relu(conv2d(X, W_conv1) + b_conv1)\n    h_pool1 = max_pool_2x2(h_conv1)\n\n    W_conv2 = weight_variable([5, 5, 32, 64])\n    b_conv2 = bias_variable([64])\n\n    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)\n    h_pool2 = max_pool_2x2(h_conv2)\n\n    W_fc1 = weight_variable([7 * 7 * 64, 1024])\n    b_fc1 = bias_variable([1024])\n\n    h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])\n    h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)\n    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)\n\n    W_fc2 = weight_variable([1024, num_classes])\n    b_fc2 = bias_variable([num_classes])\n\n    Y_predicted = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)\n\n    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=Y_predicted))\n    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)\n\n    correct_prediction = tf.equal(tf.argmax(Y_predicted, 1), tf.argmax(Y, 1))\n    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))\n```\n\n## Useful Helpers\n\n### Datasets\n\nUsually, it would be very convenient to have some built-in datasets so that we can jump start some examples...\n\n**Keras:**\n\n* Classification: CIFAR-10, CIFAR-100, MNIST, IMDB movie reviews, Reuters newswire\n* Regression: Boston housing price\n\nMore information can be found on [the Keras page](https://keras.io/datasets/#datasets).\n\n**TensorFlow:**\n\nTF provides a couple of datasets but they are spread everywhere:\n\n* Some of the datasets can be found in the [`learn` contribution module](https://www.tensorflow.org/get_started/tflearn), e.g. the Iris and Boston housing price dataset, MNIST etc..\n* Through [TF-Slim](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim) (also part of the contribution module) which is another lightweight high-level API for TF but developed by TensorFlow’s team, we can also get [additional datasets](https://github.com/tensorflow/models/tree/master/slim) such as Flowers, CIFAR-10, MNIST and ImageNet\n\n### Pre-Trained Models\n\nFor many image classification problem, we normally don’t train the models from scratch (because it is computationally expensive or we don’t have much data) but we often start from pre-trained models and fine-tune it...\n\n**Keras:**\n\n* Keras has [several pre-trained models](https://keras.io/applications/#available-models) e.g. Xception, VGG16, VGG19 etc.., that are trained on ImageNet\n\n**TensorFlow:**\n\n* For TF, pre-trained models exist but are not in-built e.g. some of the pre-trained models can [be accessed via TF-Slim](https://github.com/tensorflow/models/tree/master/slim#Pretrained)\n\n### Others\n\n**Keras:**\n\n* Preprocessing tools for [sequences](https://keras.io/preprocessing/sequence/), [text](https://keras.io/preprocessing/text/) and [image](https://keras.io/preprocessing/image/) data (the image preprocessing is awesome as you can easily augment image data with a number of random transformations e.g. rotation, zoom etc... - [see this post from the Keras author](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html) for a detailed use case)\n* Other useful features:\n    - One-hot encoding of target variables: `from keras.utils.np_utils import to_categorical`\n    - Normalize vector: `from keras.utils.np_utils import normalize`\n\n**TensorFlow:**\n\n* TensorFlow has [TensorFlow Serving](https://tensorflow.github.io/serving/) which is used to operationalize TF models in production (an interesting [use case from Zendesk](https://medium.com/zendesk-engineering/how-zendesk-serves-tensorflow-models-in-production-751ee22f0f4b) using TensorFlow Serving)\n* It provides a specialized debugger called [`tfdbg`](https://www.tensorflow.org/programmers_guide/debugger) to debug your TF program (also check out the [slides from Jongwook Choi](https://wookayin.github.io/tensorflow-talk-debugging/#1) (must read) to get more information on how/where to use it)\n* [Threading and queues](https://www.tensorflow.org/programmers_guide/threading_and_queues) are supported for asynchronous computation (Morgan Giraud provides a [very cool example](https://blog.metaflow.fr/tensorflow-how-to-optimise-your-input-pipeline-with-queues-and-multi-threading-e7c3874157e0) on how to optimize the read data step)\n\n## Conclusion\n\nWe only covered the surface of both APIs. There are many more features that we haven’t talked about. There are pros and cons for both APIs. Keras is particularly made for fast prototyping and it definitely serves its purpose. TensorFlow, on the other hand is much more verbose but you get a higher degree of flexibility and control. There are other wrappers around TensorFlow. For example, we highlighted some features from TF-Slim because it is in-built. Another interesting one is [TF Learn](http://tflearn.org/). Its syntax is quite nice and it has a very good documentation. As you can see there are many options. At the end of the day you have to choose yourself what suits your problem the best.\n\n#### So what is your favorite language of choice for deep learning? Tell me why?\n\n## Some Other Useful Links\n\n* A [Complete Guide](https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html) To Using Keras as Part of a TensorFlow Workflow - François Chollet\n* [Learning Deep Learning with Keras](http://p.migdal.pl/2017/04/30/teaching-deep-learning.html) (Excellent overview for someone who wants to get started into Deep Learning in general) - Piotr Migdal\n* TensorFlow in a Nutshell: Part [One](https://medium.com/@camrongodbout/tensorflow-in-a-nutshell-part-one-basics-3f4403709c9d), [Two](https://chatbotnewsdaily.com/tensorflow-in-a-nutshell-part-two-hybrid-learning-98c121d35392), [Three](https://hackernoon.com/tensorflow-in-a-nutshell-part-three-all-the-models-be1465993930) - Camron Godbout\n* [Some (clean) TensorFlow Examples](https://github.com/aymericdamien/TensorFlow-Examples) by the Author of TF Learn - Aymeric Damien\n* [TensorFlow for Deep Learning Research](http://web.stanford.edu/class/cs20si/) (check out the slides and Github repo) - Chip Huyen\n* [MetaFlow AI](https://blog.metaflow.fr/) (check out their TensorFlow best practice series - it's amazing) - Morgan Giraud \u0026 Thomas Olivier\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatitran%2Fml-berlin-blog","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatitran%2Fml-berlin-blog","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatitran%2Fml-berlin-blog/lists"}