{"id":13574346,"url":"https://github.com/oneapi-src/document-automation","last_synced_at":"2025-04-04T14:32:34.094Z","repository":{"id":66145923,"uuid":"536270342","full_name":"oneapi-src/document-automation","owner":"oneapi-src","description":"AI Starter Kit for Named Entity Recognition using Intel® Optimized Tensorflow (version 2.9.0 with oneDNN)","archived":true,"fork":false,"pushed_at":"2024-02-01T23:56:22.000Z","size":163,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-08-02T15:10:33.341Z","etag":null,"topics":["deep-learning","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oneapi-src.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-09-13T19:00:15.000Z","updated_at":"2024-04-08T18:24:10.000Z","dependencies_parsed_at":"2023-11-27T21:45:06.043Z","dependency_job_id":null,"html_url":"https://github.com/oneapi-src/document-automation","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oneapi-src%2Fdocument-automation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oneapi-src%2Fdocument-automation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oneapi-src%2Fdocument-automation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oneapi-src%2Fdocument-automation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oneapi-src","download_url":"https://codeload.github.com/oneapi-src/document-automation/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223147598,"owners_count":17095545,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","tensorflow"],"created_at":"2024-08-01T15:00:50.690Z","updated_at":"2024-11-05T09:32:09.014Z","avatar_url":"https://github.com/oneapi-src.png","language":"Python","readme":"PROJECT NOT UNDER ACTIVE MANAGEMENT\n\nThis project will no longer be maintained by Intel.\n\nIntel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.  \n\nIntel no longer accepts patches to this project.\n\nIf you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the open source software community, please create your own fork of this project.  \n\nContact: webadmin@linux.intel.com\n# **Tensorflow Named Entity Recognition**\r\n\r\n# **Introduction**\r\nNamed Entity recognition‚ (NER) (also known as entity identification, entity chunking and entity extraction) is a sub-task of \r\ninformation extraction that seeks to locate and classify named entities in text into pre-defined categories such as the names \r\nof persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.\r\n\r\nFor example, Healthcare Payers employ hundreds of clinicians who review millions of claim-related documentation pages searching \r\nfor relevant data points to then draw a conclusion manually. Similarly, other applications which require to review documentation\r\npages searching for relevant data points to draw a conclusion includes Optimizing Search Engine Algorithms, Question-Answer Systems, \r\nGene Identification, DNA Identification, Identification of Drug Names and Disease Names and Simplifying Customer Support.\r\n\r\nAs information is exponentially increasing, often in non-standard formats, making it difficult for organisations to manage; \r\nManual reviews waste time and tends to inaccurate results contributing to dissatisfaction. All those challenges have forced \r\norganisations to look for solutions to automate these applications.\r\n\r\nThe experiment is aimed to expedite the classification of key information from a text, illustrating the selection of relevant \r\ndata points, for example that a clinician will perform from a claim or a claim related document. The goal is therefore to locate \r\nand classify named entities in text into pre-defined categories using Tensorflow BERT Transfer Learning NER Model\r\n\r\n## **Table of Contents**\r\n - [Purpose](#purpose)\r\n - [Reference Solution](#proposed-solution)\r\n - [Reference Implementation](#reference-implementation)\r\n - [Intel® Implementation](#optimizing-the-E2E-solution-with-Intel®-oneAPI)\r\n - [Performance Observations](#performance-observations)\r\n\r\n## **Purpose**\r\nAs indicated above, the main objective is to locate and classify named entities using Tensorflow BERT Transfer Learning NER Model. \r\nThis can be broken down into two sub-tasks: identifying the boundaries of the named entities, and identifying its type.\r\n\r\nWith NER, organizations look to automate to address the following challenges:\r\n- Manual documentation reviews are time-consuming, complex, and monotonous\r\n- The volume of reviews are substantially increasing\r\n- Humans are prone to error, can easily be distracted, and miss critical components\r\n- Non-standard data limits feedback for analytics and operational teams\r\n- Growing shortage of highly skilled staff to review documents\r\n\r\n\u003cp\u003eThe models built for NER are predominately used as intermediate models in complex AI model architecture designed for various data \r\nscience applications. \r\n\r\n## **Reference Solution**\r\nIn this reference kit, we build a deep learning model to predict the named entity tags for the given sentence. We also focus on below critical factors\r\n- Faster model development and \r\n- Performance efficient model inference and deployment mechanism.\r\n\r\n\u003cbr\u003eNamed entity recognition is a task that is well-suited to the type of classifier-based approach. In particular, a \r\ntagger can be built that labels each word in a sentence using the IOB format, where chunks are labelled by their appropriate type. \u003c/br\u003e\r\n\r\nThe IOB Tagging system contains tags of the form:\r\n\r\n```\r\nB - {CHUNK_TYPE} for the word in the Beginning chunk \r\nI - {CHUNK_TYPE} for words Inside the chunk \r\nO - Outside any chunk \r\n```\r\n\r\nThe IOB tags are further classified into the following classes\r\n\r\n```\r\ngeo = Geographical Entity \r\norg = Organization \r\nper = Person \r\ngpe = Geopolitical Entity \r\ntim = Time indicator \r\nart = Artifact \r\neve = Event \r\nnat = Natural Phenomenon.\r\n```\r\nAs documents are constantly changing, clients using these solutions must re-train their models to deal with ever-increased and \r\nchangeable data sets. Therefore, training and inference prediction of named entity tags is done in batches with different dataset \r\nsizes to illustrate this scenario. Inference prediction was also done in real-time (batchsize 1) to illustrate the selection of \r\nrelevant data points similar to a scenario wherein a clinician will get from a claim or a claim related when a document is loaded.\r\n\r\nSince GPUs are the natural choice for deep learning and AI processing to achieve a higher FPS rate but they are also very expensive \r\nand memory consuming, the experiment applies model quantization using Intel's technology which compresses the models using quanitization \r\ntechniques and use the CPU for processing while maintaining the accuracy and speeding up the inference time of named entity tagging.\r\n\r\n### **Key Implementation Details**\r\n\r\nThe reference kit implementation is a reference solution to the Named Entity Recognition use case that includes \r\n\r\n  1. A reference E2E architecture to arrive at an AI solution with BERT Transfer Learning Model using Tensorflow 2.8.0\r\n  2. An Optimized reference E2E architecture enabled with Intel® Optimizations for Tensorflow 2.9.0 \r\n\r\n## **Reference Implementation**\r\n\r\n### ***E2E Architecture***\r\n### **Use Case E2E flow**\r\n\r\n![Use_case_flow](assets/e2e_flow.png)\r\n\r\n### ***Hyper-parameter Analysis***\r\n\r\nIn realistic scenarios, an analyst will run the BERT Transfer Learning Model multiple times on the same dataset, scanning across different hyper-parameters.  To capture this, we measure the total amount of time it takes to generate results across different hyper-parameters for a fixed algorithm, which we define as hyper-parameter analysis.  In practice, the results of each hyper-parameter analysis provides the analyst with many different models that they can take and further analyze.\r\n\r\nThe below table provide details about the hyperparameters \u0026 values used for hyperparameter tuning in our benchmarking experiments:\r\n| **Algorithm**                     | **Hyperparameters**\r\n| :---                              | :---\r\n| BERT Transfer Learning            | Batch size - 1, 32, 64 and 128\r\n\r\nIn the benchmarking results given in later sections, inference time for batch size of 1  which can also be read as real time inference \r\nfor one test sample.\r\n\r\n### **Dataset**\r\n\u003c!-- Dataset Details --\u003e\r\nDataset used in this reference kit is taken from [Kaggle](https://www.kaggle.com/datasets/abhinavwalia95/entity-annotated-corpus)\r\n\u003e *Please see this data set's applicable license for terms and conditions. Intel Corporation does not own the rights to this data set and does not confer any rights to it.*\r\n\r\nEach row in the data set represents sentence and its corresponding named entity tags and contains below features for the training\r\n- ***SentenceID*** - Sentence Identification number\r\n- ***Word*** - The words of the sentence\r\n- ***POS*** - Parts of speech of each word\r\n- ***Tag*** - Named entity tags for each word\r\n\r\nBased on these features \"SentenceID\", \"Word\", \"POS\", the model is trained to predict named entity tag (\"Tag\").\r\n\r\nFollow the below mentioned steps to download the dataset\r\nThe NER dataset is downloaded from kaggle and extracted in a data folder before running the training python module.\r\n\r\n1) Create a data folder using command\r\n```sh\r\nmkdir data\r\n```\r\n2) Create a kaggle folder using \r\n```sh\r\nmkdir kaggle\r\n```\r\n3) Navigate inside the kaggle folder using the command \r\n```sh\r\ncd kaggle\r\n```\r\n4) Install kaggle if not done using the below command:\r\n```sh\r\npip install kaggle\r\n```\r\n6) Go to https://www.kaggle.com/ . Login to your Kaggle account. Go to 'Account Tab' \u0026 select 'Create a new API token'. \r\nThis will trigger the download of kaggle.json file. This file contains your API credentials.\r\n7) Move the downloaded 'kaggle.json' file to folder 'kaggle'\r\n8) Execute the following command\r\n```sh\r\nchmod 600 kaggle.json\r\n```\r\n10) Export the kaggle username \u0026 token to the enviroment\r\n```sh\r\nexport KAGGLE_USERNAME=\"user name\"\r\nexport KAGGLE_KEY=\"key value\"\r\n```\r\n14) The \"user name\" and \"key\" can be found in the kaggle.json file.\r\n15) Run the following command to download the dataset\r\n```sh\r\nkaggle datasets download -d abhinavwalia95/entity-annotated-corpus\r\n```\r\n16) The file \"entity-annotated-corpus.zip\" will be downloaded in the current directory\r\n17) Move the entity-annotated-corpus.zip file into data folder by executing this command.\r\n```sh\r\nmv entity-annotated-corpus.zip ./data/\r\n```\r\n18) Unzip the downloaded dataset by executing below\r\n```sh\r\ncd ./data/\r\nsudo apt install unzip\r\nunzip entity-annotated-corpus.zip\r\n```\r\n19) This will create two files ner_dataset.csv and ner_csv. The ner_dataset.csv will be used to \r\ngenerate the training and testing datasets.\r\n20) Run the script gen_dataset.py in the root directory to generate the training and testing datasets.\r\n```sh\r\ncd ../\r\n\r\npython src/gen_dataset.py --dataset_file ./data/ner_dataset.csv\r\n```\r\n21) The files ner_dataset.csv, ner_test_dataset.csv and ner_test_quan_dataset.csv files will be generated in the current \r\ndirectory.\r\n22) Move these files into \"data\" folder.\r\n```sh\r\nmv ner_dataset.csv ./data/\r\nmv ner_test_dataset.csv ./data/\r\nmv ner_test_quan_dataset.csv ./data/\r\n```\r\n\r\n\u003eNote:\r\nThe dataset consisted of 48K sentences with corresponding parts of speech and labeled tags.\r\nFor data preparation, it has split into train and evaluation sets. For training, the dataset has been reduced \r\nto 24K sentences by removing 50% of records from the original dataset file. For inference, dataset has been \r\nreduced to 2500 records.\r\n\r\n### **Input**\r\nThe input dataset consists of sentences for which named entity tagging need to be performed. Each of the sentences are identified by below features.\r\n1. Unique ID\r\n2. Words of sentence\r\n3. Parts of speech for each word\r\n\r\n\u003cb\u003eExample:\u003c/b\u003e\r\n```\r\nThey marched from the Houses of Parliament to a rally in Hyde Park\r\n```\r\n\r\n|**Unique ID**    |**Word**       |**Parts of speech**\r\n| :---        | :---      | :---\r\n|1            |They\t\t    | PRP\r\n|             |marched\t  |VBD\r\n|             |from\t\t    |IN\r\n|             |the\t\t    |DT\r\n|             |Houses\t    |NNS\r\n|             |of\t\t      |IN\r\n|             |Parliament\t|NN\r\n|             |to\t\t      |TO\r\n|             |a\t\t      |DT\r\n|             |rally\t\t  |NN\r\n|             |in\t\t      |IN\r\n|             |Hyde\t\t    |NNP\r\n|             |Park\t\t    |NNP\r\n|             |.\t\t      |.\r\n\r\n### **Expected Output**\r\nFor the given input sentence and its features, the expected output is the named entity tagging for each of the words in the sentence. \r\n\r\n\u003cb\u003eExample:\u003c/b\u003e\r\n```\r\nThey marched from the Houses of Parliament to a rally in Hyde Park\r\n```\r\n|**Word**       |**Named Entity Tag**\r\n| :---          | :---\r\n|They\t\t        |O\r\n|marched\t      |O\r\n|from\t\t        |O\r\n|the\t\t        |O\r\n|Houses\t        |O\r\n|of\t\t          |O\r\n|Parliament\t    |O\r\n|to\t\t          |O\r\n|a\t\t          |O\r\n|rally\t\t      |O\r\n|in\t\t          |O\r\n|Hyde\t\t        |B-geo\r\n|Park\t\t        |I-geo\r\n|.\t\t          |O\r\n\r\n### ***Software Requirements***\r\n\r\n1. Python - 3.9.x\r\n2. Tensorflow - 2.8.0\r\n\r\nFirst clone the respository executing the below command.\r\n```\r\ngit clone https://github.com/oneapi-src/document-automation.git\r\n```\r\nNote that this reference kit implementation already provides the necessary scripts to setup the software requirements. \r\nTo utilize these environment scripts, first install Anaconda/Miniconda by following the instructions at the following link\r\n\r\n[Anaconda installation](https://docs.anaconda.com/anaconda/install/linux/)\r\nor\r\nhttps://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html\r\n\r\n### ***Solution setup***\r\nFollow the below conda installation commands to setup the Stock enviroment along with the necessary packages for this model training and prediction.\r\n\u003eNote: It is assumed that the present working directory is the root directory of this code repository\r\n\r\n```\r\nconda env create --file env/stock/ner_stock.yml\r\n```\r\n\r\n\u003eNote:\r\nIf while creating the environment if the error \"command 'gcc' failed: No such file or directory\" occurs then,\r\ninstall gcc using the command below.\r\nsudo apt-get install gcc\r\n\r\nThis command utilizes the dependencies found in the `env/stock/ner_stock.yml` file to create an environment as follows:\r\n\r\n**YAML file**                       | **Environment Name**         |  **Configuration** |\r\n| :---: | :---: | :---: |\r\n| `env/stock/ner_stock.yml`             | `ner_stock` | Python=3.9.x with Tensorflow 2.8.0\r\n\r\nUse the following command to activate the environment that was created:\r\n```sh\r\nconda activate ner_stock\r\n```\r\n\r\nAfter activating the environment for stock Tensorflow framework, make sure the oneDNN flag is \r\ndisabled by running the below instruction at the command line.\r\n```\r\nexport TF_ENABLE_ONEDNN_OPTS=0\r\n```\r\n\r\nVerify if the flag is enabled by using the below command.\r\n\u003cbr\u003e\r\n```\r\necho $TF_ENABLE_ONEDNN_OPTS\r\n```\r\n\r\n### **Reference Sources**\r\n*Case Study* https://www.kaggle.com/code/ravikumarmn/ner-using-bert-tensorflow-99-35/notebook\u003cbr\u003e\r\n\r\n### ***Solution implementation***\r\n\r\n#### **Model building process**\r\nThe Python script given below need to be executed to start training using the \r\nactive environment enabled by using the above steps to setup the environment. \r\n\r\nThe script will run the benchmarks for the passed parameters and displays the corresponding \r\ntraining time in seconds. The details of the script and parameters are given below.\r\n\r\nExecute the Python script as given below to start training for specific batch size and given dataset file.\r\n```shell\r\npython src/run_modeltraining.py --batchsize \u003cbatchsize value\u003e --dataset_file \u003cdataset filename\u003e -i \u003cintel/stock\u003e --save_model_path \u003csave file path\u003e\r\n\r\n```\r\n\r\n  Arguments:\u003cbr\u003e\r\n```\r\n  --help                   show this help message and exit\r\n  --batchsize              Give the required batch sizes\r\n  --dataset_file           Give the name of dataset file\r\n  --i                      0 for stock, 1 for intel environment\r\n  --save_model_path        Give the directory path to save the model after the training\r\n```\r\n\u003cb\u003eNote:\u003c/b\u003e \r\n1) The dataset file and save_model_path parameters are mandatory to be given, remaining parameters if not given will take the default values\r\n2) --help option will give the details of the arguments\r\n\r\n\u003cb\u003eExample\u003c/b\u003e: \r\n```shell\r\npython src/run_modeltraining.py --batch_size 128 --dataset_file \"./data/ner_dataset.csv\" --intel 0 --save_model_path \"./models/trainedmodels/\"\r\n```\r\nThis command runs the model with batch size of 128, using dataset file \u003ci\u003ener_dataset.csv\u003c/i\u003e, for stock environment\r\nand saves the trained model in \u003ci\u003e\"./models/trainedmodels/stock/model_b128/\" \u003c/i\u003e folder and \r\nsubsequently, outputs the training time of the model. The user can collect the logs by \r\nredirecting the output to a file as illustrated below.\r\n\r\n```shell\r\npython src/run_modeltraining.py --batch_size 128 --dataset_file \"./data/ner_dataset.csv\" --intel 0 --save_model_path \"./models/trainedmodels/\" | tee \u003clog_file_name\u003e\r\n```\r\n\r\n\r\nThe output of the python script \u003ci\u003erun_modeltraining.py\u003c/i\u003e will be collected in the file \u003clog_file_name\u003e\r\n\r\n**Expected Output**\r\n\r\nOutput would be generated by the Python script \u003ci\u003erun_modeltraining.py\u003c/i\u003e which will capture the overall training time in seconds.\r\nThe output can be redirected to a file as per the command above.\r\n\r\nThe lines below are from the sample output of the python script \u003ci\u003erun_modeltraining.py\u003c/i\u003e and it gives details of training the model\r\n\r\n    ----------------------------------------\r\n    # Model Training \r\n    # Time (in seconds): 6879.922860383987\r\n    # Batch size: 32\r\n    # Model saved path: ./models/trainedmodels/\r\n    ----------------------------------------\r\n\r\n### Model Inference or Predictions\r\nThe Python script given below need to be executed to do inference using the \r\nactive environment enabled by using the above steps to setup the environment. \r\n\r\nThe script will run the benchmarks for the passed parameters and \r\ndisplays the corresponding inference time in seconds. The details of the script\r\nand parameters are given below.\r\n\r\nExecute the Python script as given below to perform prediction for specific batch size and dataset file.\r\n```shell\r\npython src/run_inference.py --batchsize \u003cbatchsize value\u003e --dataset_file \u003cdataset filename\u003e -i \u003cintel/stock\u003e --model_path \u003cmodel file path\u003e\r\n\r\n```\r\n\r\n  Arguments:\u003cbr\u003e\r\n```\r\n  --help                   show this help message and exit\r\n  --batchsize              Give the required batch sizes\r\n  --dataset_file           Give the name of test dataset file\r\n  --i                      0 for stock, 1 for intel environment\r\n  --model_path             Give the directory path the trained model\r\n```\r\n\r\n\u003cb\u003eNote:\u003c/b\u003e\r\n1) All the options above are optional expect for test dataset file and model_path, if not given will take the default values\r\n2) --help option will give the details of the arguments\u003cbr\u003e\r\n\r\n\u003cb\u003eExample\u003c/b\u003e: \r\n```shell\r\npython src/run_inference.py --batch_size 128 --dataset_file \"./data/ner_test_dataset.csv\" --intel 0 --model_path \"./models/trainedmodels/stock/model_b128/model_checkpoint\"\r\n```\r\nThis command runs the model with batch size of 128, using test dataset file \u003ci\u003ener_test_dataset.csv\u003c/i\u003e, for stock environment\r\nfor the trained model (trained in intel environment) in \u003ci\u003e\"./models/trainedmodels/stock/model_b128/model_checkpoint\"\u003c/i\u003e folder and \r\nsubsequently, outputs the inference time of the model. The user can collect the logs by \r\nredirecting the output to a file as illustrated below.\r\n\r\n```shell\r\npython src/run_inference.py --batch_size 128 --dataset_file \"./data/ner_test_dataset.csv\" --intel 0 --model_path \"./models/trainedmodels/stock/model_b128/model_checkpoint\" | tee \u003clog_file_name\u003e\r\n```\r\nThe output of the script \u003ci\u003erun_inference.py\u003c/i\u003e will be collected in the log_file_name.\r\n\r\n**Expected Output**\r\n\r\nOutput would be generated by the Python script \u003ci\u003erun_inference.py\u003c/i\u003e which will capture the overall inference time in seconds.\r\nThe output can be redirected to a file as per the command above.\r\n\r\nThe lines below from the sample output of the python script run_inference.py and it gives details of inference of the model\r\n\r\n    ----------------------------------------\r\n    # Model Inference details:\r\n    # Real time inference (in seconds): 0.11928558349609375\r\n    # Average batch inference:\r\n    #   Time (in seconds): 5.85315097249349\r\n    #   Batch size: 128\r\n    ----------------------------------------\r\n\r\n\r\n## **Optimizing the E2E solution with Intel Optimizations for Tensorflow**\r\nAlthough AI delivers a solution to address named entity recognition, on a production scale implementation with millions \r\nor billions of records demands for more compute power without leaving any performance on the table. Under this scenario, \r\na named entity recognition models are essential for identifying and extracting entities which will enable analyst to take\r\nappropriate decisions. For example in healthcare it can be used to identifying and extracting entities like diseases, tests,\r\ntreatments and test results.  In order to derive the most insightful and beneficial actions to take, they will need to study \r\nand analyze the data generated though various feature sets and algorithms, thus requiring frequent re-runs of the algorithms \r\nunder many different parameter sets. To utilize all the hardware resources efficiently, Software optimizations cannot be ignored.   \r\n \r\nThis reference kit solution extends to demonstrate the advantages of using the Intel AI Analytics Toolkit on the task of building a model for classifiying named entity taggings.  The savings gained from using the Intel optimizations for Tensorflow can lead an analyst to more efficiently explore and understand data, leading to better and more precise targeted solutions.\r\n\r\n#### Use Case E2E flow\r\n![image](assets/e2e_flow_optimized.png)\r\n\r\n### **Optimized software components**\r\nIntel Optimized Tensorflow (version 2.9.0 with oneDNN) TensorFlow framework has been optimized using oneAPI Deep Neural Network Library (oneDNN) primitives, a popular performance library for deep learning applications. It provides accelerated implementations of numerous popular DL algorithms that optimize performance on Intel® hardware with only requiring a few simple lines of modifications to existing code. In addition the fourth generation Intel® Xeon® Scalable processors (Sapphire Rapids) provide suppport for AMX bf16 instructions to provide additional boost by utilizing Automatic Mixed Precision.\r\n\r\nIntel® OneAPI Neural Compressor (version 1.10.1) INC is an open-source Python* library. ML developers can incorporate this library for quantizing the deep learning models. It supports two types of quantization as mentioned below​\r\n\r\nQuantization Aware Training​\r\n1) Accuracy Aware Quantization​\r\n2) Default Quantization​\r\n\r\nPost Training Quantization​\r\n1) Accuracy Aware Quantization​\r\n2) Default Quantization\r\n\r\n### ***Software Requirements***\r\n| **Package**                | **Intel® Python**\r\n| :---                       | :---\r\n| Python                     | 3.9.x\r\n| Tensorflow                 | 2.9.0\r\n| Neural Compressor          | 1.12\r\n\r\n### ***Optimized Solution setup***\r\nFollow the below conda installation commands to setup the Stock enviroment along with the necessary packages for this model training and prediction.\r\n\u003eNote: It is assumed that the present working directory is the root directory of this code repository\r\n\r\n```shell\r\nconda env create --file env/intel/ner_intel.yml\r\n```\r\n\u003eNote:\r\nIf while creating the environment if the error \"command 'gcc' failed: No such file or directory\" occurs then,\r\ninstall gcc using the command below.\r\nsudo apt-get install gcc\r\n\r\nThis command utilizes the dependencies found in the `env/intel/ner_intel.yml` file to create an environment as follows:\r\n\r\n**YAML file**                                 | **Environment Name** |  **Configuration** |\r\n| :---: | :---: | :---: |\r\n| `env/intel/ner_intel.yml`             | `ner_intel` | Python=3.9.x with Intel Optimized Tensorflow 2.9.0\r\n\r\nFor the workload implementation to arrive at first level solution we will be using the intel environment\r\n\r\nUse the following command to activate the environment that was created:\r\n```shell\r\nconda activate ner_intel\r\n```\r\n\r\n### ***Optimized Solution implementation***\r\nAfter activating the environment for Intel Tensorflow framework, make sure the oneDNN flag is \r\nenabled by running the below instruction at the command line.\r\n```shell\r\nexport TF_ENABLE_ONEDNN_OPTS=1\r\n```\r\n\r\nVerify if the flag is enabled by using the below command.\r\n\u003cbr\u003e\r\n```shell\r\necho $TF_ENABLE_ONEDNN_OPTS\r\n```\r\n\r\n#### **Model building process with Intel® optimizations**\r\nThe python script run_modeltraining.py used for training on stock version is used for training the model on Intel environment also\r\nas per example below.\r\n\r\n\u003cb\u003eExample\u003c/b\u003e: \r\n```shell\r\npython src/run_modeltraining.py --batch_size 128 --dataset_file \"./data/ner_dataset.csv\" --intel 1 --save_model_path \"./models/trainedmodels/\"\r\n```\r\nThis command runs the model with batch size of 128, using dataset file \u003ci\u003ener_dataset.csv\u003c/i\u003e, for intel environment\r\nand saves the trained model in \u003ci\u003e\"./models/trainedmodels/stock/model_b128/\" \u003c/i\u003e folder and \r\nsubsequently, outputs the training time of the model. The user can collect the logs by \r\nredirecting the output to a file as illustrated below.\r\n\r\n\u003e**Note** : You can enable bf16 training by setting the bf16 flag to 1. Please note that this flag MUST be enabled only on Intel® Fourth Gen Xeon® Scalable processor codenamed Sapphire Rapids that has bf16 training support and optimizations to utilize AMX,the latest ISA introduced in this family of processors.  \r\n\r\n```shell\r\npython src/run_modeltraining.py --batch_size 128 --dataset_file \"./data/ner_dataset.csv\" --intel 1 --bf16 1 --save_model_path \"./models/trainedmodels/\" | tee \u003clog_file_name\u003e\r\n```\r\n\r\nThe output of the python script \u003ci\u003erun_modeltraining.py\u003c/i\u003e will be collected in the file \u003clog_file_name\u003e\r\n\r\n**Expected Output**\r\nOutput would be generated by the Python script \u003ci\u003erun_modeltraining.py\u003c/i\u003e which will capture the overall training time in seconds.\r\nThe output can be redirected to a file as per the command above.\r\n\r\nThe lines below are from the sample output of the python script \u003ci\u003erun_modeltraining.py\u003c/i\u003e and it gives details of training the model\r\n\r\n    ----------------------------------------\r\n    # Model Training \r\n    # Time (in seconds): 6879.922860383987\r\n    # Batch size: 32\r\n    # Model saved path: ./models/trainedmodels/\r\n    ----------------------------------------\r\n\r\n#### **Model Inference process with Intel® optimizations**\r\nThe python script run_inference.py used to obtain inference benchmarks for stock version is used for getting inference benchmarks for \r\nIntel environment also as per example below.\r\n\r\n\u003cb\u003eExample\u003c/b\u003e: \r\n```shell\r\npython src/run_inference.py --batch_size 128 --dataset_file \"./data/ner_test_dataset.csv\" --intel 1 --model_path \"./models/trainedmodels/intel/model_b128/model_checkpoint\"\r\n```\r\nThis command runs the model with batch size of 128, using test dataset file \u003ci\u003ener_test_dataset.csv\u003c/i\u003e, for Intel environment\r\nfor the trained model (trained in intel environment) in \u003ci\u003e\"./models/trainedmodels/intel/model_b128/model_checkpoint\"\u003c/i\u003e folder and \r\nsubsequently, outputs the inference time of the model. The user can collect the logs by \r\nredirecting the output to a file as illustrated below.\r\n\r\n\u003e**Note** : You can enable bf16 inference by setting the bf16 flag to 1. Please note that this flag MUST be enabled only on Intel® Fourth Gen Xeon® Scalable processor codenamed Sapphire Rapids that has bf16 support and optimizations to utilize AMX,the latest ISA introduced in this family of processors.  \r\n\r\n\r\n```shell\r\npython src/run_inference.py --batch_size 128 --dataset_file \"./data/ner_test_dataset.csv\" --intel 1 --bf16 1 --model_path \"./models/trainedmodels/intel/model_b128/model_checkpoint\" | tee \u003clog_file_name\u003e\r\n```\r\n\r\n\r\nThe output of the script \u003ci\u003erun_inference.py\u003c/i\u003e will be collected in the log_file_name.\r\n\r\n**Expected Output**\r\nOutput would be generated by the Python script \u003ci\u003erun_inference.py\u003c/i\u003e which will capture the overall inference time in seconds.\r\nThe output can be redirected to a file as per the command above.\r\n\r\nThe lines below from the sample output of the python script run_inference.py and it gives details of inference of the model\r\n\r\n    ----------------------------------------\r\n    # Model Inference details:\r\n    # Real time inference (in seconds): 0.11928558349609375\r\n    # Average batch inference:\r\n    #   Time (in seconds): 5.85315097249349\r\n    #   Batch size: 128\r\n    ----------------------------------------\r\n\r\n#### **Model Conversion process with Intel Neural Compressor**\r\nIntel® Neural Compressor is used to quantize the FP32 Model to the INT8 Model. \r\nOptimzied model is used here for evaluating and timing Analysis. \r\nIntel® Neural Compressor supports many optimization methods. \r\nIn this case, we have used post training quantization with default quantization method to quantize the FP32 model.\r\n\r\nBefore performing the quantization of the trained model, the model is converted to frozen graph format using the \u003ci\u003erun_create_frozen_graph.py\u003c/i\u003e\r\npython script. The usage of this script to generate the frozen graph is given below.\r\n\r\n```\r\npython src/run_create_frozen_graph.py --model_path \u003ctrained model file path\u003e --save_model_path \u003cpath to save the model frozen graph\u003e\r\n```\r\nwhere,\u003cbr\u003e\r\n\u003cb\u003emodel_path\u003c/b\u003e - The path of the FP32 trained model \u003cbr\u003e\r\n\u003cb\u003esave_model_path\u003c/b\u003e - The path to save the frozen graph format of the model given in model_path\u003cbr\u003e\r\n\r\n\u003cb\u003eExample:\u003c/b\u003e\r\n```\r\npython src/run_create_frozen_graph.py --model_path \"./models/trainedmodels/intel/model_b32/model_checkpoint\" --save_model_path \"./models/frozen_models/intel/model_b32/\"\r\n```\r\nThe model at \u003ci\u003e\"./models/trainedmodels/intel/model_b32/model_checkpoint\"\u003c/i\u003e will be converted to frozen graph format and \r\nwill be saved at \u003ci\u003e\"./models/frozen_models/intel/model_b32/\"\u003c/i\u003e\r\n\r\nOnce the frozen model format of the model is created, the \u003ci\u003erun_neural_compressor_conversion.py\u003c/i\u003e python script is used for \r\nquanitization of the FP32 trained model. The syntax for using the script\r\ngiven below.\r\n\r\n```\r\npython src/run_neural_compressor_conversion.py --dataset_file \u003ctest dataset file name\u003e --model_path \u003cpath of the frozen graph\u003e --config_file \u003cconfiguration file\u003e --save_model_path \u003cpath to save the model\u003e\r\n```\r\nwhere,\r\n```\r\n--dataset_file        The path of the test dataset file\r\n--model_path          The path of the model file in the frozen graph format\r\n--config_file         The path of the configuration file which contains the settings for the quanitization\r\n--save_model_path     The path to save the quantized model\r\n```\r\n\u003cb\u003eExample:\u003c/b\u003e\r\n```\r\npython src/INC/run_neural_compressor_conversion.py --dataset_file \"./data/ner_test_quan_dataset.csv\" --model_path \"./models/frozen_models/intel/model_b32/frozen_graph.pb\" --config_file \"./env/deploy.yaml\" --save_model_path \"./models/quantized_models/intel/model_b32_d100/inc_model_b32_d100/\"\r\n```\r\n\r\nwhere \u003ci\u003e\"./data/ner_test_quan_dataset.csv\"\u003c/i\u003e is the name of the file which contains the samples to be used during evaluation, \u003ci\u003e\"./models/frozen_models/intel/model_b32/frozen_graph.pb\"\u003c/i\u003e is the model file in frozen graph format, \u003ci\u003e\"./env/deploy.yaml\"\u003c/i\u003e is the configuration\r\nfile with settings for the INC quantization module and \u003ci\u003e\"./models/quantized_models/intel/model_b32_d100/inc_model_b32_d100/\"\u003c/i\u003e\r\nis the path to save the quantized model.\r\n\r\n\u003eNote:\r\nIf while running the above script if the error \"Unable to run due to ImportError: libGL.so.1: \r\ncannot open shared object file: No such file or directory\" occurs then install libgl using the command,\r\nsudo apt-get install libgl1\r\n\r\nIf the quanitized model need to be tuned to evaluate a specific accuracy relative to the FP32 trained model then the respective configuration\r\nparameters need to set in the config file. The python script \u003ci\u003erun_neural_compressor_tune_conversion.py\u003c/i\u003e need to used for the same. The syntax\r\nof the script and usage is given below.\r\n\r\n```\r\npython src/INC/run_neural_compressor_tune_conversion.py --dataset_file \u003ctest dataset file name\u003e --model_path \u003cpath of the frozen graph\u003e --config_file \u003cconfiguration file\u003e --save_model_path \u003cpath to save the model\u003e\r\n```\r\nwhere,\r\n```\r\n--dataset_file        The path of the test dataset file\r\n--model_path          The path of the model file in the frozen graph format\r\n--config_file         The path of the configuration file which contains the settings for the quanitization\r\n--save_model_path     The path to save the quantized model\r\n```\r\nThe usage of the script is similar to the \u003ci\u003erun_neural_compressor_conversion.py\u003c/i\u003e except for the configuration file.\r\n\r\n\u003cb\u003eExample:\u003c/b\u003e\r\n```\r\npython src/INC/run_neural_compressor_tune_conversion.py --dataset_file \"./data/ner_test_quan_dataset.csv\" --model_path \"./models/frozen_models/intel/model_b32/frozen_graph.pb\" --config_file \"./env/deploy_accuracy.yaml\" --save_model_path \"./models/acc_quantized_models/intel/model_b32_d100/inc_model_b32_d100/\"\r\n```\r\n\r\nwhere \u003ci\u003e\"./data/ner_test_quan_dataset.csv\"\u003c/i\u003e is the name of the file which contains the samples to be used during evaluation, \u003ci\u003e\"./models/frozen_models/intel/model_b32/frozen_graph.pb\"\u003c/i\u003e is the model file in frozen graph format, \u003ci\u003e\"./env/deploy_accuracy.yaml\"\u003c/i\u003e is the configuration\r\nfile with settings for the INC quantization module to fine tune the quanitized model and \u003ci\u003e\"./models/acc_quantized_models/intel/model_b32_d100/inc_model_b32_d100/\"\u003c/i\u003e\r\nis the path to save the quantized model.\r\n\r\n\u003eNote:\r\nIf while running the above script if the error \"Unable to run due to ImportError: libGL.so.1: \r\ncannot open shared object file: No such file or directory\" occurs then install libgl using the command,\r\nsudo apt-get install libgl1\r\n\r\n#### **Model Inference process with Intel® Quanitizations**\r\nNow that the quantized model is created using INC it can be used for inferencing on the test data and perform benchmarking.\r\nThe inferencing is done on the FP32 model and INC quantized model and results for real time inference and batch inference are used\r\nfor benchmarking.\r\n\r\nThe python script \u003ci\u003erun_neural_compressor_inference.py\u003c/i\u003e is used for perform predictions on the test data. The syntax to use the script \r\nis given below.\r\n\r\n```\r\npython src/INC/run_neural_compressor_inference.py --batch_size 32 --dataset_file \u003ctest dataset file\u003e --model_path \u003cFP32 or INC frozen graph file\u003e\r\n```\r\nwhere,\r\n```\r\n--batch_size      Give the required batch size for inference\r\n--dataset_file    The path of the test data set file name\r\n--model_path      The path of the FP32 or quantized frozen graph model\r\n```\r\n\u003cb\u003eExample:\u003c/b\u003e\r\n```\r\npython src/INC/run_neural_compressor_inference.py --batch_size 128 --dataset_file \"./data/ner_test_quan_dataset.csv\" --model_path \"./models/quantized_models/intel/model_b128_d100/inc_model_b128_d100.pb\" | tee \u003clog_file_name\u003e\r\n```\r\n\r\nThe inference is done for batch size of 128, with test dataset file \u003ci\u003e\"./data/ner_test_quan_dataset.csv\"\u003c/i\u003e, using the \r\nmodel \u003ci\u003e\"./models/frozen_models/frozen_graph.pb\"\u003c/i\u003e in frozen graph format. The model can also be a \r\nquantized model in frozen graph format.\r\n\r\n## **Performance Observations**\r\nThis section covers the inference time comparison between Stock Tensorflow version and Intel Tensorflow distribution for this model training and prediction. The results are captured for varying batch sizes which includes training time and inference time. The results are used to calculate the performance gain achieved by using Intel One API packages over stock version of similar packages.\r\n\r\n\r\n![image](assets/FP32AndINT8IDefaultQuantizationnferenceBenchmarksGraph.png)\r\n\r\n\u003cbr\u003e**Key Takeaways**\u003cbr\u003e\r\n- TensorFlow 2.9.0 with Intel OneDNN offers real time inference speed-up of upto 1.08x and batch inference time speed-up ranging between 1.16x and 1.24x compared to stock TensorFlow 2.8.0 version with different batch sizes.\r\n- Intel® Neural Compressor distribution with default quantization offers real time inference speed-up of upto 1.94x and batch inference time speed-up ranging between 1.86x and 1.93x compared to Stock TensorFlow 2.8.0 version with different batch sizes.\r\n\r\n#### **Conclusion**\r\nTo build a named entity recognition solution using BERT transfer learning approach, at scale, Data Scientist will need to train models for substantial datasets and run inference more frequently. The ability to accelerate training will allow them to train more frequently and achieve better accuracy. Besides training, faster speed in inference will allow them to run prediction in real-time scenarios as well as more frequently. A Data Scientist will also look at data classification to tag and categorize data so that it can be better understood and analyzed. This task requires a lot of training and retraining, making the job tedious. The ability to get it faster speed will accelerate the ML pipeline. This reference kit implementation provides performance-optimized guide around named entity tag prediction use cases that can be easily scaled across similar use cases.\r\n\r\n\r\n## **Notices \u0026 Disclaimers**\r\nTo the extent that any public or non-Intel datasets or models are referenced by or accessed using tools or code on this site those datasets or models are provided by the third party indicated as the content source. Intel does not create the content and does not warrant its accuracy or quality. By accessing the public content, or using materials trained on or with such content, you agree to the terms associated with that content and that your use complies with the applicable license.\r\n \r\nIntel expressly disclaims the accuracy, adequacy, or completeness of any such public content, and is not liable for any errors, omissions, or defects in the content, or for any reliance on the content. Intel is not liable for any liability or damages relating to your use of public content.\r\n\r\n\r\n## Appendix\r\n\r\n### **Experiment setup**\r\n\r\n| Platform                          | Microsoft Azure: Standard_D8_v5 (Ice Lake)\u003cbr\u003eUbuntu 20.04\r\n| :---                              | :---\r\n| Hardware                          | Azure Standard_D8_V5\r\n| CPU cores                         | 8\r\n| Memory                            | 32GB\r\n| Software                          | Intel® oneAPI AI Analytics Toolkit, Tensorflow\r\n| What you will learn               | Advantage of using Intel OneAPI TensorFlow (2.9.0 with oneDNN enabled) over the stock Tensorflow (Tensorflow 2.8.0) for the BERT transfer learning model architecture training and inference. Advantage of Intel® Neural Compressor over Intel OneAPI TensorFlow (2.9.0 with oneDNN enabled).\r\n\r\n","funding_links":[],"categories":["Table of Contents"],"sub_categories":["AI - Frameworks and Toolkits"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foneapi-src%2Fdocument-automation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foneapi-src%2Fdocument-automation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foneapi-src%2Fdocument-automation/lists"}