{"id":14181785,"url":"https://github.com/ServiceNow/picard","last_synced_at":"2025-08-07T14:31:14.146Z","repository":{"id":37727252,"uuid":"401779782","full_name":"ServiceNow/picard","owner":"ServiceNow","description":"PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. PICARD is a ServiceNow Research project that was started at Element AI.","archived":false,"fork":false,"pushed_at":"2023-10-18T02:48:43.000Z","size":778,"stargazers_count":333,"open_issues_count":36,"forks_count":123,"subscribers_count":11,"default_branch":"main","last_synced_at":"2024-08-18T11:13:38.903Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2109.05093","language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ServiceNow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-31T16:57:40.000Z","updated_at":"2024-08-09T01:58:17.000Z","dependencies_parsed_at":"2024-08-18T11:22:17.509Z","dependency_job_id":null,"html_url":"https://github.com/ServiceNow/picard","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2Fpicard","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2Fpicard/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2Fpicard/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ServiceNow%2Fpicard/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ServiceNow","download_url":"https://codeload.github.com/ServiceNow/picard/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":229045144,"owners_count":18011450,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-18T11:04:13.295Z","updated_at":"2024-12-10T10:31:13.914Z","avatar_url":"https://github.com/ServiceNow.png","language":"Haskell","readme":"*ServiceNow completed its acquisition of Element AI on January 8, 2021. All references to Element AI in the materials that are part of this project should refer to ServiceNow.*\n\n\u003cp align=\"center\"\u003e\n    \u003cbr\u003e\n    \u003cimg alt=\"make it parse\" src=\"https://repository-images.githubusercontent.com/401779782/c2f46be5-b74b-4620-ad64-57487be3b1ab\" width=\"600\"/\u003e\n    \u003cbr\u003e\n\u003cp\u003e\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/ElementAI/picard/actions/workflows/build.yml\"\u003e\n        \u003cimg alt=\"build\" src=\"https://github.com/ElementAI/picard/actions/workflows/build.yml/badge.svg?branch=main\u0026event=push\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/ElementAI/picard/blob/main/LICENSE\"\u003e\n        \u003cimg alt=\"license\" src=\"https://img.shields.io/github/license/ElementAI/picard.svg?color=blue\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://paperswithcode.com/paper/picard-parsing-incrementally-for-constrained\"\u003e\n        \u003cimg src=\"https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/picard-parsing-incrementally-for-constrained/text-to-sql-on-spider\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://paperswithcode.com/paper/picard-parsing-incrementally-for-constrained\"\u003e\n        \u003cimg src=\"https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/picard-parsing-incrementally-for-constrained/dialogue-state-tracking-on-cosql\"\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\nThis is the official implementation of the following paper:\n\n[Torsten Scholak](https://twitter.com/tscholak), Nathan Schucher, Dzmitry Bahdanau. [PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models](https://arxiv.org/abs/2109.05093). *Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP).*\n\nIf you use this code, please cite:\n\n```bibtex\n@inproceedings{Scholak2021:PICARD,\n  author = {Torsten Scholak and Nathan Schucher and Dzmitry Bahdanau},\n  title = \"{PICARD}: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models\",\n  booktitle = \"Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing\",\n  month = nov,\n  year = \"2021\",\n  publisher = \"Association for Computational Linguistics\",\n  url = \"https://aclanthology.org/2021.emnlp-main.779\",\n  pages = \"9895--9901\",\n}\n```\n\n## Watch The Video\n\n[![Watch the video](https://img.youtube.com/vi/kTpixsr-37w/maxresdefault.jpg)](https://youtu.be/kTpixsr-37w)\n\n## Overview\n\nThis code implements:\n\n* The PICARD algorithm for constrained decoding from language models.\n* A text-to-SQL semantic parser based on pre-trained sequence-to-sequence models and PICARD achieving state-of-the-art performance on both the [Spider](https://yale-lily.github.io/spider) and the [CoSQL](https://yale-lily.github.io/cosql) datasets. \n\n## About PICARD\n\n\u003e **TL;DR:** We introduce PICARD -- a new method for simple and effective constrained decoding from large pre-trained language models.\n\u003e On the challenging Spider and CoSQL text-to-SQL datasets, PICARD significantly improves the performance of fine-tuned but otherwise unmodified T5 models.\n\u003e Using PICARD, our T5-3B models achieved state-of-the-art performance on both Spider and CoSQL.\n\nIn text-to-SQL translation, the goal is to translate a natural language question into a SQL query.\nThere are two main challenges to this task:\n\n1. The generated SQL needs to be semantically correct, that is, correctly reflect the meaning of the question.\n2. The SQL also needs to be valid, that is, it must not result in an execution error.\n\nSo far, there has been a trade-off between these two goals:\nThe second problem can be solved by using a special decoder architecture that -- by construction -- always produces valid SQL.\nThis is the approach taken by most prior work.\nThose decoders are called \"constrained decoders\", and they need to be trained from scratch on the text-to-SQL dataset.\nHowever, this limits the generality of the decoders, which is a problem for the first goal.\n\nA better approach would be to use a pre-trained encoder-decoder model and to constrain its decoder to produce valid SQL after fine-tuning the model on the text-to-SQL task.\nThis is the approach taken by the PICARD algorithm.\n\n### How is PICARD different from existing constrained decoders?\n\n* It’s an incremental parsing algorithm that integrates with ordinary beam search.\n* It doesn’t require any training.\n* It doesn’t require modifying the model.\n* It works with any model that generates a sequence of tokens (including language models).\n* It doesn’t require a special vocabulary.\n* It works with character-, sub-word-, and word-level language models.\n\n### How does PICARD work?\n\nThe following picture shows how PICARD is integrated with beam search.\n\n\u003cp align=\"center\"\u003e\n    \u003cbr\u003e\n    \u003cimg src=\"beam_search_with_picard.svg\" width=\"400\"/\u003e\n    \u003cbr\u003e\n\u003cp\u003e\n\nDecoding starts from the left and proceeds to the right.\nThe algorithm begins with a single token (usually `\u003cs\u003e`),\nand then keeps expanding the beam with hypotheses generated token-by-token by the decoder.\nAt each decoding step and for each hypothesis,\nPICARD checks whether the next top-`k` tokens are valid.\nIn the image above, only 3 token predictions are shown, and `k` is set to 2.\nValid tokens (☑) are added to the beam. Invalid ones (☒) are discarded. The `k+1`-th, `k+2`-th, ... tokens are discarded, too.\nLike in normal beam search, the beam is pruned to contain only the top-`n` hypotheses.\n`n` is the beam size, and in the image above it is set to 2 as well.\nHypotheses that are terminated with the end-of-sentence token (usually `\u003c/s\u003e`) are not expanded further.\nThe algorithm stops when the all hypotheses are terminated\nor when the maximum number of tokens has been reached.\n\n### How does PICARD know whether a token is valid?\n\nIn PICARD, checking, accepting, and rejecting of tokens and token sequences is achieved through *parsing*.\nParsing means that we attempt to assemble a data structure from the tokens\nthat are currently in the beam or are about to be added to it.\nThis data structure (and the parsing rules that are used to build it) encode the constraints we want to enforce.\n\nIn the case of SQL, the data structure we parse to is the abstract syntax tree (AST) of the SQL query.\nThe parsing rules are defined in a computer program called a parser.\nDatabase engines, such as PostgreSQL, MySQL, and SQLite, have their own built-in parser that they use internally to process SQL queries.\nFor Spider and CoSQL,\nwe have implemented a parser that supports a subset of the SQLite syntax and that checks additional constraints on the AST.\nIn our implementation,\nthe parsing rules are made up from simpler rules and primitives that are provided by a third-party parsing library.\n\nPICARD uses a parsing library called [attoparsec](https://hackage.haskell.org/package/attoparsec) that supports incremental input.\nThis is a special capability that is not available in many other parsing libraries.\nYou can feed attoparsec a string that represents only part of the expected input to parse.\nWhen parsing reaches the end of an input fragment,\nattoparsec will return a [continuation function](https://hackage.haskell.org/package/attoparsec-0.14.1/docs/Data-Attoparsec-Text.html#t:IResult)\nthat can be used to continue parsing.\nThink of the continuation function as a suspended computation that can be resumed later.\nInput fragments can be parsed one after the other when they become available until the input is complete.\n\nHerein lies the key to PICARD:\nIncremental parsing of input fragments is exactly what we need to check tokens one by one during decoding.\n\nIn PICARD,\nparsing is initialized with an empty string, and attoparsec will return the first continuation function.\nWe then call that continuation function with all the token predictions we want to check in the first decoding step.\nFor those tokens that are valid, the continuation function will return a new continuation function\nthat we can use to continue parsing in the next decoding step.\nFor those tokens that are invalid, the continuation function will return a failure value which cannot be used to continue parsing.\nSuch failures are discarded and never end up in the beam.\nWe repeat the process until the end of the input is reached.\nThe input is complete once the model predicts the end-of-sentence token.\nWhen that happens, we finalize the parsing by calling the continuation function with an empty string.\nIf the parsing is successful, it will return the final AST.\nIf not, it will return a failure value.\n\nThe parsing rules are described at a high level in the [PICARD paper](https://arxiv.org/abs/2109.05093).\nFor details, see the PICARD code, specifically the [Language.SQL.SpiderSQL.Parse module](https://github.com/ElementAI/picard/blob/main/picard/src/Language/SQL/SpiderSQL/Parse.hs).\n\n### How well does PICARD work?\n\nLet's look at the numbers:\n\n#### On [Spider](https://yale-lily.github.io/spider)\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth rowspan=2 valign=bottom\u003eURL\u003c/th\u003e\n    \u003cth rowspan=2 valign=bottom\u003eBased on\u003c/th\u003e\n    \u003cth colspan=2\u003eExact-set Match Accuracy\u003c/th\u003e\n    \u003cth colspan=2\u003eExecution Accuracy\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth\u003eDev\u003c/th\u003e\n    \u003cth\u003eTest\u003c/th\u003e\n    \u003cth\u003eDev\u003c/th\u003e\n    \u003cth\u003eTest\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003cb\u003e\u003ca href=\"https://huggingface.co/tscholak/cxmefzzi\"\u003etscholak/cxmefzzi\u003c/a\u003e w PICARD\u003c/b\u003e\u003c/td\u003e\n    \u003ctd\u003eT5-3B\u003c/td\u003e\n    \u003ctd\u003e\u003cb\u003e75.5 %\u003c/b\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cb\u003e71.9 %\u003c/b\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cb\u003e79.3 %\u003c/b\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cb\u003e75.1 %\u003c/b\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/cxmefzzi\"\u003etscholak/cxmefzzi\u003c/a\u003e w/o PICARD\u003c/td\u003e\n    \u003ctd\u003eT5-3B\u003c/td\u003e\n    \u003ctd\u003e71.5 %\u003c/td\u003e\n    \u003ctd\u003e68.0 %\u003c/td\u003e\n    \u003ctd\u003e74.4 %\u003c/td\u003e\n    \u003ctd\u003e70.1 %\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/3vnuv1vf\"\u003etscholak/3vnuv1vf\u003c/a\u003e w PICARD\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k\"\u003et5.1.1.lm100k.large\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e74.8 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e79.2 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/3vnuv1vf\"\u003etscholak/3vnuv1vf\u003c/a\u003e w/o PICARD\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k\"\u003et5.1.1.lm100k.large\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e71.2 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e74.4 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/1wnr382e\"\u003etscholak/1wnr382e\u003c/a\u003e w PICARD\u003c/td\u003e\n    \u003ctd\u003eT5-Large\u003c/td\u003e\n    \u003ctd\u003e69.1 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e72.9 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/1wnr382e\"\u003etscholak/1wnr382e\u003c/a\u003e w/o PICARD\u003c/td\u003e\n    \u003ctd\u003eT5-Large\u003c/td\u003e\n    \u003ctd\u003e65.3 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e67.2 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/1zha5ono\"\u003etscholak/1zha5ono\u003c/a\u003e w PICARD\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k\"\u003et5.1.1.lm100k.base\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e66.6 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e68.4 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/1zha5ono\"\u003etscholak/1zha5ono\u003c/a\u003e w/o PICARD\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k\"\u003et5.1.1.lm100k.base\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e59.4 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e60.0 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\nClick on the links to download the models.\n\u003ca href=\"https://huggingface.co/tscholak/cxmefzzi\"\u003etscholak/cxmefzzi\u003c/a\u003e and \u003ca href=\"https://huggingface.co/tscholak/1wnr382e\"\u003etscholak/1wnr382e\u003c/a\u003e\nare the versions of the model that we used in our experiments for the paper, reported as T5-3B and T5-Large, respectively.\n\u003ca href=\"https://huggingface.co/tscholak/cxmefzzi\"\u003etscholak/cxmefzzi\u003c/a\u003e, \u003ca href=\"https://huggingface.co/tscholak/3vnuv1vf\"\u003etscholak/3vnuv1vf\u003c/a\u003e, and \u003ca href=\"https://huggingface.co/tscholak/1zha5ono\"\u003etscholak/1zha5ono\u003c/a\u003e were trained to use database content, whereas \u003ca href=\"https://huggingface.co/tscholak/1wnr382e\"\u003etscholak/1wnr382e\u003c/a\u003e was not.\n\nNote that, without PICARD, 12% of the SQL queries generated by \u003ca href=\"https://huggingface.co/tscholak/cxmefzzi\"\u003etscholak/cxmefzzi\u003c/a\u003e on Spider’s development set resulted in an execution error. With PICARD, this number decreased to 2%.\n\n#### On [CoSQL](https://yale-lily.github.io/cosql) Dialogue State Tracking\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth rowspan=2 valign=bottom\u003eURL\u003c/th\u003e\n    \u003cth rowspan=2 valign=bottom\u003eBased on\u003c/th\u003e\n    \u003cth colspan=2\u003eQuestion Match Accuracy\u003c/th\u003e\n    \u003cth colspan=2\u003eInteraction Match Accuracy\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003cth\u003eDev\u003c/th\u003e\n    \u003cth\u003eTest\u003c/th\u003e\n    \u003cth\u003eDev\u003c/th\u003e\n    \u003cth\u003eTest\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003cb\u003e\u003ca href=\"https://huggingface.co/tscholak/2e826ioa\"\u003etscholak/2e826ioa\u003c/a\u003e w PICARD\u003c/b\u003e\u003c/td\u003e\n    \u003ctd\u003eT5-3B\u003c/td\u003e\n    \u003ctd\u003e\u003cb\u003e56.9 %\u003c/b\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cb\u003e54.6 %\u003c/b\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cb\u003e24.2 %\u003c/b\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cb\u003e23.7 %\u003c/b\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/2e826ioa\"\u003etscholak/2e826ioa\u003c/a\u003e w/o PICARD\u003c/td\u003e\n    \u003ctd\u003eT5-3B\u003c/td\u003e\n    \u003ctd\u003e53.8 %\u003c/td\u003e\n    \u003ctd\u003e51.4 %\u003c/td\u003e\n    \u003ctd\u003e21.8 %\u003c/td\u003e\n    \u003ctd\u003e21.7 %\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/2jrayxos\"\u003etscholak/2jrayxos\u003c/a\u003e w PICARD\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k\"\u003et5.1.1.lm100k.large\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e54.2 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://huggingface.co/tscholak/2jrayxos\"\u003etscholak/2jrayxos\u003c/a\u003e w/o PICARD\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#lm-adapted-t511lm100k\"\u003et5.1.1.lm100k.large\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e52.5 %\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n    \u003ctd\u003e—\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\nClick on the links to download the models. \u003ca href=\"https://huggingface.co/tscholak/2e826ioa\"\u003etscholak/2e826ioa\u003c/a\u003e is the version of the model that we used in our experiments for the paper, reported as T5-3B.\n\n## Quick Start\n\n### Prerequisites\n\nThis repository uses git submodules. Clone it like this:\n```sh\n$ git clone git@github.com:ElementAI/picard.git\n$ cd picard\n$ git submodule update --init --recursive\n```\n\n### Training\n\nThe training script is located in `seq2seq/run_seq2seq.py`.\nYou can run it with:\n```\n$ make train\n```\nThe model will be trained on the Spider dataset by default.\nYou can also train on CoSQL by running `make train-cosql`.\n\nThe training script will create the directory `train` in the current directory.\nTraining artifacts like checkpoints will be stored in this directory.\n\nThe default configuration is stored in `configs/train.json`.\nThe settings are optimized for a GPU with 40GB of memory.\n\nThese training settings should result in a model\nwith at least 71% exact-set-match accuracy on the Spider development set.\nWith PICARD, the accuracy should go up to at least 75%.\n\nWe have uploaded a model trained on the Spider dataset to the huggingface model hub,\n\u003ca href=\"https://huggingface.co/tscholak/cxmefzzi\"\u003etscholak/cxmefzzi\u003c/a\u003e.\nA model trained on the CoSQL dialog state tracking dataset is available, too,\n\u003ca href=\"https://huggingface.co/tscholak/2e826ioa\"\u003etscholak/2e826ioa\u003c/a\u003e.\n\n### Evaluation\n\nThe evaluation script is located in `seq2seq/run_seq2seq.py`.\nYou can run it with:\n```\n$ make eval\n```\nBy default, the evaluation will be run on the Spider evaluation set.\nEvaluation on the CoSQL evaluation set can be run with `make eval-cosql`.\n\nThe evaluation script will create the directory `eval` in the current directory.\nThe evaluation results will be stored there.\n\nThe default configuration is stored in `configs/eval.json`.\n\n### Serving\n\nA trained model can be served using the `seq2seq/serve_seq2seq.py` script.\nThe configuration file can be found in `configs/serve.json`.\nYou can start serving with:\n```\n$ make serve\n```\nBy default, the 800-million-parameter \u003ca href=\"https://huggingface.co/tscholak/3vnuv1vf\"\u003etscholak/3vnuv1vf\u003c/a\u003e model will be loaded. You can also load a different model by specifying the model name in the configuration file. The device to use can be specified as well. The default is to use the first available GPU. CPU can be used by specifying `-1`.\n\nWhen the script is called, it uses the folder specified by the `db_path` option to look for SQL database files.\nThe default folder is `database`, which will be created in the current directory.\nInitially, this folder will be empty, and you can add your own SQL files to it.\nThe structure of the folder should be like this:\n```\ndatabase/\n  my_1st_database/\n    my_1st_database.sqlite\n  my_2nd_database/\n    my_2nd_database.sqlite\n```\nwhere `my_1st_database` and `my_2nd_database` are the `db_id`s of the databases.\n\nOnce the server is up and running, use the Swagger UI to test inference with the `/ask` endpoint.\nThe server will be listening at `http://localhost:8000/`,\nand the Swagger UI will be available at `http://localhost:8000/docs#/default/ask_ask__db_id___question__get`.\n\n### Docker\n\nThere are three docker images that can be used to run the code:\n\n* **[tscholak/text-to-sql-dev](https://hub.docker.com/repository/docker/tscholak/text-to-sql-dev):** Base image with development dependencies. Use this for development. Pull it with `make pull-dev-image` from the docker hub. Rebuild the image with `make build-dev-image`. \n* **[tsscholak/text-to-sql-train](https://hub.docker.com/repository/docker/tscholak/text-to-sql-train):** Training image with development dependencies but without Picard dependencies. Use this for fine-tuning a model. Pull it with `make pull-train-image` from the docker hub. Rebuild the image with `make build-train-image`.\n* **[tscholak/text-to-sql-eval](https://hub.docker.com/repository/docker/tscholak/text-to-sql-eval):** Training/evaluation image with all dependencies. Use this for evaluating a fine-tuned model with Picard. This image can also be used for training if you want to run evaluation during training with Picard. Pull it with `make pull-eval-image` from the docker hub. Rebuild the image with `make build-eval-image`.\n\nAll images are tagged with the current commit hash. The images are built with the buildx tool which is available in the latest docker-ce. Use `make init-buildkit` to initialize the buildx tool on your machine. You can then use `make build-dev-image`, `make build-train-image`, etc. to rebuild the images. Local changes to the code will not be reflected in the docker images unless they are committed to git.\n","funding_links":[],"categories":["💬 Classic Model","Haskell"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FServiceNow%2Fpicard","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FServiceNow%2Fpicard","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FServiceNow%2Fpicard/lists"}