{"id":22344716,"url":"https://github.com/languagemachines/tadpole","last_synced_at":"2026-01-28T17:36:18.211Z","repository":{"id":146794205,"uuid":"73700759","full_name":"LanguageMachines/tadpole","owner":"LanguageMachines","description":"The good old predecessor of Frog","archived":false,"fork":false,"pushed_at":"2016-11-14T13:11:19.000Z","size":28905,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-06-01T07:56:47.911Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Lex","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LanguageMachines.png","metadata":{"files":{"readme":"README","changelog":"ChangeLog","contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-11-14T12:03:31.000Z","updated_at":"2016-11-14T13:07:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"d08575a4-b169-43d5-862b-99fe5a7beffa","html_url":"https://github.com/LanguageMachines/tadpole","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/LanguageMachines/tadpole","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LanguageMachines%2Ftadpole","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LanguageMachines%2Ftadpole/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LanguageMachines%2Ftadpole/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LanguageMachines%2Ftadpole/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LanguageMachines","download_url":"https://codeload.github.com/LanguageMachines/tadpole/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LanguageMachines%2Ftadpole/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28847902,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-28T15:15:36.453Z","status":"ssl_error","status_checked_at":"2026-01-28T15:15:13.020Z","response_time":57,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-04T09:14:53.976Z","updated_at":"2026-01-28T17:36:18.193Z","avatar_url":"https://github.com/LanguageMachines.png","language":"Lex","funding_links":[],"categories":[],"sub_categories":[],"readme":"Tadpole 0.6\n\n  A Tagger-Lemmatizer-Morphological-Analyzer-Dependency-Parser for Dutch\n  Version  0.6\n  http://ilk.uvt.nl/tadpole\n \n  Copyright 2006-2010 Bertjan Busser, Antal van den Bosch, and Ko\n  van der Sloot\n  ILK Research Group, Faculty of Humanities, Tilburg University\n  http://ilk.uvt.nl\n\n  Tadpole is free software; you can redistribute it and/or modify\n  it under the terms of the GNU General Public License as published by\n  the Free Software Foundation; either version 3 of the License, or\n  (at your option) any later version.\n\n  Tadpole is distributed in the hope that it will be useful,\n  but WITHOUT ANY WARRANTY; without even the implied warranty of\n  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\n  GNU General Public License for more details.\n\n  You should have received a copy of the GNU General Public License\n  along with this program.  If not, see \u003chttp://www.gnu.org/licenses/\u003e.\n\n  For more information and updates, see:\n      http://ilk.uvt.nl/tadpole\n\n---------------------------------------------------------------------\nInstallation and Quick Start\n\nTadpole relies on Timbl version 6.3, TimblServer version 1.0, and Mbt\nversion 3.2. TimblServer relies on Timbl; Mbt relies on Timbl and\nTimblServer. The logical order of installation is therefore (1) Timbl,\n(2) TimblServer, (3) Mbt, and (4) Tadpole. Tadpole will NOT work with\nprevious versions of Timbl and Mbt. The three software packages can be\ndownloaded from\n\n  http://ilk.uvt.nl/timbl (Timbl and TimblServer)\n  http://ilk.uvt.nl/mbt   (Mbt)\n\nPlease consult the installation instructions with these packages.\n\nTadpole also relies on Python 2.5 or higher, libboost 1.33 or higher,\nand ICU 3.6 or higher. Please consult your system maintainer if you\ncannot install these packages yourself.\n\nWhen you have downloaded the Tadpole tarball from\nhttp://ilk.uvt.nl/downloads/pub/software/tadpole , you can untar the\npackage, and go to the Tadpole directory. If you installed Timbl,\nTimblSever and Mbt in the same install directory (i.e., you specified\nthe same install directory with \"--prefix=\u003cinstalldir\u003e\" in all three\npackage installations), it is sufficient to the same with Tadpole.\n\n%prompt$\u003e tar zxvf tadpole-0.6.tar.gz\n%prompt$\u003e cd tadpole-0.6\n%prompt$\u003e ./configure --prefix=\u003cinstalldir\u003e\n%prompt$\u003e make \u0026\u0026 make install\n\nInvoking the Tadpole binary without arguments prints a basic usage:\n\n%prompt$\u003e ./Tadpole\nTadpole v.0.6\nOptions:\n\t-d \u003cdirName\u003e path to config dir (default ./config)\n\t-T \u003ctaggerconfigfile\u003e (uses Mbt-style settings file)\n\t-M \u003cMBMAconfigfile\u003e (morphological analysis)\n\t   accepts:\n\t\tt \u003ctreefile\u003e\n\t\tm \u003cmode\u003e\n\t-L \u003cMBlemconfigfile\u003e (lemmatizer)\n\t   accepts:\n\t\tp \u003cprefix\u003e (for filenames)\n\t\tO \u003ctimbl options\u003e\n\t-U \u003cmwuconfigfile\u003e (multiwordchunker)\n\t   accepts:\n\t\tt \u003cmwu unit file\u003e\n\t\tc \u003cconnect_char\u003e (char between tokens in a mwu)\n\t-P \u003cparserconfigfile\u003e \n\t   accepts:\n\t   to do...\n\t-t \u003ctestfile\u003e\n\t--testdir=\u003cdirectory\u003e (all files in this dir will be processed\n\t-o \u003coutputfile\u003e (default stdout)\n\t--outputdir=\u003coutputfile\u003e (default stdout)\n\t-s \u003coutput field separator\u003e (default tab)\n\t-S \u003cport\u003e (run as server instead of reading from testfile)\n\t-K : keep intermediate files, (last sentence only) (default false)\n\t-d \u003cdebug level\u003e (for more verbosity)\n\t--skip=\u003ccomponents\u003e Allows to skip certain Tadpole components.\n\t  Especially the dependency parser is resource intensive and may\n\t  want to be skipped when not required. Components are indicated by\n\t  one character, multiple may be combined:\n\t  t - tokeniser, p - parser, m - morphological analyser\n\nThe following command line is an example run of Tadpole on the provided\nsample text file test.txt\n\n\n%prompt%\u003e ./Tadpole -t test.txt\n\nThis should produce output (to stdout) like this:\n\n1    De\t    de\t    [de]   LID(bep,stan,rest)\t2\tdet\n2    oprichter\t    oprichter\t\t\t[op][richt][er]\tN(soort,ev,basis,zijd,stan)\t8\tsu\n3    van\t    van\t\t\t\t[van]\t\tVZ(init)\t\t\t2\tmod\n4    Wikipedia\t    Wikipedia\t\t\t[Wikipedia]\tSPEC(deeleigen)\t\t\t3\tobj1\n5    ,\t\t    ,\t\t\t\t[,]\t\tLET()\t\t\t\t4\tpunct\n6    Jimmy_Wales    Jimmy_Wales\t\t\t[Jimmy]_[Wales]\tSPEC(deeleigen)\t\t\t2\tapp\n7    ,\t\t    ,\t\t\t\t[,]\t\tLET()\t\t\t\t6\tpunct\n8    wil\t    willen\t\t\t[wil]\t\tWW(pv,tgw,ev)\t\t\t0\tROOT\n9    een\t    een\t\t\t\t[een]\t\tLID(onbep,stan,agr)\t\t11\tdet\n10   nieuwe\t    nieuw\t\t\t[nieuw][e]\tADJ(prenom,basis,met-e,stan)\t11\tmod\n11   zoekmachine    zoekmachine\t\t\t[zoek][machine]\tN(soort,ev,basis,zijd,stan)\t8\tsu\n12   lanceren\t    lanceren\t\t\t[lanceren]\tWW(inf,vrij,zonder)\t\t8\tvc\n13   .\t\t    .\t\t\t\t[.]\t\tLET()\t\t\t\t12\tpunct\n\nThe first column is a token counter; the second column is the token\nitself, followed by its lemma and its morphological analysis. The\nfifth column is the CGN POS tag. The sixth column points to the\ntoken counter of the head token of the line's token in the dependency\ngraph; the seventh column contains the type of dependency relation\nbetween the two tokens.\n\n\n---------------------------------------------------------------------\nCredits\n\nMany thanks go out to the people who made the developments of the\nTadpole components possible: Walter Daelemans, Jakub Zavrel, Ko van\nder Sloot, Sabine Buchholz, Sander Canisius, Gert Durieux, and Peter\nBerck. \n\nThanks to Erik Tjong Kim Sang and Lieve Macken for stress-testing the\nfirst versions of Tadpole, and to Rogier Kraf, Guy De Pauw, Joost\nHengstmengel, Frederik Vaassen, Wouter van Atteveldt, Joseph Turian,\nBarbara Plank, Jan-Pieter Kunst, Robert Hensing, Theo van den Heuvel,\nand Martha van den Hoven for valuable bug reports, comments, and\nsuggestions for improvements.\n\n\n---------------------------------------------------------------------\nReferences\n\nTadpole is described in the following paper:\n\nVan den Bosch, A., Busser, G.J., Daelemans, W., and Canisius, S. (to\n appear). An efficient memory-based morphosyntactic tagger and parser for\n Dutch, To appear in Selected Papers of the 17th Computational Linguistics in\n the Netherlands Meeting, Leuven, Belgium.\n\nWe kindly ask you to refer to this paper if you make use of Tadpole in\nyour own work.\n\nYou can find more information on components of Tadpole in these papers,\nwhich can be downloaded from http://ilk.uvt.nl/publications :\n\nDaelemans, W., Zavrel, J, Berck, P, and Gillis, S. (1996). MBT: A\n Memory-Based Part of Speech Tagger-Generator. In: E. Ejerhed and I. Dagan\n (eds.) Proceedings of the Fourth Workshop on Very Large Corpora, Copenhagen,\n Denmark, pp. 14-27.\n\nVan den Bosch, A., Daelemans, W., and Weijters, A. (1996). Morphological\n analysis as classification: An inductive-learning approach. In Proceedings\n of NeMLaP-2, Bilkent University, Turkey, 79-89.\n\nVan den Bosch, A., and Daelemans, W. (1999). Memory-based morphological\n analysis. In Proceedings of the 37th Annual Meeting of the Association for\n Computational Linguistics, ACL'99, University of Maryland, USA, June 20-26,\n 1999, pp. 285-292.\n\nZavrel, J., and Daelemans W. (1999).  Recent Advances in Memory-Based\n Part-of-Speech Tagging. In: Actas del VI Simposio Internacional de\n Comunicacion Social, Santiago de Cuba, pp. 590-597.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flanguagemachines%2Ftadpole","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flanguagemachines%2Ftadpole","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flanguagemachines%2Ftadpole/lists"}