{"id":13482511,"url":"https://github.com/knowitall/reverb","last_synced_at":"2025-03-27T13:32:07.177Z","repository":{"id":2076925,"uuid":"3016145","full_name":"knowitall/reverb","owner":"knowitall","description":"Web-Scale Open Information Extraction","archived":false,"fork":false,"pushed_at":"2019-03-06T15:57:35.000Z","size":11891,"stargazers_count":537,"open_issues_count":6,"forks_count":138,"subscribers_count":67,"default_branch":"master","last_synced_at":"2024-04-18T02:39:13.148Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://reverb.cs.washington.edu/","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"xiezhenye/mysql-plugin-disable-myisam","license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/knowitall.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2011-12-20T00:21:08.000Z","updated_at":"2024-04-12T19:19:47.000Z","dependencies_parsed_at":"2022-09-07T12:02:20.804Z","dependency_job_id":null,"html_url":"https://github.com/knowitall/reverb","commit_stats":null,"previous_names":["knowitall/reverb-core"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/knowitall%2Freverb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/knowitall%2Freverb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/knowitall%2Freverb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/knowitall%2Freverb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/knowitall","download_url":"https://codeload.github.com/knowitall/reverb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245854499,"owners_count":20683366,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T17:01:02.770Z","updated_at":"2025-03-27T13:32:03.563Z","avatar_url":"https://github.com/knowitall.png","language":"Java","funding_links":[],"categories":["函式庫","人工智能"],"sub_categories":["書籍"],"readme":"# ReVerb\n\nReVerb is a program that automatically identifies and extracts binary \nrelationships from English sentences. ReVerb is designed for Web-scale \ninformation extraction, where the target relations cannot be specified in \nadvance and speed is important. \n\nReVerb takes raw text as input, and outputs (argument1, relation phrase, \nargument2) triples. For example, given the sentence \"Bananas are an excellent \nsource of potassium,\" ReVerb will extract the triple (bananas, be source of, \npotassium). \n\nMore information is available at the ReVerb homepage: \n\u003chttp://reverb.cs.washington.edu\u003e\n\n## Quick Start\nIf you want to run ReVerb on a small amount of text without modifying its \nsource code, we provide an executable jar file that can be run from the command \nline. Follow these steps to get started:\n\n1.  Download the latest ReVerb jar from \n\u003chttp://reverb.cs.washington.edu/reverb-latest.jar\u003e\n\n2.  Run `java -Xmx512m -jar reverb-latest.jar yourfile.txt`.\n\n3.  Run `java -Xmx512m -jar reverb-latest.jar -h` for more options.\n\n## Building\nBuilding ReVerb from source requires Apache Maven (\u003chttp://maven.apache.org\u003e). \nRun this command to download the required dependencies, compile, and create a \nsingle executable jar file.\n\n    mvn clean compile assembly:single\n\nThe compiled class files will be put in the `target/classes` directory. The \nsingle executable jar file will be written to \n`target/reverb-core-*-jar-with-dependencies.jar` where `*` is replaced with\nthe version number.  \n\n## Command Line Interface\nOnce you have built ReVerb, you can run it from the command line.\n\nThe command line interface to ReVerb takes plain text or HTML as input, and \noutputs a tab-separated table of output. Each row in the output represents a \nsingle extracted (argument1, relation phrase, argument2) triple, plus metadata. \nThe output has the following columns:\n\n1. The filename (or `stdin` if the source is standard input)\n2. The sentence number this extraction came from. \n3. Argument1 words, space separated\n4. Relation phrase words, space separated\n5. Argument2 words, space separated\n6. The start index of argument1 in the sentence. For example, if the value is \n`i`, then the first word of argument1 is the `i-1`th word in the sentence.\n7. The end index of argument1 in the sentence. For example, if the value is \n`j`, then the last word of argument1 is the `j`th word in the sentence.\n8. The start index of relation phrase.\n9. The end index of relation phrase.\n10. The start index of argument2.\n11. The end index of argument2.\n12. The confidence that this extraction is correct. The higher the number, the \nmore trustworthy this extraction is.\n13. The words of the sentence this extraction came from, space-separated.\n14. The part-of-speech tags for the sentence words, space-separated. \n15. The chunk tags for the sentence words, space separated. These represent a \nshallow parse of the sentence. \n16. A normalized version of arg1. See the `BinaryExtractionNormalizer` javadoc \nfor details about how the normalization is done.\n17. A normalized version of rel.\n18. A normalized version of arg2.\n\nFor example:\n\n    $ echo \"Bananas are an excellent source of potassium.\" | \n        ./reverb -q | tr '\\t' '\\n' | cat -n\n     1  stdin\n     2  1\n     3  Bananas\n     4  are an excellent source of\n     5  potassium\n     6  0\n     7  1\n     8  1\n     9  6\n    10  6\n    11  7\n    12  0.9999999997341693\n    13  Bananas are an excellent source of potassium .\n    14  NNS VBP DT JJ NN IN NN .\n    15  B-NP B-VP B-NP I-NP I-NP I-NP I-NP O\n    16  bananas\n    17  be source of\n    18  potassium\n\nFor a list of options to the command line interface to ReVerb, run `reverb -h`. \n\n### Examples\n\n#### Running ReVerb on small set of files\n    ./reverb file1 file2 file3 ...\n\n#### Running ReVerb on standard input\n    ./reverb \u003c input\n\n#### Running ReVerb on HTML files\nThe `--strip-html` flag (short version: `-s`) removes tags from the input \nbefore running ReVerb. \n\n    ./reverb --strip-html myfile.html\n\n#### Running ReVerb on a list of files\nYou may have an entire directory structure that you want to run ReVerb on. \nReVerb takes approximately 10 seconds to initialize, so it is not efficient to \nstart a new process for each file. To pass ReVerb a list of paths, use the `-f` \nswitch:\n\n    # Run ReVerb on all files under mydir/\n    find mydir/ -type f | ./reverb -f\n\n## Java Interface\nTo include ReVerb as a library in your own project, please take a look at the \nexample class `ReVerbExample` in the \n`src/main/java/edu/washington/cs/knowitall/examples` directory. \n\nWhen running code that calls ReVerb, make sure to increase the Java Virtual \nMachine heap size by passing the argument `-Xmx512m` to java. ReVerb loads \nmultiple models into memory, and will be significantly slower if the heap size \nis not large enough.\n\n## Using Eclipse\nTo modify the ReVerb source code in Eclipse, use Apache Maven to create the \nappropriate project files:\n\n    mvn eclipse:eclipse\n\nThen, start Eclipse and navigate to File \u003e Import. Then, under General, select \n\"Existing Projects into Workspace\". Then point Eclipse to the main ReVerb \ndirectory.\n\n## Including ReVerb as a Dependency\nIf you want to start a new project that depends on ReVerb, first create a new\nskeleton project using Maven. The following command will ask you to fill in\nthe details of your project name, etc.:\n\n    mvn archetype:generate\n\nNext, add ReVerb as a dependency. To make sure you are using the latest version\nof ReVerb, [consult Maven Central](http://search.maven.org/#search%7Cga%7C1%7Creverb).  Do this by adding the following XML under the `\u003cproject\u003e` element:\n\n    \u003cdependencies\u003e\n      \u003cdependency\u003e\n        \u003cgroupId\u003eedu.washington.cs.knowitall\u003c/groupId\u003e\n        \u003cartifactId\u003ereverb-core\u003c/artifactId\u003e\n        \u003cversion\u003e1.4.1\u003c/version\u003e\n      \u003c/dependency\u003e\n    \u003c/dependencies\u003e\n\nYour final `pom.xml` file should look something like this:\n\n    \u003cproject xmlns=\"http://maven.apache.org/POM/4.0.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\n      xsi:schemaLocation=\"http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd\"\u003e\n      \u003cmodelVersion\u003e4.0.0\u003c/modelVersion\u003e\n    \n      \u003cgroupId\u003emygroup\u003c/groupId\u003e\n      \u003cartifactId\u003emyartifact\u003c/artifactId\u003e\n      \u003cversion\u003e1.0-SNAPSHOT\u003c/version\u003e\n      \u003cpackaging\u003ejar\u003c/packaging\u003e\n\n      \u003cname\u003emyartifact\u003c/name\u003e\n      \u003curl\u003ehttp://maven.apache.org\u003c/url\u003e\n    \n      \u003cproperties\u003e\n        \u003cproject.build.sourceEncoding\u003eUTF-8\u003c/project.build.sourceEncoding\u003e\n      \u003c/properties\u003e\n    \n      \u003cdependencies\u003e\n        \u003cdependency\u003e\n          \u003cgroupId\u003ejunit\u003c/groupId\u003e\n          \u003cartifactId\u003ejunit\u003c/artifactId\u003e\n          \u003cversion\u003e3.8.1\u003c/version\u003e\n          \u003cscope\u003etest\u003c/scope\u003e\n        \u003c/dependency\u003e\n        \u003cdependency\u003e\n          \u003cgroupId\u003eedu.washington.cs.knowitall\u003c/groupId\u003e\n          \u003cartifactId\u003ereverb-core\u003c/artifactId\u003e\n          \u003cversion\u003e1.4.1\u003c/version\u003e\n        \u003c/dependency\u003e\n      \u003c/dependencies\u003e\n    \u003c/project\u003e\n\nYou should be able to include ReVerb in your code now. You can try this out by\nincluding `import edu.washington.cs.knowitall.extractor.ReVerbExtractor` in \nyour program.\n\n## Retraining the Confidence Function\nReVerb includes a class for training new confidence functions, given a list of \nlabeled examples, called `ReVerbClassifierTrainer`. Example code for training a \nnew confidence function `confFunction` is shown below - the non-trivial part is \nlikely to be converting your labeled data to an \n`Iterable\u003cLabeledBinaryExtraction\u003e`.\n\nExample Pseudocode:\n\n    // Provide your labeled data here\n    Iterable\u003cLabeledBinaryExtraction\u003e myLabeledData = ??? \n    ReVerbClassifierTrainer trainer = \n        new ReVerbClassifierTrainer(myLabeledData);\n    Logistic classifier = trainer.getClassifier();\n    ReVerbConfFunction confFunction = new ReVerbConfFunction(classifier);\n     // confFunction is ready to use here.\n    double conf = confFunction.getConf(extraction);\n\nIf you already have a list of binary labeled ReVerb extractions, it should be \neasy to convert them to `ChunkedBinaryExtraction` objects, and then to \n`LabeledBinaryExtraction` objects (see the constructors for these classes). \nAlso note that ReVerb includes a `LabeledBinaryExtractionReader` and `Writer` \nclass. You may wish to (re-)serialize your data using \n`LabeledBinaryExtractionWriter` - this will put it in the same format as all \nprevious data used to train ReVerb confidence functions, and it will be easy to \nread in the future with `LabeledBinaryExtractionReader`. \n\n\n## Help and Contact\nFor more information, please visit the ReVerb homepage at the University of \nWashington: \u003chttp://reverb.cs.washington.edu\u003e.\n\n## FAQ\n\n1.  How fast is ReVerb?\n\n    You should really benchmark ReVerb yourself, but on my computer (a new computer in 2011) ReVerb processed 5000 high-quality web sentences in 21 s, or 238 sentences per second, in a single thread.  ReVerb is easily parallelizable by processing different sentences concurrently.\n\n## Contributors\n* Anthony Fader \u003chttp://www.cs.washington.edu/homes/afader\u003e\n* Michael Schmitz \u003chttp://www.schmitztech.com/\u003e\n* Robert Bart (rbart at cs.washington.edu)\n* Janara Christensen \u003chttp://www.cs.washington.edu/homes/janara\u003e\n* Niranjan Balasubramanian \u003chttp://www.cs.washington.edu/homes/niranjan\u003e\n* Jonathan Berant \u003chttp://www.cs.tau.ac.il/~jonatha6\u003e\n\n## Citing ReVerb\nIf you use ReVerb in your academic work, please cite ReVerb with the following \nBibTeX citation:\n\n    @inproceedings{ReVerb2011,\n      author =   {Anthony Fader and Stephen Soderland and Oren Etzioni},\n      title =    {Identifying Relations for Open Information Extraction},\n      booktitle =    {Proceedings of the Conference of Empirical Methods\n                      in Natural Language Processing ({EMNLP} '11)},\n      year =     {2011},\n      month =    {July 27-31},\n      address =  {Edinburgh, Scotland, UK}\n    }\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fknowitall%2Freverb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fknowitall%2Freverb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fknowitall%2Freverb/lists"}