{"id":15286161,"url":"https://github.com/ribugent/perl-apache-tika","last_synced_at":"2025-03-23T21:22:01.659Z","repository":{"id":56836728,"uuid":"43396860","full_name":"ribugent/perl-apache-tika","owner":"ribugent","description":"A Perl interface to the Apache Tika api","archived":false,"fork":false,"pushed_at":"2017-05-29T21:24:41.000Z","size":23,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-29T04:42:47.366Z","etag":null,"topics":["perl","perl-module","tika","tika-api"],"latest_commit_sha":null,"homepage":"","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ribugent.png","metadata":{"files":{"readme":"Readme.md","changelog":"Changes","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-09-29T21:57:33.000Z","updated_at":"2017-05-29T21:23:54.000Z","dependencies_parsed_at":"2022-09-07T14:20:53.856Z","dependency_job_id":null,"html_url":"https://github.com/ribugent/perl-apache-tika","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ribugent%2Fperl-apache-tika","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ribugent%2Fperl-apache-tika/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ribugent%2Fperl-apache-tika/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ribugent%2Fperl-apache-tika/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ribugent","download_url":"https://codeload.github.com/ribugent/perl-apache-tika/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245169889,"owners_count":20571973,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["perl","perl-module","tika","tika-api"],"created_at":"2024-09-30T15:10:48.801Z","updated_at":"2025-03-23T21:22:01.633Z","avatar_url":"https://github.com/ribugent.png","language":"Perl","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NAME\n\nApache::Tika - A perl interface to Apache Tika API\n\n# SYNOPSIS\n\n    use Apache::Tika\n\n    my $tika = Apache::Tika-\u003enew();\n\n    # Extract metadata and text from a pdf file\n    open my $fh, '\u003c:raw', '/local/file.pdf';\n    my $pdf = do { local $/; \u003c$fh\u003e };\n    close $fh;\n\n    my $meta = $tika-\u003emeta($pdf);\n    my $text = $tika-\u003etika($pdf);\n\n    # Extract text from a website\n    my $response = LWP::UserAgent-\u003eget('http://some.web.site');\n    my $text = $tika-\u003etika(\n     $r-\u003edecoded_content('charset' =\u003e 'none'),\n     $r-\u003eheaders-\u003eheader('content-type')\n    );\n\n# DESCRIPTION\n\nThis module provide Apache Tika api support\n\n# CONSTRUCTOR\n\n- Apache::Tika-\u003enew(%options)\n\n    This constructs `Apache::Tika` object. You can specify the following options\n\n    - url\n\n        Apache Tika server url (defaults to http://localhost:9998)\n\n    - ua\n\n        Custom useragent\n\n# METHODS\n\nThe following api methods are available, to get more information about method responses visit [http://wiki.apache.org/tika/TikaJAXRS](http://wiki.apache.org/tika/TikaJAXRS)\n\n- $tika-\u003emeta($bytes, $contentType)\n- $tika-\u003ermeta($bytes, $contentType, $format)\n- $tika-\u003etika($bytes, $contentType)\n- $tika-\u003edetect\\_stream($bytes)\n- $tika-\u003elanguage\\_stream($bytes)\n\nThe $bytes parameter is always required and must contain the data to send to the server.\nThe $contentType is optional, but if know the $bytes content-type (p.e. \"text/html; charset=iso-8\") you can send it to improve the tika response.\n\n# SEE ALSO\n\n[Apache Tika](http://wiki.apache.org/tika/TikaJAXRS)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fribugent%2Fperl-apache-tika","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fribugent%2Fperl-apache-tika","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fribugent%2Fperl-apache-tika/lists"}