Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/utgwkk/twitter-text
Perl implementation of the twitter-text parsing library
https://github.com/utgwkk/twitter-text
perl twitter twitter-text
Last synced: about 2 months ago
JSON representation
Perl implementation of the twitter-text parsing library
- Host: GitHub
- URL: https://github.com/utgwkk/twitter-text
- Owner: utgwkk
- License: other
- Created: 2020-10-21T11:27:16.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-06-01T12:07:59.000Z (over 2 years ago)
- Last Synced: 2024-10-21T18:54:03.572Z (3 months ago)
- Topics: perl, twitter, twitter-text
- Language: Perl
- Homepage: https://metacpan.org/pod/Twitter::Text
- Size: 216 KB
- Stars: 1
- Watchers: 4
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: Changes
- License: LICENSE
Awesome Lists containing this project
README
[![Actions Status](https://github.com/utgwkk/Twitter-Text/workflows/CI/badge.svg)](https://github.com/utgwkk/Twitter-Text/actions)
# NAMETwitter::Text - Perl implementation of the twitter-text parsing library
# SYNOPSIS
use Twitter::Text;
$result = parse_tweet('Hello world こんにちは世界');
print $result->{valid} ? 'valid tweet' : 'invalid tweet';# DESCRIPTION
Twitter::Text is a Perl implementation of the twitter-text parsing library.
## WARNING
This library does not implement auto-linking and hit highlighting.
Please refer [Implementation status](https://github.com/utgwkk/Twitter-Text/issues/5) for latest status.
# FUNCTIONS
All functions below are exported by default.
## Extraction
### extract\_hashtags
$hashtags = extract_hashtags($text);
Returns an array reference of extracted hashtag string from `$text`.
### extract\_hashtags\_with\_indices
$hashtags_with_indices = extract_hashtags_with_indices($text, [\%options]);
Returns an array reference of hash reference of extracted hashtag from `$text`.
Each hash reference consists of `hashtag` (hashtag string) and `indices` (range of hashtag).
### extract\_mentioned\_screen\_names
$screen_names = extract_mentioned_screen_names($text);
Returns an array reference of exctacted screen name string from `$text`.
### extract\_mentioned\_screen\_names\_with\_indices
$screen_names_with_indices = extract_mentioned_screen_names_with_indices($text);
Returns an array reference of hash reference of extracted screen name or list from `$text`.
Each hash reference consists of `screen_name` (screen name string) and `indices` (range of screen name).
### extract\_mentions\_or\_lists\_with\_indices
$mentions_or_lists_with_indices = extract_mentions_or_lists_with_indices($text);
Returns an array reference of hash reference of extracted screen name from `$text`.
Each hash reference consists of `screen_name` (screen name string) and `indices` (range of screen name or list). If it is a list, the hash reference also contains `list_slug` item.
### extract\_urls
$urls = extract_urls($text);
Returns an array reference of extracted URL string from `$text`.
### extract\_urls\_with\_indices
$urls = extract_urls_with_indices($text, [\%options]);
Returns an array reference of hash reference of extracted URL from `$text`.
Each hash reference consists of `url` (URL string) and `indices` (range of screen name).
## Validation
### parse\_tweet
$parse_result = parse_tweet($text, [\%options]);
The `parse_tweet` function takes a `$text` string and optional `\%options` parameter and returns a hash reference with following values:
- `weighted_length`
The overall length of the tweet with code points weighted per the ranges defined in the configuration file.
- `permillage`
Indicates the proportion (per thousand) of the weighted length in comparison to the max weighted length. A value > 1000 indicates input text that is longer than the allowable maximum.
- `valid`
Indicates if input text length corresponds to a valid result.
- `display_range_start`, `display_range_end`
An array of two unicode code point indices identifying the inclusive start and exclusive end of the displayable content of the Tweet.
- `valid_range_start`, `valid_range_end`
An array of two unicode code point indices identifying the inclusive start and exclusive end of the valid content of the Tweet.
#### EXAMPLES
use Data::Dumper;
use Twitter::Text;$result = parse_tweet('Hello world こんにちは世界');
print Dumper($result);
# $VAR1 = {
# 'weighted_length' => 33
# 'permillage' => 117,
# 'valid' => 1,
# 'display_range_start' => 0,
# 'display_range_end' => 32,
# 'valid_range_start' => 0,
# 'valid_range_end' => 32,
# };### is\_valid\_hashtag
$valid = is_valid_hashtag($hashtag);
Validate `$hashtag` is a valid hashtag and returns a boolean value that indicates if given argument is valid.
### is\_valid\_list
$valid = is_valid_list($username_list);
Validate `$username_list` is a valid @username/list and returns a boolean value that indicates if given argument corresponds to a valid result.
### is\_valid\_url
$valid = is_valid_url($url, [unicode_domains => 1, require_protocol => 1]);
Validate `$url` is a valid URL and returns a boolean value that indicates if given argument is valid.
If `unicode_domains` argument is a truthy value, validate `$url` is a valid URL with Unicode characters. (default: true)
If `require_protocol` argument is a truthy value, validation requires a protocol of URL. (default: true)
### is\_valid\_username
$valid = is_valid_username($username);
Validate `$username` is a valid username for Twitter and returns a boolean value that indicates if given argument is valid.
# SEE ALSO
[twitter-text](https://github.com/twitter/twitter-text). Implementation of Twitter::Text (this library) is heavily based on [Ruby implementation of twitter-text](https://github.com/twitter/twitter-text/tree/master/rb).
[https://developer.twitter.com/en/docs/counting-characters](https://developer.twitter.com/en/docs/counting-characters)
# COPYRIGHT & LICENSE
Copyright (C) Twitter, Inc and other contributors
Copyright (C) utgwkk.
This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.# AUTHOR
utgwkk