https://github.com/zverok/linkhum
URL auto-linker with reasonable and humane behavior
https://github.com/zverok/linkhum
Last synced: about 1 year ago
JSON representation
URL auto-linker with reasonable and humane behavior
- Host: GitHub
- URL: https://github.com/zverok/linkhum
- Owner: zverok
- License: mit
- Created: 2015-06-26T12:31:51.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2020-12-24T17:57:49.000Z (over 5 years ago)
- Last Synced: 2025-03-29T05:51:18.124Z (over 1 year ago)
- Language: Ruby
- Size: 31.3 KB
- Stars: 25
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# LinkHum
**LinkHum** (aka "Links Humana") is URL auto-linker for user-entered texts.
It tries hard to do the most reasonable thing even in complex cases.
It will be useful for sites with plain-text user input
Features:
* auto-links URL;
* very accurate detection of punctiations inside and outside of URL;
* excessive tests set for complex (yet real-life) texts with URLs;
* customizable behavior.
**NB**: the original algo was written by [@squadette](https://github.com/squadette)
and the test cases provided by users of [Mokum](https://mokum.place).
Just gemifying this (on behalf of original author).
## Install
```
[sudo] gem install linkhum
```
Or in your Gemfile
```ruby
gem 'linkhum'
```
And then
```
bundle install
```
## Usage
As simple as:
```ruby
LinkHum.urlify("Please look at http://github.com/zverok/linkhum, it's awesome!")
# => 'Please look at http://github.com/zverok/linkhum, it's awesome!'
```
## Showcase
```ruby
# Doesn't touch punctuations outside:
LinkHum.urlify('http://slashdot.org, or http://lwn.net? They say, "just http://google.com"')
# => "http://slashdot.org, or http://lwn.net? They say, \"just http://google.com\""
# But processes it inside:
LinkHum.urlify('Watch this: https://www.youtube.com/watch?v=Q9Dv4Hmf_O8')
# => "Watch this: https://www.youtube.com/watch?v=Q9Dv4Hmf_O8"
# Understands parentheses:
LinkHum.urlify("It's a movie: https://en.wikipedia.org/wiki/Hours_(2013_film) It's just parens: (https://www.youtube.com/watch?v=Q9Dv4Hmf_O8)")
# => "It's a movie: https://en.wikipedia.org/wiki/Hours_(2013_film) It's just parens: (https://www.youtube.com/watch?v=Q9Dv4Hmf_O8)"
# URL shortening:
LinkHum.urlify("It's too long: http://www.booking.com/searchresults.ru.html?sid=28c7356c8d0fb6d81de3a45eff97e0fe;dcid=4;bb_asr=2&class_interval=1&csflt=%7B%7D&dest_id=-2167973&dest_type=city&group_adults=2&group_children=0&idf=1&label_click=undef&no_rooms=1&offset=0&review_score_group=empty&score_min=0&si=ai%2Cco%2Cci%2Cre%2Cdi&src=index&ss=Lisbon%2C%20Lisbon%20Region%2C%20Portugal&ss_raw=Lisbon&ssb=empty")
# => "It's too long: http://www.booking.com/searchresults.ru.html?sid=28c7356c8d0f..."
# It's customizable:
LinkHum.urlify(
"It's too long: http://www.booking.com/searchresults.ru.html?sid=28c7356c8d0fb6d81de3a45eff97e0fe;dcid=4;bb_asr=2&class_interval=1&csflt=%7B%7D&dest_id=-2167973&dest_type=city&group_adults=2&group_children=0&idf=1&label_click=undef&no_rooms=1&offset=0&review_score_group=empty&score_min=0&si=ai%2Cco%2Cci%2Cre%2Cdi&src=index&ss=Lisbon%2C%20Lisbon%20Region%2C%20Portugal&ss_raw=Lisbon&ssb=empty",
max_length: 20)
# =>
# International domains and Non-ASCII paths:
LinkHum.urlify("Domain: http://www.詹姆斯.com/, and path: https://ru.wikipedia.org/wiki/Эффект_Даннинга_—_Крюгера")
# => "Domain: http://www.詹姆斯.com/, and path: https://ru.wikipedia.org/wiki/Эффект_Даннинга_—_Крюгера"
# Look, ma, no XSS!
LinkHum.urlify('http://example.com/foo?">here.window.alert("wow");')
# => "http://example.com/foo?\">here.window.alert(\"wow\")...</a>"
```
## Customization
### On the fly
Custom URL params:
```ruby
LinkHum.urlify("http://oursite.com/posts/12345 has been mentioned at http://cnn.com"){
|uri|
uri.host == 'oursite.com' ? {} : {target: '_blank'}
}
# => "<a href='http://oursite.com/posts/12345'>http://oursite.com/posts/12345</a> has been mentioned at <a href='http://cnn.com' target='_blank'>http://cnn.com</a>"
```
Provided block should receive an instance of `Addressable::URI` and
return hash of additional link attributes. You can use it for opening
foreign links in new tab, or for styling them different (Wikipedia-style),
or to provide special icons for links to Youtube, Wikipedia and Google...
Up to you
### Define your own LinkHum
```ruby
class MyLinks < LinkHum
def link_attrs(uri)
{target: '_blank'} unless uri.host == 'oursite.com'
end
end
MyLinks.urlify("http://oursite.com/posts/12345 has been mentioned at http://cnn.com")
# => "<a href='http://oursite.com/posts/12345'>http://oursite.com/posts/12345</a> has been mentioned at <a href='http://cnn.com' target='_blank'>http://cnn.com</a>"
```
You can also define special strings, which should also became URLs on your
site:
```ruby
class MyLinks < LinkHum
special /@(\S+)\b/ do |username|
"http://oursite/users/#{username}"
end
end
MyLinks.urlify("Hey, @jude!")
# => "Hey, <a href='http://oursite/users/jude'>@jude</a>!"
# nil or false means no replacements:
class MyLinksConditional < LinkHum
special /@(\S+)\b/ do |username|
"http://oursite/users/#{username}" if User.where(name: username).exists?
end
end
MyLinksConditional.urlify("So, our @dude and @unknownguy walk into a bar...")
# => "So, our <a href='http://oursite/users/dude'>@dude</a> and @unknownguy walk into a bar..."
```
Some `special` gotchas:
* in version 0.0.2, you can define any number of `special`s, but it's
totally up to you to have non-conflicting, clearly distinguished patterns;
* it passes to the block values by the same logic as `String#scan` does:
```ruby
class AllSymbols < LinkHum
special /@\S+\b/ do |username|
p username
nil
end
end
AllSymbols.urlify('@dude')
# Receives "@dude"
class SelectedPart < LinkHum
special /@(\S+)\b/ do |username|
p username
nil
end
end
SelectedPart.urlify('@dude')
# Receives "dude"
class SeveralArgs < LinkHum
special(/@(\S+)_(\S+)\b/) do |first, second|
p first, second
nil
end
end
SeveralArgs.urlify('@cool_dude')
# Receives "cool", "dude"
```
### "Parse only" mode
If your demands for resulting strings construction is far more complicated
than default LinkHum behavior, you can use its `#parse` command to split
string into tokens, and process them by yourself. All URL-detection
goodness and `special`s still will be with you:
```ruby
class MyParser < LinkHum
# You don't need rendering blocks for your specials
# Second argument is special's name, it is optional
special /@(\S+)\b/, :username
special /\#(\S+)\b/, :tag
end
MyParser.parse("Here is @dude. He is #cute. Is he on http://facebook.com?")
# => [
# {type: :text , content: 'Here is '},
# {type: :username, content: '@dude', captures: ['dude']},
# {type: :text , content: '. He is '},
# {type: :tag , content: '#cute', captures: ['cute']},
# {type: :text , content: '. Is he on '},
# {type: :url , content: 'http://facebook.com'},
# {type: :text , content: '?'}
# ]
```
## Credits
* [@squadette](https://github.com/squadette) -- author of original code;
* users of [Mokum](https://mokum.place) -- testing and advicing (and now
you can observe LinkHum work online at Mokum);
* [@zverok](https://github.com/zverok) -- gemifying, documenting and
writing specs.
## Contributing
Just usual fork-change-pull request process.
### Development
* Don't forget to use `rspec` after any changes made (and specify them,
of course!)
* It's preferred to use `bundle exec dokaz` to check if README written
correctly and `bundle exec dokaz -fshow` to check what exactly code
from README will output.
## License
MIT