https://github.com/goodbytes/linkdetector
A Java utility that lets you detect where in a text links begin and end.
https://github.com/goodbytes/linkdetector
Last synced: 4 months ago
JSON representation
A Java utility that lets you detect where in a text links begin and end.
- Host: GitHub
- URL: https://github.com/goodbytes/linkdetector
- Owner: goodbytes
- License: apache-2.0
- Created: 2025-04-02T09:38:35.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-02T17:49:04.000Z (about 1 year ago)
- Last Synced: 2025-04-20T10:13:39.277Z (about 1 year ago)
- Language: Java
- Size: 25.4 KB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 4
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
# LinkDetector
A Java utility that lets you detect where in plain text a hyperlink begins and ends.
Correctly identifying the start and end location of a link in text can be tricky, especially when those links either
_include_ or are _surrounded by_ parenthesis, followed by a comma, and so on:
- `https://example.org/foo,` instead of `https://example.org/foo` (from a text like `... on https://example.org/foo, where ...`).
- `https://example.org/foo)` instead of `https://example.org/foo` (from a text like `... the webpage (https://example.org/foo) where ...` ).
These often lead to browsers being opened to invalid URLs, causing end-users to see 404 pages or other errors.
This project simplifies parsing the correct start and end of links in text, which helps avoid such issues.
## Usage
The distributable is available through the Maven central repository. You can then define this project to be a dependency of your project, like so:
```xml
nl.goodbytes.util
linkdetector
1.0.0
```
To use the utility in your code, invoke the `parse` method of the `LinkDetector` class, as shown below. This will split
up the text in fragments (returned in a list). For each fragment, a start and end index is provided, and defines if it does or does
not represent a link.
```java
final String input = "Please find more information in the corresponding page on "
+ "Wikipedia (https://en.wikipedia.org/wiki/Ambiguity_(disambiguation)). Let me "
+ "know if you have questions!";
final List fragments = LinkDetector.parse(input);
for (final Fragment fragment : fragments)
{
System.out.println("Fragment starting at index " + fragment.startIndex()
+ ", ending at index " + fragment.endIndex() + " (exclusive) "
+ (fragment.isLink() ? "is" : "is not") + " a link:");
System.out.println("\t" + fragment);
System.out.println();
}
```
The example code above generates the following output:
```
Fragment starting at index 0, ending at index 69 (exclusive) is not a link:
Please find more information in the corresponding page on Wikipedia (
Fragment starting at index 69, ending at index 125 (exclusive) is a link:
https://en.wikipedia.org/wiki/Ambiguity_(disambiguation)
Fragment starting at index 125, ending at index 162 (exclusive) is not a link:
). Let me know if you have questions!
```
## Build / Compilation
This project should be compatible with any version of Java that is not _ancient_. It _should_ be compatible with
Java 1.4, but to circumvent some issues with modern build tooling, its project descriptor defines 1.8.
The project can be built using standard [Maven](https://maven.apache.org/) invocations, like this:
```bash
mvn clean package
```
The project does not use any external dependencies (although for testing, the [JUnit](https://junit.org/) library is
added to the test scope of the build process).
## Attribution
This is but a simple Java wrapper around a regular expression that was provided by [Wiktor Kwapisiewicz](https://metacode.biz/@wiktor).