Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kafene/netscape-bookmark-parser
a php script (function) to parse netscape format bookmark files
https://github.com/kafene/netscape-bookmark-parser
Last synced: 12 days ago
JSON representation
a php script (function) to parse netscape format bookmark files
- Host: GitHub
- URL: https://github.com/kafene/netscape-bookmark-parser
- Owner: kafene
- License: mit
- Created: 2013-01-15T22:37:22.000Z (about 12 years ago)
- Default Branch: master
- Last Pushed: 2017-08-02T07:59:32.000Z (over 7 years ago)
- Last Synced: 2024-11-25T19:40:02.507Z (2 months ago)
- Language: PHP
- Homepage: https://packagist.org/packages/kafene/netscape-bookmark-parser
- Size: 80.1 KB
- Stars: 43
- Watchers: 5
- Forks: 23
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- project-awesome - kafene/netscape-bookmark-parser - a php script (function) to parse netscape format bookmark files (PHP)
README
# netscape-bookmark-parser
[![license](https://img.shields.io/github/license/kafene/netscape-bookmark-parser.svg?style=flat-square)](https://opensource.org/licenses/MIT)[![](https://img.shields.io/packagist/v/kafene/netscape-bookmark-parser.svg?style=plastic)](https://packagist.org/packages/kafene/netscape-bookmark-parser)
## About
This library provides a generic `NetscapeBookmarkParser` class that is able
of parsing Netscape bookmark export files.The motivations behind developing this parser are the following:
- the [Netscape format](https://msdn.microsoft.com/en-us/library/aa753582%28v=vs.85%29.aspx)
has a very loose specification:
no [DTD](https://en.wikipedia.org/wiki/Document_type_definition)
nor [XSL stylesheet](https://en.wikipedia.org/wiki/XSL)
to constrain how data is formatted
- software and web services export bookmarks using a wild variety of attribute
names and values
- using standard SAX or DOM parsers is thus not straightforward.How it works:
- the input bookmark file is trimmed and sanitized to improve parsing results
- the resulting data is then parsed using [PCRE](http://www.pcre.org/) patterns
to match attributes and values corresponding to the most likely:
- attribute names: `description` vs. `note`, `tags` vs. `labels`, `date` vs. `time`, etc.
- data formats: `comma,separated,tags` vs. `space separated labels`,
UNIX epochs vs. human-readable dates, newlines & carriage returns, etc.
- an associative array containing all successfully parsed links with their
attributes is returned## Example
Script:
```php
parseFile('./tests/input/netscape_basic.htm');
var_dump($bookmarks);
```Output:
```
array(2) {
[0] =>
array(6) {
'tags' =>
string(14) "private secret"
'uri' =>
string(19) "https://private.tld"
'title' =>
string(12) "Secret stuff"
'note' =>
string(52) "Super-secret stuff you're not supposed to know about"
'time' =>
int(971175336)
'pub' =>
int(0)
}
[1] =>
array(6) {
'tags' =>
string(18) "public hello world"
'uri' =>
string(17) "http://public.tld"
'title' =>
string(12) "Public stuff"
'note' =>
string(0) ""
'time' =>
int(1456433748)
'pub' =>
int(1)
}
}
```