Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/asg017/sqlite-html
A SQLite extension for querying, manipulating, and creating HTML elements.
https://github.com/asg017/sqlite-html
sqlite sqlite-extension
Last synced: 5 days ago
JSON representation
A SQLite extension for querying, manipulating, and creating HTML elements.
- Host: GitHub
- URL: https://github.com/asg017/sqlite-html
- Owner: asg017
- License: mit
- Created: 2021-10-11T21:15:19.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2023-08-06T01:24:57.000Z (over 1 year ago)
- Last Synced: 2024-10-25T09:31:04.746Z (11 days ago)
- Topics: sqlite, sqlite-extension
- Language: Go
- Homepage:
- Size: 2.51 MB
- Stars: 374
- Watchers: 7
- Forks: 12
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-sqlite - `sqlite-html` - Querying and generating HTML elements (People / As Main Database)
README
# sqlite-html
A SQLite extension for querying, manipulating, and creating HTML elements.
- Extract HTML or text from HTML with CSS selectors, like `.querySelector()`, `.innerHTML`, and `.innerText`
- Generate a table of matching elements from a CSS selector, like `.querySelectorAll()`
- Safely create HTML elements in a query, like `.createElement()` and `.appendChild()``sqlite-html`'s API is modeled after the official [JSON1](https://www.sqlite.org/json1.html#jmini) SQLite extension.
This extension is written in Go, thanks to [riyaz-ali/sqlite](https://github.com/riyaz-ali/sqlite). While this library aims to be fast and efficient, it is overall slower than what a pure C SQLite extension could be, but in practice you may not notice much of a difference.
## Usage
```sql
.load ./html0
select html_extract('Anakin Skywalker
', 'b');
-- "Skywalker"
````sqlite-html` is similar to other HTML scraping tools like [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/) (Python) or [cheerio](https://cheerio.js.org/) (Node.js) or [nokogiri](https://nokogiri.org/) (Ruby). You can use CSS selectors to extract individual elements or groups of elements to query data from HTML sources.
For example, here we find all `href` links in an `index.html` file.
```sql
select
text as name,
html_attribute_get(anchors, 'a', 'href') as href
from html_each(readfile('index.html'), 'a') as anchors
```We can also safely generate HTML with `html_element`, modeled after React's [`React.createElement`](https://reactjs.org/docs/react-api.html#createelement).
```sql
select html_element('p', null,
'Luke, I am your',
html_element('b', null, 'father'),
'!',html_element('img', json_object(
'src', 'https://images.dog.ceo/breeds/groenendael/n02105056_4600.jpg',
'width', 200
))
);-- "
Luke, I am yourfather!
"
```## Documentation
See [`docs.md`](./docs.md) for a full API reference.
## Installing
| Language | Install | |
| -------------- | ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Python | `pip install sqlite-html` | [![PyPI](https://img.shields.io/pypi/v/sqlite-html.svg?color=blue&logo=python&logoColor=white)](https://pypi.org/project/sqlite-html/) |
| Datasette | `datasette install datasette-sqlite-html` | [![Datasette](https://img.shields.io/pypi/v/datasette-sqlite-html.svg?color=B6B6D9&label=Datasette+plugin&logoColor=white&logo=python)](https://datasette.io/plugins/datasette-sqlite-html) |
| Node.js | `npm install sqlite-html` | [![npm](https://img.shields.io/npm/v/sqlite-html.svg?color=green&logo=nodedotjs&logoColor=white)](https://www.npmjs.com/package/sqlite-html) |
| Deno | [`deno.land/x/sqlite_html`](https://deno.land/x/sqlite_html) | [![deno.land/x release](https://img.shields.io/github/v/release/asg017/sqlite-html?color=fef8d2&include_prereleases&label=deno.land%2Fx&logo=deno)](https://deno.land/x/sqlite_html) |
| Ruby | `gem install sqlite-html` | ![Gem](https://img.shields.io/gem/v/sqlite-html?color=red&logo=rubygems&logoColor=white) |
| Github Release | | ![GitHub tag (latest SemVer pre-release)](https://img.shields.io/github/v/tag/asg017/sqlite-html?color=lightgrey&include_prereleases&label=Github+release&logo=github) |The [Releases page](https://github.com/asg017/sqlite-html/releases) contains pre-built binaries for Linux amd64, MacOS amd64 (no arm), and Windows.
### As a loadable extension
If you want to use `sqlite-html` as a [Runtime-loadable extension](https://www.sqlite.org/loadext.html), Download the `html0.dylib` (for MacOS), `html0.so` (Linux), or `html0.dll` (Windows) file from a release and load it into your SQLite environment.
> **Note:**
> The `0` in the filename (`html0.dylib`/ `html0.so`/`html0.dll`) denotes the major version of `sqlite-html`. Currently `sqlite-html` is pre v1, so expect breaking changes in future versions.For example, if you are using the [SQLite CLI](https://www.sqlite.org/cli.html), you can load the library like so:
```sql
.load ./html0
select html_version();
-- v0.0.1
```Or in Python, using the builtin [sqlite3 module](https://docs.python.org/3/library/sqlite3.html):
```python
import sqlite3con = sqlite3.connect(":memory:")
con.enable_load_extension(True)
con.load_extension("./html0")print(con.execute("select html_version()").fetchone())
# ('v0.0.1',)
```Or in Node.js using [better-sqlite3](https://github.com/WiseLibs/better-sqlite3):
```javascript
const Database = require("better-sqlite3");
const db = new Database(":memory:");db.loadExtension("./html0");
console.log(db.prepare("select html_version()").get());
// { 'html_version()': 'v0.0.1' }
```Or with [Datasette](https://datasette.io/):
```
datasette data.db --load-extension ./html0
```## See also
- [sqlite-http](https://github.com/asg017/sqlite-http), for making HTTP requests in SQLite (pairs great with this tool)
- [htmlq](https://github.com/mgdm/htmlq), for a similar but CLI-based HTML query tool using CSS selectors
- [riyaz-ali/sqlite](https://github.com/riyaz-ali/sqlite), the brilliant Go library that this library depends on
- [nalgeon/sqlean](https://github.com/nalgeon/sqlean), several pre-compiled handy SQLite functions, in C