Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/avencera/fast_rss

Fast Elixir RSS feed parser, a NIF wrapper around the Rust RSS crate
https://github.com/avencera/fast_rss

atom elixir erlang feeds hacktoberfest parser rss rust

Last synced: about 1 month ago
JSON representation

Fast Elixir RSS feed parser, a NIF wrapper around the Rust RSS crate

Awesome Lists containing this project

README

        

FastRSS


Parse RSS feeds very quickly



Hex.pm
Hex.pm
Hex.pm
HexDocs.pm
last commit


Intro
|
Compatibility
|
Installation
|
Usage
|
Benchmarks
|
Deploying
|
License

---

## Intro

Parse RSS feeds very quickly

- This is rust NIF built using [rustler](https://github.com/rusterlium/rustler)
- Uses the [RSS](https://crates.io/crates/rss) rust crate to do the actual RSS parsing

**Speed**

Currently this is already much faster than most of the pure elixir/erlang packages out there. In benchmarks there are speed improvements anywhere between **6.12x - 50.09x** over the next fastest package ([feeder_ex](https://github.com/manukall/feeder_ex)) that was tested.

Compared to the slowest elixir options tested ([feed_raptor](https://github.com/merongivian/feedraptor), [elixir_feed_parser](https://github.com/fdietz/elixir-feed-parser)), FastRSS was sometimes **259.91x** faster and used **5,412,308.17x** less memory _(0.00156 MB vs 8423.70 MB)_.

See full [benchmarks](#benchmark) below:

## Compatibility

FastRSS requires a minimum combination of Elixir 1.6.0 and Erlang/OTP 20.0, and is tested with a maximum combination of Elixir 1.11.1 and Erlang/OTP 22.0.

## Installation

This package is available on [hex](https://hex.pm/packages/fast_rss).

It can be installed by adding `fast_rss` to your list of dependencies in `mix.exs`:

```elixir
def deps do
[
{:fast_rss, "~> 0.5.0"}
]
end
```

You also need the rust compiler installed: https://www.rust-lang.org/tools/install

```bash
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
```

## Usage

There is only two functions, one for parsing rss `parse_rss/1` and one for parsing atom feeds `parse_atom/1` they takes a string and outputs an `{:ok, map()}` with string keys.

```elixir
iex(1)> {:ok, map_of_rss} = FastRSS.parse_rss("...rss_feed_string...")
iex(2)> Map.keys(map_of_rss)
["categories", "cloud", "copyright", "description", "docs", "dublin_core_ext",
"extensions", "generator", "image", "items", "itunes_ext", "language",
"last_build_date", "link", "managing_editor", "namespaces", "pub_date",
"rating", "skip_days", "skip_hours", "syndication_ext", "text_input", "title",
"ttl", "webmaster"]
```

The docs can be found at [https://hexdocs.pm/fast_rss](https://hexdocs.pm/fast_rss).

### Supported Feeds

Reading from the following RSS versions is supported:

- RSS 0.90
- RSS 0.91
- RSS 0.92
- RSS 1.0
- RSS 2.0
- iTunes
- Dublin Core
- Atom

## Benchmark

HTML: [https://avencera.github.io/fast_rss/](https://avencera.github.io/fast_rss/)

Benchmark run from 2020-02-22 05:23:47.524699Z UTC

### System

Benchmark suite executing on the following system:


Operating System
macOS

CPU Information
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz

Number of Available Cores
16

Available Memory
32 GB

Elixir Version
1.10.1

Erlang Version
22.2.6

### Configuration

Benchmark suite executing with the following configuration:


:time
30 s

:parallel
1

:warmup
5 s

### Statistics

**Input: anxiety**

Run Time


Name
IPS
Average
Devitation
Median
99th %


fast_rss
188.57
5.30 ms
±8.26%
5.45 ms
6.43 ms


feeder_ex
3.70
269.92 ms
±5.34%
268.12 ms
316.12 ms


feed_raptor
2.99
334.01 ms
±2.44%
331.03 ms
371.28 ms


elixir_feed_parser
1.94
515.72 ms
±1.94%
516.10 ms
536.05 ms

Comparison


Name
IPS
Slower

fast_rss
188.57
 


feeder_ex
3.70
50.9x


feed_raptor
2.99
62.99x


elixir_feed_parser
1.94
97.25x

Memory Usage


Name
Memory
Factor


fast_rss
0.00156 MB
 


feeder_ex
17.21 MB
11004.73x


feed_raptor
268.53 MB
171693.91x


elixir_feed_parser
313.30 MB
200316.09x


**Input: ben**

Run Time


Name
IPS
Average
Devitation
Median
99th %


fast_rss
83.95
11.91 ms
±10.29%
12.23 ms
16.17 ms


feeder_ex
13.33
75.04 ms
±4.38%
74.21 ms
89.72 ms


elixir_feed_parser
3.52
284.18 ms
±3.89%
283.83 ms
324.08 ms


feed_raptor
0.48
2078.76 ms
±0.52%
2076.27 ms
2097.44 ms

Comparison


Name
IPS
Slower

fast_rss
83.95
 


feeder_ex
13.33
6.3x


elixir_feed_parser
3.52
23.86x


feed_raptor
0.48
174.51x

Memory Usage


Name
Memory
Factor


fast_rss
0.00155 MB
 


feeder_ex
27.86 MB
17990.96x


elixir_feed_parser
163.88 MB
105811.88x


feed_raptor
1577.41 MB
1018492.36x


**Input: daily**

Run Time


Name
IPS
Average
Devitation
Median
99th %


fast_rss
32.98
0.0303 s
±7.62%
0.0313 s
0.0339 s


feeder_ex
4.94
0.20 s
±4.61%
0.199 s
0.24 s


elixir_feed_parser
0.64
1.57 s
±1.50%
1.57 s
1.63 s


feed_raptor
0.127
7.88 s
±0.23%
7.88 s
7.90 s

Comparison


Name
IPS
Slower

fast_rss
32.98
 


feeder_ex
4.94
6.68x


elixir_feed_parser
0.64
51.86x


feed_raptor
0.127
259.91x

Memory Usage


Name
Memory
Factor


fast_rss
0.00153 MB
 


feeder_ex
109.73 MB
71555.78x


elixir_feed_parser
880.51 MB
574178.95x


feed_raptor
6386.12 MB
4164382.64x


**Input: dave**

Run Time


Name
IPS
Average
Devitation
Median
99th %


fast_rss
407.08
2.46 ms
±9.83%
2.41 ms
3.16 ms


feeder_ex
56.52
17.69 ms
±6.14%
17.37 ms
22.51 ms


elixir_feed_parser
8.90
112.31 ms
±4.12%
111.93 ms
127.60 ms


feed_raptor
1.59
628.45 ms
±1.60%
626.71 ms
656.74 ms

Comparison


Name
IPS
Slower

fast_rss
407.08
 


feeder_ex
56.52
7.2x


elixir_feed_parser
8.90
45.72x


feed_raptor
1.59
255.83x

Memory Usage


Name
Memory
Factor


fast_rss
0.00157 MB
 


feeder_ex
9.25 MB
5886.17x


elixir_feed_parser
80.42 MB
51170.23x


feed_raptor
571.18 MB
363425.45x


**Input: sleepy**

Run Time


Name
IPS
Average
Devitation
Median
99th %


fast_rss
760.30
1.32 ms
±16.62%
1.21 ms
2.03 ms


feeder_ex
124.28
8.05 ms
±6.94%
8.03 ms
10.32 ms


elixir_feed_parser
26.26
38.09 ms
±5.08%
37.81 ms
44.42 ms


feed_raptor
3.21
311.16 ms
±2.85%
307.86 ms
345.09 ms

Comparison


Name
IPS
Slower

fast_rss
760.30
 


feeder_ex
124.28
6.12x


elixir_feed_parser
26.26
28.96x


feed_raptor
3.21
236.57x

Memory Usage


Name
Memory
Factor


fast_rss
0.00157 MB
 


feeder_ex
4.28 MB
2726.19x


elixir_feed_parser
35.88 MB
22829.92x


feed_raptor
274.98 MB
174963.99x


**Input: stuff**

Run Time


Name
IPS
Average
Devitation
Median
99th %


fast_rss
19.19
0.0521 s
±9.19%
0.0546 s
0.0635 s


feeder_ex
0.93
1.07 s
±2.49%
1.07 s
1.15 s


elixir_feed_parser
0.53
1.88 s
±1.22%
1.89 s
1.92 s


feed_raptor
0.0797
12.54 s
±1.61%
12.44 s
12.77 s

Comparison


Name
IPS
Slower

fast_rss
19.19
 


feeder_ex
0.93
20.59x


elixir_feed_parser
0.53
36.11x


feed_raptor
0.0797
240.68x

Memory Usage


Name
Memory
Factor


fast_rss
0.00154 MB
 


feeder_ex
140.58 MB
91220.55x


elixir_feed_parser
1018.78 MB
661058.28x


feed_raptor
8424.44 MB
5466379.81x


## Deploying

Deploying rust NIFs can be a little bit annoying as you have to install the rust compiler. We try to alleviate this with [`rustler_precopmiled`](https://hexdocs.pm/rustler_precompiled/RustlerPrecompiled.html), which will create precompiled assets for a number of targets (see [`release.yml`](./.github/workflows/release.yml) for the full list), but does not cover all environments. If you are having trouble deploying this package make an issue and I will try and help you out.

I will then add it to the FAQ below.

### Q. How do I deploy using an Alpine Dockerfile?

#### A. I recommend using a [multi-stage Dockerfile](https://docs.docker.com/develop/develop-images/multistage-build/), and doing the following

1. On the stages where you build all your deps, and build your release make sure to install `build-base` and `libgcc`:

```dockerfile
# This step installs all the build tools we'll need
RUN apk update && \
apk upgrade --no-cache && \
apk add --no-cache \
git \
curl \
build-base \
libgcc && \
mix local.rebar --force && \
mix local.hex --force
```

2. Install the rust compiler and allow dynamic linking to the C library by setting the rust flag

```dockerfile
# install rustup
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
ENV RUSTUP_HOME=/root/.rustup \
RUSTFLAGS="-C target-feature=-crt-static" \
CARGO_HOME=/root/.cargo \
PATH="/root/.cargo/bin:$PATH"
```

3. On the stage where you actually run your elixir release install `libgcc`:

```dockerfile
################################################################################
## STEP 4 - FINAL
FROM alpine:3.11

ENV MIX_ENV=prod

RUN apk update && \
apk add --no-cache \
bash \
libgcc \
openssl-dev

COPY --from=release-builder /opt/built /app
WORKDIR /app
CMD ["/app/my_app/bin/my_app", "start"]
```

## License

FastRSS is released under the Apache License 2.0 - see the [LICENSE](LICENSE) file.