An open API service indexing awesome lists of open source software.

https://github.com/dsrhaslab/lazyfs

LazyFS: A FUSE Filesystem with an internal dedicated page cache, which can be used to simulate data loss on unsynced writes
https://github.com/dsrhaslab/lazyfs

data-loss-prevention filesystem fuse storage

Last synced: 5 months ago
JSON representation

LazyFS: A FUSE Filesystem with an internal dedicated page cache, which can be used to simulate data loss on unsynced writes

Awesome Lists containing this project

README

          

# LazyFS













A FUSE file system with an internal dedicated page cache that only flushes data if explicitly requested by the application. This is useful for simulating power failures and losing unsynced data.

Please cite the [following paper](https://www.vldb.org/pvldb/vol17/p3017-ramos.pdf) if you use LazyFS:

```bibtex
@article{
author = {Ramos, Maria and Azevedo, Jo\~{a}o and Kingsbury, Kyle and Pereira, Jos\'{e} and Esteves, T\^{a}nia and Macedo, Ricardo and Paulo, Jo\~{a}o},
title = {When Amnesia Strikes: Understanding and Reproducing Data Loss Bugs with Fault Injection},
year = {2024},
issue_date = {July 2024},
publisher = {VLDB Endowment},
volume = {17},
number = {11},
issn = {2150-8097},
url = {https://doi.org/10.14778/3681954.3681980},
doi = {10.14778/3681954.3681980},
journal = {Proc. VLDB Endow.},
month = jul,
pages = {3017–3030},
numpages = {14}
}
```

## Installation

LazyFS was tested with ext4 (with the default mount options) as the underlying file system (FUSE backend), in both Debian 11 (bullseye) and Ubuntu 20.04 (focal) environment. It is C++17 compliant and requires the following packages to be installed:

- `CMake` and `g++`:

```bash
sudo apt install g++ cmake

# The following versions are used during development:
# cmake: 3.16.3
# g++: 9.4.0
```

- `FUSE 3`:

```bash
sudo apt install libfuse3-dev libfuse3-3 fuse3
```

FUSE requires the option `allow_other` as a startup argument so that other users can read and write files, besides the user owning the file system. For that, you must uncomment/add the following line on the configuration file `/etc/fuse.conf`:

```bash
user_allow_other
```

Compile and install the caching library `libpcache`, which will be attached to **LazyFS**:

```bash
cd libs/libpcache && ./build.sh && cd -
```

Finally, build `lazyfs`:

```bash
cd lazyfs && ./build.sh && cd -
```

## Running and Injecting faults

**LazyFS** uses a [toml](https://toml.io/en/) configuration file to set up the cache and a named pipe to append fault commands:

```toml
[faults]
fifo_path="/tmp/faults.fifo"
# fifo_path_completed="/tmp/faults_completed.fifo"

[cache]
apply_eviction=false

[cache.simple]
custom_size="0.5GB"
blocks_per_page=1

# [cache.manual]
# io_block_size=4096
# page_size=4096
# no_pages=10

[file system]
log_all_operations=false
logfile="/tmp/lazyfs.log"

[[injection]]
type="torn-seq"
op="write"
file="output.txt"
persist=[1,4]
occurrence=2

[[injection]]
type="torn-op"
file="output1.txt"
occurrence=5
parts=3 #or parts_bytes=[4096,3600,1260]
persist=[1,3]

[[injection]]
type="clear"
from="f1.txt"
timing="before"
op="fsync"
occurrence=6
crash=true
```

I recommend following the `simple` cache configuration (indicating the cache size and using a similar configuration file as `default.toml`), since it's currently the most tested schema in our experiments. Additionally, for the section **[cache]**, you can specify the following:

- **apply_eviction**: Whether the cache should behave like the real page cache, evicting pages when the cache fills to the maximum.

- **[cache.simple]** or **[cache.manual]**: To setup the cache size and internal organization. For now, you could just follow the example using the **custom_size** in (Gb/Mb/Kb) and the **number of blocks in each page** (you can just leave 1 as default). For manual configurations, comment out the simple configuration and uncoment/change the example above to suit your needs.

**Optionally**, users can specify a set of predefined `injection` faults before LazyFS starts running. These faults are introduced as additional features, namely:

- **torn-seq**: This fault type is used when a sequence of system calls, targeting a single file, is executed consecutively without an intervening `fsync`. *In the example*, during the second group of consecutive writes (the group number is defined by the parameter `occurrence`), to the file "output.txt", the first and fourth writes will be persisted to disk (the writes to be persisted are defined by the parameter `persist`). After the fourth write (the last in the `persist` vector), LazyFS will crash itself.
- **torn-op**: This fault type involves dividing a write system call into smaller parts, with some of these parts being persisted while others are not. In the example, the fifth write issued (the number of the write is defined by the parameter `occurrence`) to the file "output1.txt" will be divided into three equal parts if the `parts` parameter is used, or into customizable-sized parts if the `parts_bytes` parameter is defined. In the commented code, there's an example of using `parts_bytes`, where the write will be torn into three parts: the first with 4096 bytes, the second with 3600 bytes, and the last with 1200 bytes. The `persist` vector determines which parts will be persisted. After the persistence of these parts, LazyFS will crash.
- **clear-cache**: Clears unsynced data in a certain point of the execution. In the example above, this fault will be injected after (`timing`) the sixth (`occurrence`) `fsync` (`op`) to the file "f1.txt" (`from`). The `op` parameter must be a system call, and if it involves two paths (such as `rename`), the `to` parameter should also be specified. The `crash` parameter determines whether LazyFS should crash after the fault injection.

In torn-seq and torn-op faults, configure parameter `file` with the absolute path using the root directory of LazyFS.

Other parameters:

- **fifo_path**: The absolute path where the faults FIFO should be created.
- **fifo_path_completed**: If we plan to inject the clear cache fault synchronously, it is necessary to determine the completion of the `lazyfs::clear-cache` command execution. By specifying this parameter, a message will be written to another FIFO (`finished::clear-cache`), so that users can set up a reader process that waits before making any post-fault consistency checks.
- **log_all_operations**: Whether to log all file system operations that LazyFS receives.
- **logfile**: The log file for LazyFS's outputs. Fault acknowledgment is sent to `stdout` or to the `logfile`.

To **run the file system**, one could use the **mount-lazyfs.sh** script, which calls FUSE with the correct parameters:

- `-m` : FUSE mount directory.
- `-r` : FUSE root directory.
- `-f` : Foreground mode.
- `-s` : FUSE single thread mode.

```bash
cd lazyfs/

# Running LazyFS in the foreground (add '-f/--foregound')

./scripts/mount-lazyfs.sh -c config/default.toml -m /tmp/lazyfs.mnt -r /tmp/lazyfs.root -f -s

# Running LazyFS in the background

./scripts/mount-lazyfs.sh -c config/default.toml -m /tmp/lazyfs.mnt -r /tmp/lazyfs.root -s

# Umount with

./scripts/umount-lazyfs.sh -m /tmp/lazyfs.mnt/

# Display help

./scripts/mount-lazyfs.sh --help
./scripts/umount-lazyfs.sh --help
```

Finally, one can control LazyFS by echoing the following commands to the configured FIFO:

- **Clear cache** - clears all unsynced data:

```bash
echo "lazyfs::clear-cache" > /tmp/faults.fifo
```

- **Checkpoint** - checkpoints all unsynced data by calling `write` to the underlying file system (without `fsync`):

```bash
echo "lazyfs::cache-checkpoint" > /tmp/faults.fifo
```

> Note: Any subsequent failure is outside of the test control.

- **Show usage** - displays the current cache usage (percentage of pages allocated):

```bash
echo "lazyfs::display-cache-usage" > /tmp/faults.fifo
```

- **Report unsynced data,** which displays the inodes that have data in cache:

```bash
echo "lazyfs::unsynced-data-report" > /my/path/faults.fifo
```

- **Kill the filesystem,** which is triggered by an operation, a timing and a path regex:

Here **timing** should be one of `before` or `after`, and **op** should be a valid system call name (e.g. `write` or `read`).

- In the case of operations that have a source path only (e.g. `create`, `open`, `read`, `write`, ...)

```bash
echo "lazyfs::crash::timing=...::op=...::from_rgx=..." > /my/path/faults.fifo
```

Here, `from_rgx` is required (do not specify to_rgx).

- For `rename`, `link` and `symlink`, one is able to specify the destination path:

```bash
echo "lazyfs::crash::timing=...::op=...::from_rgx=...::to_rgx=..." > /my/path/faults.fifo
```

Here, only one of `from_rgx` or `to_rgx` is required.

Example 1:

```bash
echo "lazyfs::crash::timing=before::op=write::from_rgx=file1" > /my/path/faults.fifo
```

> Kills LazyFS before executing a write operation to the file pattern 'file1'.

Example 2:

```bash
echo "lazyfs::crash::timing=before::op=link::from_rgx=file1::to_rgx=file2" > /my/path/faults.fifo
```

> Kills LazyFS before executing a rename operation from the file pattern 'file1' to the file pattern 'file2'.

Example 3:

```bash
echo "lazyfs::crash::timing=before::op=rename::to_rgx=fileabd" > /my/path/faults.fifo
```

> Kills LazyFS before executing a link operation to the file pattern 'fileabd'.

- **Kill the filesystem** after injecting `torn-op` or `torn-seq`faults:

The parameters are the same as the ones presented in the above configuration file. Parameters that have multiple values, must be specified without the parenthesis (e.g., `persist=1,2` ).

- ```bash
echo "lazyfs::torn-op::file=...::persist=...::parts=...::occurrence=..." > /my/path/faults.fifo
```

- ```bash
echo "lazyfs::torn-seq::op=...::file=...::persist=...::occurrence=..." > /my/path/faults.fifo
```

After a LazyFS crash, the mount directory may be left in an undefined state. In that case, run:

```bash
fusermount -uz
```

LazyFS expects that every buffer written to the FIFO file terminates with a new line character (**echo** does this by default). Thus, if using `pwrite`, for example, make sure you end the buffer with `\n`.

## Contact

For additional information regarding possible improvements and collaborations please open an issue or contact: [@devzizu](https://github.com/devzizu), [@mj-ramos](https://github.com/mj-ramos) and [@dsrhaslab](https://github.com/dsrhaslab).