Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/stibu81/browserhistory

Read and analyse Firefox' browser history
https://github.com/stibu81/browserhistory

Last synced: about 1 month ago
JSON representation

Read and analyse Firefox' browser history

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# browserhistory

[![R-CMD-check](https://github.com/stibu81/browserhistory/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/stibu81/browserhistory/actions/workflows/R-CMD-check.yaml)
[![Codecov test
coverage](https://codecov.io/gh/stibu81/browserhistory/branch/main/graph/badge.svg)](https://app.codecov.io/gh/stibu81/browserhistory?branch=main)

browserhistory offers tools to read and analyse the browser history of
Firefox with R. Support for other browsers might be added in the future.

## Installation

You can install the development version of browserhistory from
[GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("stibu81/browserhistory")
```

## Reading the Browser History

Firefox stores the browser history in a SQLite database called `places.sqlite`.
The function `read_firefox_history()` reads the browser history from this
database and returns it as a tibble. The package contains a small example
database that is read in the following example:

```{r}
library(browserhistory)
root_dir <- get_test_root_dir()
history <- read_browser_history(root_dir = root_dir)
history
```

In order to read the history, you need to figure out where Firefox stores the
relevant files. The folder contains a file called `profiles.ini` and subfolders
for one or more profiles. Use the folder that contains `profiles.ini` as
the root directory.

The following list gives hints on where to find the root directory on a small
number of systems:

* **Ubuntu 20.04/22.04:** If Firefox has been installed from a deb package, the root
directory is `⁠~/.mozilla/firefox`. (This might also be true for other
Debian-based distributions.) If Firefox has been installed as a snap package,
the root directory is `⁠~/snap/firefox/common/.mozilla/firefox`⁠.
* **Windows 10/11:** The root directory is
`⁠C:\Users\\AppData\Roaming\Mozilla\Firefox`, where
`` must be replaced by your user name.

You can use the function `guess_root_dir()` that tries to find a suitable root
directory according to the above rules.

```{r}
guess_root_dir()
```

Most of the functions that take a root directory as input use `guess_root_dir()`
as the default value. So, on some system, it will be enough to simply call
`read_browser_history()` without any inputs to get a result:

```{r echo = FALSE}
# cheat, to make the next chunk work
read_browser_history <- function() {
browserhistory::read_browser_history(root_dir = get_test_root_dir())
}
```
```{r}
read_browser_history()
```
```{r echo = FALSE}
# set back read_browser_history() to the correct function
read_browser_history <- browserhistory::read_browser_history
```

The structure of the root directory depends on the system. In all cases, it
contains the file `profiles.ini` that stores the information about the
available profiles. Use `list_profiles()` too see, which profiles there are:

```{r}
list_profiles(root_dir)
```

The test data contains two profiles and the second one is the default, as is
indicated by the attribute `"default"`:

```{r}
attr(list_profiles(root_dir), "default")
```

The values of the vector indicate the directories in which the profile data is
stored. As one can check, the test data indeed contain a directory for each
profile:

```{r}
list.files(root_dir, recursive = FALSE)
```

The full structure of the root directory on Linux looks as follows:

```
root_dir
├── bad_profile
│ └── somefile
├── profiles.ini
└── test_profile
└── places.sqlite
```

The profile `test_profile` actually contains the history database
`places.sqlite` (next to many other files in an actual root directory).
Sometimes, there are invalid profile like `bad_profile` that do not contain
`places.sqlite`.

On Windows, the root directory contains a directory `Profiles` that has the
profile directories as subdirectories. The equivalent of the test data would
have the following structure:

```
root_dir
├── Profiles
│ ├── bad_profile
│ │ └── somefile
│ └── test_profile
│ └── places.sqlite
└── profiles.ini
```

`read_firefox_history()` will automatically select the
default profile, but when there are several profiles and you want to read
a different one, you have to pass the name of the profile folder to the
`profile` argument as follows:

```{r}
history <- read_browser_history(root_dir = root_dir, profile = "test_profile")
history
```

On Windows, since the profile directories are inside a subdirectory, the
equivalent call would look as follows:

```{r eval = FALSE}
history <- read_browser_history(root_dir = root_dir, profile = "Profiles/test_profile")
```