https://github.com/jdblischak/faviconPlease
Find URL for a website's favicon
https://github.com/jdblischak/faviconPlease
favicon-grabber favicons rstats
Last synced: 4 months ago
JSON representation
Find URL for a website's favicon
- Host: GitHub
- URL: https://github.com/jdblischak/faviconPlease
- Owner: jdblischak
- License: other
- Created: 2020-12-14T20:19:38.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2024-11-04T14:58:13.000Z (about 1 year ago)
- Last Synced: 2024-11-26T14:10:16.147Z (12 months ago)
- Topics: favicon-grabber, favicons, rstats
- Language: R
- Homepage: https://cran.r-project.org/package=faviconPlease
- Size: 92.8 KB
- Stars: 8
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
Awesome Lists containing this project
- jimsghstars - jdblischak/faviconPlease - Find URL for a website's favicon (R)
README
---
output: github_document
---
# faviconPlease
[](https://cran.r-project.org/package=faviconPlease)
[](https://github.com/jdblischak/faviconPlease/actions)
```{r description, results='asis', echo=FALSE}
cat(read.dcf("DESCRIPTION", fields = "Description"))
```
```{r example}
library(faviconPlease)
faviconPlease("https://github.com/")
```
Also check out my [blog post on faviconPlease][blog-post] for more background
and examples.
[blog-post]: https://blog.jdblischak.com/posts/faviconplease/
## Installation
Install latest release from CRAN:
```{r installation-cran, eval=FALSE}
install.packages("faviconPlease")
```
Install development version from GitHub:
```{r installation-github, eval=FALSE}
install.packages("remotes")
remotes::install_github("jdblischak/faviconPlease")
```
## Code of Conduct
Please note that the faviconPlease project is released with a [Contributor Code
of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html).
By contributing to this project, you agree to abide by its terms.
## Default strategy
By default, `faviconPlease()` uses the following strategy to find the URL to the
favicon for a given website. It stops once it finds a URL and returns it.
1. Download the HTML file and search its `` for any `` elements with
`rel="icon"` or `rel="shortcut icon"`.
1. Download the HTML file at the root of the server (i.e. discard the path) and
search its `` for any `` elements with `rel="icon"` or
`rel="shortcut icon"`.
1. Attempt to download a file called `favicon.ico` at the root of the server.
This is the default location that a browser looks if the HTML file does not
specify an alternative location in a `` element. If the file `favicon.ico`
is successfully downloaded, then this URL is returned.
1. If the above steps fail, as a fallback, use the [favicon service][ddg-icon]
provided by the search engine [DuckDuckGo][ddg]. This provides a nice default
for websites that don't have a favicon (or can't be easily found).
[ddg]: https://duckduckgo.com/
[ddg-icon]: https://duckduckgo.com/duckduckgo-help-pages/privacy/favicons/
## Extending faviconPlease
The default strategy above is designed to reliably get you a favicon URL for
most websites. However, you can customize it as needed.
### Change the fallback to use Google's favicon service
The default fallback function is `faviconDuckDuckGo()`. To instead use Google's
favicon service, you can set the argument `fallback = faviconGoogle`.
Note that neither DuckDuckGo nor Google have every favicon you might expect. And
the availability can change over time. You can see some examples in my [blog
post][blog-post]. Fortunately they both provide a generic favicon to insert when
they don't have the favicon.
### Use a custom fallback function
You can use your own custom fallback function instead. It must accept one
argument, which is the server, e.g. `"github.com"`. The easiest approach would
be to copy-paste one of the existing fallback functions and modify it to use
your alternative favicon service.
```{r custom-fallback}
args(faviconDuckDuckGo)
body(faviconDuckDuckGo)
```
### Use a custom fallback favicon
If you have a URL to a generic favicon file that you would like to use as a
fallback, you can directly pass this as a character vector. It could also be a
path to an image file on the server where your app is running.
### Change the order of the favicon functions
The default strategy first checks the `` for a link to the favicon file
and then checks for the availability of the file `favicon.ico`. You can change
this order, or only perform one of them, by changing the argument `functions`
passed to `faviconPlease()`. It should be a list of functions.
```{r order, eval=FALSE}
# default
functions = list(faviconLink, faviconIco)
# Switch the order
functions = list(faviconIco, faviconLink)
# Only search
functions = list(faviconLink)
# Only check for favicon.ico
functions = list(faviconIco)
# Skip the favicon functions entirely and just use the fallback
functions = NULL
```
### Use a custom favicon function
You can also create your own custom favicon function to pass to
`faviconPlease()`. By default it must accept 3 arguments. It will be passed the
URL's scheme (e.g. `"https"`), server (e.g. `"github.com"`), and path (e.g.
`"/jdblischak/faviconPlease"`). Your function should return the URL to a favicon
or an empty string, `""`, if it can't find one.
```{r faviconLink-signature}
# Favicon functions must accept at least 3 positional arguments
args(faviconLink)
```
As a concrete example, here is a custom function for searching for `favicon.ico`
on Ubuntu 20.04, which has increased security settings (see troubleshooting
section below).
```{r faviconIcoUbuntu20}
faviconIcoUbuntu20 <- function(scheme, server, path) {
faviconIco(scheme, server, path, method = "wget",
extra = c("--no-check-certificate",
"--ciphers=DEFAULT:@SECLEVEL=1"))
}
```
It calls `faviconIco()` with the specific settings needed by `download.file()`
to work on Ubuntu 20.04. You could then use your custom function instead of the
default `faviconIco()` by calling `faviconPlease()` with `functions =
list(faviconLink, faviconIcoUbuntu20)`.
Note that the example function `faviconIcoUbuntu20()` will likely fail on
Windows, macOS, and Ubuntu versions prior to 20.04.
## Troubleshooting
Unfortunately it's not easy to make this fool proof for all operating systems
and all websites. Here are some known issues:
1. `download.file()`, used by `faviconIco()`, is known to have cross-platform
issues. Thus the official documentation in `?download.file` recommends:
> Setting the `method` should be left to the end user.
Accordingly, `faviconIco()` exposes the arguments `method`, `extra`, and
`headers`, which are passed directly to `download.file()`. Alternatively you
can set the global options `"download.file.method"` or
`"download.file.extra"`.
1. Ubuntu 20.04 increased its default security settings for downloading files
from the internet ([details][openssl-ticket]). Unfortunately many websites have
not updated their SSL certificates to comply with the increased security
restrictions. `faviconLink()` has a workaround for this situation, but not
`faviconIco()`. As an example, here's how you could detect the availability of
favicon.ico for the Ensembl website on Ubuntu 20.
```{r ubuntu20, eval=FALSE}
faviconIco("https", "www.ensembl.org", "",
method = "wget", extra = c("--no-check-certificate",
"--ciphers=DEFAULT:@SECLEVEL=1"))
```
Alternatively, if it's an option for you, you could avoid this workaround by
using the previous Ubuntu LTS release 18.04. Also note that the above
command will fail on Ubuntu 18.04 because the default `wget` installed
doesn't have the argument `--ciphers`.
[openssl-ticket]: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1864689