https://github.com/rempsyc/flightanalysis
R package to scrape flight data from Google Flights and analyzes prices. Can determine optimal flight from date, place, and price
https://github.com/rempsyc/flightanalysis
Last synced: 4 months ago
JSON representation
R package to scrape flight data from Google Flights and analyzes prices. Can determine optimal flight from date, place, and price
- Host: GitHub
- URL: https://github.com/rempsyc/flightanalysis
- Owner: rempsyc
- License: other
- Created: 2025-11-09T20:09:05.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-12-02T05:06:26.000Z (7 months ago)
- Last Synced: 2025-12-04T08:18:59.855Z (6 months ago)
- Language: R
- Homepage: https://rempsyc.github.io/flightanalysis/
- Size: 5.12 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- License: LICENSE
Awesome Lists containing this project
README
---
output:
github_document
---
# Flight Analysis: Find the best flights
An R package for analyzing, forecasting, and collecting flight data and prices from Google Flights.
## Features
- Detailed scraping and querying tools for Google Flights using chromote
- Support for multiple trip types: one-way, round-trip, chain-trip, and perfect-chain
- Flexible date search across multiple airports and date ranges
- Summary tables showing prices by city and date
- Automatic identification of cheapest travel dates
- Visualization functions for price trends and best dates
## Installation
You can install the development version of flightanalysis from GitHub:
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
```{r install, eval=FALSE}
install.packages('flightanalysis',
repos = c('https://rempsyc.r-universe.dev', 'https://cloud.r-project.org'))
# Or if you need the version from the last hour, install through `remotes`
# install.packages("remotes")
remotes::install_github("rempsyc/flightanalysis")
```
## Usage
### Loading the Package
```{r library, eval=TRUE}
library(flightanalysis)
```
### Creating Flight Queries and Fetching the Data
The main scraping function that makes up the backbone of most functionalities is `fa_define_query()`. It serves as a data object, preserving the flight information as well as meta-data from your query. `fa_fetch_flights()` then fetches flight information from that query. Origin and destination airports can be specified as a mix of Airport 3-letter codes, city 3-letter codes, or city names.
```{r fa_define_query, eval=TRUE}
# Round-trip
query <- fa_define_query("NYC", "London", "2025-12-20", "2026-01-05")
# Same as:
query <- fa_define_query("New York", "LON", "2025-12-20", "2026-01-05")
query
# Fetch the flight data
flights <- fa_fetch_flights(query)
# View the flight data
head(flights$data[1:11]) |>
knitr::kable()
```
The package supports multiple trip types:
- **One-way**: `fa_define_query("JFK", "IST", "2025-07-20")`
- **Round-trip**: `fa_define_query("JFK", "IST", "2025-07-20", "2025-08-20")`
- **Chain-trip**: `fa_define_query("JFK", "IST", "2025-08-20", "RDU", "LGA", "2025-12-25")`
- **Perfect-chain**: `fa_define_query("JFK", "2025-09-20", "IST", "2025-09-25", "JFK")`
## Flexible Date Search
The package supports flexible date search across multiple airports and dates:
```{r fa_define_query_range, eval=TRUE}
# Create query objects for multiple origins and dates
queries <- fa_define_query_range(
origin = c("BOM", "DEL"),
dest = "JFK",
date_min = "2025-12-18",
date_max = "2025-12-22"
)
# Fetch all flights
flights <- fa_fetch_flights(queries, verbose = FALSE)
# Create summary table (City × Date with prices)
fa_summarize_prices(flights) |>
knitr::kable()
# Find the cheapest dates
fa_find_best_dates(
flights,
n = 5,
by = "min",
price_max = 1400,
max_stops = 1,
travel_time_max = 26 # 26 hours (numeric = hours, or use "26 hr" format)
) |>
knitr::kable()
```
## Visualizing Price Data
The package includes plotting functions to visualize price trends and best dates:
```{r plots, eval=TRUE}
# Plot price trends across dates
fa_plot_prices(flights,
title = "Flight Prices: BOM/DEL to JFK",
size_by = "travel_time",
annotate_col = "travel_time")
# Plot best travel dates
fa_plot_best_dates(flights)
```
## Original Python Package
**Credits:** This package is an R implementation inspired by the original Python package [google-flight-analysis](https://github.com/celebi-pkg/flight-analysis) by Kaya Celebi.