Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/wanasit/chrono

A natural language date parser in Javascript
https://github.com/wanasit/chrono

Last synced: about 2 months ago
JSON representation

A natural language date parser in Javascript

Awesome Lists containing this project

README

        

# Chrono (v2)

A natural language date parser in Javascript.

![Build Status](https://github.com/github/docs/actions/workflows/test.yml/badge.svg)
[![Coverage Status](https://img.shields.io/coverallsCoverage/github/wanasit/chrono.svg)](https://coveralls.io/r/wanasit/chrono?branch=master)

It is designed to handle most date/time format and extract information from any given text:

* _Today_, _Tomorrow_, _Yesterday_, _Last Friday_, etc
* _17 August 2013 - 19 August 2013_
* _This Friday from 13:00 - 16.00_
* _5 days ago_
* _2 weeks from now_
* _Sat Aug 17 2013 18:40:39 GMT+0900 (JST)_
* _2014-11-30T08:15:30-05:30_

### Installation

With npm:
```bash
$ npm install --save chrono-node
```

```javascript
import * as chrono from 'chrono-node';

chrono.parseDate('An appointment on Sep 12-13');
```
For Node.js:
```javascript
const chrono = require('chrono-node');

// or `import chrono from 'chrono-node'` for ECMAScript
```

### What's changed in the v2
For Users
* Chrono’s default now handles only international English. While in the previous version, it tried to parse with all known languages.
* The current fully supported languages are `en`, `ja`, `fr`, `nl`, `ru` and `uk` (`de`, `pt`, and `zh.hant` are partially supported).

For contributors and advanced users
* The project is rewritten in TypeScript
* New [Parser](#parser) and [Refiner](#refiner) interface
* New source code structure. All parsers, refiners, and configuration should be under a locale directory (See. `locales/en`)

**Note: [v1.x.x](https://github.com/wanasit/chrono/tree/v1.x.x) will still be supported for the time being.**

## Usage

Simply pass a `string` to functions `chrono.parseDate` or `chrono.parse`.

```javascript
import * as chrono from 'chrono-node';

chrono.parseDate('An appointment on Sep 12-13');
// Fri Sep 12 2014 12:00:00 GMT-0500 (CDT)

chrono.parse('An appointment on Sep 12-13');
/* [{
index: 18,
text: 'Sep 12-13',
start: ...
}] */
```

For more advanced usage, here is the typescript definition of the `parse` function:
```typescript
parse(text: string, ref?: ParsingReference, option?: ParsingOption): ParsedResult[] {...}
```

#### Parsing Reference (Date / Timezone)

Today's "Friday" is different from last month's "Friday".
The meaning of the referenced dates depends on when and where they are mentioned.
Chrono lets you define the reference as `Date` or `ParsingReference` object:

```javascript
// (Note: the examples run on JST timezone)

chrono.parseDate('Friday', new Date(2012, 8 - 1, 23));
// Fri Aug 24 2012 12:00:00 GMT+0900 (JST)

chrono.parseDate('Friday', new Date(2012, 8 - 1, 1));
// Fri Aug 03 2012 12:00:00 GMT+0900 (JST)

chrono.parseDate("Friday at 4pm", {
// Wed Jun 09 2021 21:00:00 GMT+0900 (JST)
// = Wed Jun 09 2021 07:00:00 GMT-0500 (CDT)
instant: new Date(1623240000000),
timezone: "CDT",
})
// Sat Jun 12 2021 06:00:00 GMT+0900 (JST)
// = Fri Jun 11 2021 16:00:00 GMT-0500 (CDT)
```

#### ParsingReference
* `instant?: Date` The instant when the input is written or mentioned
* `timezone?: string | number` The timezone where the input is written or mentioned.
Support minute-offset (number) and timezone name (e.g. "GMT", "CDT")

### Parsing Options

`forwardDate` (boolean) to assume the results should happen after the reference date (forward into the future)

```javascript
const referenceDate = new Date(2012, 7, 25);
// Sat Aug 25 2012 00:00:00 GMT+0900 -- The reference date was Saturday

chrono.parseDate('Friday', referenceDate);
// Fri Aug 24 2012 12:00:00 GMT+0900 (JST) -- The day before was Friday

chrono.parseDate('Friday', referenceDate, { forwardDate: true });
// Fri Aug 31 2012 12:00:00 GMT+0900 (JST) -- The following Friday
```

`timezones` Override or add custom mappings between timezone abbreviations and offsets. Use this when you want Chrono to parse certain text into a given timezone offset. Chrono supports both unambiguous (normal) timezone mappings and ambigous mappings where the offset is different during and outside of daylight savings.

```javascript
// Chrono doesn't understand XYZ, so no timezone is parsed
chrono.parse('at 10:00 XYZ', new Date(2023, 3, 20))
// "knownValues": {"hour": 10, "minute": 0}

// Make Chrono parse XYZ as offset GMT-0300 (180 minutes)
chrono.parse('at 10:00 XYZ', new Date(2023, 3, 20), { timezones: { XYZ: -180 } })
// "knownValues": {"hour": 10, "minute": 0, "timezoneOffset": -180}

// Make Chrono parse XYZ as offset GMT-0300 outside of DST, and GMT-0200 during DST. Assume DST is between
import { getLastDayOfMonthTransition } from "timezone";
import { Weekday, Month } from "parsing";

const parseXYZAsAmbiguousTz = {
timezoneOffsetDuringDst: -120,
timezoneOffsetNonDst: -180,
dstStart: (year: number) => getLastWeekdayOfMonth(year, Month.FEBRUARY, Weekday.SUNDAY, 2),
dstEnd: (year: number) => getLastWeekdayOfMonth(year, Month.SEPTEMBER, Weekday.SUNDAY, 3)
};
// Parsing a date which falls within DST
chrono.parse('Jan 1st 2023 at 10:00 XYZ', new Date(2023, 3, 20), { timezones: { XYZ: parseXYZAsAmbiguousTz } })
// "knownValues": {"month": 1, ..., "timezoneOffset": -180}

// Parsing a non-DST date
chrono.parse('Jun 1st 2023 at 10:00 XYZ', new Date(2023, 3, 20), { timezones: { XYZ: parseXYZAsAmbiguousTz } })
// "knownValues": {"month": 6, ..., "timezoneOffset": -120}
```

### Parsed Results and Components

#### ParsedResult
* `refDate: Date` The [reference date](#reference-date) of this result
* `index: number` The location within the input text of this result
* `text: string` The text this result that appears in the input
* `start: ParsedComponents` The parsed date components as a [ParsedComponents](#parsedcomponents) object
* `end?: ParsedComponents` Similar to `start`
* `date: () => Date` Create a javascript Date

#### ParsedComponents
* `get: (c: Component) => number | null` Get known or implied value for the component
* `isCertain: (c: Component) => boolean` Check if the component has a known value
* `date: () => Date` Create a javascript Date

For example:
```js
const results = chrono.parse('I have an appointment tomorrow from 10 to 11 AM');

results[0].index; // 22
results[0].text; // 'tomorrow from 10 to 11 AM'
results[0].refDate; // Sat Dec 13 2014 21:50:14 GMT-0600 (CST)

// `start` is Sat Dec 14 2014 10:00:00
results[0].start.get('day'); // 14 (the 14th, the day after refDate)
results[0].start.get('month'); // 12 (or December)
results[0].start.get('hour'); // 10
results[0].start.date(); // Sun Dec 14 2014 10:00:00 GMT-0600 (CST)

...
results[0].end.date(); // Sun Dec 14 2014 11:00:00 GMT-0600 (CST)
```

### Strict vs Casual configuration

Chrono comes with `strict` mode that parse only formal date patterns.

```js
// 'strict' mode
chrono.strict.parseDate('Today'); // null
chrono.strict.parseDate('Friday'); // null
chrono.strict.parseDate('2016-07-01'); // Fri Jul 01 2016 12:00:00 ...
chrono.strict.parseDate('Jul 01 2016'); // Fri Jul 01 2016 12:00:00 ...

// 'casual' mode (default)
chrono.parseDate('Today'); // Thu Jun 30 2016 12:00:00 ...
chrono.casual.parseDate('Friday'); // Fri Jul 01 2016 12:00:00 ...
chrono.casual.parseDate('2016-07-01'); // Fri Jul 01 2016 12:00:00 ...
chrono.casual.parseDate('Jul 01 2016'); // Fri Jul 01 2016 12:00:00 ...
```

### Locales

By default, Chrono is configured to handle **only international English**.
This differs from the previous version of Chrono that would try all locales by default.

There are several locales supported contributed by multiple developers under `./locales` directory.

```js
// default English (US)
chrono.parseDate('6/10/2018');

chrono.en.parseDate('6/10/2018'); // June 10th, 2018
chrono.en.GB.parseDate('6/10/2018'); // October 6th, 2018

chrono.ja.parseDate('昭和64年1月7日');
```

Current supported locale options are: `en`, `ja`, `fr`, `nl`, `ru` and `uk` (`de`, `pt`, and `zh.hant` are partially supported).

#### Importing specific locales

Chrono exports all locale options by default for simplicity. However, this can cause issues when using Chrono if you're using a Node.js runtime that was built with the Intl module disabled (with the `--without-intl` flag), such as:
```
Invalid regular expression: /* omitted */: Invalid property name in character class
```

This is because the Intl module is required to handle special characters, such as Cyrillic (`ru`).

To avoid this, you can specify only the locale(s) you want to import:
```typescript
// CommonJS (Node.js)
const chrono = require('chrono-node/en')

// ECMAScript
import chrono from 'chrono-node/en'

// TypeScript
// Warning: `moduleResolution` must be set to `node16` or `nodeNext` in tsconfig.json`
import * as chrono from 'chrono-node/en'
```

## Customize Chrono

Chrono’s extraction pipeline configuration consists of `parsers: Parser[]` and `refiners: Refiner[]`.

* First, each parser independently extracts patterns from input text input and create parsing results ([ParsingResult](#parsedresult)).
* Then, the parsing results are combined, sorted, and refined with the refiners. In the refining phase, the results can be filtered-out, merged, or attached with additional information.

### Parser

```typescript
interface Parser {
pattern: (context: ParsingContext) => RegExp,
extract: (context: ParsingContext, match: RegExpMatchArray) =>
(ParsingComponents | ParsingResult | {[c in Component]?: number} | null)
}
```

Parser is a module for low-level pattern-based parsing.
Ideally, each parser should be designed to handle a single specific date format.

User can create a new parser for supporting new date formats or languages
by providing RegExp pattern `pattern()` and extracting result or components from the RegExp match `extract()`.

```javascript
const custom = chrono.casual.clone();
custom.parsers.push({
pattern: () => { return /\bChristmas\b/i },
extract: (context, match) => {
return {
day: 25, month: 12
}
}
});

custom.parseDate("I'll arrive at 2.30AM on Christmas night");
// Wed Dec 25 2013 02:30:00 GMT+0900 (JST)
// 'at 2.30AM on Christmas'
```

### Refiner

```typescript
interface Refiner {
refine: (context: ParsingContext, results: ParsingResult[]) => ParsingResult[]
}
```

Refiner is a higher level module for improving or manipulating the results. User can add a new type of refiner to customize Chrono's results or to add some custom logic to Chrono.

```javascript
const custom = chrono.casual.clone();
custom.refiners.push({
refine: (context, results) => {
// If there is no AM/PM (meridiem) specified,
// let all time between 1:00 - 4:00 be PM (13.00 - 16.00)
results.forEach((result) => {
if (!result.start.isCertain('meridiem') &&
result.start.get('hour') >= 1 && result.start.get('hour') < 4) {

result.start.assign('meridiem', 1);
result.start.assign('hour', result.start.get('hour') + 12);
}
});
return results;
}
});

// This will be parsed as PM.
// > Tue Dec 16 2014 14:30:00 GMT-0600 (CST)
custom.parseDate("This is at 2.30");

// Unless the 'AM' part is specified
// > Tue Dec 16 2014 02:30:00 GMT-0600 (CST)
custom.parseDate("This is at 2.30 AM");
```

In the example, the custom refiner assigns PM to parsing results with ambiguous [meridiem](http://en.wikipedia.org/wiki/12-hour_clock).
The `refine` method of the refiner class will be called with parsing [results](#parsedresult) (from [parsers](#parser) or other previous refiners).
The method must return an array of the new results (which, in this case, we modified those results in place).

### More documentation

Checkout the Typescript Documentation in the project's [Github page](http://wanasit.github.io/chrono/).

## Development Guides

This guide explains how to set up chrono project for prospective contributors.

```bash
# Clone and install library
$ git clone https://github.com/wanasit/chrono.git chrono
$ cd chrono
$ npm install

```

Parsing date from text is complicated. A small change can have effects on unexpected places.
So, Chrono is a heavily tested library.
Commits that break a test shouldn't be allowed in any condition.

Chrono's unit testing is based-on [Jest](https://facebook.github.io/jest/).

```bash
# Run the test
$ npm run test

# Run the test in watch mode
$ npm run watch
```

Chrono's source files is in `src` directory.
The built bundle (`dist/*`) is created by running [Webpack](https://webpack.js.org/) via the following command

```bash
$ npm run build
```