https://github.com/uber-web/type-analyzer
https://github.com/uber-web/type-analyzer
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/uber-web/type-analyzer
- Owner: uber-web
- License: mit
- Created: 2017-08-16T22:05:41.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2023-05-23T21:05:48.000Z (about 3 years ago)
- Last Synced: 2025-04-16T08:37:48.585Z (about 1 year ago)
- Language: JavaScript
- Size: 335 KB
- Stars: 37
- Watchers: 10
- Forks: 13
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# type-analyzer
Infer data types from CSV columns.
## Overview
This package provides a single interface for generating the datatype for a given
row-column formatted dataset. We support the following datatypes:
* **DATE**
* **TIME**
* **DATETIME**
* **NUMBER**
* **INT**
* **FLOAT**
* **CURRENCY**
* **PERCENT**
* **STRING**
* **ARRAY**
* **OBJECT**
* **ZIPCODE**
* **BOOLEAN**
* **GEOMETRY**
* **GEOMETRY_FROM_STRING**
* **PAIR_GEOMETRY_FROM_STRING**
* **NONE**
## Installation
npm install type-analyzer
## Usage
### `Analyzer.computeColMeta(data, rules, options)` (Function)
**Parameters**
- `data` **Array** _required_ An array of row object
- `rules` **Array** _optional_ An array of custom regex rules
- `options` **Object** _optional_ Option object
- `options.ignoreDataTypes` **Array** _optional_ Data types to ignore
```js
var Analyzer = require('type-analyzer').Analyzer;
var data = [
{
"ST_AsText": "MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5)))",
"name": "san_francisco",
"lat": "37.7749295",
"lng": "-122.4194155",
"launch_date": "2010-06-05",
"added_at": "2010-06-05 12:00"
},
{
"ST_AsText": "MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5)))",
"name": "paris",
"lat": "48.856666",
"lng": "2.3509871",
"launch_date": "2011-12-04",
"added_at": "2010-06-05 12:00"
},
]
var colMeta = Analyzer.computeColMeta(data);
```
- **`rules`**
You can pass in an array of custom rules. For example. if you want to ensure that a column full of ids represented as numbers is identified as a column of strings. Rules can be matched with either exact `name` of the column, or `regex` used to match names. Note: Analyzer prefers rules using name over regex since better performance.
```js
var Analyzer = require('type-analyzer').Analyzer;
var colMeta = Analyzer.computeColMeta(data, [{name: 'id', dataType: 'STRING'}]);
// or
var colMeta = Analyzer.computeColMeta(data, [{regex: /id/, dataType: 'STRING'}]);
```
- **`options.ignoreDataTypes`**
You can also pass in `ignoreDataTypes` to ignore certain types. This will improve your type checking performance.
```js
var DATA_TYPES = require('type-analyzer').DATA_TYPES;
var colMeta = Analyzer.computeColMeta(arr, [], {ignoredDataTypes: DATA_TYPES.CURRENCY})[0].type,
```
And it will short cut around the usual analysis system and give
you back the column formatted as you'd expect.
### `DATA_TYPES`
You can import all availale types as a constant.
## Update
Breaking changes with v1.0.0: Regex has moved into src, but can more easily be
accessed from the module.exports from the root. As part of a larger clean up
many extraneous util files were removed.