Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jean-philippe-martin/lcdio
Lowest Common Denominator IO. Everything is a list of dictionaries!
https://github.com/jean-philippe-martin/lcdio
csv json parquet python3 sqlite toml tsv yaml
Last synced: about 1 month ago
JSON representation
Lowest Common Denominator IO. Everything is a list of dictionaries!
- Host: GitHub
- URL: https://github.com/jean-philippe-martin/lcdio
- Owner: jean-philippe-martin
- License: gpl-2.0
- Created: 2024-09-04T05:31:58.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-09-04T22:04:48.000Z (2 months ago)
- Last Synced: 2024-09-30T23:01:24.511Z (about 2 months ago)
- Topics: csv, json, parquet, python3, sqlite, toml, tsv, yaml
- Language: Python
- Homepage:
- Size: 371 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Lowest Common Denominator I/O
"Everything is a list of dictionaries!"
LCDIO lets you read through the records in the same way for
a variety of file formats:- csv
- tsv
- json
- jsonl
- parquet
- field-separated text
- SQLite
- toml
- yaml## Usage
For files with named columns: The file is a list of dictionaries (key is the column name).
```python
import lcdiofile = lcdio.open('testdata/planets.parquet')
for row in file:
print(f'Planet {row["name"]} is {row["distance"]} light-seconds away from the sun.')
```For files without named columns: The file is a list of dictionaries (key is the column number).
```python
import lcdiofile = lcdio.open('testdata/planets.csv')
for row in file:
print(f'Planet {row[0]} is {row[1]} light-seconds away from the sun.')
```### Additional features
The rows returned by LCDIO are dict-like, but they have a few other features:
- You can read multiple columns at a time (with slicing):
```python
import lcdiofile = lcdio.open('testdata/planets.csv')
for row in file:
print(f'The first two columns are {row[0:2]}')
```This includes all the slicing syntax, such as skipping every other item (`row[::2]`), skipping the first one (`row[1:]`), skipping the last one (`row[:-1]`) etc.
- You can use column index (and slices) even when the columns are named:
```python
import lcdiofile = lcdio.open('testdata/planets.parquet')
for row in file:
print(f'Planet {row[0]}')
```- If the row contains an array, you can access it by adding an argument:
```python
>>> row['days_in_office']
['monday', 'wednesday']
>>> row['days_in_office', 0]
'monday'
```- If the row contains a JSON object, you can access members by adding arguments:
```python
>>> row[0]
{'age': 30, 'secrets': {'password': 'foo', 'closet': '2 skeletons'}}
>>> row[0, 'secrets', 'password']
'foo'
```### Philosophy
This is not about making the most performant or the most featureful reader for these formats.
What this is about is making the library that is the easiest to use for reading a bunch of formats.
You don't need to remember the names of all the specific libraries to import. You don't need to remember
each of their syntax.
You only need to remember one thing: "everything is a list of dictionaries."## Other options
the `open` method has a `mode` argument to tell it what format to read the file as (if it can't be guessed from the file extension, say), and a `has_header` argument to flag whether the `csv` has a header row or not.