Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/fiedsch/datamanagement
Data management helpers (PHP-CLI)
https://github.com/fiedsch/datamanagement
csv-data data datamanagement helper php
Last synced: about 2 months ago
JSON representation
Data management helpers (PHP-CLI)
- Host: GitHub
- URL: https://github.com/fiedsch/datamanagement
- Owner: fiedsch
- License: mit
- Created: 2016-01-16T12:23:23.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2024-08-25T06:52:10.000Z (4 months ago)
- Last Synced: 2024-09-25T19:54:26.488Z (3 months ago)
- Topics: csv-data, data, datamanagement, helper, php
- Language: PHP
- Size: 137 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: Readme.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Datamanagement Tools
PHP classes and helpers for managing data read from text files
* Data\FileReader read text files
* Data\CsvFileReader read CSV files
* Data\FixedWidthReader reads text files that contain data in fixed width columns
* Data\Helper helper functions like `SC()` that converts from spreadsheet column name to index of array
generated by (e.g.) `CsvFileReader->getLine()`
## Examples
### Work on CSV data
```php
getLine()) !== null) {
// ignore empty lines (i.e. lines containing no data)
if (!$reader->isEmpty($line)) {
print_r($line);
}
}
// $reader->close(); // not needed as it will be automatically called when there are no more lines} catch (Exception $e) {
print $e->getMessage() . "\n";
}
```#### Features
As of v0.3.2 the typical boilerplate "open file, read every non-empty line, close file"
can be written in a fancier way. Use the optional parameter to `getLine()`:
```php
getLine(Reader::SKIP_EMPTY_LINES)) !== null) {
print_r($line);
}
```
### Data augmentation
```php
register(new TokenServiceProvider());
$augmentor->addRule('token', function (Augmentor $augmentor, $data) {
return [ 'token' => $augmentor['token']->getUniqueToken() ];
});
$reader = new CsvReader("testdata.csv", ";");
$writer = new CsvWriter("testdata.augmented.txt", "\t");
$header_written = false;
while (($line = $reader->getLine(Reader::SKIP_EMPTY_LINES)) !== null) {
$result = $augmentor->augment($line);
if (!$header_written) {
$writer->printLine(array_merge(['input_line'], array_keys($result), $reader->getHeader()));
$header_written = true;
}
$writer->printLine(array_merge([$reader->getLineNumber()], $result, $line));
}
$writer->close();
} catch (Exception $e) {
print $e->getMessage() . "\n";
}
```
### Creating Tokens
Method one: let the `TokenCreator` make sure, we have unique tokens:
```php
0) {
$token = $creator->getUniqueToken();
$output->printLine([$token]);
}
$output->close();
```Method two: generate tokens first and then check if they are unique. This might be faster and less
resource consuming for large amounts of tokens:```php
// same as above, exept
// $token = $creator->getUniqueToken();
// becomes
$token = $creator->cretateToken();
```
Check that the generated tokens are unique
```bash
echo " both lines show the same numbers, there were no duplicate tokens"
wc -l mytokens.csv
sort mytokens.csv | uniq | wc -l
```