An open API service indexing awesome lists of open source software.

https://github.com/vc1492a/datecleanse.py

A Python-based tool that reformats multiple date formats into a single format for later use.
https://github.com/vc1492a/datecleanse.py

Last synced: about 1 month ago
JSON representation

A Python-based tool that reformats multiple date formats into a single format for later use.

Awesome Lists containing this project

README

        

# dateCleanse.py

dateCleanse.py converts a variety of date formats and to a single, common format: MM/DD/YYYY.

Development pace will coincide with its frequency of use for date cleansing in
different applications, but this tool is ready for use. Written and tested in Python 3.4.3.

## Currently Supported Date Formats

- MM/YYYY
- YYYY/MM
- YYYY-MM
- *YY-MM
- DDDDD (days since 01/01/1900)
- YY/MM
- YY-MM
- YYMM
- YYYY
- M-YY
- M/YY
- YY-M
- MYY
- YYYY-M-DD HH:MM:SS

Note that two-digit years are assumed to occur post year 2000 and that all dates are assumed to occur post 1930.

## Date Formats for Future Release

We do not have any other date formats currently planned for future release. If you'd like to see a different date
format included for future release, please let us know or submit a pull request.

## How-To

Export a single column from your data frame or your vector of dates into the directory *input_data*. If you
are using R, you can export the files to that directory individually as follows:

```r
# include the dates you wish to reformat in the directory titled input_data
setwd("C:/user/folder/input_data")
# repeat the following write command for all the files you want to reformat
write.csv(dates, "name_of_file.csv", row.names=FALSE, quote=FALSE)
```

Your working directory will be different depending on your system configuration. Ensure that row names and quotes are
not included from the R output. Then, simply fire up your favorite Python IDE and hit run.
Or from the command line, type:

```python
python dateCleanse.py
```

The script will ask for the path to the directory which contains the dates to be reformatted. If you placed
your *name_of_file.csv* file in the same directory, simply type:

```python
input_data
```
and hit enter (this step may seem trivial but allows for flexibility if your files are in a different directory).

The script will begin reformatting the date values and show several progress bars. Once the process complete,
cleansed dates are stored in a new file called *newDates.csv*, contained in a new directory titled *output_data*.

## Non-native Dependencies

- tqdm