https://github.com/kevinmarquesp/csv2sql
Script that reads a CSV file and constructs a SQL query string that inserts every row of this file into an SQL database -- you can also connect to an PostgreSQL database (by default) and actually run that generated insert command
https://github.com/kevinmarquesp/csv2sql
cli cli-tool csv csv-parser csv-parser-python learning-by-doing python sql study-project
Last synced: 11 days ago
JSON representation
Script that reads a CSV file and constructs a SQL query string that inserts every row of this file into an SQL database -- you can also connect to an PostgreSQL database (by default) and actually run that generated insert command
- Host: GitHub
- URL: https://github.com/kevinmarquesp/csv2sql
- Owner: kevinmarquesp
- License: mit
- Created: 2024-01-20T12:23:28.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2026-04-25T17:54:10.000Z (about 2 months ago)
- Last Synced: 2026-04-25T19:24:15.877Z (about 2 months ago)
- Topics: cli, cli-tool, csv, csv-parser, csv-parser-python, learning-by-doing, python, sql, study-project
- Language: Python
- Homepage:
- Size: 38.1 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# CSV To SQL Script
Script that reads a CSV file and constructs a SQL query string that inserts every row of this file into an SQL database -- you can also connect to an PostgreSQL database (by default) and actually run that generated insert command.
I was studying SQL and I needed to pass an example CSV file to my database in order to do some operations and study the behavior of the SQL syntax, so I thought that it would be a really good quick project to rest a bit and let all this SQL stuff for later. Probably, this script isn't the best solution for that problem, so if you want a proper tool to do that, maybe you'll need to do a little research. But, by the way, if this little side project is useful to you, I would be happy.
The thing that I was studying with this project is to find a good and simple Python project structure that is easy to use/install and test. This happens to be a really simple command line tool, so it's better to hold the entire code base in a single file, so the user could just copy that source code and use it as an app -- the challenge was to make a testable command line tool that doesn't require installing any libraries, just using what Python already offers me by default.
https://github.com/kevinmarquesp/csv_to_sql/assets/71402116/58e99629-bd08-4d5c-be08-d62f3f684db0
## Installation Process
As I said, the important file is the `csv_to_sql/csv_to_sql.py` file, so the way to go is just copy and paste the contents of this file in somewhere on your `$PATH`. You can just select the text from your browser, open your text editor and paste its contents, or you can just follow the steps below:
```bash
git clone https://github.com/kevinmarquesp/csv2sql
cp csv2sql/csv_to_sql/csv_to_sql.py ~/.local/bin/csv2sql
rm -rf csv2sql #removes the cloned repository
```
## Uninstall Process
To uninstall you'll need to remove this file from your system, so it depends on where you've installed it in the first place. But you followed the recommended steps to install in this document, you can uninstall with:
```bash
rm ~/.local/bin/csv2sql
```
## Development & Contribution
This code **should not** have any external dependencies that the user should use `python3 -m pip` to install, the objective is to be *plug and play* tool. Also, I think everything becomes a lot easier -- for the user, at least -- if all the code base is condensed into a single file, which could be a challange to mantain, it's important to write good and readable code for that.
I'm using a simple "*pythonic*" **styleguide**, the only constraint is that each line should not be longer than 80 characters and that I'm using double quotes (`""`) as the main characters to denote a string -- that's because english has a bunch of `'` in its grammar, and escaping each of them is just too annoying.
> [!TIP]
> Every time you update the docstrings of the `csv_to_sql.py` file, you can run the `build_docs.sh` script to automatically update the contents of this `README.md` file. But you'll have to install the `pydoc-markdown` library from `pip`; its a command line utility that generates a Markdown documentation based on the docstrings inside a file/module.
About the **tests**, it has a similar philosophy. There is a single test script -- the `test.py` file -- that uses the default `unittest` library. You can run these tests with: `python3 test.py`. Also, it's a good idea to be creative and do something to run this tests constantly, you can use a watch command to run every time the `csv_to_sql/csv_to_sql.py` is modified or something like that, I personally likes to put it in a *while sleep* loop.
### Code's Documentation (autogenerated)
```python
def log(msg: str)
```
Simple function that replaces some special characters to their
respective color codes, like replacing an `[g]`, which means "green", to
`\033[32m` of the string, then printing it on the screen.
+ **msg**: The message that you want to show on the terminal output.
```python
def parse_arguments(args: list[str]) -> Namespace
```
Given an list of arguments -- e.g. `['argument', '-o', '--option',
'optton_value']` -- this function puts everything together and returns a
`Namespace` object with all that arguments parsed to Python's types and
easily accessible with the `parsed_args.option` notation.
+ **args**: List of arguments that the user has specified, maybe you would
like to use it with the `sys.argv` list to access the command
line arguments.
```python
def join_list_format_sql(data: list[str]) -> str
```
Utility function that joins every instance of a list with a `, `
character, and put that string between `()`. Creating a string that is
compatible with SQL syntax, you can use it to select columns in a table or
to list values for each column.
+ **data**: List of strings that will be joined together.
```python
def escape_sql_characters(sql_str: str) -> str
```
Given a string, it will put a `\` character for each character that
could cause some trouble in you SQL string. It's important if an user is
called `Claire O’Connell` for an example, the `'` cound couse some syntax
error -- also, it's important to avoid SQL injection.
+ **sql_str**: The string that you want to escape the characters;
```python
def format_sql_row(row: list[str]) -> list[str]
```
The most complicated function, this function creates a list with
strings that is compatible with SQL syntax. For an example: it replaces a
"Foo" string for "E'Foo'" -- with special characters escaped -- and a "1"
string to just "1". Maybe you'll need to check the test cases or the source
code in order to understand that function well...
> [!NOTE]
> For an example: If you give an `["52", "Rice", "20.7", "TRUE"]` list, it
> will return a list that looks like `["52", "E'Rice'", "20.7", "TRUE"]`
+ **row**: Is the list of strings that you want to format to be compatible
with SQL syntax.
```python
def get_sql_slices(file_path: str, dlmtr: str = ",")
```
Utility function that opens the file and parses the header and column
values string slices that are compatible with the SQL syntax.
+ **file_path**: Path string to access the file contents;
+ **dlmtr**: This is `,` by default, but you can set a custom delimiter if
your file is formatted in a different way.
```python
def get_insert_query(file_path: str, dlmtr: str = ",") -> str
```
Open the specified file to format a SQL statement that inserts every
row values on the `.csv` file into the table that has the same name of the
`.csv` file. Also, it checks for erros when opening the file, if something
goes wrong, it will exit the script with status 1.
+ **file_path**: Path string to access the file contents;
+ **dlmtr**: This is `,` by default, but you can set a custom delimiter if
your file is formatted in a different way.
```python
def connect_and_send(host: str, port: int, user: str, password: str,
db_name: str, insert_query: str) -> None
```
This is the most important function. By default, it uses the database
authentication information provided in the command line arguments to
connect to a PostgreSQL database and send that query. If you want this
script to work with different databases, maybe you would like to edit the
source code for that, and that's the function that you're looking for.
+ **host**: Hostname of your database server;
+ **port**: Port for the connection to your database server;
+ **user**: Username to access the database tables;
+ **password**: Password of your database user;
+ **db_name**: The database this script should access once it's connected;
+ **insert_query**: Query that this code generates by reading a `.csv`
file.
```python
def main(args: list[str]) -> None
```
Main function that parses command line arguments, logs information,
generates SQL insert queries from CSV files, and sends them to a database.
The function works as follows:
1. Parses the command line arguments;
1. Iterates over each CSV file path in the parsed arguments;
1. Generates an SQL insert query for the current CSV file;
1. If the `print` flag is set in the parsed arguments, it prints the insert
query and breaks the loop; and
1. Connects to the database and sends the insert query.
+ **args**: List of command line arguments.