Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/linogaliana/python-datascientist

Dépôt associé au cours Python pour data scientists (ENSAE 2e année)
https://github.com/linogaliana/python-datascientist

data-science jupyter jupyter-notebook machine-learning opendata python teaching

Last synced: about 20 hours ago
JSON representation

Dépôt associé au cours Python pour data scientists (ENSAE 2e année)

Awesome Lists containing this project

README

        

# Data science with Python

[![build-doc Actions Status](https://github.com/linogaliana/python-datascientist/actions/workflows/prod.yml/badge.svg)](https://github.com/linogaliana/python-datascientist/actions)


Download
nbviewer
Onyxia

Open In Colab

[![DOI](https://zenodo.org/badge/280161677.svg)](https://zenodo.org/badge/latestdoi/280161677)

> [!NOTE]
> This is the English 🇬🇧🇺🇸 version of the `README`. If you want to see the French 🇫🇷 version, you can click on the link below:
>
> [![fr](https://img.shields.io/badge/lang-fr-red.svg)](https://github.com/linogaliana/python-datascientist/blob/main/doc/README-fr.md)

This GitHub repository
stores the source files used to build the site
.

It contains the entire course *Python for Data Science*
that I teach in the second year (Master 1) at [ENSAE](https://www.ensae.fr/).

> [!NOTE]
> A guide to assist potential contributors is available by clicking the button below:
>
> [![`CONTRIBUTING.md`](https://img.shields.io/badge/CONTRIBUTING-fr-red.svg)](https://github.com/linogaliana/python-datascientist/blob/main/doc/CONTRIBUTING-fr.md)

## Syllabus

The syllabus is available [on the ENSAE website](https://www.ensae.fr/courses/1425-python-pour-le-data-scientist) and on the [course website](https://pythonds.linogaliana.fr/).

Overall, it offers a very comprehensive content that can satisfy both beginners in data science and those looking for more advanced content:

1. __Data Manipulation__: standard data manipulation (`Pandas`), geographical data (`Geopandas`), data retrieval (web scraping, API)...
1. __Data Visualization__: classic visualizations (`Matplotlib`, `Seaborn`), cartography, interactive visualizations (`Plotly`, `Folium`)
1. __Modeling__: machine learning (`Scikit`), econometrics
1. __Text Data Processing (NLP)__: introduction to tokenization with `NLTK` and `SpaCy`, modeling...
1. **Introduction to Modern Data Science**: cloud computing, `ElasticSearch`, continuous integration...

The content of this site is based on open data, whether French data (mainly from the central platform [`data.gouv`](https://www.data.gouv.fr) or the website of [Insee](https://www.insee.fr)) or American data.

A good complement to the website's content is the course we give with Romain Avouac ([@avouacr](https://github.com/avouacr)) in the final year at ENSAE, more focused on the production of data science projects: [https://ensae-reproductibilite.github.io/website/](https://ensae-reproductibilite.github.io/website/)


Testing Python examples

You can use a personal installation of `Python` or shared servers. On the website, a series of buttons are available to easily test the examples on `Jupyter` notebooks in the configuration that suits you best.


Here are, for example, these buttons for the Numpy tutorial:



Download
Onyxia

Open In Colab