https://github.com/mgorshkov/pd
C++ pandas-like data analysis library.
https://github.com/mgorshkov/pd
cplusplus cpp pandas
Last synced: 2 days ago
JSON representation
C++ pandas-like data analysis library.
- Host: GitHub
- URL: https://github.com/mgorshkov/pd
- Owner: mgorshkov
- License: mit
- Created: 2022-07-10T04:06:52.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2026-03-26T09:37:58.000Z (11 days ago)
- Last Synced: 2026-03-27T00:50:59.384Z (10 days ago)
- Topics: cplusplus, cpp, pandas
- Language: C++
- Homepage:
- Size: 102 KB
- Stars: 7
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://ci.appveyor.com/project/mgorshkov/pd/branch/main)
# About
Methods from pandas library on top of NP library.
# Requirements
Any C++20-compatible compiler:
* gcc 8 or higher
* clang 10 or higher
* Visual Studio 2019 or higher
# Repo
```
git clone https://github.com/mgorshkov/pd.git
```
# Build unit tests and sample
## Linux/MacOS
```
mkdir build && cd build
cmake ..
cmake --build .
```
## Windows
```
mkdir build && cd build
cmake ..
cmake --build . --config Release
```
# Build docs
```
cmake --build . --target doc
```
Open scipy/build/doc/html/index.html in your browser.
# Install
```
cmake .. -DCMAKE_INSTALL_PREFIX:PATH=~/pd_install
cmake --build . --target install
```
# Usage example (samples/read_csv)
```
#include
#include
int main(int, char **) {
using namespace pd;
auto df = read_csv("https://raw.githubusercontent.com/adityakumar529/Coursera_Capstone/master/diabetes.csv");
std::cout << "df.shape=" << df.shape() << std::endl;
const char *non_zero[] = {"Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI"};
for (const auto &column: non_zero) {
df[column] = df[column].replace(0L, np::NaN);
auto mean = df[column].mean(true);
df[column] = df[column].replace(np::NaN, mean);
}
auto X = df.iloc(":", "0:8");
auto y = df.iloc(":", "8");
std::cout << "X=" << X << std::endl;
std::cout << "y=" << y << std::endl;
return 0;
}
```
# How to build the sample
1. Clone the repo
```
git clone https://github.com/mgorshkov/pd.git
```
2. cd samples/read_csv
```
cd samples/read_csv
```
3. Make build dir
```
mkdir -p build-release && cd build-release
```
4. Configure cmake
```
cmake -DCMAKE_BUILD_TYPE=Release ..
```
5. Build
## Linux/MacOS
```
cmake --build .
```
## Windows
```
cmake --build . --config Release
```
6. Run the app
```
$./read_csv
```
# Links
* C++ numpy-like template-based array implementation: https://github.com/mgorshkov/np
* Scientific methods on top of NP library: https://github.com/mgorshkov/scipy
* ML Methods from scikit-learn library: https://github.com/mgorshkov/sklearn