https://github.com/ahmedhosssam/lesser_pandas
Pandas-like Data Analysis library in C++
https://github.com/ahmedhosssam/lesser_pandas
cpp data-analysis data-science pandas
Last synced: about 2 months ago
JSON representation
Pandas-like Data Analysis library in C++
- Host: GitHub
- URL: https://github.com/ahmedhosssam/lesser_pandas
- Owner: ahmedhosssam
- Created: 2025-03-31T05:09:53.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-31T08:29:05.000Z (about 1 year ago)
- Last Synced: 2025-03-31T08:30:45.826Z (about 1 year ago)
- Topics: cpp, data-analysis, data-science, pandas
- Language: C++
- Homepage:
- Size: 1.36 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Lesser Pandas

## Pandas-like Data Analysis Library in C++
### Examples:
```cpp
#include "lesser_pandas.h"
int main() {
DataFrame df("data.csv");
cout << "Full DataFrame:\n";
cout << df << endl;
cout << "First 3 rows:\n";
df.head(3);
cout << "Last 2 rows:\n";
df.tail(2);
// Access a single column
Column& age_col = df["Age"];
cout << "Age column:\n" << age_col << endl;
// Compute mean of Age column (ignoring missing values)
cout << "Mean age: " << age_col.mean() << endl;
age_col.fillna(age_col.mean());
cout << "Age column after filling missing values with the average:\n" << age_col << endl;
// Save the DataFrame to a file named "output.csv"
df.save_to_csv("output.csv", true, "|", true, "N/A", {"Age", "Salary"});
df.fillna(5);
cout << "DataFrame after filling missing values:\n";
cout << df << endl;
// Rename columns
df.rename({{"Age", "Years"}, {"Salary", "Income"}});
cout << "DataFrame after renaming columns:\n";
cout << df << endl;
cout << df["Income"];
cout << "Min Salary: " << df["Income"].min() << endl;
cout << "Max Salary: " << df["Income"].max() << endl;
// Filtering
DataFrame newData = df[df["Years"] > 30];
cout << newData << endl;
return 0;
}
```
## TODO
### Contributions are welcomed
- [x] Rename a column
- [x] `fillna`: Fill missing values
- [x] `dropna(col_name)`: Drop rows where `col_name` is missing
- [ ] `df.describe()`: Descriptive statistics
- [ ] `df.corr()`: Correlation matrix
- [x] `df[df['Amount'] > 1000]`: Filter rows based on a condition
- [x] `df.sum()`: Returns the sum of all rows
- [ ] `df["col"].sum()`
- If the column contains non-numeric data (e.g., strings), `sum()` will concatenate them.
- If the column has missing values (NaN), they will be ignored by default unless you specify `skipna=False`.
- [x] `df.to_csv('cleaned_data.csv')` save a modified dataframe to a new csv file.
- [ ] Implement A Test Suit for Lesser Pandas.