https://github.com/sabasyed/synthetic-data-generation
Synthetically generated latex expressions along with their python codes.
https://github.com/sabasyed/synthetic-data-generation
lambdify latex python sympy sympy-expressions synthetic-data synthetic-dataset-generation
Last synced: 2 months ago
JSON representation
Synthetically generated latex expressions along with their python codes.
- Host: GitHub
- URL: https://github.com/sabasyed/synthetic-data-generation
- Owner: SabaSyed
- Created: 2024-08-30T17:27:46.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-10-16T20:47:43.000Z (over 1 year ago)
- Last Synced: 2025-02-10T06:24:16.866Z (over 1 year ago)
- Topics: lambdify, latex, python, sympy, sympy-expressions, synthetic-data, synthetic-dataset-generation
- Language: Jupyter Notebook
- Homepage:
- Size: 40 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Synthetic Data Generation
This repository contains scripts and resources for generating synthetic datasets, focusing on mathematical expressions. The aim is to create diverse, randomized data that can be used for training models, particularly for tasks involving LaTeX-to-Python code conversion.
## Overview
To run the scripts in this repository, you need to install the **SymPy** library in Python. SymPy is used to generate and manipulate mathematical expressions programmatically.
```bash
pip install sympy
```
### Generated Data
We created 100 random expressions for each of the following categories:
- **Multivariable equations**
- **Trigonometric functions**
- **Geometric expressions**
- **Diophantine equations**
- **Summation equations**
These expressions are generated using Python's SymPy library. The `lambdify` function is utilized to convert the symbolic SymPy expressions into executable Python code, which can be evaluated and tested.
## Future Improvements
- Expand the range of mathematical functions and expressions.
- Optimize test case generation for more complex scenarios.
Feel free to contribute by providing feedback, issues, or pull requests!