{"id":24703416,"url":"https://github.com/youssef-saaed/knn_assignment_c_version","last_synced_at":"2025-03-22T04:43:01.576Z","repository":{"id":222499648,"uuid":"757461889","full_name":"youssef-saaed/KNN_Assignment_C_version","owner":"youssef-saaed","description":"This is a machine learning assignment that implements the K-Nearest Neighbors (KNN) algorithm on the Iris dataset from scratch using pure C","archived":false,"fork":false,"pushed_at":"2024-02-14T14:49:37.000Z","size":40,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-27T05:55:02.995Z","etag":null,"topics":["c","from-scratch","iris-classification","iris-dataset","knn","pure-c"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/youssef-saaed.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-02-14T14:45:35.000Z","updated_at":"2024-06-01T17:20:49.000Z","dependencies_parsed_at":"2024-02-14T16:03:29.020Z","dependency_job_id":null,"html_url":"https://github.com/youssef-saaed/KNN_Assignment_C_version","commit_stats":null,"previous_names":["youssef-saaed/knn_assignment_c_version"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/youssef-saaed%2FKNN_Assignment_C_version","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/youssef-saaed%2FKNN_Assignment_C_version/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/youssef-saaed%2FKNN_Assignment_C_version/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/youssef-saaed%2FKNN_Assignment_C_version/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/youssef-saaed","download_url":"https://codeload.github.com/youssef-saaed/KNN_Assignment_C_version/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244907379,"owners_count":20529851,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","from-scratch","iris-classification","iris-dataset","knn","pure-c"],"created_at":"2025-01-27T05:55:14.624Z","updated_at":"2025-03-22T04:43:01.560Z","avatar_url":"https://github.com/youssef-saaed.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# KNN Assignment C version\n\nThis is a machine learning assignment that implements the K-Nearest Neighbors (KNN) algorithm on the Iris dataset. The Iris dataset contains 150 samples of three different species of iris flowers, with four features each: sepal length, sepal width, petal length, and petal width. The goal is to classify each sample into one of the three species based on the features.\n\n## Dataframe.h\n\nThis is a header file that defines the data frame structure and functions. A data frame is a two-dimensional array of cells, where each cell can store different types of data, such as integers, doubles, or strings. The data frame also has an array of column names and a shape attribute that stores the number of rows and columns. The data frame functions include:\n\n- `new_dataframe`: Creates a new data frame with the given number of rows and columns and returns a pointer to it.\n- `read_csv`: Reads a CSV file and stores the data in a data frame and returns a pointer to it.\n- `copy_dataframe`: Copies an existing data frame and returns a pointer to the new data frame.\n- `copy_cell`: Copies an existing cell and returns a pointer to the new cell.\n- `print_cell`: Prints the data of a cell to the standard output.\n- `print_dataframe`: Prints the data of a data frame to the standard output.\n- `delete_dataframe`: Deletes a data frame and frees the memory allocated for it.\n- `mean`: Calculates the mean of a column in a data frame and returns it as a double.\n- `stdev`: Calculates the standard deviation of a column in a data frame and returns it as a double.\n- `standardize`: Standardizes a data frame by subtracting the mean and dividing by the standard deviation for the given columns and returns a pointer to the standardized data frame.\n\n## KNN_Helpers.h\n\nThis is a header file that defines the K-Nearest Neighbors helper functions. The KNN algorithm is a supervised learning method that classifies a sample based on the majority vote of its k nearest neighbors in the feature space. The helper functions include:\n\n- `split_samples`: Splits a data frame into training and testing data frames based on a given ratio and a label column and stores the pointers in the given parameters.\n- `ecludian_dist`: Calculates the Euclidean distance between two points of the same length and returns the result as a double.\n- `KNN_training_res`: Performs the KNN algorithm on the training and testing data frames and prints the results.\n\n## main.c\n\nThis is the main file that executes the program. It does the following steps:\n\n- Reads the Iris.csv file and stores the pointer to the data frame in `df`.\n- Prints the data frame to the standard output.\n- Declares and initializes an array of column names to be standardized: `stdcols`.\n- Standardizes the data frame by subtracting the mean and dividing by the standard deviation for the given columns and stores the pointer to the standardized data frame in `standardized_df`.\n- Declares pointers to the training and testing data frames and initializes them to NULL: `training_df` and `testing_df`.\n- Splits the standardized data frame into training and testing data frames based on a 0.2 ratio and the label column (5) and stores the pointers in `training_df` and `testing_df`.\n- Performs the KNN algorithm on the training and testing data frames with k = 3, 5, and 7 and prints the results.\n- Deletes the dataframes and frees the memory allocated for them.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyoussef-saaed%2Fknn_assignment_c_version","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyoussef-saaed%2Fknn_assignment_c_version","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyoussef-saaed%2Fknn_assignment_c_version/lists"}