{"id":28629846,"url":"https://github.com/codebox/gradient-descent","last_synced_at":"2025-09-02T15:34:10.064Z","repository":{"id":27676403,"uuid":"31162577","full_name":"codebox/gradient-descent","owner":"codebox","description":"Python implementations of both Linear and Logistic Regression using Gradient Descent","archived":false,"fork":false,"pushed_at":"2015-03-14T17:47:11.000Z","size":184,"stargazers_count":3,"open_issues_count":1,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2023-03-11T21:48:10.096Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://codebox.org.uk/pages/gradient-descent-python","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/codebox.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-02-22T12:38:09.000Z","updated_at":"2018-09-20T16:06:50.000Z","dependencies_parsed_at":"2022-09-03T03:42:32.395Z","dependency_job_id":null,"html_url":"https://github.com/codebox/gradient-descent","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"purl":"pkg:github/codebox/gradient-descent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fgradient-descent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fgradient-descent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fgradient-descent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fgradient-descent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/codebox","download_url":"https://codeload.github.com/codebox/gradient-descent/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codebox%2Fgradient-descent/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259462575,"owners_count":22861514,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-12T12:13:42.452Z","updated_at":"2025-06-12T12:13:44.697Z","avatar_url":"https://github.com/codebox.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# gradient-descent\n\nThis Python utility provides implementations of both [Linear](http://en.wikipedia.org/wiki/Linear_regression) and \n[Logistic Regression](http://en.wikipedia.org/wiki/Logistic_regression) using \n[Gradient Descent](http://en.wikipedia.org/wiki/Gradient_descent), these algorithms are commonly used in Machine Learning.\n\nThe utility analyses a set of data that you supply, known as the _training set_, which consists of multiple data items or \n_training examples_. Each training example must contain one or more input values, and one output value. The utility attempts \nto derive an equation (called the _hypothesis_) which defines the relationship between the input values and the output value. \nThe hypothesis can then be used to predict what the output will be for new inputs, that were not part of the original training set.\n\nFor example, if you are interested in predicting house prices you might compile a training set using data from past property sales, \nusing the selling price as the output value, and various attributes of the houses such as number of rooms, \narea, number of floors etc. as the input values.\n\n### Training Data File Format\n\nTo use the utility with a training set, the data must be saved in a correctly formatted text file, with each line in the file \ncontaining the data for a single training example. A line must begin with the output value followed by a ':', the remainder \nof the line should consist of a comma-separated list of the input values for that training example. The number of input values \nmust be the same for each line in the file - any lines containing more/fewer input values than the first line will be rejected. \nLines beginning with a '#' symbol will be treated as comments and ignored.\n\nAn extract from the House Prices data file might look like this:\n\n\u003cpre\u003e\n# House Price Data\n# line format is: \u0026lt;price\u0026gt;:\u0026lt;room count\u0026gt;,\u0026lt;number of floors\u0026gt;,\u0026lt;area\u0026gt;\n235000:9,2,112\n125500:4,1,90\n400000:12,2,190\n\u003c/pre\u003e\n\n### Helper Configuration\n\nAs well as supplying a training set, you will need to write a few lines of Python code to configure how the utility will run. \nIt is recommended that you use the `Helper` class to do this, which will simplify the use of the utility by handling \nthe wiring and instantiation of the other classes, and by providing reasonable defaults for many of the required configuration parameters.  \n\nThe Helper class has many configuration options, which are documented below. A simple invocation might look something like this:\n\n\u003cpre\u003e\nHelper('house_price_data.txt') \\\n    .with_linear_regression() \\\n    .with_alpha(0.1) \\\n    .with_iterations(30000) \\\n    .with_linear_terms() \\\n    .go()\n\u003c/pre\u003e\n\nThe Helper is configured using the following methods:\n\n#### with_iterations\n\nAn integer value, defaulting to 1000. This determines the number of iterations of Gradient Descent that will be performed before the \ncalculated hypothesis is displayed. Higher values will yield more accurate results, but will increase the required running time.\n\n#### with_alpha\n\nA numeric value, defaulting to 1. This method sets the _learning rate_ parameter used by Gradient Descent when updating the hypothesis \nafter each iteration. Up to a point, higher values will cause the algorithm to converge on the optimal solution more quickly, however if \nthe value is set too high then it will fail to converge at all, yielding successively larger errors on each iteration. Finding a good \nlearning rate value is largely a matter of experimentation - enabling error checking, as detailed below, can assist with this process.\n\n#### with_error_checking\n\nA boolean value, defaulting to False. When set to True the utility will check the hypothesis error after each iteration, and abort if \nthe error has increased. Setting this can be useful when attempting to determine a reasonable learning rate value for a new data set, \nhowever once this has been done error checking should be disabled in order to increase processing speed.\n\n#### with_term\n\nAdds a single term to the hypothesis. This method requires a string value (the name that will be used to refer to the new term) and a \nfunction object accepting a single parameter, which will be a list containing all the input values for a single training example. \nThis method should be used to add custom, non-linear terms to the hypothesis:\n\n\u003cpre\u003e\n.with_term('w^2',    lambda l: l[0] * l[0])         # Square of the first input value\n.with_term('log(n)', lambda l: math.log(l[3], 10))  # Logarithm (base 10) of the 4th input value\n.with_term('a*b*c',  lambda l: l[0] * l[1] * l[2])  # Product of the first 3 input values\n\u003c/pre\u003e\n\n#### with_linear_terms\n\nAdds a series of linear terms to the hypothesis, one for each of the input parameters in the training set. The terms will be named \nautomatically, 'x1' for the first input parameter, 'x2' for the second and so on.\n\n#### with_regularisation_coefficient\n\nAn integer value, defaulting to '0'. Setting a non-zero regularisation coefficient will have the effect of producing a smoother, more \ngeneral hypothesis, less prone to overfitting - as a consequence the hypothesis will yield larger errors on the training \nset, but may provide a better fit for new data.\n\n#### with_linear_regression\n\nMakes the utility use Linear Regression to derive the hypothesis\n\n#### with_logistic_regression\n\nMakes the utility use Logistic Regression to derive the hypothesis. Note that when using Logistic Regression the output values in the \ntraining set must be either '0' or '1'.\n\n#### with_normalisation\n\nA boolean value, defaulting to True. When normalisation is enabled, the utility will perform Feature Scaling and Mean Normalisation \non the input data.\n\n#### with_test_on_completion\n\nMakes the utility run the final hypothesis against the training data after calculation has been completed. The displayed results \nshould give a clear indication of how good the hypothesis is.\n\n### Example: Linear Regression\n\nHere the utility is used to derive an equation for calculating the Apparent Magnitude of a star from its Absolute Magnitude and its Distance. This is a slightly atypical application of machine learning because these quantities are already known to be related by a [mathematical formula](http://www.astro.cornell.edu/academics/courses/astro201/mag_absolute.htm), however it should serve as a useful test to prove that the utility is working correctly.\n\nThe training set contains approximately 1000 examples extracted from the [HYG Database](http://www.astronexus.com/hyg). The input data is contained in a text file called `star_data.txt` a sample from the file is shown below:  \n\n\u003cpre\u003e\n...\n9.1:219.7802,2.39\n9.27:47.9616,5.866\n6.61:442.4779,-1.619\n...\n\u003c/pre\u003e\n\nThe utility is executed using the command shown below. Note that in the names for the various terms, the letter 'D' has been used to represent the Distance value (the first input value) and 'M' represents the Absolute Magnitude (the second input value). In this example we have speculatively added a number of custom terms using M and D both individually and in combination with each other. Each of these terms may or may not be involved in the actual relationship between the inputs and the output - the utility will determine which of them are actually useful, and to what extent, as part of its processing.\n\n\u003cpre\u003e\nHelper('star_data.txt') \\\n    .with_linear_regression() \\\n    .with_alpha(1) \\\n    .with_iterations(30000) \\\n    .with_term('M',      lambda l : l[1]) \\\n    .with_term('M^2',    lambda l : l[1] * l[1]) \\\n    .with_term('D',      lambda l : l[0]) \\\n    .with_term('D^2',    lambda l : l[0] * l[0]) \\\n    .with_term('D*M',    lambda l : l[0] * l[1]) \\\n    .with_term('log(D)', lambda l : math.log(l[0], 10)) \\\n    .go()\n\u003c/pre\u003e\n\nAfter 30,000 iterations the following hypothesis has been calculated:\n\n\u003cpre\u003e\n-------------------------------\nTheta values:\n-------------\n      x0 =      -4.99921928\n       D =       0.00000083\n     D^2 =      -0.00000000\n       M =       1.00003066\n  log(D) =       4.99956287\n     M^2 =      -0.00000644\n     D*M =      -0.00000000\n\nCompleted 30000 iterations\n-------------------------------\n\u003c/pre\u003e\n\nThe numbers shown against each of the terms are their coefficients in the resulting hypothesis equation. Notice that in addition to the 6 terms we added to the Helper, there is also a 7th term called 'x0'. This term is automatically added to the hypothesis by the utility, and is simply a constant term that does not depend on any of the input values.\nThis output can be interpreted to mean that the best hypothesis found by the utility (i.e. the best way to find the output from the inputs) is by using the equation:\n\n\u003cpre\u003e\n    output = -4.99921928 + (D * 0.00000083) + (-0.00000000 * D^2) + (1.00003066 * M) + (4.99956287 * log(D)) + (-0.00000644 * M^2) + (-0.00000000 * D * M)\n\u003c/pre\u003e\n\nHowever four of these coefficients are very close to zero, so it is safe to assume these terms have little influence on the output value, and we can remove them:\n\n\u003cpre\u003e\n    output = -4.99921928 + (1.00003066 * M) + (4.99956287 * log(D))\n\u003c/pre\u003e\n\nEach of the remaining coefficients are close to an integer value, so we can further simplify the equation by rounding them as follows: \n\n\u003cpre\u003e\n    output = -5 + M + 5 * log(D)\n\u003c/pre\u003e\n\nThis equation matches [the one used by astronomers](http://www.astro.cornell.edu/academics/courses/astro201/mag_absolute.htm) to calculate magnitude values.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodebox%2Fgradient-descent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodebox%2Fgradient-descent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodebox%2Fgradient-descent/lists"}