{"id":49453598,"url":"https://github.com/Jays-code-collection/HMMs_Stock_Market","last_synced_at":"2026-06-02T05:00:38.085Z","repository":{"id":55645180,"uuid":"307095483","full_name":"Jays-code-collection/HMMs_Stock_Market","owner":"Jays-code-collection","description":"Contains all code related to using HMMs to predict stock market prices. ","archived":false,"fork":false,"pushed_at":"2022-08-13T08:17:42.000Z","size":104,"stargazers_count":239,"open_issues_count":2,"forks_count":54,"subscribers_count":9,"default_branch":"main","last_synced_at":"2023-11-07T19:11:56.166Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Jays-code-collection.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-10-25T12:36:55.000Z","updated_at":"2023-09-07T23:55:07.000Z","dependencies_parsed_at":"2022-08-15T05:20:52.480Z","dependency_job_id":null,"html_url":"https://github.com/Jays-code-collection/HMMs_Stock_Market","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/Jays-code-collection/HMMs_Stock_Market","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jays-code-collection%2FHMMs_Stock_Market","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jays-code-collection%2FHMMs_Stock_Market/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jays-code-collection%2FHMMs_Stock_Market/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jays-code-collection%2FHMMs_Stock_Market/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Jays-code-collection","download_url":"https://codeload.github.com/Jays-code-collection/HMMs_Stock_Market/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Jays-code-collection%2FHMMs_Stock_Market/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33806987,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-02T02:00:07.132Z","response_time":109,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-30T04:01:03.701Z","updated_at":"2026-06-02T05:00:38.078Z","avatar_url":"https://github.com/Jays-code-collection.png","language":"Python","funding_links":[],"categories":["📦 Legacy \u0026 Inactive Projects"],"sub_categories":[],"readme":"# Stock Market prediction using Hidden Markov Models\r\nThis repo contains all code related to my work using Hidden Markov Models to predict stock market prices. This\r\ninitially started as academic work, for my masters dissertation, but has since been a project that I have continued to work on \r\npost graduation. At present, the program must be called from a terminal/ command line, but there is\r\nan aim to extend it to an interactive site in future, potentially via Django.\r\n\r\n## Motivation\r\nHidden Markov Models are an incredibly interesting type of stochastic process that are under utilised in the\r\nMachine Learning world. They are particularly useful for analysing time series. This, combined with their ability to \r\nconvert the observable outputs that are emitted by real-world processes into predictable and efficient models makes\r\nthem a viable candidate to be used for stock market analysis. The stock market\r\nhas several interesting properties that make modeling non-trivial, namely\r\nvolatility, time dependence and other similar complex dependencies. HMMs\r\nare suited to dealing with these complications as the only information they\r\nrequire to generate a model is a set of observations (in this case historical stock market data).\r\n\r\n## Example\r\nTraining on and predicting stock prices between January 1st 2018 to December 5th 2020 (the date that this example was ran on), predicting 5 days into the future. Typically the model will need to be trained on longer periods for more accurate results but this is purely to have a simple example.\r\n\r\nInput:\r\n```shell\r\npython stock_analysis.py -n AAPL -s 2018-01-01 -e 2020-12-05 -o C:\\Users\\Jay\\Test -p True -f 5 -m True\r\n```\r\n\r\nOutput:\r\n```shell\r\nUsing continuous Hidden Markov Models to predict stock prices for AAPL\r\nTraining data period is from 2018-01-02 00:00:00 to 2019-12-17 00:00:00\r\n2020-12-06 17:50:11,202 __main__     INFO     \u003e\u003e\u003e Extracting Features\r\n2020-12-06 17:50:11,203 __main__     INFO     Features extraction Completed \u003c\u003c\u003c\r\nPredicting Close prices from 2019-12-18 00:00:00 to 2020-12-04 00:00:00\r\n100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 244/244 [07:54\u003c00:00,  1.94s/it]\r\nAll predictions saved. The Mean Squared Error for the 244 days considered is: 3.7785175769493202\r\nPredicting future Close prices from 2020-12-05 00:00:00 to 2020-12-09 00:00:00\r\n100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:09\u003c00:00,  1.92s/it]\r\nThe predicted stock prices for the next 5 days from 2020-12-05 are:  [122.99846938775511, 123.75152124114953, 124.50918361609536, 125.27148474027554,\r\n126.0384530141956]\r\nThe full set of predictions has been saved, including the High, Low, Open and Close prices for 5 days in the future.\r\n```\r\n\r\nBottom of Excel file (future predictions):\r\n\r\n|          Date         |   High   |    Low   |   Open   |   Close  |\r\n|:---------------------:|:--------:|:--------:|:--------:|:--------:|\r\n| 2020-12-03   00:00:00 | 123.78   | 122.21   | 123.52   | 122.94   |\r\n| 2020-12-04   00:00:00 | 122.86   | 121.52   | 122.6    | 122.25   |\r\n| 2020-12-05   00:00:00 | 123.6083 | 122.25   | 122.25   | 122.9985 |\r\n| 2020-12-06   00:00:00 | 124.3651 | 122.9985 | 122.9985 | 123.7515 |\r\n| 2020-12-07   00:00:00 | 125.1265 | 123.7515 | 123.7515 | 124.5092 |\r\n| 2020-12-08   00:00:00 | 125.8926 | 124.5092 | 124.5092 | 125.2715 |\r\n| 2020-12-09   00:00:00 | 126.6634 | 125.2715 | 125.2715 | 126.0385 |\r\n\r\nThe full table contains all of the test data, in this case from 2019-12-18 to 2020-12-04 as well as the final 5 days in the future, 2020-12-04 to 2020-12-09.\r\n\r\nPlot:\r\n\r\n![plot](images/AAPLresults_plot.png)\r\n## Dependencies\r\n* Pandas_datareader - Allows one to download data directly from Yahoo finance\r\n* NumPy - Required for fast manipulation of financial data (e.g. calculating fractional change)\r\n* Matplotlib - Required for visualisation of results\r\n* Hmmlearn - Open source package that allows for creation and fitting of HMM's \r\n* Sklearn - Used to calculate metrics to score the results and split the data, will be removed in future to reduce dependency\r\n* Tqdm - Used to track progress whilst training\r\n* Argparse - Required for console inputs\r\n\r\n## Method\r\nStock market data is downloaded via pandas_datareader and the data is split into training and testing datasets. The \r\nfractional changes for any given day (from open to close, high to open, open to low) in the training dataset are computed and stored in a NumPy \r\narray. These fractional changes can be seen as the observations for the HMM and are used to train the continuous HMM \r\nwith hmmlearn's fit method. The model then predicts the closing price for each day in the training dataset, based on the given \r\ndays opening price. This prediction is found by calculating the highest scoring potential outcome out of a pre-determined \r\nset of outcomes (e.g. +0.001%, -0.001% etc). All predictions as well as the actual close prices for the testing period are stored in an \r\nexcel file and the Mean Squared Error between the two is printed out. The MSE is also included in the file name for future \r\nreference. Future days are predicted by feeding forward the prediction values. Unfortunately, at present there is no method in place to account for overnight/ weekend futures trading, and so for the future predictions the n+1 days open price is the same as the nth days closing price. \r\n\r\n## Installation\r\n\r\n### Linux\r\n```shell\r\nsudo apt-get install libopenjp2-7 libtiff5\r\npip install -r requirements.txt\r\n```\r\n\r\n## Usage \r\nWithin the src directory:\r\n```shell\r\npython stock_analysis.py [-n XXXX] [-s yyyy-mm-dd] [-e yyyy-mm-dd] [-o dir] [-p T/F] [-f int] [-m T/F]\r\n```\r\nThe -n input represents a given stock name, -s is the start date of the period considered, -e is the end date of the period considered \r\nand -o takes in the output directory for the excel file produced. It is important that the dates are input in the correct\r\norder. -p is a boolean input that tells the script whether or not to plot the historical dates, it will have no effect if -m is not also set to true. \r\n-f stands for future and takes in an integer that determines how many days into the future the user would like to predict. \r\n-m stands for metrics, and determines whether or not to predict for the historical data, or just for the future days, if True all of the historical data in the\r\ntest set will be predicted and the Mean Squared Error will be returned. The justification for -m being an optional input is that the model can take quite some time to \r\npredict, so it's best if the user has the option to just predict the close prices for x future days quickly as that is the information that many people will find most \r\nuseful. \r\n\r\nTo run tests, use:\r\n```shell\r\npython3 -m pytest tests\r\n```\r\n\r\nTo run tests using docker, use:\r\n```shell\r\ndocker-compose up tests\r\n```\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJays-code-collection%2FHMMs_Stock_Market","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJays-code-collection%2FHMMs_Stock_Market","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJays-code-collection%2FHMMs_Stock_Market/lists"}