{"id":21289699,"url":"https://github.com/piriyaraj/word-count-and-plot","last_synced_at":"2025-03-15T16:11:11.730Z","repository":{"id":118859515,"uuid":"477772241","full_name":"piriyaraj/Word-count-and-plot","owner":"piriyaraj","description":"It reads multiple text file and plot the chart for all the words in the text. The plot bar count can be change using user inputs. developed by @piriyaraj","archived":false,"fork":false,"pushed_at":"2022-04-04T16:19:38.000Z","size":117,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-22T05:44:09.017Z","etag":null,"topics":["clanguage","countwordformmultiplefile","wordcounter","wordcounterinc","wordcountplot"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/piriyaraj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-04-04T15:58:14.000Z","updated_at":"2022-04-04T16:31:54.000Z","dependencies_parsed_at":null,"dependency_job_id":"7ec4c5bf-730d-45d5-b054-ae0ef17c16b9","html_url":"https://github.com/piriyaraj/Word-count-and-plot","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piriyaraj%2FWord-count-and-plot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piriyaraj%2FWord-count-and-plot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piriyaraj%2FWord-count-and-plot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/piriyaraj%2FWord-count-and-plot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/piriyaraj","download_url":"https://codeload.github.com/piriyaraj/Word-count-and-plot/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243754094,"owners_count":20342542,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clanguage","countwordformmultiplefile","wordcounter","wordcounterinc","wordcountplot"],"created_at":"2024-11-21T12:42:42.182Z","updated_at":"2025-03-15T16:11:11.702Z","avatar_url":"https://github.com/piriyaraj.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Department of Computer Engineering\n\n# University of Peradeniya\n\n### CO222: Programming Methodology - Project 2\n\n## 1 Introduction\n\nOne of the most important features of any written language is the occurrence of particular charac-\nters or words in general. For example, in the English language, all 26 characters are not used in\nthe same frequency. Generally, characters likee,a,tmore frequently appear in text. These kinds of\ninformation can be used in different applications such as Machine Learning, OCR, Cryptography,\netc. The same applies for words. Prepositions and articles likethe, a, and, inare more frequently\nused than other words. In project 2, you are supposed to observe this characteristic in the English\nlanguage using a program. A file or multiple files containing English text will be sent to the pro-\ngram, and the program should give an output (word or character frequencies) as a horizontal bar\nchart printed on the terminal.\u003cbr /\u003e\n![plot graph for the word count](word%20coutn%20plot.png)\u003cbr /\u003e\nFigure 1: The expected output from the program. The most frequently used words are displayed as\na horizontal bar chart\n\nFig 1 shows the expected output from the program concerning the maximum word frequencies.\nThere are different control and input arguments for the program. According to the arguments, the\nprogram should be able to change its behaviour and result in the expected output.\n\n\n## 2 Program output\n\n### 2.1 Control arguments for the program\n\nFile name/ File Names\n\nThe program should be able to accept any number of file names in any order. File names will not\nstart with ‘-’. eg: -file.txt\n\nNumber of rows in the chart\n\nThe argument specifies the number of rows in the bar chart. It should be given as-l 10where 10 is\nthe limit. It can be any positive integer. A number should always follow the-largument. The pair\ncan be in any place of the arguments list.\n\nScaled option\n\nWhen–scaledargument is given, the first row of the graph should fully occupy the max print width.\nAny other row should scale to be matched with first row scale factor.\n\nWord/Character toggle\n\nThe program can analyse two modes of frequencies, characters and words. If the output should be\ngiven as words, the-wshould be given whereas-cargument will give character frequency output.\n\n### 2.2 Default options\n\nThe program must take at least one file name to work. All other arguments are optional. If not\ngiven, the program will work asnon-scaled, will output frequencies forwordsand limit the output\nrows to 10.\n\n### 2.3 Pre-processing\n\nAll the non-alphanumeric characters must be removed from the text. For example, the word\nb@dW0rdshould changed intobdW0rd. Then, it should be converted into the lower-case string,\nand only the processed words should be taken into calculations.\nWhile printing, if two words share the same frequency, the first occurred word in the text should\nbe printed first on the chart. Also, all the numbers should have two decimal places only.\n\n### 2.4 Printing area\n\nThe program should work in 80 character width screen. To understand the printing pattern, please\nrefer to the given binary file and test with different files. It will give you a clear understanding\nabout how the graph is printed on the screen. The output should print exactly at the same place and\nscale as the given program.\nYou should use std=c99 flag to compile the source code because there are several Unicode char-\nacters you have to when printing the graph. They are; 2500, 2502, 2514, 2591. It is up to you\nto find out what exactly these Unicode print. To print Unicode you may use printf as follows,\nprintf(”\\u2502”);\n\n\n\n### 3.1 Basic functionality \n\nIf the program can read multiple files, store words or characters and then produce the maximum N\nnumber of frequencies, then the program will be given 50 marks (even without a graph).\n\n### 3.2 Plotting the chart \n\nIf the graph is plotted with correct output and as expected, the program will be given another 20\nmarks.\nBoth the above cases, you may use the following static pattern of the command line arguments to\nrun the program.\n./freqv1 -c –scaled -l 10 file1 file2 file ...\nwhere, -c can be changed to -w and 10 can be any positive integer\n\n### 3.3 Input arguments and error handling \n\nAs you can see, the program has many arguments to be processed, and they may appear any place\nin the argument list. If your program is capable of handling arguments as the example binary you\nare given, you may score 30 marks more.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpiriyaraj%2Fword-count-and-plot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpiriyaraj%2Fword-count-and-plot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpiriyaraj%2Fword-count-and-plot/lists"}