{"id":21182678,"url":"https://github.com/guilhermebkel/data-structures-study","last_synced_at":"2026-05-14T13:31:19.791Z","repository":{"id":165593023,"uuid":"481023349","full_name":"guilhermebkel/data-structures-study","owner":"guilhermebkel","description":"🏛️ A deep study about Data Structures with help of C++ language","archived":false,"fork":false,"pushed_at":"2022-06-17T21:26:02.000Z","size":635,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-08-22T19:13:24.332Z","etag":null,"topics":["algorithm-analysis","algorithms-and-data-structures","c","data-structures"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/guilhermebkel.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-04-13T01:02:28.000Z","updated_at":"2022-05-28T00:36:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"d12fe869-4f06-4e4a-b011-d53c03a90f6c","html_url":"https://github.com/guilhermebkel/data-structures-study","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/guilhermebkel/data-structures-study","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guilhermebkel%2Fdata-structures-study","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guilhermebkel%2Fdata-structures-study/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guilhermebkel%2Fdata-structures-study/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guilhermebkel%2Fdata-structures-study/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/guilhermebkel","download_url":"https://codeload.github.com/guilhermebkel/data-structures-study/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guilhermebkel%2Fdata-structures-study/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33026778,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T13:14:54.681Z","status":"online","status_checked_at":"2026-05-14T02:00:06.663Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm-analysis","algorithms-and-data-structures","c","data-structures"],"created_at":"2024-11-20T17:57:38.741Z","updated_at":"2026-05-14T13:31:19.733Z","avatar_url":"https://github.com/guilhermebkel.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Structures Study\n\nA deep study about Data Structures with help of C language\n\n## Algorithm Complexity Analysis\n\nThe algorithm efficiency can be measured in a lot of ways, the most famous are:\n\n- Temporal complexity.\n- Spatial complexity.\n\n### Specific Algorithm Analysis\n\nWe try to understand how much it costs for an specific algorithm to solve a problem.\n\nThese are some of the characteristics we study about the algorithm:\n\n- How many times each part of the algorithm must be executed.\n\n- How much memory is needed for the algorithm data structures.\n\n### Algorithm Class Analysis\n\nWe try to understand what is the algorithm with the less possible cost to solve a particular problem (per example, ordering and searching algorithms).\n\nAll the algorithm family is studied. So we try to find the better one. Being minded about it, we can create limits for the computational complexity of the algorithms that belong to that family.\n\n### Cost Measurement\n\nIf we use the same cost measure for different algorithms, we are able to compare them and choose the better one.\n\nWe can measure the algorithm cost in the following ways:\n\n#### Algorithm Execution\n\nMost of the time this measurement is not the better one, since the results depends on compiler, hardware and memory utilization.\n\nEven though, sometimes it is a good measurement when we need to compare distinct algorithms to solve the same problem and they have the same order of magnitude or we need to make an analysis of how the algorithm behave in the context it will be applied.\n\n#### Mathematical Model\n\nWe use a mathematical model based on a idealized computer.\n\nUsually we use the RAM Model (Random Access Machine):\n\n- A processor that executes an action per time.\n\n- A memory that stores the data.\n\n- Basic operations with constant cost (memory access, conditionals, arithmetic operation, etc.).\n\nWe need to specify the algorithm operations and their execution costs (usually we only consider the cost of the most significant operations).\n\n### Algorithm Cost\n\nWhen we determine the cheaper cost possible to solve a algorithm class problem, we find the inherently difficult to solve that problem.\n\nA top algorithm is the one with cost equals to the cheaper problem cost. \n\n### Complexity Function\n\nThe execution cost of a algorithm is found by a cost function, or complexity function.\n\n**f(n)** is the cost measurement needed to execute a algorithm with size of **n**.\n\n- **Temporal Complexity Function:** **f(n)** measures the cost in number of operations to execute a algorithm in a problem of size **n**.\n\n- **Spatial Complexity Function:** **f(n)** measures the memory needed to execute a algorithm in a problem of size **n**.\n\nIn general, we use **f(n)** as a temporal complexity function, but, be minded that it does not represent the time directly, instead, it represents the amount of times a specific operation considered to be relevant is executed.\n\n#### Example: Find max number in an array.\n\nThe cost is **F(n) = n - 1** since we have a loop with **n - 1** iterations in which we make a comparison.\n\n```c\nint maxValue (int *A, int n) {\n\tint i, temp;\n\n\ttemp = A[0];\n\n\tfor (i = 1; i \u003c n; i++) {\n\t\tif (temp \u003c A[i]) {\n\t\t\ttemp = A[i];\n\t\t}\n\t}\n\n\treturn temp;\n} \n```\n\nThe execution cost measurement of an algorithm is a function with the same size of the data input.\n\nIn some algorithms, the execution cost depends on the data organization as well. In these cases, we will have different complexity functions to represent the best, worst and average case.\n\n- **Best case:** Least execution time in all the input of size **n**.\n\n- **Worst case:** Biggest execution in all input of size **n**.\n\n- **Average case:** Average of all execution time of all input of size **n**.\n\t- (best case + worst case) / 2\n\n#### Example: Sequential search (each record has an unique key).\n\nThe **best case** is when the record is the first one read: **F(n) = 1**.\n\nThe **worst case** is when the record is the last one read or it is not even on the store: **F(n) = n**.\n\nThe **average case** is when the record is in the middle of the store: **F(n) = (n + 1) / 2**.\n\n```c\nint searchIndex (Store *A, int n, int key) {\n\tint i;\n\n\ti = 0;\n\n\twhile(i \u003c n) {\n\t\tif (A[i].key == key) {\n\t\t\tbreak;\n\t\t}\n\n\t\ti++;\n\t}\n\n\treturn i;\n}\n```\n\n### Algorithms Comparison\n\nWhen we compare algorithms of the same class, sometimes we are able to understand what is the limit inferior of that class.\n\nThe limit inferior shows us the better complexity function that can be found for an algorithms class.\n\n## Asymptotic Complexity\n\nIt is important to study the algorithm cost for big values of **n** (n → ∞).\n\nWe call asymptotic complexity analysis, when we analysis the algorithm when the value of **n** tends to infinity.\n\nIn that case, we do not need to worry about constants and terms of less growing.\n\n### Asymptotic Dominance\n\nA function **f(n)** dominates asymptotically a function **g(n)** if there are two positive constants **c** and **m** for **n \u003e= m**, we have **|g(n)|\u003c= c|f(n)|**.\n\n\u003cimg src=\"./resources/asymptotic-dominance.png\"\u003e\u003c/img\u003e\n\n### Asymptotic Notations\n\n#### O (Big-O)\n\nSpecifies the limit superior for **g(n)** (the worst case).\n\n**g(n) = O(f(n))** if **f(n)** dominates asymptotically **g(n)**.\n\n**g(n) is order of f(n)**, or **O of f(n)**.\n\n\u003cimg src=\"./resources/big-o.png\"\u003e\u003c/img\u003e\n\n- Useful operations:\n\n\u003cimg src=\"./resources/big-o-operations.png\"\u003e\u003c/img\u003e\n\n- Example:\n\n\u003cimg src=\"./resources/big-o-example-1.png\"\u003e\u003c/img\u003e\n\n#### Ω (Big-Omega)\n\nSpecifies the limit inferior for g(n).\n\nA function **g(n)** is **Ω(f(n))** if there are two constants **c** and **m** in which **g(n) \u003e= cf(n)** for each **n \u003e= m**.\n\n\u003cimg src=\"./resources/big-omega.png\"\u003e\u003c/img\u003e\n\n#### Θ (Big-Theta)\n\nA function **g(n)** is **Θ(f(n))** if there are positive constants **c1, c2 and m** in which **0 \u003c= c1f(n) \u003c= g(n) \u003c= c2f(n)** for every **n \u003e= m**.\n\n\u003cimg src=\"./resources/big-theta.png\"\u003e\u003c/img\u003e\n\nFor every **n \u003e= m**, the function **g(n)** is equal **f(n)**.\n\n### Asymptotic Class Behavior\n\nIn general, it is interesting to group algorithms and problems by asymptotic behavior class, that determines the inherit complexity of the algorithm.\n\nWhen two algorithms have the same asymptotic behavior class, we call them equivalent (in that case, we need to make a better analysis of the complexity function or its performance in real systems).\n\n#### f(n) = O(1): Constant Complexity\n\nThe algorithms is not dependent of **n**.\n\nThe algorithm instructions are executed a fixed number of times.\n\n#### f(n) = O(log n): Logarithm Complexity\n\nUsually algorithms that turns a problem into smaller problems.\n\n#### f(n) = O(n): Linear Complexity\n\nIn general, a small work is done in every element of the input.\n\nEverytime **n** doubles its size, the time of execution gets doubled as well.\n\n#### f(n) = O(n log n)\n\nUsually algorithms that break problems in smaller ones, and so, solve each one independently and making the solution adjustments later.\n\n#### f(n) = O(n²): Quadratic Complexity\n\nUsually when the input data is processed in pairs, per example doing a loop inside another.\n\n#### f(n) = O(n³): Cubic Complexity\n\nUseful to solve smaller problems.\n\n#### f(n) = O(2^n): Exponential Complexity\n\nUsually are not useful at a practical sight of view.\n\nHappens when we try to solve problems by brute force.\n\n#### f(n) = O(n!)\n\nAn algorithm of complexity O(n!) is called to have exponential complexity, even being too much worse than **O(2^n)**.\n\nHappens when we try to solve problems by brute force.\n\n### Asymptotic Class Behavior Comparison\n\n\u003cimg src=\"./resources/complexity-comparison.png\"\u003e\u003c/img\u003e\n\n## Analysis Techniques\n\nDetermining the complexity function of a time of execution of some program can be a hard mathematical problem. Instead, determining the complexity order, without any worry about the constant value can be an easier task.\n\n### Execution Time Analysis\n\nGenerally we consider an allocation, reading or writing to be **O(1)**.\n\n```c\na = 0;\n\nv[0] = 12;\n\nb = a + 1;\n\nreturn b;\n```\n\nCommand time inside a conditional, more time to evaluate a condition, in general is **O(1)**.\n\nThat way, if the complexity of commands is different in case the condition if **T** or **F**, we can have a best and worst case.\n\n```c\nif (A[j] \u003c A[min]) {\n\tmin = j;\n} else {\n\treturn min;\n}\n```\n\nEvery function call must have your time computed separately, starting from the ones that do not call another procedures.\n\n#### Example \n\n\u003cimg src=\"./resources/execution-time-analysis-big-o.png\"\u003e\u003c/img\u003e\n\n\u003cimg src=\"./resources/execution-time-analysis-big-o-2.png\"\u003e\u003c/img\u003e\n\n## Recursive Algorithm Analysis\n\nAn algorithm is recursive when the procedure call itself.\n\n### Recursive function structure\n\nUsually, the recursive functions are divided into two parts:\n\n- Recursive call.\n\n- Stop condition.\n\nThe stop condition is essential to avoid executing infinite loops. The recursive call can be:\n\n- **Direct:** The function A calls itself.\n\n- **Indirect:** The function A calls B and B calls A again.\n\nThe recursive call can happen more than one time inside the function.\n\n#### Example: Fibonacci\n\n- Complexity: Exponential\n\n```c\nint Fib(int n) {\n\tif (n \u003c 3) {\n\t\treturn 1;\n\t} else {\n\t\treturn Fib(n-1) + Fib(n-2);\n\t}\n}\n```\n\n- Complexity: O(n)\n\n```c\nint FibIter(int n) {\n\tint fn1 = 1, fn2 = 1;\n\tint fn, i;\n\n\tif (n \u003c 3) {\n\t\treturn 1;\n\t}\n\n\tfor (i = 3; i \u003c= n; i++) {\n\t\tfn = fn2 + fn1;\n\t\tfn2 = fn1;\n\t\tfn1 = fn;\n\t}\n\n\treturn fn;\n} \n```\n\nMost of the time recursiveness has a high cost, so try to think if it is possible to do that in an iterative way.\n\nWhen we are dealing with complex algorithms, which iterative implementation is complex and usually requires a explicit use of a stack.\n\n- Tree paths.\n\n- Divide and Conquer (Ex: Quicksort).\n\nWhen making an analysis of a recursive algorithm, we need to use a Recurrence Equation.\n\n### Recurrence Equation\n\nIt is a way to define a function by an expression revolving around the same function with small imputs.\n\nThe recurrence equation is divided into two parts:\n\n- **Base case:** the one the equation has one solution for a given input value.\n\n- **Recurrence:** the one the equation solution for a input n is expressed in a function of the solution for small inputs.\n\n#### Example: Generic Function\n\n```\nT(n) = T(n - 1) + n, for n \u003e 1\nT(n) = 1, for n ≤ 1\n```\n\n```\nT(1) = 1\nT(2) = T(1) + 2 = 3\nT(3) = T(2) + 3 = 6\nT(4) = T(3) + 4 = 10\n...\n```\n\n#### Example: Factorial\n\n- Algorithm:\n\n```c\nint fat (int n) {\n\tif (n \u003c= 0) {\n\t\treturn 1;\n\t} else {\n\t \treturn n * fat(n-1);\n\t}\n}\n```\n\n- Recurrence Equation:\n\n```\nT(n) = 1 + T(n - 1), for n \u003e 0\nT(n) = 0, for n ≤ 0\n```\n\n#### Example: Fibonacci\n\n- Algorithm:\n\n```c\nint Fib(int n) {\n\tif (n \u003c 3) {\n\t\treturn 1;\n\t} else {\n\t\treturn Fib(n-1) + Fib(n-2);\n\t}\n}\n```\n\n- Recurrence Equation:\n\n```\nT(n) = T(n - 1) + T(n - 2) + c, for n \u003e 2\nT(n) = d, for n ≤ 2\n\nwhere c and d are constants\n```\n\nBeing minded about it, how do we solve the recurrence equations to find the complexity?\n\n### Terms Expansion\n\n- In a given recurrence, expand the terms to obtain terms with smaller input.\n\n- Repeat the process until you get to the base case.\n\n- Replace the values with the smaller input terms already found.\n\n- Sum the costs of all the terms.\n\n- Calculate the summation formula.\n\n#### Factorial\n\n- Algorithm:\n\n```c\nint fat (int n) {\n\tif (n \u003c= 0) {\n\t\treturn 1;\n\t} else {\n\t \treturn n * fat(n-1);\n\t}\n}\n```\n\n- Recurrence Equation:\n\n```\nT(n) = c + T(n - 1), for n \u003e 0\nT(n) = 0, for n ≤ 0\n```\n\n- Equation Solving:\n\n```\nT(n) = c + T(n-1)\nT(n-1) = c + T(n-2)\nT(n-2) = c + T(n-3)\n...\nT(1) = c + T(0)\nT(0) = d\n```\n\n- Complexity Analysis:\n\n```\nT(n) = c + c + c + c + ... + c + d\n\nT(n) = n * c + d\n\nO(n)\n```\n\nWe will find that the complexity of a recursive factorial algorithm is the same as the iterative version, that is **O(n)**.\n\nSo, let's take a look at what happens to the its spatial complexity on the recursive version.\n\nSince the recursive version only returns all the functions in case the latest one called reaches a stop condition, we will find that the spatial complexity of a recursive factorial algorithm is **O(n)** since the stack execution of functions has the same size of the times the function is called.\n\n### Master Theorem\n\nThis is the easiest way to solve recurrences of type:\n\n```\nT(n) = aT(n/b) + f(n)\n\nwhere a ≥ 1, b \u003e 1 and f(n) is positive\n```\n\nThis kind of recurrence is usually used by algorithms with \"divide and conquer\" approach.\n\n- Divides the problem in a sub-problems.\n- Each sub-problem has a size of n/b.\n- Each call does a job of cost f(n).\n- The base case, usually omit, has a cost constant for a small n value: **T(n) = c, n \u003c k**.\n\n```\nT(n) = aT(n/b) + f(n)\n```\n\nCompares the function \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n\" \u003e with the term \u003cimg src=\"https://render.githubusercontent.com/render/math?math=n^{log_{b}a}\" \u003e.\n\n**Obs:** It must satisfy the Regularity Condition: \u003cimg src=\"https://render.githubusercontent.com/render/math?math=af(n/b)\\leq cf(n), c\u003c1, n\u003en_{0}\" \u003e\n\n#### Case 1\n\nIf \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n) = O(n^{log_{b}a-\\varepsilon }) \\rightarrow T(n) = \\Theta (n^{log_{b}a})\" \u003e.\n\nThen \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n)\" \u003e is polynomically smaller than \u003cimg src=\"https://render.githubusercontent.com/render/math?math=n^{log_{b}a}\" \u003e.\n\n#### Case 2\n\nIf \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n) = \\Theta(n^{log_{b}a}) \\rightarrow T(n) = \\Theta (n^{log_{b}a} * log_{.}n)\" \u003e.\n\n#### Case 3\n\nIf \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n) = \\Omega (n^{log_{b}a+\\varepsilon }) \\rightarrow T(n) = \\Theta (f(n)))\" \u003e.\n\nThen \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n)\" \u003e is polynomically bigger than \u003cimg src=\"https://render.githubusercontent.com/render/math?math=n^{log_{b}a}\" \u003e.\n\n#### General Concepts\n\n- **Intuition:** The function \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n)\" \u003e is compared with \u003cimg src=\"https://render.githubusercontent.com/render/math?math=n^{log_{b}a}\" \u003e and the bigger of the functions is the solution of the recurrence. In case the two functions are equivalent, the solution is \u003cimg src=\"https://render.githubusercontent.com/render/math?math=n^{log_{b}a}\" \u003e times a logarithmic factor.\n\n- **Details:** In the cases 1 and 3, the function f(n) must be polynomically smaller/bigger than \u003cimg src=\"https://render.githubusercontent.com/render/math?math=n^{log_{b}a}\" \u003e. Besides, the function must satisfy a regularity condition.\n\n#### Example 1:\n\n```\nT(n) = 9T(n/3) + n\n\na = 9\nb = 3\nf(n) = n\n```\n\nWe will find \u003cimg src=\"https://render.githubusercontent.com/render/math?math=n^{log_{b}a}=n^{log_{3}9}=n^2\" \u003e.\n\nSo, with \u003cimg src=\"https://render.githubusercontent.com/render/math?math=\\varepsilon =1\" \u003e,\n\nWe have \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n) = O(n^{log_{b}a-\\varepsilon }) = O(n^{2-1}) = O(n)\" \u003e.\n\nWe can conclude that it fills the **Case 1:** \u003cimg src=\"https://render.githubusercontent.com/render/math?math=T(n) = \\Theta (n^{log_{b}a})=\\Theta (n^{2})\" \u003e.\n\n#### Example 2:\n\n```\nT(n) = 2T(n/2) + n - 1\n\na = 2\nb = 2\nf(n) = n - 1\n```\n\nWe will find \u003cimg src=\"https://render.githubusercontent.com/render/math?math=n^{log_{b}a}=n^{log_{2}2}=n\" \u003e.\n\nSo, we have \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n) = \\Theta(n^{log_{b}a}) = \\Theta(n)\" \u003e.\n\nWe can conclude that it fills the **Case 2:** \u003cimg src=\"https://render.githubusercontent.com/render/math?math=T(n) = \\Theta (n^{log_{b}a}*log_{.}n)=\\Theta (nlog_{.}n)\" \u003e.\n\n#### Example 3:\n\n```\nT(n) = 3T(n/4) + n log n\n\na = 3\nb = 4\nf(n) = n log n\n```\n\nWe will find \u003cimg src=\"https://render.githubusercontent.com/render/math?math=n^{log_{b}a}=n^{log_{4}3}=n^{0,793}\" \u003e.\n\nSo, with \u003cimg src=\"https://render.githubusercontent.com/render/math?math=\\varepsilon =0,207\" \u003e,\n\nWe have \u003cimg src=\"https://render.githubusercontent.com/render/math?math=f(n) = \\Omega(n^{log_{b}a+\\varepsilon }) = \\Omega(n^{0,793+0,207}) = \\Omega(n)\" \u003e.\n\nWe can conclude that it fills the **Case 3:** \u003cimg src=\"https://render.githubusercontent.com/render/math?math=T(n) = \\Theta (f(n))=\\Theta (nlog_{.}n)\" \u003e.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fguilhermebkel%2Fdata-structures-study","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fguilhermebkel%2Fdata-structures-study","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fguilhermebkel%2Fdata-structures-study/lists"}