{"id":15642166,"url":"https://github.com/parrt/msds689","last_synced_at":"2025-07-13T05:36:23.569Z","repository":{"id":53764675,"uuid":"164485064","full_name":"parrt/msds689","owner":"parrt","description":"Course syllabus, notes, projects for USF's MSDS689","archived":false,"fork":false,"pushed_at":"2024-03-01T19:40:35.000Z","size":240975,"stargazers_count":62,"open_issues_count":0,"forks_count":73,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-15T06:54:51.674Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/parrt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-01-07T19:51:52.000Z","updated_at":"2024-08-04T23:25:00.000Z","dependencies_parsed_at":"2024-10-23T00:30:11.165Z","dependency_job_id":null,"html_url":"https://github.com/parrt/msds689","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/parrt/msds689","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parrt%2Fmsds689","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parrt%2Fmsds689/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parrt%2Fmsds689/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parrt%2Fmsds689/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/parrt","download_url":"https://codeload.github.com/parrt/msds689/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parrt%2Fmsds689/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265096036,"owners_count":23710768,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-03T11:55:06.439Z","updated_at":"2025-07-13T05:36:20.258Z","avatar_url":"https://github.com/parrt.png","language":"Jupyter Notebook","readme":"MSDS689 Data Structures and Algorithms\n=======\n\nThis course is part of the [MS in Data Science program at the University of San Francisco](https://www.usfca.edu/arts-sciences/graduate-programs/data-science).\n\nThe goal is to give students a deeper and more general view of data structures and algorithms. While students have examined a number of data structures, such as binary trees, already, this course provides a much more in-depth study. This depth will benefit them greatly in the advanced machine learning course. This course also tends to address many of the difficult algorithm questions students get during job interviews. The critical data structures covered in this class are: lists, linked lists, trees, graphs, hash tables, and tries. The course also covers a variety of common and useful recursive and non-recursive algorithms, such as searching and sorting.\n\n\u003ctable\u003e\u003ctr\u003e\u003ctd\u003e\n\u003cimg src=\"student-trie.png\" width=\"350\"\u003e\n\u003c/td\u003e\u003c/tr\u003e\u003c/table\u003e\n\nAt the end of the last module, we studied feature importance and briefly touched on clustering.  There was not time to study these topics in more detail through projects, but they are important and students report getting related interview questions extremely frequently. Consequently, while seemingly a bit out of place, the faculty has decided to include free-form reports on these machine-learning related topics in this data structures and algorithms class. Here are two quick comments I received from the last cohort concerning these projects:\n\n\u003ctable\u003e\u003ctr\u003e\u003ctd\u003e\n\u003cimg src=\"student-featimp.png\" width=\"600\"\u003e\n\u003c/td\u003e\u003c/tr\u003e\u003c/table\u003e\n\n\u003ctable\u003e\u003ctr\u003e\u003ctd\u003e\n\u003cimg src=\"student-kmeans.png\" width=\"550\"\u003e\n\u003c/td\u003e\u003c/tr\u003e\u003c/table\u003e\n\n# Course Learning Objectives (CLOs)\n\nBy the end of this course, students will be able to:\n\n1. Describe object-oriented programming and use objects to construct linked lists, trees, and graph data structures\n2. Identify the essential logic and algorithms in the code of others\n3. Implement critical algorithms, such as sorting, searching, and list manipulations\n4. Create Trie data structures\n5. Design recursive algorithms\n6. Analyze and compare algorithmic complexity using “big-O” notation\n\n\n\n# Class details\n\n**INSTRUCTOR.** [Terence Parr](http://parrt.cs.usfca.edu). I’m an adjunct professor in the [data science program](https://www.usfca.edu/arts-sciences/graduate-programs/data-science) departments and was founding director of the MS in Data Science program at USF (which became the MS data science program).  Please call me Terence or Professor (not “Terry”).\n\n**SPATIAL COORDINATES:**\u003cbr\u003e\n\nClass is held at 101 Howard on main floor 154-156.\n\n\u003c!--\n* Class is held at 101 Howard in 5th floor classroom 527.\n* My office is room 607 @ 101 Howard up on mezzanine\n--\u003e\n\n**TEMPORAL COORDINATES.** Fri January 28, 2022 - Fri March 11, 2022\n\nLectures are Fridays 10-11:50AM California time\n\n\u003c!--\n* Section 01: Fri 10-11:55AM\n* Section 02: Fri 2:20-4:15PM \n* Exams: Thur Feb 13 3:15-4:15PM, Fri Mar 6 in class 10-11:30am for both classes at once\n--\u003e\n\n**INSTRUCTION FORMAT**. Class runs for 1:50 hours 1 day/week. Instructor-student interaction during lecture is encouraged and we'll mix in mini-exercises / labs during class. All programming will be done in the Python 3 programming language, unless otherwise specified.\n\n**TARDINESS.** Please be on time for class. It is a big distraction if you come in late.\n\n## Student evaluation\n\n| Artifact | Grade Weight | Due date |\n|--------|--------|--------|\n|[OO hashtable](projects/oohtable/oohtable.md)| 8% | Fri, Feb 4 11:59pm |\n|[Clustering](projects/kmeans/kmeans.md)| 17% | Sun, Feb 20 11:59pm |\n|[Feature selection and importance](projects/featimp/featimp.md)| 20% | Wed, Mar 9 11:59pm |\n|Exam 1| 25%| 4:30PM-5:30PM Thur Feb 17 |\n|Exam 2| 30%| 10AM-11:00AM Fri Mar 11 |\n\nNote: In order to pass this course, students are expected to receive at least 50% of each category. E.g., receiving less than 29/60 on your quiz might not be a satisfying grade.\n\nThe oohtable project will be graded with the specific input or tests given in the project description, so you understand precisely what is expected of your program. (There are a few hidden tests though.) Consequently, projects will be graded in binary fashion: The tests either work or they do not. The only exception is when your program does not run on the grader's or my machine because of some cross-platform issue. This is typically because a student has hardcoded some file name or directory into their program. In that case, we will take off *a minimum* of 10% instead of giving you a 0, depending on the severity of the mistake. Please go to github and verify that the website has the proper files for your solution. That is what I will download for testing.\n\nEach project has a hard deadline and only those projects working correctly before the deadline get credit.  My grading script pulls from github at the deadline.  *All projects are due at the start of class on the day indicated, unless otherwise specified.*\n\n**Grading standards**. I consider an **A** grade to be above and beyond what most students have achieved. A **B** grade is an average grade for a student or what you could call \"competence\" in a business setting. A **C** grade means that you either did not or could not put forth the effort to achieve competence. Below **C** implies you did very little work or had great difficulty with the class compared to other students.\n\n# Review and Resources\n\n* Review [OO notebook](https://github.com/parrt/msds501/blob/master/notes/OO.ipynb) and [Operator overloading notebook](notes/operator-overloading.ipynb)\n\nHere is a great free book on [algorithms by Jeff Erickson](http://jeffe.cs.illinois.edu/teaching/algorithms/).\n\n\u003c!--\nI'll be pulling some material from Kleinberg and Tardos, *Algorithm Design*. US copyright law allows me to make copies of small portions of books for educational use, which I have done. Please see compressed pdf `kleinberg-common-running-times.7z` in Canvas course files area. (Do not post that material publicly.)\n--\u003e\n\nA very useful set of [programming-concepts-for-data-science](https://github.com/shik3519/programming-concepts-for-data-science/blob/master/notebooks/03-common%20datastructures%20and%20algorithms.ipynb) and [data science coding questions](https://github.com/shik3519/programming-concepts-for-data-science/blob/master/notebooks/04-coding%20questions%20for%20DS%20interview.ipynb) by former USF MSDS student [Shikhar Gupta](https://github.com/shik3519).\n\n[10 Steps to Solving a Programming Problem](https://codeburst.io/10-steps-to-solving-a-programming-problem-8a32d1e96d74)\n\n# Syllabus\n\n* [Welcome!](notes/welcome.pdf) (Day 1)\n* Hashtable\n  * Review [hash table construction from MSDS692 project](https://github.com/parrt/msds692/blob/master/hw/search.md#creating-an-index-using-your-own-hashtable)\n  * discuss [OO htable project](https://github.com/parrt/msds689/blob/master/projects/oohtable/oohtable.md)\n* [How to read code](notes/reading-code.pdf) (Day 1)\n  * code refactoring on-the-fly for whole MSDS692 search project\n* [Problem-solving](notes/problem-solving.pdf) (Day 1)\n  * [Problem-solving exercise: LeetCode palindromes](https://github.com/parrt/msds689/blob/master/labs/problem-solving-palindromes.ipynb)\n  * [Practice Quiz: OO programming](https://github.com/parrt/msds689/blob/master/labs/quiz-oo.ipynb)\n* [Core data structures](notes/core-data-structures.pdf) (Day 2)\n  * [Problem-solving exercise: LeetCode Merge two sorted linked lists](https://leetcode.com/problems/merge-two-sorted-lists/)\n  * [Review binary trees from MSDS621](https://github.com/parrt/msds621/blob/master/labs/trees/binary-trees.ipynb)\n  * [Practice Quiz: core data structures](https://github.com/parrt/msds689/blob/master/labs/quiz-core-structures.ipynb)\n* [Introduction to algorithm complexity](notes/complexity.pdf) (“Big-O” notation)  (Day 2,3)\n  * [Measuring execution time notebook](notes/execution-time.ipynb)\n  * [Practice Quiz: complexity](https://github.com/parrt/msds689/blob/master/labs/quiz-complexity.ipynb)\n* [Getting a grip on recursion](notes/recursion.pdf) (Day 3)\n  * [recursion notebook](https://github.com/parrt/msds689/blob/master/notes/recursion-notebook.ipynb)\n  * [Problem-solving exercise: LeetCode Reverse linked list recursively](https://leetcode.com/explore/learn/card/recursion-i/251/scenario-i-recurrence-relation/2378/)\n  * [Problem-solving exercise: LeetCode Maximum Depth of Binary Tree\nSolution](https://leetcode.com/explore/learn/card/recursion-i/256/complexity-analysis/2375/)\n* [Walking data structures](notes/walking-structures.pdf) (Day 4)\n  * [walking notebook](https://github.com/parrt/msds689/blob/master/notes/walking.ipynb)\n* [Sorting](notes/sorting.pdf) (Day 4)\n  * [Sorting notebook](https://github.com/parrt/msds689/blob/master/notes/sorting.ipynb)\n* [Searching](notes/searching.pdf) (Day 5)\n  * [Searching notebook](https://github.com/parrt/msds689/blob/master/notes/searching.ipynb)\n  * Instead of LeetCode, implement the TRIE search found mentioned in slides\n* [Graphs](notes/graphs.pdf) (Day 5, 6)\n  * [Graphs notebook](https://github.com/parrt/msds689/blob/master/notes/graphs.ipynb)\n  * [Problem-solving exercise: LeetCode backtracking](https://leetcode.com/explore/interview/card/top-interview-questions-medium/109/backtracking/795/) (Walking graphs often involves backtracking)\n\nThe use of recursive algorithms will be emphasized frequently and wherever appropriate to reinforce this critical mechanism. Discussion of formal algorithm complexity will also be emphasized throughout the lectures.\n\n# Administrivia\n\n**ACADEMIC HONESTY.** You must abide by the copyright laws of the United States and academic honesty policies of USF. You may not copy code from other current or previous students. All suspicious activity will be investigated and, if warranted, passed to the Dean of Sciences for action.  Copying answers or code from other students or sources during a quiz, exam, or for a project is a violation of the university’s honor code and will be treated as such. Plagiarism consists of copying material from any source and passing off that material as your own original work. Plagiarism is plagiarism: it does not matter if the source being copied is on the Internet, from a book or textbook, or from quizzes or problem sets written up by other students. Giving code or showing code to another student is also considered a violation.\n\nThe golden rule: **You must never represent another person’s work as your own.**\n\nIf you ever have questions about what constitutes plagiarism, cheating, or academic dishonesty in my course, please feel free to ask me.\n\n**Note:** Leaving your laptop unattended is a common means for another student to take your work. It is your responsibility to guard your work. Do not leave your printouts laying around or in the trash. *All persons with common code are likely to be considered at fault.*\n\n**USF policies and legal declarations**\n\n*Students with Disabilities*\n\nIf you are a student with a disability or disabling condition, or if you think you may have a disability, please contact \u003ca href=\"/sds\"\u003eUSF Student Disability Services\u003c/a\u003e (SDS) for information about accommodations.\n\n*Behavioral Expectations*\n\nAll students are expected to behave in accordance with the \u003ca href=\"/fogcutter\"\u003eStudent Conduct Code\u003c/a\u003e and other University policies.\n\n*Academic Integrity*\n\nUSF upholds the standards of honesty and integrity from all members of the academic community. All students are expected to know and adhere to the University's \u003ca href=\"/academic-integrity/\"\u003eHonor Code\u003c/a\u003e.\n\n*Counseling and Psychological Services (CAPS)*\n\nCAPS provides confidential, free \u003ca href=\"/student-health-safety/caps\"\u003ecounseling\u003c/a\u003e to student members of our community.\n\n*Confidentiality, Mandatory Reporting, and Sexual Assault*\n\nFor information and resources regarding sexual misconduct or assault visit the \u003ca href=\"/TITLE-IX\"\u003eTitle IX\u003c/a\u003e coordinator or USFs \u003ca href=\"http://usfca.callistocampus.org\" target=\"_blank\"\u003eCallisto website\u003c/a\u003e.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparrt%2Fmsds689","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fparrt%2Fmsds689","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparrt%2Fmsds689/lists"}