Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dhaitz/data-science-links
A curated list of links to great data science articles, videos, ...
https://github.com/dhaitz/data-science-links
agile ai artificial-intelligence career-advice data-science data-scientists machine-learning
Last synced: 13 days ago
JSON representation
A curated list of links to great data science articles, videos, ...
- Host: GitHub
- URL: https://github.com/dhaitz/data-science-links
- Owner: dhaitz
- Created: 2019-04-28T12:51:13.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-12-07T14:44:12.000Z (about 2 months ago)
- Last Synced: 2024-12-07T15:27:46.190Z (about 2 months ago)
- Topics: agile, ai, artificial-intelligence, career-advice, data-science, data-scientists, machine-learning
- Homepage: https://data-science-links.netlify.app
- Size: 141 KB
- Stars: 24
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Science Links
Links to great data science articles or videos. Feel free to add stuff via pull request [here](https://github.com/dhaitz/data-science-links).
**Contents**
- [General Advice on Data Science](#general-advice-on-data-science)
- [Career Advice for Data Science and IT](#career-advice-for-data-science-and-it)
- [Agile Practices for Data Science](#agile-practices-for-data-science)
- [Lessons Learned](#lessons-learned)
- [Navigating Corporate Politics](#navigating-corporate-politics)
- [Soft Skills](#soft-skills)
- [Presentation Skills](#presentation-skills)
- [Business Acumen](#business-acumen)
- [Technical and Machine Learning Advice](#technical-and-machine-learning-advice)
- [Specific areas and tools](#specific-areas-and-tools)
- [git](#git)
- [Code Reviews](#code-reviews)
- [GenAI & LLMs](#genai--llms)
- [Software Engineering Lessons Learned](#software-engineering-lessons-learned)
- [Blogs and Info Sources](#blogs-and-info-sources)## General Advice on Data Science
* [What They Don't Tell You About Data Science 1: You Are a Software Engineer First](http://nadbordrozd.github.io/blog/2017/12/05/what-they-dont-tell-you-about-data-science-1/) - Highlighting the importance of software engineering skills for data scientists.
* [Why Data Science Teams Need Generalists, Not Specialists](https://hbr.org/2019/03/why-data-science-teams-need-generalists-not-specialists) - As data science is mostly about prototyping, iterating and learning on-the-fly, generalists are better suited than specialists.
* [Bridging the Gap: from Data Science to Production](https://florianwilhelm.info/2018/07/bridging_the_gap_from_ds_to_prod/) - Advice on how to best bridge the "deployment gap" in data science.
* [Compilation of Advice for New and Aspiring Data Scientists](https://towardsdatascience.com/compilation-of-advice-for-new-and-aspiring-data-scientists-5ea75a3925c4) - Another great collection of advice: building human capital, choosing a company, acquiring new skills
* [How can Data Scientists survive layoffs?](https://peadarcoyle.com/2020/04/26/how-can-data-scientists-survive-layoffs/) – How to make data science achieve a real business impact
* [How do we deliver Data Science in the Enterprise](https://peadarcoyle.com/2017/08/07/how-do-we-deliver-data-science-in-the-enterprise/) – How to overcome the challenges data science teams face often in an established enterprise## Career Advice for Data Science and IT
* [Don't Call Yourself A Programmer, And Other Career Advice](https://www.kalzumeus.com/2011/10/28/dont-call-yourself-a-programmer/) - Real-world career advice for engineers and IT professionals. Highlights the importance of communication, negotiating and self-marketing.
* [Miscellaneous unsolicited (and possibly biased) career advice](https://erikbern.com/2019/09/12/misc-unsolicited-career-advice.html) - another great collection of advice pieces
* [So Good They Can't Ignore You](https://commoncog.com/blog/so-good-they-cant-ignore-you/) - This article is actually a summary of Cal Newport's book of the same name. In it, he argues that "following your passion" rarely leads to job satisfaction. Instead, mastering a rare and valuable skill will build 'career capital': If you're really good at what you do, you'll feel *competent* and you're typically granted *autonomy*, both key factors for occupational happiness. Therefore, pick a career path where you can deliberately improve your skills and aim to be the world-best at what you do.## Agile Practices for Data Science
* [Evolution from #NoProjects to Continuous Digital](https://www.youtube.com/watch?v=eeK1-9QxLGE) - One of many great talks by Allan Kelly, one of the figureheads of the #beyondprojects movement. He explains why the project model leads to countless problems in today's fast-paced digital economy and suggests a better solution.
* [A manifesto for Agile data science](https://www.oreilly.com/ideas/a-manifesto-for-agile-data-science) - Seven principles to help adopt agile practices in data science.
* [Datenprodukte bauen mit Agile Data Science](https://www.inovex.de/fileadmin/files/Vortraege/2018/datenprodukte-bauen-mit-agile-data-science-tempich-29.06.2018.pdf) - "Organisatorische Rahmenbedingungen für Datenprodukte auf Basis von künstlicher Intelligenz".
* [What is Minimum Viable (Data) Product?](https://medium.com/idealo-tech-blog/what-is-minimum-viable-data-product-49269e338d85) - Adapting the agile concept of a Minimum Viable Product (MVP) for data science.
* [Agiles Management: 7 Prinzipien, die man kennen sollte ](https://strato.de/blog/agiles-management/) - Sieben Prinzipien für agiles Arbeiten.
* [8 “Simple” Guidelines For Data Projects](https://builttoadapt.io/8-simple-guidelines-for-data-projects-859a1a738ffc) - Best practices for building impactful data products.## Lessons Learned
* [Lessons learned in my first year as a data scientist](http://tommyblanchard.com/lessons-learned-in-my-first-year-as-a-data-scientist) - Lessons learned about data science projects in a business environment.
* [The First Year — A Retrospective Into My First Real Job as a Data Scientist](https://medium.com/@benjaminjiang/the-first-year-a-retrospect-into-my-first-real-job-as-a-data-scientist-d74c3179b461) - Some more lessons learned ("The most important thing a data scientist has to do is to get buy-in from management").
* [Advice for New Data Scientists](https://medium.com/airbnb-engineering/new-data-scientists-tips-for-success-5f898b6a33f3) - 4 general tips for new data scientists.
* [Lessons from My First Two Years of AI Research](http://web.mit.edu/tslvr/www/lessons_two_years.html)
* [Top 5 Mistakes of Greenhorn Data Scientists](https://towardsdatascience.com/top-5-mistakes-of-greenhorn-data-scientists-90fa26201d51)## Navigating Corporate Politics
* [The most difficult thing in data science: politics](https://www.rdisorder.eu/2017/09/13/most-difficult-thing-data-science-politics/) - How to navigate corporate politics as a data scientist.
* [How To Survive Corporate Politics as A Data Scientist](https://towardsdatascience.com/how-to-survive-corporate-politics-as-a-data-scientist-ba914fac2471) - 30 tips.
* [Your Data Science Project Will Fail Unless It Meets These 3 Conditions](https://towardsdatascience.com/3-conditions-for-data-science-project-success-e31d3a798ec2) - Rules for successful data science projects in a corporate environment.
* [How to Increase Your Influence at Work](https://hbr.org/2018/02/how-to-increase-your-influence-at-work)
* [Eiffel's Tower](https://www.youtube.com/watch?v=dLPi4lfk8is) – At the example of the Eiffel Tower's construction, Nickolas Means illustrates the importance of politics in engineering. But "Every organization is political" doesn't mean you have to participate in scheming and backstabbing. For healthy politics, reframe *Networking* as *Making Friends*, *Self-Promotion* as *Telling Stories* and *Negotiation* as *Cooperation*. As an engineering manager, rather than assuming the role of *shit umbrella* and protecting your team from all politics, be a *heat shield* that takes the pressure off your team if it becomes overwhelming.
* [Failed "Squad Goals"](https://www.jeremiahlee.com/posts/failed-squad-goals/) - Spotify doesn’t use “the Spotify model” and neither should you.
* [The Hierarchy Is Bullshit (And Bad For Business)](https://charity.wtf/2022/09/23/the-hierarchy-is-bullshit-and-bad-for-business/) - About perceiving the corporate hierarchy as a neutral data structure instead of a dominance ladder## Soft Skills
* [To Improve Your Team, First Work on Yourself](https://hbr.org/2019/01/to-improve-your-team-first-work-on-yourself) -
How internal self-awareness, external self-awareness, and personal accountability can improve team dynamics.
* [How to Develop the Five Soft Skills That Will Make You a Great Analyst](https://mode.com/blog/how-to-develop-the-five-soft-skills-that-will-make-you-a-great-analyst) - 5 points with advice on self-evaluation and improvement.
* [How to Work With Stakeholders as a Data Scientist](https://towardsdatascience.com/how-to-work-with-stakeholders-as-a-data-scientist-13a1769c8152) - On communication, expectation management and other important topics when working with stakeholders.
* [How I increased my impact as a data scientist with one question](https://towardsdatascience.com/how-i-increased-my-impact-as-a-data-scientist-with-one-question-d0417a1b10cb) - Ask stakeholders: “Imagine we’ve done all the hard work and developed a perfect solution. What will you do with it?”## Presentation Skills
* [Presentation Tips for Data Professionals](https://data36.com/presentation-tips-for-data-professionals/) - Some general advice on presentation content, design and delivery (not necessarily restricted to data science).
* [On Presentations](https://www.beautiful.ai/player/-LiSV45O9K1sE8uv5oMj/On-Presentations) – Practice your storytelling skills to engage your audience.
* [Fix Your Really Bad PowerPoint](https://www.slideshare.net/slidecomet/fix-your-really-bad-powerpoint-slidecomet-based-on-an-ebook-by-sethgodin) and [7 tips to create visual presentations](https://www.slideshare.net/EmilandDC/7-tips-to-create-visual-presentations/42-TIPSTools) - About slide design
* [What makes a good talk?](http://www.neurotheory.ox.ac.uk/~timv/pub/TalkTips.pdf) – A one-page compilation of presentation advice
* [A beginner’s guide to making beautiful slides for your talks](https://ines.io/blog/beginners-guide-beautiful-slides-talks/) by Ines Montani. There's also [part 2](https://ines.io/blog/beautiful-slides-talks-part-2-aesthetics/) and a [lightning talk](https://speakerdeck.com/inesmontani/lightning-talk-beautiful-slides-for-beginners)## Business Acumen
* [10 Reads for Data Scientists Getting Started with Business Models](https://www.conordewey.com/posts/2019/5/17/10-reads-for-data-scientists-getting-started-with-business-models) - Great collection of articles on business topics.## Technical and Machine Learning Advice
* [How becoming not a data scientist made me a better data scientist](https://docs.google.com/presentation/d/1jk-qrVKCb0-P9P4BVzH75gcVhp5Dy5n1CP_gKnHMNY0/edit#slide=id.p) - Coding advice for data scientists by Joel Grus.
* [Machine Learning: The High Interest Credit Card of Technical Debt ](https://ai.google/research/pubs/pub43146) - The famous Google paper on technical debt in machine learning systems.
* [Rules of Machine Learning: Best Practices for ML Engineering](http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf) - 43 rules for implementing ML products.
* [Software Engineering for Machine Learning: A Case Study](https://www.microsoft.com/en-us/research/publication/software-engineering-for-machine-learning-a-case-study/) - Survey results from Microsoft's AI practicioners, including best practices.
* [Software Engineering for Data Scientists - Small Big Data: large data on a single computer](https://pythonspeed.com/datascience) - Several guides on how to reduce memory usage with e.g. pandas.
* [Coding habits for data scientists](https://www.thoughtworks.com/insights/blog/coding-habits-data-scientists) - From messy notebook prototyping to production-ready code. The author also created [Clean Code concepts adapted for machine learning and data science](https://github.com/davified/clean-code-ml), including a [video series](https://www.youtube.com/watch?v=Edn6XxWmtEs&list=PLO9pkowc_99ZhP2yuPU8WCfFNYEx2IkwR&index=2).
* [How to pick more beautiful colors for your data visualizations](https://blog.datawrapper.de/beautifulcolors/) – Common color mistakes and how to avoid them.
* [Blockchain, the amazing solution for almost nothing ](https://thecorrespondent.com/655/blockchain-the-amazing-solution-for-almost-nothing)
* [https://github.com/zakirullin/cognitive-load](Cognitive Load is what matters) – Optimize your code to reduce cognitive load
* [https://programmingisterrible.com/post/139222674273/write-code-that-is-easy-to-delete-not-easy-to](Write code that is easy to delete, not easy to extend)### Specific areas and tools
#### git
- [Modern Git Commands and Features You Should Be Using](https://martinheinz.dev/blog/109): git switch, restore, sparse checkout, worktree and bisect explained
- [Popular git config options](https://jvns.ca/blog/2024/02/16/popular-git-config-options/)#### Code Reviews
- [How to Do Code Reviews Like a Human](https://mtlynch.io/human-code-reviews-1/) ([Part 2](https://mtlynch.io/human-code-reviews-2/)) and [Code Review Guidelines for Humans](https://phauer.com/2018/code-review-guidelines/) focus on the human-interaction part of code reviews
- [The Code Review Pyramid](https://www.morling.dev/blog/the-code-review-pyramid/) points out which software aspects we should (or should not) focus on#### GenAI & LLMs
- [Catching up on the weird world of LLMs](https://simonwillison.net/2023/Aug/3/weird-world-of-llms/) – Simon Willison's overview talk on LLMs (August 2023)
- [The 70% problem: Hard truths about AI-assisted coding](https://addyo.substack.com/p/the-70-problem-hard-truths-about) – TLDR: "AI isn't making our software dramatically better because software quality was (perhaps) never primarily limited by coding speed. The hard parts of software development – understanding requirements, designing maintainable systems, handling edge cases, ensuring security and performance – still require human judgment. [...] Remember: The goal isn't to write more code faster. It's to build better software. Used wisely, AI can help us do that."## Software Engineering Lessons Learned
* [Individual Contributor prinicples from 25 years in software development](https://github.com/threddyrex/docs/blob/main/career-principles.md)
* [20 Things I’ve Learned in my 20 Years as a Software Engineer](https://www.simplethread.com/20-things-ive-learned-in-my-20-years-as-a-software-engineer/)
* [What I’ve Learned in 45 Years in the Software Industry](https://www.bti360.com/what-ive-learned-in-45-years-in-the-software-industry/)
* [Software engineering practices](https://simonwillison.net/2022/Oct/1/software-engineering-practices)
* [One Way Smart Developers Make Bad Strategic Decisions](https://earthly.dev/blog/see-state/) – Why top-down standardization efforts and one-size-fits-all solutions often fail
* [Don’t Feed the Thought Leaders](https://earthly.dev/blog/thought-leaders/) – About the value of contingent, context-specific advice
* [Lessons learned in 35 years of making software](https://dev.jimgrey.net/2024/07/03/lessons-learned-in-35-years-of-making-software/) – About simplicity, building relationships and making your work visible## Blogs and Info Sources
* [inovex Analytivs Blog](https://www.inovex.de/blog/category/analytics/)
* [codecentric Data Blog (in German)](https://blog.codecentric.de/category/data/)
* [Reddit Data Science Forum](https://reddit.com/r/datascience/) - Also: Reddit forums for [data visualization](https://reddit.com/r/dataisbeautiful/) and [machine learning](https://reddit.com/r/MachineLearning/).