Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

awesome-data-leadership

A curated list of awesome posts, videos, and articles on leading a data team (small and large)
https://github.com/ronikobrosly/awesome-data-leadership

Last synced: 2 days ago
JSON representation

  • Culture

    • Jacob Kaplan-Moss - interviews-are-a-trap/?utm_source=hackernewsletter&utm_medium=email&utm_term=working) | Rethinking the exit interview: there is very little upside (unlikely things will change) and potentially significant downside (bad blood, retracted references, malicious actions by employer, etc. | 2022
    • Claire Carroll - education-is-broken/) | The post explores the disconnect between data education and real data practice in industry (e.g. analyzing static flat files in R, Pandas, or SPSS compared with using SQL along with tools like git, dbt, Airflow, VSCode, etc), why this occurs, and the effects it has on the data industry. | 2021
    • Kuba Niechcial - manager.com/2018-04-12/how-to-set-goals-for-engineers) | Provides some examples of good engineer personnel goals and things to keep in mind (e.g. KPIs should not be personal goals). | 2021
    • Christoph Neijenhuis - onboarding-retention/how-stop-shrinkage-engineering-teams) | The journey to stopping shrinkage in engineering teams is long and rarely straightforward, but there are practical things leaders can do to take control of the chaos, from taking steps to get out of survival mode and tackling problems around culture to involving teams in the development of a solid technical strategy. | 2022
    • Caitlin Moorman - endedness/opportunities for creativity and standardized rigor when leading a data function. | 2020
    • Shimin Zhang - business-case-for-fewer-developer-meetings/) | Describes the opportunity cost of having all developers or data engineers attending meetings and describes ways to recoup this. | 2022
    • David Waller - Driven Culture](https://hbr.org/2020/02/10-steps-to-creating-a-data-driven-culture) | Details some steps for working towards a data-driven culture, from taking care in choosing metrics to quantifying uncertainty. | 2020
    • Christine Garcia
    • Emily Thompson - proactive-influential?s=r) | Reactive data teams lead to low impact and attrition, so instead acknowledge if your team is reactive, assess reactivity quantitatively, focus on near-term wins for cultural change, and build longer-term foundational work into the team’s capacity | 2022
    • Jacob Kaplan-Moss - interviews-are-a-trap/?utm_source=hackernewsletter&utm_medium=email&utm_term=working) | Rethinking the exit interview: there is very little upside (unlikely things will change) and potentially significant downside (bad blood, retracted references, malicious actions by employer, etc. | 2022
    • Kuba Niechcial - manager.com/2018-04-12/how-to-set-goals-for-engineers) | Provides some examples of good engineer personnel goals and things to keep in mind (e.g. KPIs should not be personal goals). | 2021
    • Christoph Neijenhuis - onboarding-retention/how-stop-shrinkage-engineering-teams) | The journey to stopping shrinkage in engineering teams is long and rarely straightforward, but there are practical things leaders can do to take control of the chaos, from taking steps to get out of survival mode and tackling problems around culture to involving teams in the development of a solid technical strategy. | 2022
    • Caitlin Moorman - endedness/opportunities for creativity and standardized rigor when leading a data function. | 2020
    • Shimin Zhang - business-case-for-fewer-developer-meetings/) | Describes the opportunity cost of having all developers or data engineers attending meetings and describes ways to recoup this. | 2022
    • Claire Carroll - education-is-broken/) | The post explores the disconnect between data education and real data practice in industry (e.g. analyzing static flat files in R, Pandas, or SPSS compared with using SQL along with tools like git, dbt, Airflow, VSCode, etc), why this occurs, and the effects it has on the data industry. | 2021
  • Organization Structure and Job Titles

    • Morgan Krey - and-system-builders/) | There has been a proliferation of "data X" roles (e.g. data engineer, data scientist, data analyst, etc) but the author argues that there are really just two kinds of data practitioners: system builders (your engineers that build pipelines, schedule jobs, stand up APIs, etc.) and storytellers (looking for actionable insights, visualizing data on dashboards, etc). | 2022
    • Rob Dearborn - and-scaling-an-effective-data-team/?utm_campaign=Data_Elixir&utm_source=Data_Elixir_379) | General guidelines on what a properly-structured data team should look like, with describes ranging from 1-person data team to 32+ person team. | 2022
    • Chuong Do - is-the-most-effective-way-to-structure-a-data-science-team-498041b88dae) | Covers how should data scientist roles be defined (analysis vs building), where should data scientists report (centralized vs decentralized), where should the data science function live (engineering org vs product org vs independent consultancy), and what should an organization do to set up data science for success. | 2017
    • Randy Bean - data-officers-struggle-to-make-a-business-impact/?sh=5f7b6d4af1a4) | There is widespread disparity of opinion on what defines a successful Chief Data Officer, so it makes sense that only CDOs are poised for success according to a recent Gartner report. | 2019
    • Matthew Mayo - scientist-data-engineer-data-careers-explained.html) | Explanations of various titles such as Data Architect, Data Engineer, Analyst, ML Engineer, and Data Scientist | 2022
    • Rifat Majumder - product-manager/) | Describes the emerging role of "Data Product Manager", and how benefits they provide an org: better business impact, a deep understanding of customer problems, and more clarity on priorities. | 2021
    • Ben Darfler - levels-at-honeycomb/) | Describes a nice framework for thinking about job levels, based on scope and level of project complexity. | 2022
    • Jorge Fioranelli
    • Pardis Noorzad - for-integrating-data-science-teams-within-organizations-7c5afa032ebd) | Compares different models for situating DS teams including the "center-of-excellence model", the "Accounting model", the "consultant model", the "embedded model", and more, and considers factors like "Coordination efficiency", "Employee happiness", and others. | 2019
    • Kurt Cagle - you-dont-need-data-scientists-a9654cc9f0e4) | Early in an organization's data maturity stage, you don't need "data scientists" and machine learning people, you instead need to focus on data quality and ontological engineering problems. | 2018
    • Morgan Krey - and-system-builders/) | There has been a proliferation of "data X" roles (e.g. data engineer, data scientist, data analyst, etc) but the author argues that there are really just two kinds of data practitioners: system builders (your engineers that build pipelines, schedule jobs, stand up APIs, etc.) and storytellers (looking for actionable insights, visualizing data on dashboards, etc). | 2022
    • Michelangelo D'Agostino - malone-46050854/) | [The Care and Feeding of Data Scientists, Chapter 6](https://oreilly-ds-report.s3.amazonaws.com/Care_and_Feeding_of_Data_Scientists.pdf) | "Chutes and Career Ladders" discusses how to write a great career ladder for your team. | 2019
    • Gergely Orosz - silicon-valley-gets-right-on-software-engineers/) | The Silicon Valley treats engineers as autonomous adults who are smart people because that’s who they hire because that’s who can do the work they need done, while traditional companies tend to keep developers in pure execution roles. | 2021
    • Brittany Bennett - powerful-data-teams-on-investing-in-junior-talent) | Provides suggestions on how developing junior talent: blocking off time for personal development, celebrating this blocked off time, hiring tutors, and more. | 2021
    • Eric Colson - stack data science generalist and the perils of division of labor through function"](https://multithreaded.stitchfix.com/blog/2019/03/11/FullStackDS-Generalists/) | Beware specialization in data science (data science is not to execute. Rather, the goal is to learn and develop profound new business capabilities), as there are costs to specialization. | 2019
    • Michelangelo D'Agostino - malone-46050854/) | [The Care and Feeding of Data Scientists, Chapter 6](https://oreilly-ds-report.s3.amazonaws.com/Care_and_Feeding_of_Data_Scientists.pdf) | "Chutes and Career Ladders" discusses how to write a great career ladder for your team. | 2019
    • Mikkel Dengsøe - team-size-at-100-scaleups) | Author analyzed 100 known startups and notes that data team members comprise 1-5% of the company headcount, and this varies industry to industry (details included) | 2023
    • Benn Stancil - technical-pay-gap?s=r) | Describes the current state of confusion around data titles (using the "analytics engineer" as an example), and describes how the tech industry overvalues technical skills at times. | 2022
    • Benjamin Rogojan - types-of-data-engineering) | Post gives nice overview of the various flavors of data engineering roles in organizations (including software engineers, data platform engineers, etc). | 2022
    • Benn Stancil - technical-pay-gap?s=r) | Describes the current state of confusion around data titles (using the "analytics engineer" as an example), and describes how the tech industry overvalues technical skills at times. | 2022
    • Mikkel Dengsøe - team-size-at-100-scaleups) | Author analyzed 100 known startups and notes that data team members comprise 1-5% of the company headcount, and this varies industry to industry (details included) | 2023
  • ML and AI Within an Organization

    • Monica Rogati - ai-hierarchy-of-needs-18f111fcc007) | Before you can fully get value out of ML/AI in an organization, it is critical to have foundational data needs met (i.e. good data collection processes, checks, and analytics). | 2017
    • Monica Rogati - ai-hierarchy-of-needs-18f111fcc007) | Before you can fully get value out of ML/AI in an organization, it is critical to have foundational data needs met (i.e. good data collection processes, checks, and analytics). | 2017
    • Thomas Redman - data-initiatives-cant-just-be-for-data-scientists?utm_campaign=Data_Elixir&utm_source=Data_Elixir_381) | Describes the tole and importance of non-data experts in DS projects: collaborators, customers, and as creators of the data. | 2022
    • Natassha Selvaraj - data-scientists-quitting-jobs.html) | Two primary factors drive a number of new data scientists out of the profession: a mis-match between employer and employee expectations around data science work and the general difficulty of ML to add clear business value. | 2022
    • Pete Warden - should-you-protect-your-machine-learning-models-and-ip/) | Some thoughts on the importance of protecting IP in a ML org. | 2022
    • Jeff Saltz - pm.com/machine-learning-project-management/) | Touches on difficulties of managing ML projects and how the management process differs from standard software development. | 2021
    • Alfred Spector - h-wiggins/), and [Jeannette M. Wing](https://datascience.columbia.edu/people/jeannette-m-wing/) | [Data Science in Context: Foundations, Challenges, Opportunities](https://datascienceincontext.com/manuscript/) | A pre-release of a book that gives a thorough accounting of the history of Data Science, a high-level understanding of its applications, and the ethical and social concerns associated with it. | 2022
    • Brooke Carter - mui/) | [ML Education at Uber: Frameworks Inspired by Engineering Principles](https://eng.uber.com/ml-education-at-uber/) | Provides an overview of the philosophy behind Uber's ML education program. | 2022
    • Eyal Trabelsi - dataverse/how-to-build-trust-in-machine-learning-the-sane-way-39d879f22e69) | Provides suggestions on how teams can improve trust in ML in their org, including defining metrics up front, following some best practices when developing the model, A/B testing the model upon deployment, and more. | 2022
    • Andrew Lukyanenko - learned-after-10-years-in-it-1489ad39280e) | A senior data scientist gives general DS career (some of which is worth noting as a leader) including topics around interviewing, productivity, communication, time estimation, and more. | 2022
    • Eugene Yan - for-projects/) | The author describes a few process-based techniques for increasing ML project success (e.g. establishing project pilots and copilots, literature reviews, methods reviews, etc). | 2023
    • Arthur Turrell - wanderer/posts/data-science-maturity/data-science-maturity.html) | Describes conditions and infrastructure needed for data scientists to thrive in an organization, and puts it in yhe context of data maturity. | 2023
    • Shreya Shankar - term maintanence), and they also discuss interviewees’ pain points and anti-patterns. | 2022
    • Mario Perrakis - uk-technology/the-0-1-done-strategy-for-data-science-3c1737de14b3) | A description for what a DS org should aspire to: 0-day handovers facilitated by great documentation and code, 1-day prototypes enabled by good tooling and good knowledge, and a clear definition of “done”. | 2022
  • Hiring

    • Reddit
    • Hacker News - interviewing | 2022
    • Farhan Thawar - of-Engineering-hiring-cheatsheet-f7824eb3f60748bb997f9b9b14c073a5) | A guide for assessing a candidate for a engineering or data leadership role: provides good and bad responses to questions. | 2022
    • Freaking Rectange Blog - to-freaking-hire-great-developers/) | When hiring for data engineering, analytics, data science, or ML Engineering roles, it would be better to have candidates try to read code instead of writing it (it can be neutral interview-only code). | 2022
    • Nate Rosidi - python-coding-interview-questions-must-know-data-science.html) | Provides 15 examples of testing basic python dta manipulation skills for interviews. | 2022
    • Jike Chong - places-to-work-for-data-scientists?utm_source=Data_Elixir&utm_medium=social) | Looks at factors that make a data science org attractive to an IC, but this provides some insights for hiring managers trying to get in the heads of talent. | 2022
    • Dip Ranjan Chatterjee
    • Chip Huyen - we-look-for-in-a-candidate.html) | Outlines the resume evaluation process for a small startup looking for data talent and includes topics like looking for examples of persistence, looking for unique perspectives, and looking for metrics around business impact. | 2023
    • Mikhail Popov - data-scientist/) | A retrospective from the Wikimedia Foundation, of Wikipedia fame, sharing what they learned in the hiring process and how they discovered a better approach to interviewing for their data team. | 2017
    • Eli Goldberg - better-data-scientists-a-field-guide-for-hiring-managers-new-to-data-science-388d174a96df) | Make time for hiring and use your shift in priorities to your advantage, don't "wing it", write your process down and engineer it to be data driven, and modify the process not your adherence to it. | 2020
    • Randy Au - talk-a-bit-about-giving-interviews?utm_campaign=Data_Elixir&utm_source=Data_Elixir_387&s=r) | Gives thoughts on planning and carrying out a technical data science interview. | 2022
    • Randy Au - talk-a-bit-about-giving-interviews?utm_campaign=Data_Elixir&utm_source=Data_Elixir_387&s=r) | Gives thoughts on planning and carrying out a technical data science interview. | 2022
    • Tristan Handy - teams/hiring-data-engineer/) | Article makes the claim that increasingly data analysts and scientists are working on ETL pipelines themselves (with the help of Stitch, Fivetran, dbt, etc.) but data engineers are still essential for: managing core data infrastructure, building and maintaining custom ingestion pipelines, supporting data team resources with design and performance optimization, and building non-SQL transformation pipelines. | 2022
    • Chip Huyen - we-look-for-in-a-candidate.html) | Outlines the resume evaluation process for a small startup looking for data talent and includes topics like looking for examples of persistence, looking for unique perspectives, and looking for metrics around business impact. | 2023
    • Eli Goldberg - better-data-scientists-a-field-guide-for-hiring-managers-new-to-data-science-388d174a96df) | Make time for hiring and use your shift in priorities to your advantage, don't "wing it", write your process down and engineer it to be data driven, and modify the process not your adherence to it. | 2020
  • Impact

    • McKinsey - functions/mckinsey-analytics/our-insights/ten-red-flags-signaling-your-analytics-program-will-fail) | A list ranging from the executive team doesn't have a clear vision for it's analytics program to nobody knows the quantitative impact that analytics is providing | 2018
    • Abinaya Sundarraj - management-stay-top-customer-mind.html) | Describes the virtues and challenges around achieving a customer-centric, data perspective in a business. | 2022
    • Chad Sanderson - existential-threat-of-data-quality?s=r&utm_campaign=post&utm_medium=web) | Despite the rapidly-evolving/growing data stack, poor data quality remains an enormous problem; the article breaks it down into "downstream" and "upstream" categories. | 2022
    • Anna Geller - prefect-blog/should-you-measure-the-value-of-a-data-team-95c447f28d4a) | Wonderful discussion of the challenges of measuring a data team's impact, and provides clear examples of good, so-so, and poor metrics for measuring this performance. | 2023
    • Sarah Krasnik - a-data-catalog?s=r) | Although not technically on management, this tackles the critical topic of documentation, dictionaries, knowledge repos and such, which are critically important for a data org. | 2022
  • Strategy

    • Chris Brown - a-data-strategy-with-okrs-acbdbbf126a7) | Outlines how OKRs (Objectives and Key Results) can help with executing on data strategy and provides some examples. | 2022
    • Raymond See - tool-techniques-startup/) | Provides some tips for early-stage start-ups hoping to develop a data function (e.g. hire a few generalists, bring in the right tools, etc) | 2023
    • Ilan Man
    • Emilie Schario - your-data-team-like-a-product-team/) | Service-oriented data teams aren’t effective, and the authors suggest running the data team like a product team is ideal, where you take a more active roll in defining your org's success metrics and push the business forward in a more active way. | 2021
    • Jeremy Salfen - a-data-practice/) | Provides a series of suggestions for first data hires at an early stage startup, including the following principles: "don’t worry about making things fancy", "keep an eye on how things will scale, but rein in your impulses to optimize them", and "documentation, transparency, and reproducibility are interrelated and fundamental". | 2021
    • Leo Polovets - value-of-data-part-3-data-business-models) | Final post in this series describes the concept of a "Data Business Model", the reality of how data can be monetized with examples of companies in each scenario. | 2015
    • Yali Sassoon - need-to-deliberately) | Organizations spend an incredible amount of time and resources extracting data from various sources, but rarely consider making their own data to generate inputs for the ML systems. | 2022
    • Prukalpa Sankar - the-Scenes Look at How Postman’s Data Team Works: How Postman’s data team set up better onboarding, infrastructure, and processes while growing 4–5x in one year](https://entrepreneurshandbook.co/a-behind-the-scenes-look-at-how-postmans-data-team-works-fded0b8bfc64) | Describes Postman's data team structure (contains central, embedded, and distributed memebers), how they handle prioritization, sprints, and the like. | 2021
  • Diversity Equity and Inclusion

    • Sophia (Saeyoon) Baik - a-diverse-engineering-team-in-2022-the-beginners-guide/) | Provides great summary and many links describing the state of DEI in tech engineering, along with why diversity helps boost productivity, and a number of suggestions on how to reduce hiring biases. | 2022
    • Sergio Morales - proof your Analytics Efforts in 2020: Hire Diverse Teams](https://towardsdatascience.com/future-proof-your-analytics-efforts-hire-diverse-teams-9e8f9a471859) | Post describes how data team diversity deters bias and encourages curiosity, skepticism and analytical thinking; attributes any analytics enterprise will highly value. | 2020
    • Swathi Young - to-make-sure-that-diversity-in-ai-works/?sh=37ecf0cd567a) | Post provides guidance on how management teams can build diverse AI teams, including suggestions like restructuring talent acquisition, thinking through pay parity, and more. | 2021
    • Gergely Orosz - a-diverse-engineering-team/) | Stories from six engineering leaders who succeeded in building and growing diverse teams | 2021
  • Project Management

    • Oscar Baruffa - in from, including things like take the path of least resistance, work towards getting stakeholders to think it's their idea, have lots of private conversations beforehand, and more. | 2022
    • Lucas F Costa - metrics.html) | Covers four useful metrics that are easily attainable from JIRA that aren't easily gameable and can help you debug process problems: arrival rate, work in progress, throughput, and cycle time. | 2022
    • Leandro Carvalho - performance data products](https://medium.com/@leandroscarvalho/data-product-canvas-a-practical-framework-for-building-high-performance-data-products-7a1717f79f0) | Outlines the "Data Canvas" framework for building new data products, which is divided into 10 blocks (problem, solution, data, hypotheses, actors, actions, KPIs, values, risks and performance/impact), and separated by 3 domain areas: the product vision, the vision of the strategy, and the business vision. | 2022
    • Erik Bernhardsson - driven project management: when is the optimal time to give up?”](https://erikbern.com/2022/04/05/sigma-driven-project-management-when-is-the-optimal-time-to-give-up.html) | The post describes an abstract measure “alpha” that captures the risk of a project and based on that risk the post describes a statistical model that shows when one ought to give up on a project. | 2022
    • Erik Bernhardsson - driven project management: when is the optimal time to give up?”](https://erikbern.com/2022/04/05/sigma-driven-project-management-when-is-the-optimal-time-to-give-up.html) | The post describes an abstract measure “alpha” that captures the risk of a project and based on that risk the post describes a statistical model that shows when one ought to give up on a project. | 2022
    • Michael Kaminsky - analytics-p3/) | Adjustments are suggested for agile to work well on a data team: time-bound spikes for research, build in slack time for exploration, acceptance criteria includes “write the next story”, peer-review instead of sprint-review. | 2018
  • Code Review

    • Gunnar Morling - code-review-pyramid/) | There should be a hierachy of effort in reviewing code, where more effort is spent on core concepts, how performant code is, and documentation, with less effort on test quality (though of course tests are important) and syntax. | 2022
    • Tim Hopper - review-guidelines) | In the context of data team, desecribes what a code review should achieve, bullets to carry out pull requests, and some links to additional reading. | 2020
    • Eric Ma - on-data-science/blob/master/docs/workflow/code-review.md) | In the context of data science the essay briefly describes the purpose of code review, what it should not be, and the value of it in data work. | 2021
  • BI and Analytics Within an Organization

    • Lenny Rachitsky - star-metrics/) | Proposes metrics based on your type of business, recommends having a singular north star metric, and avoid using revenue as your metric. | 2021
    • Ron Berman - adoption associated with the adoption of descriptive analytics among online retailers. | 2020
    • Joe McFarren - tips-for-managing-a-successful-analytics-project/) | In the context of analytics consulting it is important to: clearly establish project scope, be in constant communication, determine a line of escalation, monitor work with tracking apps, and track finances. | 2022
    • Erik Balodis - framework-for-embedding-decision-intelligence-into-your-organization-f104947651ae) | Provides a high-level overview of how to infuse decision-intelligence into an organization, along with some additional reading sources. | 2022
    • Nelson Auner - to-60-analytics-stack/) | Gives an overview of the modern analytics stack via three buckets: a data-moving tool (ETL), a data warehouse to store the data, and a BI layer to analyze the data. | 2020
    • Mode - teams-guide-for-marketing-metrics/) | Good overview of the landscape of metrics used in data marketing work (as well as information on the technical side of it). | 2022
    • SeattleDataGuy - are-we-still-struggling-to-answer) | Surprisingly, metrics are still hard to calculate and this is at least partly because of turnover of developers, ERP and CRM migrations, producers of data constantly changing what data they provide, and mergers and acquisitions, and other reasons. | 2022
    • Marie Lefevre - all-data-requests-are-urgent-so-start-by-asking-these-5-questions-ad77d1fbe7dd) | Details five questions the authors typically asks of those that request analyses: Why? Why again? Who is it for? When is it due? Is it more of a priority than that other request? | 2022
    • Amplitude - star/why-use-the-north-star-framework) | A short book intended for product managers and product designers that describes the value of North Star metrics and how to iddentify them. | 2018
    • Dan Frank - data-science/experimentation-platform-in-a-day-c60646ef1a2) | A short technical (but very accessible guide) to setting up a simple experimentation "platform" with elements of logging, measurement, assignment, and analysis. | 2022
    • Ron Kohavi, Diane Tang, and Ya Xu
    • Caveats and Limitations of A/B Testing at Growth Tech Companies
    • Rembrand Koning - chatterji) | [Experimentation and Startup Performance: Evidence from A/B Testing](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3454379) | This academic paper provides the first evidence of how digital experimentation affects the performance of a large sample of high-technology startups using data that tracks their growth, technology use, and product launches (they find increased performance on several critical dimensions, including page views and new product features). | 2022
    • Erin Gustafson - model-duolingo/) | Outlines a thorough growth model that is broadly applicable to most B2C organizations where users subscribe to a service and includes discussion on how various "levers" of this growth model were tested. | 2023
    • Kasia Rachuta - articles/blob/main/Genericized_A_B_template.pdf) and argues that it's helpful for structuring tests and ensuring stakeholders have thought about the right business questions prior to asking for something to be launched. | 2023
    • Roger M. Stein - managing-data-scientists-is-different/) | Two challenges in managing data scientists: (1) managing a data research effort tends to be a dynamic and self-correcting process in which it is difficult to plan either a project’s timing or final outcomes, and (2) analytics is highly sensitive to time, cost, and quality tradeoffs. | 2015
    • Tristan Handy - startup-founders-guide-to-analytics-1d2176f20ac1) | Although written in 2017, this article gives a still relevant high-level overview on creating the analytics competency at your org, at different levels of company size. | 2017
  • Management Skills

    • David Loftesness
    • The Institute of Leadership & Management - framework/authenticity/self-awareness/spotlight-on-leadership-styles.html) | Describes a set of leadership/management styles including pace-setting, democratic, laissez-faire, and more. | 2018
    • Andy Johns - to-know-when-to-stop?s=r) | A framework for thinking throughout burnout including: 1) Define your personal range of tolerance, 2) Pick your career progression, 3) Pick your life progression. | 2022
    • Alan Johnson - principles-of-engineering-management/) | A brief, digestable list of management principles for new engineering managers. | 2022
    • Tanya Reilly
    • Sarah Drasner
    • Lindy Greer - research/faculty/robert-i-sutton) | [You Need Two Leadership Gears: Know when to take charge and when to get out of the way](https://hbr.org/2023/03/you-need-two-leadership-gears) | Describes how leaders that know when, where, and how to shift gears between a top-down/take charge personas (“exercise authority” mode) and a more “flat” mode (in which the leader levels the hierarchy and shares power) will tend to be more successful, research shows. | 2023
  • Data Platforms

    • Jordan Volz - who-in-the-modern-data-stack-ecosystem-spring-2022-c45854653dc4) | Articles maps out the various pieces of the modern data stack, including event tracking, a data warehouse, data governance, and more. | 2022
    • Krzysztof Szafranek - machine-learning-platform.html) | Provides an overview of Zalando's ML platform (AWS-powered) from the perspective of a machine learning practitioner. | 2022
    • Jean-Georges Perrin - tech/the-next-generation-of-data-platforms-is-the-data-mesh-b7df4b825522) | The post summarizes [Zhamak Dehghani's proposal](https://martinfowler.com/articles/data-monolith-to-mesh.html) for transitioning from current breadth-first data platforms (end-to-end data lifecycle) into vertical/depth-first architectures (one business domain at a time). | 2022
    • Gabrielle Davelaar - Microsoft](https://www.youtube.com/watch?v=HwZlGQuCTj4) | Great talk outlines how DevOps principles can be applied to AI, and then shows in detail how CI/CD, version control, model storage, and more fit into a great MLOps process. | 2018
    • Kevin Hu - four-pillars-of-data-observability) | Provides a definition of data observability and how in the context of a data platform this includes the following facets: metrics, lineage, metadata, and logs. | 2022
    • Stefan Krawczyk - i-learned-building-platforms-at-stitch-fix-fc5e0ec72c86/) | The author describes 5 lessons learned in building a data science platform, including things like don't build them for all possible users, abstract away any underlying APIs to simplify things for end-users. | 2022
    • Lak Lakshmanan - you-dont-need-mlops-5e1ce9fdaa4b) | In counterpoint to all the buzz, the author warns that MLOps is no panacea, and can often automate away important detail or cause a large amount of technical debt that ultimately doesn't save time. | 2022
    • Nishith Agarwal - the-build-vs-buy-guide-for-your-modern-data-stack/) | The author claims that the decision to build vs buy comes down to five main considerations: cost, complexity, expertise, time to value, and competitive advantage. | 2022
    • Dominik Kreuzberger - hirschl-70642270) | [Machine Learning Operations: Overview, Definition, and Architecture](https://arxiv.org/ftp/arxiv/papers/2205/2205.02302.pdf) | The authors conducted a literature review and interviews with experts to create an aggregated overview of the necessary principles, components, and roles, as well as the associated architecture and workflows surrounding "MLOps" | 2022
    • Indika Kumara
    • Charlie Summers - event-streams) | Provides an overview on how to convert events from an event-driven microservice architecture into relational tables in a warehouse like Snowflake, the advantages of this architecture, and how you might want to structure your event messages. | 2022
    • Dmitry Kruglov - of-modern-startup-abaec235c2eb) | Probably more relevant for CTO roles, but with interesting nuggets for Heads of Data, this post gives an overview of the various infrastructure and tools used in the modern startup (languages, infrastucture as code, secrets management, databases, etc). | 2022
    • Sam Lafontaine - to-build-a-modern-data-stack) | A light overview of the several components that constitute the modern data stack: a data source, data ingestion tools, data storage, data transformations and modeling, data analytics, and data activation (what used to be called "reverse ETL"). | 2021
    • Jordan Tigani - data-is-dead/) | Provactive piece that argues that despite the hype of the last 10 years around the coming "big data" wave and the need for big data tooling and infrastructure, only the smallest of fractions of organizations need to concern themselves with this. | 2023
    • Elijah Ben Izzy - - A Machine Learning Platform for Stitch Fix's Data Scientists](https://multithreaded.stitchfix.com/blog/2022/07/14/deployment-for-free/) | The authors describe, at a high-level, the initial design considerations for Stitch Fix's ML platform, present the API data scientists use to interact with it, and detail its capabilities. | 2022
    • Barr Moses - what-is-a-data-platform-and-how-to-build-one/) | While every organization’s data platform approach will vary based on the industry and the size of their company, this quick and dirty guide lays out a blueprint for a modern data platform. | 2022
    • Benjamin Rogojan - you-should-upgrade-your-data) | Gives high-level summary of data the several phases of data infrastructure that organizations mature through (from tiny start-up looking at manually-generated spreadsheets to more mature organizations with complex ETL DAGs. | 2023
  • Data Governance

    • Sanjana Sen - Governance and Compliance](https://www.youtube.com/watch?v=HHD-MLL8hPE) | A conversation among many data practitioners about how their organizations handle data access control, data tagging, anonymization, and other key compliance activities, and what frameworks they have found helpful. | 2020
    • Bryan Petzold - roggendorf-2a2aa3/?originalSubdomain=de), [Kayvaun Rowshankish](https://www.linkedin.com/in/kayvaunrowshankish/), and [Christoph Sporleder](https://www.linkedin.com/in/christoph-sporleder-3b21411/?originalSubdomain=de) | [Designing data governance that delivers value](https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/designing-data-governance-that-delivers-value) | Briefly surveys the problem of poor data governance, describes an idea data governance model, and provides six ways to drive data-governance excellence. | 2020
    • Crystal Lewis
    • Maggie Hays - governance-but-make-it-a-team-sport-30dc0164fb7c) | Outlines an iterative framework (with examples) to introduce data governance within an organization (includes identify the chief data problem(s) to solve, set clear goals to resolve these problems, start small before you go big, drive incremental action, and then measure progress and iterate). | 2023