https://github.com/lukaszkn/data-software-engineering-interview-questions
Data and Software engineering interview questions
https://github.com/lukaszkn/data-software-engineering-interview-questions
data engineering interview-questions python
Last synced: 11 months ago
JSON representation
Data and Software engineering interview questions
- Host: GitHub
- URL: https://github.com/lukaszkn/data-software-engineering-interview-questions
- Owner: lukaszkn
- Created: 2025-07-06T17:56:04.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-07-07T07:17:36.000Z (12 months ago)
- Last Synced: 2025-07-07T08:31:53.551Z (12 months ago)
- Topics: data, engineering, interview-questions, python
- Homepage:
- Size: 1.03 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 4121 Data and Software engineering interview questions
| Questions | Description |
| --- | --- |
| [Amazon Neptune](content/amazon_neptune.md) | A fast, fully managed database service powering graph use cases such as identity graphs, knowledge graphs, and fraud detection. |
| [Ansible](content/ansible.md) | An open-source automation tool primarily used for configuration management, application deployment and orchestration |
| [Apache Airflow](content/apache_airflow.md) | Apache Airflow |
| [Apache Flink](content/apache_flink.md) | Apache Flink |
| [Apache Flume](content/apache_flume.md) | Apache Flume |
| [Apache HBase](content/apache_hbase.md) | Apache HBase |
| [Apache Hive](content/apache_hive.md) | Apache Hive |
| [Apache Kafka](content/apache_kafka.md) | Apache Kafka |
| [Apache Spark](content/apache_spark.md) | Apache Spark |
| [Apache Superset](content/apache_superset.md) | Apache Superset |
| [AWS](content/aws.md) | AWS |
| [AWS Glue](content/aws_glue.md) | A serverless data integration service that makes it easy for analytics users to discover, prepare, move, and integrate data from multiple sources |
| [AWS Lambda](content/aws_lambda.md) | AWS Lambda |
| [Azure](content/azure.md) | Azure |
| [Azure Databricks](content/azure_databricks.md) | Azure Databricks |
| [Azure Purview](content/azure_purview.md) | A unified data governance solution that helps organizations discover, manage, and govern their data estate across on-premises, multi-cloud, and SaaS environments |
| [Big Data Engineering](content/bigdata.md) | Big Data engineering concepts and tools. |
| [Data pipelines](content/data_pipelines.md) | Data pipelines basics |
| [Data Warehousing](content/dwha.md) | Data Warehousing Architecture |
| [Databricks Machine Learning](content/databricks_machine_learning.md) | Databricks Machine Learning |
| [dbt](content/dbt.md) | dbt |
| [Delta Lake](content/delta_lake.md) | A flexible storage pattern that is typically used for storing massive amounts of raw data in its native format |
| [Elasticsearch](content/elasticsearch.md) | A search engine based on Apache Lucene, a free and open-source search engine. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. |
| [FastAPI](content/fastapi.md) | A high-performance web framework for building HTTP-based service APIs in Python |
| [General](content/general.md) | General programming concepts, design patterns |
| [General Data Engineer interview](content/general_interview.md) | General, behavioral, communication, collaboration, problem solving from data engineering perspective |
| [Google Cloud Platform](content/gcp.md) | Google Cloud Platform |
| [Grafana](content/grafana.md) | A multi-platform open source analytics and interactive visualization web application. |
| [Hadoop](content/hadoop.md) | Hadoop |
| [Jenkins](content/jenkins.md) | An open source automation server. It helps automate the parts of software development related to building, testing, and deploying |
| [Jetpack Compose](content/jetpack_compose.md) | Basics |
| [Kotlin Basics](content/kotlin.md) | Basic syntax, functions, variables, classes, conditional expressions, loops, ranges, collections, nullable values |
| [Machine learning](content/machine_learning.md) | Basic concepts |
| [MongoDB](content/mongodb.md) | MongoDB |
| [Pandas](content/pandas.md) | A software library written for the Python for data manipulation and analysis |
| [Polars](content/polars.md) | Polars |
| [Power BI](content/power_bi.md) | A business analytics and data visualization tool |
| [Power BI DAX](content/power_bi_dax.md) | Power BI DAX |
| [PySpark](content/pyspark.md) | PySpark |
| [Python](content/python.md) | The basics, interpreter, numbers, text, lists, sets, dictionaries, control flow, loops, functions |
| [Python Advanced](content/pythonadvanced.md) | Functions, annotations, coding style, reading and writing files, classes, iterators, standard library |
| [Python How-To](content/pythonhowto.md) | How-to's |
| [RxSwift](content/rxswift.md) | Basics of RxSwift |
| [Scala](content/scala_de.md) | Scala for data engineering |
| [Scala Essential](content/scala.md) | Essential Scala programming concepts |
| [Snowflake](content/snowflake.md) | A cloud data platform that at it's core features a columnar-stored data warehouse |
| [SQL](content/sql.md) | SQL |
| [SQL How to](content/sqlhowto.md) | SQL tips & tricks |
| [Swift Advanced](content/swiftadvanced.md) | Properties, subscripts, concurrency, type casting, nested types, extensions, protocols, generics, Combine framework |
| [Swift Basics](content/swift.md) | The basics, string and characters, collection types, control flow, functions, closures, enumerations, structures and classes, properties, methods |
| [Swift UI Advanced](content/swiftuiadvanced.md) | Advanced topics and how-to's |
| [Swift UI Basics](content/swiftui.md) | Walk through the building blocks of a SwiftUI |
| [Tableau](content/tableau.md) | Tableau |
| [Terraform](content/terraform.md) | An infrastructure as code tool that lets you build, change, and version infrastructure safely and efficiently |
[All questions](content/_all.md)