Parallel-Computing-Guide
Parallel Computing Guide
https://github.com/mikeroyal/Parallel-Computing-Guide
Last synced: 3 days ago
JSON representation
-
Apache Spark Learning Resources
- Apache Spark™ - scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
- Apache Spark Quick Start
- Introduction to Apache Spark and Analytics | AWS
- Apache Spark 3.0: For Analytics & Machine Learning | NVIDIA
- Apache Spark Basics | MATLAB & Simulink
- MATLAB Hadoop and Spark | MATLAB & Simulink
- Top Apache Spark Courses Online | Coursera
- Top Apache Spark Courses Online | Udemy
- Apache Spark In-Depth (Spark with Scala) | Udemy
- Learn Apache Spark with Online Courses | edX
- Cloudera Developer Training for Apache Spark™ and Hadoop | Cloudera
- Databricks Certified Associate Developer for Apache Spark 3.0 certification | Databricks
- Apache Spark Training Courses | NobleProg
- Cloudera Developer Training for Apache Spark™ and Hadoop | Cloudera
- Databricks Certified Associate Developer for Apache Spark 3.0 certification | Databricks
- What is Apache Spark? | IBM
- Apache Spark Essential Training Online Class | LinkedIn Learning
-
Apache Spark Tools, Libraries, and Frameworks
- Spark SQL
- Spark Streaming - tolerant stream processing engine built on the Spark SQL engine. It can express your streaming computation the same way you would express a batch computation on static data from various sources including [Apache Kafka](https://kafka.apache.org/), [Apache Flume](https://flume.apache.org/), and [Amazon Kinesis](https://aws.amazon.com/kinesis/).
- MLib - level optimization primitives and higher-level pipeline APIs.
- Graphx - parallel computation. At a high-level, GraphX extends the [Spark RDD](https://spark.apache.org/docs/latest/rdd-programming-guide.html) by introducing the Resilient Distributed Property Graph: a directed multigraph with properties attached to each vertex and edge.
- PySpark
- MLflow
- Tracking component
- Projects component
- Models component
- Model Registry
- Apache PredictionIO
- BigDL
- Apache Flume
- Apache Arrow - independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs.
- Neo4j - strength graph database that combines native graph storage, advanced security, scalable speed-optimized architecture, and ACID compliance to ensure predictability and integrity of relationship-based queries.
- Apache Spark Connector for SQL Server and Azure SQL - performance connector that enables you to use transactional data in big data analytics and persists results for ad-hoc queries or reporting. The connector allows you to use any SQL database, on-premises or in the cloud, as an input data source or output data sink for Spark jobs.
- Koalas - docs/stable/reference/api/pandas.DataFrame.html) on top of [Apache Spark](https://spark.apache.org/).
- Cluster Manager for Apache Kafka(CMAK)
- Azure Databricks - based big data analytics service designed for data science and data engineering. Azure Databricks, sets up your Apache Spark environment in minutes, autoscale, and collaborate on shared projects in an interactive workspace. Azure Databricks supports Python, Scala, R, Java, and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch, and scikit-learn.
- Hadoop Distributed File System (HDFS) - yarn/hadoop-yarn-site/YARN.html).
- Logstash
- Kibana
- PlaidML
- OpenCV - time computer vision applications. The C++, Python, and Java interfaces support Linux, MacOS, Windows, iOS, and Android.
- Caffe
- Theano - dimensional arrays efficiently including tight integration with NumPy
- AutoGluon - accuracy deep learning models on tabular, image, and text data.
-
Bioinformatics Learning Resources
- Bioinformatics
- European Bioinformatics Institute
- National Center for Biotechnology Information
- Online Courses in Bioinformatics |ISCB - International Society for Computational Biology
- Bioinformatics | Coursera
- Top Bioinformatics Courses | Udemy
- Biometrics Courses | Udemy
- Learn Bioinformatics with Online Courses and Lessons | edX
- Bioinformatics Graduate Certificate | Harvard Extension School
- Bioinformatics and Proteomics - Free Online Course Materials | MIT
- Introduction to Biometrics course - Biometrics Institute
- Bioinformatics and Biostatistics | UC San Diego Extension
-
Bioinformatics Tools, Libraries, and Frameworks
- Bioconductor - throughput genomic data. Bioconductor uses the [R statistical programming language](https://www.r-project.org/about.html), and is open source and open development. It has two releases each year, and an active user community. Bioconductor is also available as an [AMI (Amazon Machine Image)](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html) and [Docker images](https://docs.docker.com/engine/reference/commandline/images/).
- Bioconda
- UniProt - quality and freely accessible set of protein sequences annotated with functional information.
- Bowtie 2 - efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (mammalian) genomes.
- Biopython
- BioRuby
- BioJava
- BioPHP
- Avogadro - platform use in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas. It offers flexible high quality rendering and a powerful plugin architecture.
- Ascalaph Designer
- Anduril - thoughput data in biomedical research, and the platform is fully extensible by third parties. Ready-made tools support data visualization, DNA/RNA/ChIP-sequencing, DNA/RNA microarrays, cytometry and image analysis.
- Galaxy - based platform for accessible, reproducible, and transparent computational biomedical research. It allows users without programming experience to easily specify parameters and run individual tools as well as larger workflows. It also captures run information so that any user can repeat and understand a complete computational analysis.
- PathVisio - source pathway analysis and drawing software which allows drawing, editing, and analyzing biological pathways. It is developed in Java and can be extended with plugins.
- Orange
- Basic Local Alignment Search Tool
- OSIRIS - domain, free, and open source STR analysis software designed for clinical, forensic, and research use, and has been validated for use as an expert system for single-source samples.
- NCBI BioSystems
- Anduril - thoughput data in biomedical research, and the platform is fully extensible by third parties. Ready-made tools support data visualization, DNA/RNA/ChIP-sequencing, DNA/RNA microarrays, cytometry and image analysis.
- Galaxy - based platform for accessible, reproducible, and transparent computational biomedical research. It allows users without programming experience to easily specify parameters and run individual tools as well as larger workflows. It also captures run information so that any user can repeat and understand a complete computational analysis.
-
C/C++ Learning Resources
- C - purpose, high-level language that was originally developed by Dennis M. Ritchie to develop the UNIX operating system at Bell Labs. It supports structured programming, lexical variable scope, and recursion, with a static type system. C also provides constructs that map efficiently to typical machine instructions, which makes it one was of the most widely used programming languages today.
- Embedded C - committee) to address issues that exist between C extensions for different [embedded systems](https://en.wikipedia.org/wiki/Embedded_system). The extensions hep enhance microprocessor features such as fixed-point arithmetic, multiple distinct memory banks, and basic I/O operations. This makes Embedded C the most popular embedded software language in the world.
- C & C++ Developer Tools from JetBrains
- Open source C++ libraries on cppreference.com
- C++ Graphics libraries
- C++ Libraries in MATLAB
- Google C++ Style Guide
- Introduction C++ Education course on Google Developers
- C++ style guide for Fuchsia
- Chromium C++ Style Guide
- C++ Core Guidelines
- C++ Style Guide for ROS
- Learn C++
- Learn C : An Interactive C Tutorial
- C++ Online Training Courses on LinkedIn Learning
- C++ Tutorials on W3Schools
- Learn C Programming Online Courses on edX
- Learn C++ with Online Courses on edX
- Learn C++ on Codecademy
- Coding for Everyone: C and C++ course on Coursera
- C++ For C Programmers on Coursera
- C++ Online Courses on Udemy
- Top C Courses on Udemy
- Basics of Embedded C Programming for Beginners on Udemy
- C++ For Programmers Course on Udacity
- C++ Fundamentals Course on Pluralsight
- C++ - platform language that can be used to build high-performance applications developed by Bjarne Stroustrup, as an extension to the C language.
- C++ Tools and Libraries Articles
- C++ Style Guide for ROS
-
C/C++ Tools and Frameworks
- Maven
- AWS SDK for C++
- Visual Studio - rich application that can be used for many aspects of software development. Visual Studio makes it easy to edit, debug, build, and publish your app. By using Microsoft software development platforms such as Windows API, Windows Forms, Windows Presentation Foundation, and Windows Store.
- ReSharper C++
- AppCode - fixes to resolve them automatically. AppCode provides lots of code inspections for Objective-C, Swift, C/C++, and a number of code inspections for other supported languages. All code inspections are run on the fly.
- CLion - platform IDE for C and C++ developers developed by JetBrains.
- Code::Blocks
- Conan
- High Performance Computing (HPC) SDK
- Boost - edge C++. Boost has been a participant in the annual Google Summer of Code since 2007, in which students develop their skills by working on Boost Library development.
- Automake
- Cmake - source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native makefiles and workspaces that can be used in the compiler environment of your choice.
- GDB
- GCC - C, Fortran, Ada, Go, and D, as well as libraries for these languages.
- GSL - squares fitting. There are over 1000 functions in total with an extensive test suite.
- OpenGL Extension Wrangler Library (GLEW) - platform open-source C/C++ extension loading library. GLEW provides efficient run-time mechanisms for determining which OpenGL extensions are supported on the target platform.
- Libtool
- TAU (Tuning And Analysis Utilities) - based sampling. All C++ language features are supported including templates and namespaces.
- Clang - C, C++ and Objective-C++ compiler when targeting X86-32, X86-64, and ARM (other targets may have caveats, but are usually easy to fix). Clang is used in production to build performance-critical software like Google Chrome or Firefox.
- OpenCV - time applications. Cross-Platform C++, Python and Java interfaces support Linux, MacOS, Windows, iOS, and Android.
- ANTLR (ANother Tool for Language Recognition)
- Oat++ - efficient web application. It's zero-dependency and easy-portable.
- Cython
- Infer - C, and C. Infer is written in [OCaml](https://ocaml.org/).
- Azure SDK for C++
- Azure SDK for C
- C++ Client Libraries for Google Cloud Services
- Vcpkg
- CppSharp
- JavaCPP
- Spdlog - only/compiled, C++ logging library.
-
Cloud Native Learning Resources
- CNCF Cloud Native Interactive Landscape
- Build Cloud-Native applications in Microsoft Azure
- Cloud-Native application development for Google Cloud
- Cloud-Native development for Amazon Web Services
- Cloud Foundry Developer Training and Certification Program
- Cloud-Native Architecture Course on Pluralsight
- AWS Fundamentals: Going Cloud-Native on Coursera
- Developing Cloud-Native Apps w/ Microservices Architectures course on Udemy
- How load balancing works for cloud native applications with Azure Application Gateway on Linkedin Learning
- Developing Cloud Native Applications course on edX
- Developing Cloud Native Applications course on edX
-
Computer Vision Learning Resources
- Computer Vision
- OpenCV Courses
- Top Computer Vision Courses Online | Coursera
- Top Computer Vision Courses Online | Udemy
- Learn Computer Vision with Online Courses and Lessons | edX
- Computer Vision and Image Processing Fundamentals | edX
- Computer Vision Nanodegree program | Udacity
- Machine Vision Course |MIT Open Courseware
- Computer Vision Training Courses | NobleProg
- Visual Computing Graduate Program | Stanford Online
- Computer Vision
- OpenCV Courses
- Computer Vision and Image Processing Fundamentals | edX
- Introduction to Computer Vision Courses | Udacity
- Exploring Computer Vision in Microsoft Azure
-
Computer Vision Tools, Libraries, and Frameworks
- Microsoft AirSim - source, cross platform, and supports [software-in-the-loop simulation](https://www.mathworks.com/help///ecoder/software-in-the-loop-sil-simulation.html) with popular flight controllers such as PX4 & ArduPilot and [hardware-in-loop](https://www.ni.com/en-us/innovations/white-papers/17/what-is-hardware-in-the-loop-.html) with PX4 for physically and visually realistic simulations. It is developed as an Unreal plugin that can simply be dropped into any Unreal environment. AirSim is being developed as a platform for AI research to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles.
- Automated Driving Toolbox™ - eye-view plot and scope for sensor coverage, detections and tracks, and displays for video, lidar, and maps. The toolbox lets you import and work with HERE HD Live Map data and OpenDRIVE® road networks. It also provides reference application examples for common ADAS and automated driving features, including FCW, AEB, ACC, LKA, and parking valet. The toolbox supports C/C++ code generation for rapid prototyping and HIL testing, with support for sensor fusion, tracking, path planning, and vehicle controller algorithms.
- Data Acquisition Toolbox™
- LRSLibrary - Rank and Sparse Tools for Background Modeling and Subtraction in Videos. The library was designed for moving object detection in videos, but it can be also used for other computer vision and machine learning problems.
-
Containers
- Kubernetes - source container-orchestration system for automating application deployment, scaling, and management. It was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation.
- Docker - level virtualization to deliver software in packages called containers. Containers are isolated from one another and bundle their own software, libraries and configuration files; they can communicate with each other through well-defined channels. All containers are run by a single operating-system kernel and are thus more lightweight than virtual machines.
- Rook - native storage orchestrator for Kubernetes that turns distributed storage systems into self-managing, self-scaling, self-healing storage services. It automates the tasks of a storage administrator: deployment, bootstrapping, configuration, provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource management.
- Open Container Initiative
- Buildah
- Podman
- Rancher
- Containerd - level storage to network attachments and beyond. It is available for Linux and Windows.
-
Continuous Integration/Continuous Delivery
- Bamboo
- Drone - compose, to define and execute Pipelines inside Docker containers.
- Circle CI
- Team City
- Shippable
- Spinnaker - cloud continuous delivery platform for releasing software changes with high velocity and confidence.
- Prow - ops via /foo style commands, and automatic PR merging. Prow has a microservice architecture implemented as a collection of container images that run as Kubernetes deployments.
- Travis CI
-
CUDA Learning Resources
- CUDA - accelerated applications, the sequential part of the workload runs on the CPU, which is optimized for single-threaded. The compute intensive portion of the application runs on thousands of GPU cores in parallel. When using CUDA, developers can program in popular languages such as C, C++, Fortran, Python and MATLAB.
- CUDA Toolkit Documentation
- CUDA Quick Start Guide
- CUDA on WSL
- NVIDIA Deep Learning cuDNN Documentation
- CUDA GPU support for TensorFlow
-
CUDA Tools Libraries, and Frameworks
- Chainer - based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using [CuPy](https://github.com/cupy/cupy) for high performance training and inference.
- CUDA Toolkit - accelerated applications. The CUDA Toolkit allows you can develop, optimize, and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library to build and deploy your application on major architectures including x86, Arm and POWER.
- CUDA-X HPC - X HPC includes highly tuned kernels essential for high-performance computing (HPC).
- CuPy - compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.
- cuDF - like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming.
- ArrayFire - purpose library that simplifies the process of developing software that targets parallel and massively-parallel architectures including CPUs, GPUs, and other hardware acceleration devices.
- AresDB - powered real-time analytics storage and query engine. It features low query latency, high data freshness and highly efficient in-memory and on disk storage management.
- NVIDIA Container Toolkit - container) and utilities to automatically configure containers to leverage NVIDIA GPUs.
- CUTLASS - performance matrix-multiplication (GEMM) at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS.
- CUB
- Thrust - level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs.
- Arraymancer - dimensional array) project in Nim. The main focus is providing a fast and ergonomic CPU, Cuda and OpenCL ndarray library on which to build a scientific computing ecosystem.
- Kintinuous - time dense visual SLAM system capable of producing high quality globally consistent point and mesh reconstructions over hundreds of metres in real-time with only a low-cost commodity RGB-D sensor.
Categories
Reinforcement Learning Learning Resources
37
ML Frameworks, Libraries, and Tools
35
SQL/NoSQL Tools and Databases
32
C/C++ Tools and Frameworks
31
MATLAB Tools, Libraries, Frameworks
30
C/C++ Learning Resources
29
Apache Spark Tools, Libraries, and Frameworks
27
Java Tools, Libraries, and Frameworks
26
Python Frameworks and Tools
25
Julia Tools, Libraries and Frameworks
21
Vulkan Tools, Libraries, and Frameworks
20
R Tools, Libraries, and Frameworks
20
Bioinformatics Tools, Libraries, and Frameworks
19
NLP Learning Resources
18
Apache Spark Learning Resources
17
SQL/NoSQL Learning Resources
17
Telco 5G Learning Resources
17
Virtualization
17
Computer Vision Learning Resources
15
Learning Resources for ML
15
MATLAB Learning Resources
14
DevOps
14
OpenCL Tools, Libraries and Frameworks
13
CUDA Tools Libraries, and Frameworks
13
Scala Learning Resources
13
Bioinformatics Learning Resources
12
Python Learning Resources
12
File systems & Storage
12
Network Learning Resources
11
OpenCL Learning Resources
11
R Learning Resources
11
Scala Tools and Libraries
11
Cloud Native Learning Resources
11
Java Learning Resources
11
Reinforcement Learning Tools, Libraries, and Frameworks
11
Julia Learning Resources
10
Telco 5G Tools and Frameworks
10
NLP Tools, Libraries, and Frameworks
9
Containers
8
Vulkan Learning Resources
8
Continuous Integration/Continuous Delivery
8
Networking Tools & Concepts
7
Microservices
7
Deep Learning Tools, Libraries, and Frameworks
6
CUDA Learning Resources
6
Deep Learning Learning Resources
5
Computer Vision Tools, Libraries, and Frameworks
4
License
1
Network Protocols
1
Sub Categories
Keywords
python
17
cpp
10
machine-learning
9
java
8
deep-learning
8
gpu
8
cuda
8
vulkan
7
nlp
6
data-science
5
julia
5
natural-language-processing
5
curl
5
cli
5
neural-network
4
nvidia
4
http
4
pytorch
4
neural-networks
3
azure
3
tensorflow
3
c
3
cplusplus
3
named-entity-recognition
3
matlab
3
cpp11
3
cpp14
3
cxx14
3
ai
3
artificial-intelligence
3
android
3
graphics
3
docker
3
data-visualization
3
framework
2
nvidia-hpc-sdk
2
performance
2
gpu-computing
2
cxx20
2
compiler
2
linux
2
cxx17
2
big-data
2
kubernetes
2
cxx11
2
algorithms
2
cxx
2
iot
2
opencl
2
cpp20
2