Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Apache-Spark-Guide
Apache Spark Guide
https://github.com/mikeroyal/Apache-Spark-Guide
Last synced: about 1 hour ago
JSON representation
-
Computer Vision Learning Resources
- Computer Vision
- Computer Vision
- OpenCV Courses
- Exploring Computer Vision in Microsoft Azure
- Top Computer Vision Courses Online | Coursera
- Top Computer Vision Courses Online | Udemy
- Learn Computer Vision with Online Courses and Lessons | edX
- Computer Vision and Image Processing Fundamentals | edX
- Introduction to Computer Vision Courses | Udacity
- Computer Vision Nanodegree program | Udacity
- Computer Vision Training Courses | NobleProg
- Visual Computing Graduate Program | Stanford Online
- Machine Vision Course |MIT Open Courseware
- OpenCV Courses
- Introduction to Computer Vision Courses | Udacity
- Computer Vision
-
Java Learning Resources
- Java
- The Eclipse Foundation
- Getting Started with Java
- Oracle Java certifications from Oracle University
- Google Developers Training
- Java Tutorial by W3Schools
- Getting Started with Java in Visual Studio Code
- Google Java Style Guide
- AOSP Java Code Style for Contributors
- Chromium Java style guide
- Get Started with OR-Tools for Java
- Getting started with Java Tool Installer task for Azure Pipelines
- Gradle User Manual
-
Learning Resources for ML
- Machine Learning
- Machine Learning by Stanford University from Coursera
- AWS Training and Certification for Machine Learning (ML) Courses
- Machine Learning Scholarship Program for Microsoft Azure from Udacity
- Microsoft Certified: Azure Data Scientist Associate
- Microsoft Certified: Azure AI Engineer Associate
- Azure Machine Learning training and deployment
- Learning Machine learning and artificial intelligence from Google Cloud Training
- JupyterLab
- Scheduling Jupyter notebooks on Amazon SageMaker ephemeral instances
- How to run Jupyter Notebooks in your Azure Machine Learning workspace
- Machine Learning Courses Online from Udemy
- Machine Learning Courses Online from Coursera
- Learn Machine Learning with Online Courses and Classes from edX
- Machine Learning Scholarship Program for Microsoft Azure from Udacity
- Machine Learning Crash Course for Google Cloud
-
ML Frameworks, Libraries, and Tools
- Amazon SageMaker
- Apple CoreML - tune models, all on the user's device. A model is the result of applying a machine learning algorithm to a set of training data. You use a model to make predictions based on new input data.
- Fuzzy logic - tree processing and better integration with rules-based programming.
- ResearchGate
- Support Vector Machine (SVM) - group classification problems.
- OpenClipArt
- IBM
- Convolutional Neural Networks (R-CNN)
- CS231n
- Recurrent neural networks (RNNs)
- Slideteam
- wikimedia
- Random forest - used machine learning algorithm, which combines the output of multiple decision trees to reach a single result. A decision tree in a forest cannot be pruned for sampling and therefore, prediction selection. Its ease of use and flexibility have fueled its adoption, as it handles both classification and regression problems.
- Decision trees - structured models for classification and regression.
- CMU
- Naive Bayes - theorem.html) with strong independence assumptions between the features.
- mathisfun
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- nGraph - of-use to AI developers.
- Tensorman
- cuML - learn.
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- wikimedia
- DeepAI
- Support Vector Machine (SVM) - group classification problems.
-
Reinforcement Learning Tools, Libraries, and Frameworks
- XGBoost
- OpenAI
- ReinforcementLearning.jl
- Apache MXNet
- AutoGluon - accuracy deep learning models on tabular, image, and text data.
- Cluster Manager for Apache Kafka(CMAK)
- ROS/ROS2 bridge for CARLA(package) - way communication between ROS and CARLA. The information from the CARLA server is translated to ROS topics. In the same way, the messages sent between nodes in ROS get translated to commands to be applied in CARLA.
- CARLA - source simulator for autonomous driving research. CARLA has been developed from the ground up to support development, training, and validation of autonomous driving systems. In addition to open-source code and protocols, CARLA provides open digital assets (urban layouts, buildings, vehicles) that were created for this purpose and can be used freely.
- Apache MXNet
- LIBSVM - SVC, nu-SVC), regression (epsilon-SVR, nu-SVR) and distribution estimation (one-class SVM). It supports multi-class classification.
- AWS RoboMaker - managed, scalable infrastructure for simulation that customers use for multi-robot simulation and CI/CD integration with regression testing in simulation.
- Predictive Maintenance Toolbox™ - based and model-based techniques, including statistical, spectral, and time-series analysis.
- Microsoft Project Bonsai - code AI platform that speeds AI-powered automation development and part of the Autonomous Systems suite from Microsoft. Bonsai is used to build AI components that can provide operator guidance or make independent decisions to optimize process variables, improve production efficiency, and reduce downtime.
- Jupyter Notebook - source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Jupyter is used widely in industries that do data cleaning and transformation, numerical simulation, statistical modeling, data visualization, data science, and machine learning.
- Weka - in tools for standard machine learning tasks, and additionally gives transparent access to well-known toolboxes such as scikit-learn, R, and Deeplearning4j.
- Navigation Toolbox™ - based path planners, as well as metrics for validating and comparing paths. You can create 2D and 3D map representations, generate maps using SLAM algorithms, and interactively visualize and debug map generation with the SLAM map builder app.
-
Reinforcement Learning Learning Resources
- Machine Learning Course by Andrew Ng | Coursera
- Top Artificial Intelligence Courses Online | Coursera
- Professional Certificate in Computer Science for Artificial Intelligence | edX
- Artificial Intelligence (AI) Online Courses | Udacity
- Edge AI for IoT Developers Course | Udacity
- Autonomous Systems Online Courses & Programs | Udacity
- Mobile Autonomous Systems Laboratory | MIT OpenCourseWare
- Top Reinforcement Learning Courses | Coursera
- Top Reinforcement Learning Courses | Udemy
- Top Reinforcement Learning Courses | Udacity
- Reinforcement Learning Courses | Stanford Online
- Reasoning: Goal Trees and Rule-Based Expert Systems | MIT OpenCourseWare
- Reinforcement Learning - supervised](https://en.wikipedia.org/wiki/Semi-supervised_learning) or [unsupervised](https://en.wikipedia.org/wiki/Unsupervised_learning).
- Top Deep Learning Courses Online | Coursera
- Machine Learning for Everyone Courses | DataCamp
- Machine teaching with the Microsoft Autonomous Systems platform
- Intro to Artificial Intelligence Course | Udacity
- Autonomous Maritime Systems Training | AMC Search
- Deep Learning Online Courses | NVIDIA
- Autonomous Systems MOOC and Free Online Courses | MOOC List
- Robotics and Autonomous Systems Graduate Program | Standford Online
- Top Deep Learning Courses Online | Udemy
- Expert Systems and Applied Artificial Intelligence
- Autonomous Systems - Microsoft AI
- Top Reinforcement Learning Courses | Udacity
- Autonomous Systems Online Courses & Programs | Udacity
- Artificial Intelligence (AI) Online Courses | Udacity
- Edge AI for IoT Developers Course | Udacity
- Top Autonomous Cars Courses Online | Udemy
- Artificial Intelligence Nanodegree program
- Introduction to Microsoft Project Bonsai
- Applied Control Systems 1: autonomous cars: Math + PID + MPC | Udemy
- Learn Deep Learning with Online Courses and Lessons | edX
- Deep Learning Online Course Nanodegree | Udacity
- How to Think About Machine Learning Algorithms | Pluralsight
- Machine Learning Engineering for Production (MLOps) course by Andrew Ng | Coursera
- Deep Learning Courses | Stanford Online
- Deep Learning - UW Professional & Continuing Education
- Deep Learning Online Courses | Harvard University
- Data Science: Deep Learning and Neural Networks in Python | Udemy
- Understanding Machine Learning with Python | Pluralsight
- Machine Learning for Everyone Courses | DataCamp
- Learn Autonomous Robotics with Online Courses and Lessons | edX
- Mobile Autonomous Systems Laboratory | MIT OpenCourseWare
- Artificial Intelligence Expert Course: Platinum Edition | Udemy
- Learn Artificial Intelligence with Online Courses and Lessons | edX
-
Deep Learning Tools, Libraries, and Frameworks
- NVIDIA DLSS (Deep Learning Super Sampling)
- Intel Xe Super Sampling (XeSS) - cores to run XeSS. The GPUs will have Xe Matrix eXtenstions matrix (XMX) engines for hardware-accelerated AI processing. XeSS will be able to run on devices without XMX, including integrated graphics, though, the performance of XeSS will be lower on non-Intel graphics cards because it will be powered by [DP4a instruction](https://www.intel.com/content/dam/www/public/us/en/documents/reference-guides/11th-gen-quick-reference-guide.pdf).
- AMD FidelityFX Super Resolution (FSR) - quality solution for producing high resolution frames from lower resolution inputs. It uses a collection of cutting-edge Deep Learning algorithms with a particular emphasis on creating high-quality edges, giving large performance improvements compared to rendering at native resolution directly. FSR enables “practical performance” for costly render operations, such as hardware ray tracing for the AMD RDNA™ and AMD RDNA™ 2 architectures.
-
NLP Tools, Libraries, and Frameworks
- PyTorch
- Natural Language Toolkit (NLTK) - to-use interfaces to over [50 corpora and lexical resources](https://nltk.org/nltk_data/) such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries.
- spaCy - task learning with pretrained transformers like BERT.
- Keras - level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.It was developed with a focus on enabling fast experimentation. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, R, Theano, or PlaidML.
- TensorFlow - to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.
- Apache Airflow - source workflow management platform created by the community to programmatically author, schedule and monitor workflows. Airflow has a modular architecture and uses a message queue to orchestrate an arbitrary number of workers. Airflow is ready to scale to infinity.
- CoreNLP
- NLPnet - of-speech tagging, semantic role labeling and dependency parsing.
- Flair - of-the-art Natural Language Processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS), special support for biomedical data, sense disambiguation and classification, with support for a rapidly growing number of languages.
- Catalyst - trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
- Numba - aware optimizing compiler for Python sponsored by Anaconda, Inc. It uses the LLVM compiler project to generate machine code from Python syntax. Numba can compile a large subset of numerically-focused Python, including many NumPy functions. Additionally, Numba has support for automatic parallelization of loops, generation of GPU-accelerated code, and creation of ufuncs and C callbacks.
- Tensorflow_macOS - optimized version of TensorFlow and TensorFlow Addons for macOS 11.0+ accelerated using Apple's ML Compute framework.
- Apache PredictionIO
- Scikit-Learn
- Apache Spark Connector for SQL Server and Azure SQL - performance connector that enables you to use transactional data in big data analytics and persists results for ad-hoc queries or reporting. The connector allows you to use any SQL database, on-premises or in the cloud, as an input data source or output data sink for Spark jobs.
- PlaidML
- Caffe
- Theano - dimensional arrays efficiently including tight integration with NumPy.
- Open Neural Network Exchange(ONNX) - in operators and standard data types.
- Eclipse Deeplearning4J (DL4J) - based(Scala, Kotlin, Clojure, and Groovy) deep learning application. This means starting with the raw data, loading and preprocessing it from wherever and whatever format it is in to building and tuning a wide variety of simple and complex deep learning networks.
- NVIDIA cuDNN - accelerated library of primitives for [deep neural networks](https://developer.nvidia.com/deep-learning). cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN accelerates widely used deep learning frameworks, including [Caffe2](https://caffe2.ai/), [Chainer](https://chainer.org/), [Keras](https://keras.io/), [MATLAB](https://www.mathworks.com/solutions/deep-learning.html), [MxNet](https://mxnet.incubator.apache.org/), [PyTorch](https://pytorch.org/), and [TensorFlow](https://www.tensorflow.org/).
- Apache Spark - scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
- BigDL
- Apache OpenNLP - source library for a machine learning based toolkit used in the processing of natural language text. It features an API for use cases like [Named Entity Recognition](https://en.wikipedia.org/wiki/Named-entity_recognition), [Sentence Detection](), [POS(Part-Of-Speech) tagging](https://en.wikipedia.org/wiki/Part-of-speech_tagging), [Tokenization](https://en.wikipedia.org/wiki/Tokenization_(data_security)) [Feature extraction](https://en.wikipedia.org/wiki/Feature_extraction), [Chunking](https://en.wikipedia.org/wiki/Chunking_(psychology)), [Parsing](https://en.wikipedia.org/wiki/Parsing), and [Coreference resolution](https://en.wikipedia.org/wiki/Coreference).
- Anaconda
- Chainer - based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using [CuPy](https://github.com/cupy/cupy) for high performance training and inference.
-
NLP Learning Resources
- Certified Natural Language Processing Expert Certification | IABAC
- Natural Language Processing Course - Intel
- Natural Language Processing (NLP) - based modeling of human language with statistical, machine learning, and deep learning models.
- Natural Language Processing With Python's NLTK Package
- Cognitive Services—APIs for AI Developers | Microsoft Azure
- Artificial Intelligence Services - Amazon Web Services (AWS)
- Google Cloud Natural Language API
- Top Natural Language Processing Courses Online | Udemy
- Introduction to Natural Language Processing (NLP) | Udemy
- Top Natural Language Processing Courses | Coursera
- Natural Language Processing | Coursera
- Natural Language Processing in TensorFlow | Coursera
- Learn Natural Language Processing with Online Courses and Lessons | edX
- Build a Natural Language Processing Solution with Microsoft Azure | Pluralsight
- Natural Language Processing (NLP) Training Courses | NobleProg
- Natural Language Processing with Deep Learning Course | Standford Online
- Advanced Natural Language Processing - MIT OpenCourseWare
-
Bioinformatics Learning Resources
- Bioinformatics
- European Bioinformatics Institute
- National Center for Biotechnology Information
- Online Courses in Bioinformatics |ISCB - International Society for Computational Biology
- Bioinformatics | Coursera
- Top Bioinformatics Courses | Udemy
- Biometrics Courses | Udemy
- Learn Bioinformatics with Online Courses and Lessons | edX
- Bioinformatics Graduate Certificate | Harvard Extension School
- Bioinformatics and Biostatistics | UC San Diego Extension
- Bioinformatics and Proteomics - Free Online Course Materials | MIT
- Introduction to Biometrics course - Biometrics Institute
- National Center for Biotechnology Information
-
SQL/NoSQL Learning Resources
- What is NoSQL?
- SQL
- Transact-SQL(T-SQL) - SQL commands.
- Introduction to Transact-SQL
- SQL Tutorial by W3Schools
- Learn SQL Skills Online from Coursera
- SQL Courses Online from Udemy
- SQL Online Training Courses from LinkedIn Learning
- Learn SQL For Free from Codecademy
- GitLab's SQL Style Guide
- OracleDB SQL Style Guide Basics
- Tableau CRM: BI Software and Tools
- Databases on AWS
- Best Practices and Recommendations for SQL Server Clustering in AWS EC2.
- Connecting from Google Kubernetes Engine to a Cloud SQL instance.
- Educational Microsoft Azure SQL resources
- MySQL Certifications
- SQL vs. NoSQL Databases: What's the Difference?
- Tableau CRM: BI Software and Tools
- Tableau CRM: BI Software and Tools
-
Bioinformatics Tools, Libraries, and Frameworks
- Bioconductor - throughput genomic data. Bioconductor uses the [R statistical programming language](https://www.r-project.org/about.html), and is open source and open development. It has two releases each year, and an active user community. Bioconductor is also available as an [AMI (Amazon Machine Image)](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html) and [Docker images](https://docs.docker.com/engine/reference/commandline/images/).
- Bioconda
- UniProt - quality and freely accessible set of protein sequences annotated with functional information.
- Bowtie 2 - efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (mammalian) genomes.
- Biopython
- BioRuby
- BioJava
- BioPHP
- Avogadro - platform use in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas. It offers flexible high quality rendering and a powerful plugin architecture.
- Ascalaph Designer
- Anduril - thoughput data in biomedical research, and the platform is fully extensible by third parties. Ready-made tools support data visualization, DNA/RNA/ChIP-sequencing, DNA/RNA microarrays, cytometry and image analysis.
- Galaxy - based platform for accessible, reproducible, and transparent computational biomedical research. It allows users without programming experience to easily specify parameters and run individual tools as well as larger workflows. It also captures run information so that any user can repeat and understand a complete computational analysis.
- PathVisio - source pathway analysis and drawing software which allows drawing, editing, and analyzing biological pathways. It is developed in Java and can be extended with plugins.
- Orange
- Basic Local Alignment Search Tool
- OSIRIS - domain, free, and open source STR analysis software designed for clinical, forensic, and research use, and has been validated for use as an expert system for single-source samples.
- NCBI BioSystems
- Anduril - thoughput data in biomedical research, and the platform is fully extensible by third parties. Ready-made tools support data visualization, DNA/RNA/ChIP-sequencing, DNA/RNA microarrays, cytometry and image analysis.
-
SQL/NoSQL Tools and Databases
- Azure SQL Database - powered and automated features that optimize performance and durability for you. Serverless compute and Hyperscale storage options automatically scale resources on demand, so you can focus on building new applications without worrying about storage size or resource management.
- Azure SQL Managed Instance - premises applications to the cloud with very few application and database changes. Managed instance has split compute and storage components.
- Azure Synapse Analytics
- MSSQL for Visual Studio Code
- SQL Server Data Tools (SSDT)
- Bulk Copy Program - line tool that comes with Microsoft SQL Server. BCP, allows you to import and export large amounts of data in and out of SQL Server databases quickly snd efficeiently.
- SQL Server Migration Assistant
- SQL Server Integration Services - level data integration and data transformations solutions. Use Integration Services to solve complex business problems by copying or downloading files, loading data warehouses, cleansing and mining data, and managing SQL Server objects and data.
- SQL Server Business Intelligence(BI)
- Tableau - releases/press-release-details/2019/Salesforce-Completes-Acquisition-of-Tableau/default.aspx).
- DataGrip - sensitive code completion, helping you to write SQL code faster. Completion is aware of the tables structure, foreign keys, and even database objects created in code you're editing.
- MySQL - native applications using the world's most popular open source database.
- PostgreSQL - relational database system with over 30 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance.
- Amazon DynamoDB - value and document database that delivers single-digit millisecond performance at any scale. It is a fully managed, multiregion, multimaster, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications.
- Apache Cassandra™ - tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.
- Hadoop Distributed File System (HDFS) - yarn/hadoop-yarn-site/YARN.html).
- Apache Mesos
- ElasticSearch - capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java.
- Logstash
- Kibana
- Trino - us/azure/architecture/data-guide/relational-data/etl), allow them all to use standard SQL statement, and work with numerous data sources and targets all in the same system.
- Extract, transform, and load (ETL)
- Redis(REmote DIctionary Server) - memory data structure store, used as a database, cache, and message broker. It provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams.
- FoundationDB - value store and employs ACID transactions for all operations. It is especially well-suited for read/write workloads but also has excellent performance for write-intensive workloads. FoundationDB was acquired by [Apple in 2015](https://techcrunch.com/2015/03/24/apple-acquires-durable-database-company-foundationdb/).
- IBM DB2 - empowered capabilities designed to help you manage both structured and unstructured data on premises as well as in private and public cloud environments. Db2 is built on an intelligent common SQL engine designed for scalability and flexibility.
- MongoDB - like documents.
- OracleDB - critical data with the highest availability, reliability, and security.
- MariaDB - critical applications.
- SQLite - language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.SQLite is the most used database engine in the world. SQLite is built into all mobile phones and most computers and comes bundled inside countless other applications that people use every day.
- SQLite Database Browser
- InfluxDB - us/azure/architecture/data-guide/relational-data/etl) or monitoring and alerting purposes, user dashboards, Internet of Things sensor data, and visualizing and exploring the data and more. It also has support for processing data from [Graphite](http://graphiteapp.org/).
- CouchbaseDB - model NoSQL document-oriented database](https://en.wikipedia.org/wiki/Multi-model_database). It creates a key-value store with managed cache for sub-millisecond data operations, with purpose-built indexers for efficient queries and a powerful query engine for executing SQL queries.
- dbWatch - premise, hybrid/cloud database environments.
- Adminer
- DbVisualizer
- AppDynamics Database - Volume Production Environment.
- Toad - in expertise. This SQL management tool resolve issues, manage change and promote the highest levels of code quality for both relational and non-relational databases.
- Lepide SQL Server - to-use, graphical user interface.
- Sequel Pro
- Kibana
- Netdata - fidelity infrastructure monitoring and troubleshooting, real-time monitoring Agent collects thousands of metrics from systems, hardware, containers, and applications with zero configuration. It runs permanently on all your physical/virtual servers, containers, cloud deployments, and edge/IoT devices, and is perfectly safe to install on your systems mid-incident without any preparation.
- Azure Data Studio
- Azure SQL Database - powered and automated features that optimize performance and durability for you. Serverless compute and Hyperscale storage options automatically scale resources on demand, so you can focus on building new applications without worrying about storage size or resource management.
- Azure SQL Managed Instance - premises applications to the cloud with very few application and database changes. Managed instance has split compute and storage components.
- Apache HBase™ - source, NoSQL, distributed big data store. It enables random, strictly consistent, real-time access to petabytes of data. HBase is very effective for handling large, sparse datasets. HBase serves as a direct input and output to the Apache MapReduce framework for Hadoop, and works with Apache Phoenix to enable SQL-like queries over HBase tables.
- Cosmos DB Profiler - time visual debugger allowing a development team to gain valuable insight and perspective into their usage of Cosmos DB database. It identifies over a dozen suspicious behaviors from your application’s interaction with Cosmos DB.
-
Scala Learning Resources
- Scala Style Guide
- Scala - oriented and functional programming in one concise, high-level language. Scala's static types help avoid bugs in complex applications, and its JVM and JavaScript runtimes let you build high-performance systems with easy access to huge ecosystems of libraries.
- Creating a Scala Maven application for Apache Spark in HDInsight using IntelliJ
- Using Scala to Program AWS Glue ETL Scripts
- Using Flink Scala shell with Amazon EMR clusters
- AWS EMR and Spark 2 using Scala from Udemy
- Using the Google Cloud Storage connector with Apache Spark
- Write and run Spark Scala jobs on Cloud Dataproc for Google Cloud
- Scala Courses and Certifications from edX
- Scala Courses from Coursera
- Top Scala Courses from Udemy
- Using the Google Cloud Storage connector with Apache Spark
- Write and run Spark Scala jobs on Cloud Dataproc for Google Cloud
- Scala Courses and Certifications from edX
- Scala Courses from Coursera
- Top Scala Courses from Udemy
- Intro to Spark DataFrames using Scala with Azure Databricks
-
CUDA Tools Libraries, and Frameworks
- CUDA-X HPC - X HPC includes highly tuned kernels essential for high-performance computing (HPC).
- CUDA Toolkit - accelerated applications. The CUDA Toolkit allows you can develop, optimize, and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library to build and deploy your application on major architectures including x86, Arm and POWER.
- Minkowski Engine - differentiation library for sparse tensors. It supports all standard neural network layers such as convolution, pooling, unpooling, and broadcasting operations for sparse tensors.
- CuPy - compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.
- cuDF - like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming.
- ArrayFire - purpose library that simplifies the process of developing software that targets parallel and massively-parallel architectures including CPUs, GPUs, and other hardware acceleration devices.
- AresDB - powered real-time analytics storage and query engine. It features low query latency, high data freshness and highly efficient in-memory and on disk storage management.
- GraphVite - speed and large-scale embedding learning in various applications.
-
CUDA Learning Resources
- CUDA Toolkit Documentation
- CUDA Quick Start Guide
- CUDA on WSL
- CUDA GPU support for TensorFlow
- NVIDIA Deep Learning cuDNN Documentation
- NVIDIA GPU Cloud Documentation
- NVIDIA NGC - optimized software for deep learning, machine learning, and high-performance computing (HPC) workloads.
- NVIDIA NGC Containers - accelerated software for AI, machine learning and HPC. These containers take full advantage of NVIDIA GPUs on-premises and in the cloud.
- CUDA - accelerated applications, the sequential part of the workload runs on the CPU, which is optimized for single-threaded. The compute intensive portion of the application runs on thousands of GPU cores in parallel. When using CUDA, developers can program in popular languages such as C, C++, Fortran, Python and MATLAB.
-
MATLAB Learning Resources
- MATLAB
- MATLAB Documentation
- MATLAB and Simulink Training from MATLAB Academy
- MathWorks Certification Program
- Apache Spark Basics | MATLAB & Simulink
- MATLAB Hadoop and Spark | MATLAB & Simulink
- MATLAB Online Courses from Udemy
- MATLAB Online Courses from Coursera
- MATLAB Online Courses from edX
- Building a MATLAB GUI
- MATLAB Style Guidelines 2.0
- Setting Up Git Source Control with MATLAB & Simulink
- Pull, Push and Fetch Files with Git with MATLAB & Simulink
- Create New Repository with MATLAB & Simulink
- PRMLT
- Getting Started with MATLAB
-
MATLAB Tools, Libraries, Frameworks
- MATLAB and Simulink Services & Applications List
- MATLAB in the Cloud - cloud) including [AWS](https://aws.amazon.com/) and [Azure](https://azure.microsoft.com/).
- MATLAB Online™
- Simulink - Based Design. It supports simulation, automatic code generation, and continuous testing of embedded systems.
- Simulink Online™
- hctsa - series analysis using Matlab.
- Plotly
- YALMIP
- GNU Octave - level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation.
- MATLAB Drive™
- SoC Blockset™
- Wireless HDL Toolbox™ - verified, hardware-ready Simulink® blocks and subsystems for developing 5G, LTE, and custom OFDM-based wireless communication applications. It includes reference applications, IP blocks, and gateways between frame and sample-based processing.
- ThingSpeak™ - of-concept IoT systems that require analytics.
- hctsa - series analysis using Matlab.
- Image Processing Toolbox™ - standard algorithms and workflow apps for image processing, analysis, visualization, and algorithm development. You can perform image segmentation, image enhancement, noise reduction, geometric transformations, image registration, and 3D image processing.
-
Java Tools, Libraries, and Frameworks
- Java SE
- JDK Development Tools
- IntelliJ IDEA
- NetBeans
- Elasticsearch
- RxJava - based programs by using observable sequences. It extends the [observer pattern](http://en.wikipedia.org/wiki/Observer_pattern) to support sequences of data/events and adds operators that allow you to compose sequences together declaratively while abstracting away concerns about things like low-level threading, synchronization, thread-safety and concurrent data structures.
- Guava
- Retrofit - safe HTTP client for Android and Java develped by Square.
- Apache Flink - and batch-processing capabilities with elegant and fluent APIs in Java and Scala.
- Fastjson
- libGDX - platform Java game development framework based on OpenGL (ES) that works on Windows, Linux, Mac OS X, Android, your WebGL enabled browser and iOS.
- Jenkins - source automation server. Built with Java, it provides over 1700 [plugins](https://plugins.jenkins.io/) to support automating virtually anything, so that humans can actually spend their time doing things machines cannot.
- Redisson - Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, and local cache.
- GraalVM - based languages like Java, Scala, Clojure, Kotlin, and LLVM-based languages such as C and C++.
- Gradle - language software development. From mobile apps to microservices, from small startups to big enterprises, Gradle helps teams build, automate and deliver better software, faster. Write in Java, C++, Python or your language of choice.
- Apache Groovy - typing and static compilation capabilities, for the Java platform aimed at improving developer productivity thanks to a concise, familiar and easy to learn syntax. It integrates smoothly with any Java program, and immediately delivers to your application powerful features, including scripting capabilities, Domain-Specific Language authoring, runtime and compile-time meta-programming and functional programming.
- JaCoCo
- Apache JMeter
- Junit
- Mockito
- SpotBugs
- SpringBoot - powered, production-grade applications and services with absolute minimum fuss. It takes an opinionated view of the Spring platform so that new and existing users can quickly get to the bits they need.
- YourKit
- DBeaver - platform database tool for developers, SQL programmers, database administrators and analysts. Supports any database which has JDBC driver (which basically means - ANY database). EE version also supports non-JDBC datasources (MongoDB, Cassandra, Redis, DynamoDB, etc).
-
Python Learning Resources
- Python - level programming language. Python is used heavily in the fields of Data Science and Machine Learning.
- Python Developer’s Guide
- Azure Functions Python developer guide - us/azure/azure-functions/functions-reference).
- CheckiO
- Python Institute
- MTA: Introduction to Programming Using Python Certification
- Getting Started with Python in Visual Studio Code
- Google's Python Style Guide
- Google's Python Education Class
- Real Python
- Intro to Python for Data Science
- Intro to Python by W3schools
- Codecademy's Python 3 course
- Learn Python with Online Courses and Classes from edX
- Python Courses Online from Coursera
- PCPP – Certified Professional in Python Programming 2
- PCEP – Certified Entry-Level Python Programmer certification
- PCAP – Certified Associate in Python Programming certification
-
Python Frameworks and Tools
- CherryPy - oriented HTTP web framework.
- Python Package Index (PyPI)
- PyCharm
- Django - level Python Web framework that encourages rapid development and clean, pragmatic design.
- Flask
- Web2py - source web application framework written in Python allowing allows web developers to program dynamic web content. One web2py instance can run multiple web sites using different databases.
- Tornado - blocking network I/O, which can scale to tens of thousands of open connections.
- HTTPie
- Scrapy - level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
- Sentry
- Sanic
- Pyramid - world web application development and deployment more fun and more productive.
- TurboGears
- Falcon - performance Python web framework for building large-scale app backends and microservices with support for MongoDB, Pluggable Applications and autogenerated Admin.
- NumPy
- Pillow
- IPython
- GraphLab Create - scale, high-performance machine learning models.
- Pandas
- Matplotlib - quality figures in a variety of hardcopy formats and interactive environments across platforms.
- Python Tools for Visual Studio(PTVS)
-
R Tools, Libraries, and Frameworks
- Dash
- Visual Studio Code
- Code Server
- VSCode-R - project.org/), including features such as extended syntax highlighting, R language service based on code analysis, interacting with R terminals, viewing data, plots, workspace variables, help pages, managing packages, and working with [R Markdown](https://rmarkdown.rstudio.com/) documents.
- R Debugger
- Shiny
- Rmarkdown
- Plotly
- Metaflow - life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
- Prophet - linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data.
- LightGBM
- MLR
- Plumber
- Drake - focused pipeline toolkit for reproducibility and high-performance computing.
- DiagrammeR
- Knitr - purpose literate programming engine in R, with lightweight API's designed to give users full control of the output without heavy coding work.
- Broom
- RStudio - highlighting editor that supports direct code execution, and tools for plotting, history, debugging and workspace management.
- VSCode-R - project.org/), including features such as extended syntax highlighting, R language service based on code analysis, interacting with R terminals, viewing data, plots, workspace variables, help pages, managing packages, and working with [R Markdown](https://rmarkdown.rstudio.com/) documents.
- R Debugger
- Shiny
- CatBoost
-
Scala Tools and Libraries
- Dotty
- Scala.js
- Polynote
- Scala Native - of-time compiler and lightweight managed runtime designed specifically for Scala.
- Gitbucket
- Finagle - agnostic RPC system
- Gatling - Sent-Events and JMS.
- Scalatra - performance, async web framework, inspired by [Sinatra](https://www.sinatrarb.com/).
- Scala.js
- Polynote
- Scala Native - of-time compiler and lightweight managed runtime designed specifically for Scala.
- Finagle - agnostic RPC system
- Scalatra - performance, async web framework, inspired by [Sinatra](https://www.sinatrarb.com/).
- Azure Databricks - based big data analytics service designed for data science and data engineering. Azure Databricks, sets up your Apache Spark environment in minutes, autoscale, and collaborate on shared projects in an interactive workspace. Azure Databricks supports Python, Scala, R, Java, and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch, and scikit-learn.
-
R Learning Resources
-
Computer Vision Tools, Libraries, and Frameworks
- OpenCV - time computer vision applications. The C++, Python, and Java interfaces support Linux, MacOS, Windows, iOS, and Android.
- LRSLibrary - Rank and Sparse Tools for Background Modeling and Subtraction in Videos. The library was designed for moving object detection in videos, but it can be also used for other computer vision and machine learning problems.
- Automated Driving Toolbox™ - eye-view plot and scope for sensor coverage, detections and tracks, and displays for video, lidar, and maps. The toolbox lets you import and work with HERE HD Live Map data and OpenDRIVE® road networks. It also provides reference application examples for common ADAS and automated driving features, including FCW, AEB, ACC, LKA, and parking valet. The toolbox supports C/C++ code generation for rapid prototyping and HIL testing, with support for sensor fusion, tracking, path planning, and vehicle controller algorithms.
- Statistics and Machine Learning Toolbox™
- Data Acquisition Toolbox™
- Computer Vision Toolbox™
- ROS Toolbox
- Mapping Toolbox™
- Image Processing Toolbox™ - standard algorithms and workflow apps for image processing, analysis, visualization, and algorithm development. You can perform image segmentation, image enhancement, noise reduction, geometric transformations, image registration, and 3D image processing.
- Model Predictive Control Toolbox™ - loop simulations, you can evaluate controller performance.
- Robotics Toolbox™ - holonomic vehicle. The Toolbox also including a detailed Simulink model for a quadrotor flying robot.
- UAV Toolbox
- Deep Learning Toolbox™ - term memory (LSTM) networks to perform classification and regression on image, time-series, and text data. You can build network architectures such as generative adversarial networks (GANs) and Siamese networks using automatic differentiation, custom training loops, and shared weights. With the Deep Network Designer app, you can design, analyze, and train networks graphically. It can exchange models with TensorFlow™ and PyTorch through the ONNX format and import models from TensorFlow-Keras and Caffe. The toolbox supports transfer learning with DarkNet-53, ResNet-50, NASNet, SqueezeNet and many other pretrained models.
- Parallel Computing Toolbox™ - intensive problems using multicore processors, GPUs, and computer clusters. High-level constructs such as parallel for-loops, special array types, and parallelized numerical algorithms enable you to parallelize MATLAB® applications without CUDA or MPI programming. The toolbox lets you use parallel-enabled functions in MATLAB and other toolboxes. You can use the toolbox with Simulink® to run multiple simulations of a model in parallel. Programs and models can run in both interactive and batch modes.
- Deep Learning HDL Toolbox™ - built bitstreams for running a variety of deep learning networks on supported Xilinx® and Intel® FPGA and SoC devices. Profiling and estimation tools let you customize a deep learning network by exploring design, performance, and resource utilization tradeoffs.
- Reinforcement Learning Toolbox™ - making algorithms for complex applications such as resource allocation, robotics, and autonomous systems.
- Lidar Toolbox™ - camera cross calibration for workflows that combine computer vision and lidar processing.
- Vision HDL Toolbox™ - streaming algorithms for the design and implementation of vision systems on FPGAs and ASICs. It provides a design framework that supports a diverse set of interface types, frame sizes, and frame rates. The image processing, video, and computer vision algorithms in the toolbox use an architecture appropriate for HDL implementations.
- Microsoft AirSim - source, cross platform, and supports [software-in-the-loop simulation](https://www.mathworks.com/help///ecoder/software-in-the-loop-sil-simulation.html) with popular flight controllers such as PX4 & ArduPilot and [hardware-in-loop](https://www.ni.com/en-us/innovations/white-papers/17/what-is-hardware-in-the-loop-.html) with PX4 for physically and visually realistic simulations. It is developed as an Unreal plugin that can simply be dropped into any Unreal environment. AirSim is being developed as a platform for AI research to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles.
- Partial Differential Equation Toolbox™
-
Uncategorized
-
Uncategorized
-
Categories
Reinforcement Learning Learning Resources
46
SQL/NoSQL Tools and Databases
46
ML Frameworks, Libraries, and Tools
38
NLP Tools, Libraries, and Frameworks
26
Java Tools, Libraries, and Frameworks
24
R Tools, Libraries, and Frameworks
22
Python Frameworks and Tools
21
Computer Vision Tools, Libraries, and Frameworks
20
SQL/NoSQL Learning Resources
20
Python Learning Resources
18
Bioinformatics Tools, Libraries, and Frameworks
18
Scala Learning Resources
17
NLP Learning Resources
17
Learning Resources for ML
16
Computer Vision Learning Resources
16
Reinforcement Learning Tools, Libraries, and Frameworks
16
MATLAB Learning Resources
16
MATLAB Tools, Libraries, Frameworks
15
Scala Tools and Libraries
14
Java Learning Resources
13
Bioinformatics Learning Resources
13
R Learning Resources
10
CUDA Learning Resources
9
CUDA Tools Libraries, and Frameworks
8
Uncategorized
4
Deep Learning Tools, Libraries, and Frameworks
3
License
1
Sub Categories
Keywords
machine-learning
5
natural-language-processing
4
nlp
4
java
3
deep-learning
3
pytorch
2
cuda
2
python
2
neural-network
2
postgresql
2
compiler
2
semantic-role-labeling
2
named-entity-recognition
2
ai
2
artificial-intelligence
2
subspace-learning
1
subspace-tracking
1
rpca
1
matrix-factorization
1
matrix-completion
1
matrix
1
tensor
1
tensor-decomposition
1
nlp-parsing
1
stanford-nlp
1
parsing
1
pos-tagging
1
sequence-labeling
1
word-embeddings
1
flow
1
reactive-streams
1
rxjava
1
guava
1
android
1
caffe2
1
deep-neural-networks
1
mxnet
1
ngraph
1
onnx
1
paddlepaddle
1
performance
1
tensorflow
1
big-data
1
cluster-management
1
kafka
1
scala
1
gpu
1
machine-learning-algorithms
1
nvidia
1
matlab
1