NLP-Guide
Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.
https://github.com/mikeroyal/NLP-Guide
Last synced: 4 days ago
JSON representation
-
Java Tools and Frameworks
- Retrofit - safe HTTP client for Android and Java develped by Square.
- Guava
- RxJava - based programs by using observable sequences. It extends the [observer pattern](http://en.wikipedia.org/wiki/Observer_pattern) to support sequences of data/events and adds operators that allow you to compose sequences together declaratively while abstracting away concerns about things like low-level threading, synchronization, thread-safety and concurrent data structures.
- IntelliJ IDEA
- Apache Groovy - typing and static compilation capabilities, for the Java platform aimed at improving developer productivity thanks to a concise, familiar and easy to learn syntax. It integrates smoothly with any Java program, and immediately delivers to your application powerful features, including scripting capabilities, Domain-Specific Language authoring, runtime and compile-time meta-programming and functional programming.
- YourKit
- Jenkins - source automation server. Built with Java, it provides over 1700 [plugins](https://plugins.jenkins.io/) to support automating virtually anything, so that humans can actually spend their time doing things machines cannot.
- GraalVM - based languages like Java, Scala, Clojure, Kotlin, and LLVM-based languages such as C and C++.
- Gradle - language software development. From mobile apps to microservices, from small startups to big enterprises, Gradle helps teams build, automate and deliver better software, faster. Write in Java, C++, Python or your language of choice.
- Apache Flink - and batch-processing capabilities with elegant and fluent APIs in Java and Scala.
- DBeaver - platform database tool for developers, SQL programmers, database administrators and analysts. Supports any database which has JDBC driver (which basically means - ANY database). EE version also supports non-JDBC datasources (MongoDB, Cassandra, Redis, DynamoDB, etc).
- Java SE
- JDK Development Tools
- Fastjson
- libGDX - platform Java game development framework based on OpenGL (ES) that works on Windows, Linux, Mac OS X, Android, your WebGL enabled browser and iOS.
- Redisson - Memory Data Grid. Over 50 Redis based Java objects and services: Set, Multimap, SortedSet, Map, List, Queue, Deque, Semaphore, Lock, AtomicLong, Map Reduce, Publish / Subscribe, Bloom filter, Spring Cache, Tomcat, Scheduler, JCache API, Hibernate, MyBatis, RPC, and local cache.
- JaCoCo
- Junit
- Mockito
- SpotBugs
- Java Design Patterns
- okhttp
- LeakCanary
- Elasticsearch
- Apache Groovy - typing and static compilation capabilities, for the Java platform aimed at improving developer productivity thanks to a concise, familiar and easy to learn syntax. It integrates smoothly with any Java program, and immediately delivers to your application powerful features, including scripting capabilities, Domain-Specific Language authoring, runtime and compile-time meta-programming and functional programming.
-
C/C++ Tools and Frameworks
- AppCode - fixes to resolve them automatically. AppCode provides lots of code inspections for Objective-C, Swift, C/C++, and a number of code inspections for other supported languages. All code inspections are run on the fly.
- Visual Studio Code
- ANTLR (ANother Tool for Language Recognition)
- Visual Studio - rich application that can be used for many aspects of software development. Visual Studio makes it easy to edit, debug, build, and publish your app. By using Microsoft software development platforms such as Windows API, Windows Forms, Windows Presentation Foundation, and Windows Store.
- OpenCV - time applications. Cross-Platform C++, Python and Java interfaces support Linux, MacOS, Windows, iOS, and Android.
- Cython
- Cmake - source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native makefiles and workspaces that can be used in the compiler environment of your choice.
- Libtool
- GCC - C, Fortran, Ada, Go, and D, as well as libraries for these languages.
- GDB
- Conan
- GSL - squares fitting. There are over 1000 functions in total with an extensive test suite.
- ReSharper C++
- CLion - platform IDE for C and C++ developers developed by JetBrains.
- Code::Blocks
- High Performance Computing (HPC) SDK
- Boost - edge C++. Boost has been a participant in the annual Google Summer of Code since 2007, in which students develop their skills by working on Boost Library development.
- Automake
- OpenGL Extension Wrangler Library (GLEW) - platform open-source C/C++ extension loading library. GLEW provides efficient run-time mechanisms for determining which OpenGL extensions are supported on the target platform.
- Maven
- TAU (Tuning And Analysis Utilities) - based sampling. All C++ language features are supported including templates and namespaces.
- Clang - C, C++ and Objective-C++ compiler when targeting X86-32, X86-64, and ARM (other targets may have caveats, but are usually easy to fix). Clang is used in production to build performance-critical software like Google Chrome or Firefox.
- Oat++ - efficient web application. It's zero-dependency and easy-portable.
- Infer - C, and C. Infer is written in [OCaml](https://ocaml.org/).
- AWS SDK for C++
- Vcpkg
- Spdlog - only/compiled, C++ logging library.
- CppSharp
- JavaCPP
- Azure SDK for C++
- Azure SDK for C
- C++ Client Libraries for Google Cloud Services
-
Python Frameworks, Libraries, and Tools
- PyCharm
- Matplotlib - quality figures in a variety of hardcopy formats and interactive environments across platforms.
- Python Package Index (PyPI)
- Django - level Python Web framework that encourages rapid development and clean, pragmatic design.
- Web2py - source web application framework written in Python allowing allows web developers to program dynamic web content. One web2py instance can run multiple web sites using different databases.
- Falcon - performance Python web framework for building large-scale app backends and microservices with support for MongoDB, Pluggable Applications and autogenerated Admin.
- Pillow
- IPython
- Pandas
- Python Tools for Visual Studio(PTVS)
- Python Fire
- Luigi - in.
- Locust
- Pipenv
- spaCy
- AWS Chalice
- Neural Network Intelligence(NNI)
- Bottle - framework for Python. It is distributed as a single file module and has no dependencies other than the [Python Standard Library](https://docs.python.org/library/).
- PuLP
- Python Tools for Visual Studio(PTVS)
- HTTPie
- Sanic
-
ML Frameworks, Libraries, and Tools
- Jupyter Notebook - source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Jupyter is used widely in industries that do data cleaning and transformation, numerical simulation, statistical modeling, data visualization, data science, and machine learning.
- Amazon SageMaker
- Anaconda
- Apache Spark - scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
- Scikit-Learn
- PyTorch
- Azure Databricks - based big data analytics service designed for data science and data engineering. Azure Databricks, sets up your Apache Spark environment in minutes, autoscale, and collaborate on shared projects in an interactive workspace. Azure Databricks supports Python, Scala, R, Java, and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch, and scikit-learn.
- Apple CoreML - tune models, all on the user's device. A model is the result of applying a machine learning algorithm to a set of training data. You use a model to make predictions based on new input data.
- Apache OpenNLP - source library for a machine learning based toolkit used in the processing of natural language text. It features an API for use cases like [Named Entity Recognition](https://en.wikipedia.org/wiki/Named-entity_recognition), [Sentence Detection](), [POS(Part-Of-Speech) tagging](https://en.wikipedia.org/wiki/Part-of-speech_tagging), [Tokenization](https://en.wikipedia.org/wiki/Tokenization_(data_security)) [Feature extraction](https://en.wikipedia.org/wiki/Feature_extraction), [Chunking](https://en.wikipedia.org/wiki/Chunking_(psychology)), [Parsing](https://en.wikipedia.org/wiki/Parsing), and [Coreference resolution](https://en.wikipedia.org/wiki/Coreference).
- Open Neural Network Exchange(ONNX) - in operators and standard data types.
- Apache MXNet
- AutoGluon - accuracy deep learning models on tabular, image, and text data.
- Apache PredictionIO
- BigDL
- Eclipse Deeplearning4J (DL4J) - based(Scala, Kotlin, Clojure, and Groovy) deep learning application. This means starting with the raw data, loading and preprocessing it from wherever and whatever format it is in to building and tuning a wide variety of simple and complex deep learning networks.
- XGBoost
- nGraph - of-use to AI developers.
- Theano - dimensional arrays efficiently including tight integration with NumPy.
- Caffe
- PlaidML
- Tensorflow_macOS - optimized version of TensorFlow and TensorFlow Addons for macOS 11.0+ accelerated using Apple's ML Compute framework.
- Cluster Manager for Apache Kafka(CMAK)
- Apache Spark Connector for SQL Server and Azure SQL - performance connector that enables you to use transactional data in big data analytics and persists results for ad-hoc queries or reporting. The connector allows you to use any SQL database, on-premises or in the cloud, as an input data source or output data sink for Spark jobs.
- OpenCV - time computer vision applications. The C++, Python, and Java interfaces support Linux, MacOS, Windows, iOS, and Android.
- Azure Databricks - based big data analytics service designed for data science and data engineering. Azure Databricks, sets up your Apache Spark environment in minutes, autoscale, and collaborate on shared projects in an interactive workspace. Azure Databricks supports Python, Scala, R, Java, and SQL, as well as data science frameworks and libraries including TensorFlow, PyTorch, and scikit-learn.
-
C/C++ Learning Resources
- Google C++ Style Guide
- C - purpose, high-level language that was originally developed by Dennis M. Ritchie to develop the UNIX operating system at Bell Labs. It supports structured programming, lexical variable scope, and recursion, with a static type system. C also provides constructs that map efficiently to typical machine instructions, which makes it one was of the most widely used programming languages today.
- Embedded C - committee) to address issues that exist between C extensions for different [embedded systems](https://en.wikipedia.org/wiki/Embedded_system). The extensions hep enhance microprocessor features such as fixed-point arithmetic, multiple distinct memory banks, and basic I/O operations. This makes Embedded C the most popular embedded software language in the world.
- C & C++ Developer Tools from JetBrains
- Open source C++ libraries on cppreference.com
- C++ Graphics libraries
- C++ Libraries in MATLAB
- Introduction C++ Education course on Google Developers
- C++ style guide for Fuchsia
- Chromium C++ Style Guide
- C++ Core Guidelines
- C++ Style Guide for ROS
- Learn C++
- Learn C : An Interactive C Tutorial
- C++ Online Training Courses on LinkedIn Learning
- C++ Tutorials on W3Schools
- Learn C Programming Online Courses on edX
- Learn C++ with Online Courses on edX
- Learn C++ on Codecademy
- Coding for Everyone: C and C++ course on Coursera
- C++ For C Programmers on Coursera
- C++ Online Courses on Udemy
- Top C Courses on Udemy
- Basics of Embedded C Programming for Beginners on Udemy
- C++ For Programmers Course on Udacity
- C++ Fundamentals Course on Pluralsight
- C++ - platform language that can be used to build high-performance applications developed by Bjarne Stroustrup, as an extension to the C language.
- C++ Tools and Libraries Articles
-
R Tools, Libraries, and Frameworks
- Rmarkdown
- Plotly
- Metaflow - life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
- LightGBM
- MLR
- Plumber
- Drake - focused pipeline toolkit for reproducibility and high-performance computing.
- DiagrammeR
- Knitr - purpose literate programming engine in R, with lightweight API's designed to give users full control of the output without heavy coding work.
- Broom
- ML workspace - in-one web-based IDE specialized for machine learning and data science. It is simple to deploy and gets you started within minutes to productively built ML solutions on your own machines. ML workspace is the ultimate tool for developers preloaded with a variety of popular data science libraries (Tensorflow, PyTorch, Keras, and MXnet) and dev tools (Jupyter, VS Code, and Tensorboard) perfectly configured, optimized, and integrated.
- Rplugin
-
R Learning Resources
- An Introduction to R
- R
- Google's R Style Guide
- R developer's guide to Azure
- Running R on AWS
- RStudio Server Pro for AWS
- Learn R by Codecademy
- Learn R Programming with Online Courses and Lessons by edX
- R Language Courses by Coursera
- Learn R For Data Science by Udacity
- Running R at Scale on Google Compute Engine
-
Java Learning Resources
-
CUDA Learning Resources
- CUDA - accelerated applications, the sequential part of the workload runs on the CPU, which is optimized for single-threaded. The compute intensive portion of the application runs on thousands of GPU cores in parallel. When using CUDA, developers can program in popular languages such as C, C++, Fortran, Python and MATLAB.
- CUDA Toolkit Documentation
- CUDA Quick Start Guide
- CUDA on WSL
- NVIDIA Deep Learning cuDNN Documentation
-
CUDA Tools Libraries, and Frameworks
- CUDA Toolkit - accelerated applications. The CUDA Toolkit allows you can develop, optimize, and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library to build and deploy your application on major architectures including x86, Arm and POWER.
- NVIDIA cuDNN - accelerated library of primitives for [deep neural networks](https://developer.nvidia.com/deep-learning). cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN accelerates widely used deep learning frameworks, including [Caffe2](https://caffe2.ai/), [Chainer](https://chainer.org/), [Keras](https://keras.io/), [MATLAB](https://www.mathworks.com/solutions/deep-learning.html), [MxNet](https://mxnet.incubator.apache.org/), [PyTorch](https://pytorch.org/), and [TensorFlow](https://www.tensorflow.org/).
- CUDA-X HPC - X HPC includes highly tuned kernels essential for high-performance computing (HPC).
- Chainer - based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using [CuPy](https://github.com/cupy/cupy) for high performance training and inference.
- CuPy - compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.
- CatBoost
- cuDF - like API that will be familiar to data engineers & data scientists, so they can use it to easily accelerate their workflows without going into the details of CUDA programming.
- ArrayFire - purpose library that simplifies the process of developing software that targets parallel and massively-parallel architectures including CPUs, GPUs, and other hardware acceleration devices.
- AresDB - powered real-time analytics storage and query engine. It features low query latency, high data freshness and highly efficient in-memory and on disk storage management.
- CUTLASS - performance matrix-multiplication (GEMM) at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS.
- NVIDIA Container Toolkit - container) and utilities to automatically configure containers to leverage NVIDIA GPUs.
- Thrust - level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs.
- Numba - aware optimizing compiler for Python sponsored by Anaconda, Inc. It uses the LLVM compiler project to generate machine code from Python syntax. Numba can compile a large subset of numerically-focused Python, including many NumPy functions. Additionally, Numba has support for automatic parallelization of loops, generation of GPU-accelerated code, and creation of ufuncs and C callbacks.
- cuML - learn.
- CUB
- Tensorman
- Kintinuous - time dense visual SLAM system capable of producing high quality globally consistent point and mesh reconstructions over hundreds of metres in real-time with only a low-cost commodity RGB-D sensor.
- Minkowski Engine - differentiation library for sparse tensors. It supports all standard neural network layers such as convolution, pooling, unpooling, and broadcasting operations for sparse tensors.
- Arraymancer - dimensional array) project in Nim. The main focus is providing a fast and ergonomic CPU, Cuda and OpenCL ndarray library on which to build a scientific computing ecosystem.
-
MATLAB Learning Resources
- MATLAB
- MATLAB Documentation
- Getting Started with MATLAB
- MATLAB Online Courses from Udemy
- MATLAB Online Courses from Coursera
- MATLAB Online Courses from edX
- Building a MATLAB GUI
- MATLAB Style Guidelines 2.0
- Setting Up Git Source Control with MATLAB & Simulink
- Pull, Push and Fetch Files with Git with MATLAB & Simulink
- Create New Repository with MATLAB & Simulink
- PRMLT
- MathWorks Certification Program
- PRMLT
-
MATLAB Tools, Libraries, Frameworks
- MATLAB and Simulink Services & Applications List
- MATLAB in the Cloud - cloud) including [AWS](https://aws.amazon.com/) and [Azure](https://azure.microsoft.com/).
- Simulink - Based Design. It supports simulation, automatic code generation, and continuous testing of embedded systems.
- Simulink Online™
- MATLAB Drive™
- Parallel Computing Toolbox™ - intensive problems using multicore processors, GPUs, and computer clusters. High-level constructs such as parallel for-loops, special array types, and parallelized numerical algorithms enable you to parallelize MATLAB® applications without CUDA or MPI programming. The toolbox lets you use parallel-enabled functions in MATLAB and other toolboxes. You can use the toolbox with Simulink® to run multiple simulations of a model in parallel. Programs and models can run in both interactive and batch modes.
- Image Processing Toolbox™ - standard algorithms and workflow apps for image processing, analysis, visualization, and algorithm development. You can perform image segmentation, image enhancement, noise reduction, geometric transformations, image registration, and 3D image processing.
- Computer Vision Toolbox™
- Statistics and Machine Learning Toolbox™
- Lidar Toolbox™ - camera cross calibration for workflows that combine computer vision and lidar processing.
- Mapping Toolbox™
- UAV Toolbox
- Partial Differential Equation Toolbox™
- ROS Toolbox
- Robotics Toolbox™ - holonomic vehicle. The Toolbox also including a detailed Simulink model for a quadrotor flying robot.
- Deep Learning Toolbox™ - term memory (LSTM) networks to perform classification and regression on image, time-series, and text data. You can build network architectures such as generative adversarial networks (GANs) and Siamese networks using automatic differentiation, custom training loops, and shared weights. With the Deep Network Designer app, you can design, analyze, and train networks graphically. It can exchange models with TensorFlow™ and PyTorch through the ONNX format and import models from TensorFlow-Keras and Caffe. The toolbox supports transfer learning with DarkNet-53, ResNet-50, NASNet, SqueezeNet and many other pretrained models.
- Reinforcement Learning Toolbox™ - making algorithms for complex applications such as resource allocation, robotics, and autonomous systems.
- Deep Learning HDL Toolbox™ - built bitstreams for running a variety of deep learning networks on supported Xilinx® and Intel® FPGA and SoC devices. Profiling and estimation tools let you customize a deep learning network by exploring design, performance, and resource utilization tradeoffs.
- Model Predictive Control Toolbox™ - loop simulations, you can evaluate controller performance.
- Vision HDL Toolbox™ - streaming algorithms for the design and implementation of vision systems on FPGAs and ASICs. It provides a design framework that supports a diverse set of interface types, frame sizes, and frame rates. The image processing, video, and computer vision algorithms in the toolbox use an architecture appropriate for HDL implementations.
- SoC Blockset™
- Wireless HDL Toolbox™ - verified, hardware-ready Simulink® blocks and subsystems for developing 5G, LTE, and custom OFDM-based wireless communication applications. It includes reference applications, IP blocks, and gateways between frame and sample-based processing.
- ThingSpeak™ - of-concept IoT systems that require analytics.
- hctsa - series analysis using Matlab.
- YALMIP
- hctsa - series analysis using Matlab.
- MATLAB Schemer
- LRSLibrary - Rank and Sparse Tools for Background Modeling and Subtraction in Videos. The library was designed for moving object detection in videos, but it can be also used for other computer vision and machine learning problems.
- SEA-MAT
- Gramm - level interface to produce publication-quality plots of complex data with varied statistical visualizations. Gramm is inspired by R's ggplot2 library.
- Image Processing Toolbox™ - standard algorithms and workflow apps for image processing, analysis, visualization, and algorithm development. You can perform image segmentation, image enhancement, noise reduction, geometric transformations, image registration, and 3D image processing.
- GNU Octave - level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation.
-
Python Learning Resources
- CheckiO
- PCPP – Certified Professional in Python Programming 2
- Getting Started with Python in Visual Studio Code
- Google's Python Style Guide
- Google's Python Education Class
- Intro to Python for Data Science
- Intro to Python by W3schools
- Codecademy's Python 3 course
- Learn Python with Online Courses and Classes from edX
- Python Courses Online from Coursera
- The Python Open Source Computer Science Degree by Forrest Knight
- Real Python
-
Learning Resources for ML
- Machine Learning by Stanford University from Coursera
- Machine Learning Scholarship Program for Microsoft Azure from Udacity
- Microsoft Certified: Azure Data Scientist Associate
- Microsoft Certified: Azure AI Engineer Associate
- Azure Machine Learning training and deployment
- Learning Machine learning and artificial intelligence from Google Cloud Training
- JupyterLab
- Scheduling Jupyter notebooks on Amazon SageMaker ephemeral instances
- How to run Jupyter Notebooks in your Azure Machine Learning workspace
- Machine Learning Courses Online from Udemy
- Machine Learning Courses Online from Coursera
- Learn Machine Learning with Online Courses and Classes from edX
- Machine Learning Scholarship Program for Microsoft Azure from Udacity
- Machine Learning
- Machine Learning Crash Course for Google Cloud
- AWS Training and Certification for Machine Learning (ML) Courses
-
Julia Learning Resources
-
Julia Tools, Libraries and Frameworks
- JuliaPro
- Juno
- Profile (Stdlib)
- JuliaGPU - level syntax and flexible compiler, Julia is well positioned to productively program hardware accelerators like GPUs without sacrificing performance.
- CUDA.jl - friendly array abstraction, a compiler for writing CUDA kernels in Julia, and wrappers for various CUDA libraries.
- Julia for VSCode
- JuMP.jl - specific modeling language for [mathematical optimization](https://en.wikipedia.org/wiki/Mathematical_optimization) embedded in Julia.
- Knet
- DataFrames.jl
- Flux.jl - Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support.
- PyCall.jl
- IJulia.jl
- Optim.jl
- Revise.jl - compile.
- Debugger.jl
- AWS.jl
- Nanosoldier.jl
- RCall.jl
- MXNet.jl - of-art deep learning to Julia.
- Distributions.jl
- IRTools.jl
Categories
MATLAB Tools, Libraries, Frameworks
32
C/C++ Tools and Frameworks
32
C/C++ Learning Resources
28
Java Tools and Frameworks
25
ML Frameworks, Libraries, and Tools
25
Python Frameworks, Libraries, and Tools
22
Julia Tools, Libraries and Frameworks
21
CUDA Tools Libraries, and Frameworks
19
Learning Resources for ML
16
MATLAB Learning Resources
14
Python Learning Resources
12
R Tools, Libraries, and Frameworks
12
R Learning Resources
11
Java Learning Resources
10
Julia Learning Resources
9
CUDA Learning Resources
5
License
1
Sub Categories
Keywords
python
14
cuda
9
deep-learning
8
cpp
8
java
7
machine-learning
6
gpu
6
neural-network
4
pytorch
4
nvidia
4
data-science
4
julia
4
cpp11
3
cpp14
3
cxx14
3
c
3
matlab
3
tensorflow
3
neural-networks
3
android
3
iot
2
cxx17
2
tensor
2
cli
2
cxx11
2
cxx
2
cpp20
2
cloud
2
cplusplus
2
cpp17
2
numpy
2
rest
2
algorithms
2
machine-learning-algorithms
2
statistics
2
kotlin
2
cxx20
2
gpu-computing
2
azure-sdk
2
azure
2
compiler
2
nvidia-hpc-sdk
2
developer-tools
2
performance
2
visual-studio
2
data-visualization
2
nlp
2
http
2
docker
2
anaconda
1