{"id":18594146,"url":"https://github.com/rubenszimbres/repo-2016","last_synced_at":"2025-04-10T16:30:55.340Z","repository":{"id":107087259,"uuid":"62509868","full_name":"RubensZimbres/Repo-2016","owner":"RubensZimbres","description":"R, Python and Mathematica Codes in Machine Learning, Deep Learning, Artificial Intelligence, NLP and Geolocation","archived":false,"fork":false,"pushed_at":"2018-12-10T16:36:35.000Z","size":5240,"stargazers_count":107,"open_issues_count":0,"forks_count":114,"subscribers_count":16,"default_branch":"master","last_synced_at":"2025-03-25T00:41:49.477Z","etag":null,"topics":["autoencoder","deep-learning","face-recognition","keras","lasagne","lstm","lstm-neural-networks","mathematica","natural-language-processing","nlp","nlp-machine-learning","python","python-3","python3","rstats","theano","theano-models","time-series-analysis","timeseries","word2vec"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RubensZimbres.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-07-03T18:31:24.000Z","updated_at":"2025-01-11T19:52:33.000Z","dependencies_parsed_at":null,"dependency_job_id":"3ddb34c5-ba75-4804-a1d3-2e7f28e02f6c","html_url":"https://github.com/RubensZimbres/Repo-2016","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RubensZimbres%2FRepo-2016","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RubensZimbres%2FRepo-2016/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RubensZimbres%2FRepo-2016/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RubensZimbres%2FRepo-2016/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RubensZimbres","download_url":"https://codeload.github.com/RubensZimbres/Repo-2016/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248252690,"owners_count":21072700,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["autoencoder","deep-learning","face-recognition","keras","lasagne","lstm","lstm-neural-networks","mathematica","natural-language-processing","nlp","nlp-machine-learning","python","python-3","python3","rstats","theano","theano-models","time-series-analysis","timeseries","word2vec"],"created_at":"2024-11-07T01:14:37.520Z","updated_at":"2025-04-10T16:30:55.326Z","avatar_url":"https://github.com/RubensZimbres.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# R, Python and Mathematica Codes in Data Science\n\n\u003cb\u003e Welcome to my GitHub repo. \u003c/b\u003e\n\nI am a Data Scientist and I code in R, Python and Wolfram Mathematica. Here you will find some Machine Learning, Deep Learning, Natural Language Processing and Artificial Intelligence models I developed.\n\n\u003cb\u003e Outputs of the models can be seen at my portfolio: \u003c/b\u003e https://drive.google.com/file/d/0B0RLknmL54khdjRQWVBKeTVxSHM/view?usp=sharing\n\n------------------\n# Mathematica Codes\n\n\u003cb\u003e MNIST_HOT.5.FULL:  \u003c/b\u003e\tis a solution for the MNIST dataset in Mathematica, with 96.51% accuracy, based on difference of pixels.\n\n\u003cb\u003e Mathematica - Artificial Intelligence Simulating Interactions in Social Networks:  \u003c/b\u003e\tis a model that simulates human interactions in a social network using cellular automata and agent-based modeling. Each agent has 3 possible choices for interation and a memory. The code has 14 pages with a big loop included in one line of code.\n\n\u003cb\u003e Mathematica - Facial Recognition in Movement: \u003c/b\u003e This code operationalizes facial recognition in a downloaded YouTube video. The output is also a video with the result of face recognition (YouTube link of the output is included in code page)\n\n\u003cb\u003e Mathematica - Monte Carlo Simulation: \u003c/b\u003e is an animated model of a Markov Chain Monte Carlo Simulation for autonomous driving. A video of the dynamic output was also generated and link for the YouTube video is included in code page.\n  \n\u003cb\u003e Mathematica - Social Network Surveillance:  \u003c/b\u003e\tis a model that tracks individuals in a social network, tracks also his connections and future interactions.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=https://github.com/RubensZimbres/Repo-2016/blob/master/pictures/MNIST_FINAL.jpg?raw=true\u003e\n\u003c/p\u003e\n\n------------------\n# Python Codes\n\nKeras version used in models: keras==1.1.0 | LSTM 0.2\n\n\u003cb\u003e Python - Autoencoder MNIST:  \u003c/b\u003e\tis an autoencoder model for classification of images developed with Keras, for the MNIST dataset, with model Checkpoint as a callback to save weights.\n\n\u003cb\u003e Python - Autoencoder for Text Classification:  \u003c/b\u003e\tis an autoencoder model for classification of text made with Keras, also with model Checkpoint.\n\n\u003cb\u003e Python - Deep Learning with Lasagne:  \u003c/b\u003e\tis a deep neural network developed with Lasagne, where you can see values of weights in each layer, including bias.\n\n\u003cb\u003e Python - Face Recognition: \u003c/b\u003e\tis a model using OpenCV to detect faces.\n\n\u003cb\u003e Python - Image Extraction from Twitter: \u003c/b\u003e\tis a model that extracts pictures and their links from Twitter webpages, plotting with matplotlib.\n\n\u003cb\u003e Python - Keras Convolutional Neural Network: \u003c/b\u003e is a CNN developed to classify the MNIST dataset with an accuracy greater than 99%.\n\n\u003cb\u003e Python - Keras Deep Regressor: \u003c/b\u003e\tis a deep Neural Network for prediction of a continuous output made with Keras, learning rate scheduler according to derivative of error, random initial weights, with loss history.\n\n\u003cb\u003e Python - Keras LSTM Network: \u003c/b\u003e\tis a Recurrent Neural Network (LSTM) to predict and generate text.\n\n\u003cb\u003e Python - Keras Multi Layer Perceptron: \u003c/b\u003e\tis a MLP model, Neural Networks made with Keras with loss history, scheduled learning rate according to derivative of error for prediction and classification.\n\n\u003cb\u003e Python - Machine Learning: \u003c/b\u003e is a Principal Components Analysis followed by a Linear Regression.\n\n\u003cb\u003e Python - NLP Doc2Vec: \u003c/b\u003e\tis a Natural Language Processing model where I asked a Wikipedia webpage a question and 4 possible answers were semantically chosen from the tokenized and vectorized webpage, using KNN and cosine distance.\n\n\u003cb\u003e Python - NLP Semantic Analysis: \u003c/b\u003e\tis a Natural Language Processing model that classifies a given sentence according to semantic similarity to other sentences, using cosine distance.\n\n\u003cb\u003e Python - NLP Word2Vec: \u003c/b\u003e\tis a model developed from scratch to measure cosine similarity among words.\n\n\u003cb\u003e Python - Reinforcement Learning: \u003c/b\u003e\tis a model based on simple rules and Game Theory where agents attitude change according to payoff achieved. Can be adapted for tit-for-tat strategy, always cooperate, always defeat and other strategies. Rewards were placed in the payoff matrix.\n\n\u003cb\u003e Python - Social Networks: \u003c/b\u003e\tis a model that draws social networks configuration and connections.\n\n\u003cb\u003e Python - Support Vector Machines: \u003c/b\u003e\tis a Machine Learning model that classifies the Iris dataset with SVM and plots it.\n\n\u003cb\u003e Python - Theano Deep Learning: \u003c/b\u003e\tis a Neural Network with two hidden layers using Theano.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=https://github.com/RubensZimbres/Repo-2016/blob/master/pictures/GENSIM_Word2Vec.jpg?raw=true\u003e\n\u003c/p\u003e\n\n------------------\n# R Codes\n\n\u003cb\u003e R - Churn of Customers: \u003c/b\u003e is a model that uses a logistic regression associated with a threshold to predict which customers present the greater risk to be lost.\n\n\u003cb\u003e R - Data Cleaning + Multinomial Regression: \u003c/b\u003e\tis a model that presents data cleaning and a multinomial regression using package nnet to classify customers according to their level of loyalty.\n\n\u003cb\u003e R - Face Recognition: \u003c/b\u003e\tis a code to detect faces and objects in R.\n\n\u003cb\u003e R - Geolocation Brazil: \u003c/b\u003e\tis a file for geo-spatial localization, brazilian map.\n\n\u003cb\u003e R - Geolocation USA: \u003c/b\u003e\tis also a file for geo-spatial localization, USA map.\n\n\u003cb\u003e R - Geolocation World: \u003c/b\u003e\tis a file for geo-spatial localization, world map, zoom available, customizable icons.\n\n\u003cb\u003e R - Gradient Descent Logistic: \u003c/b\u003e\tis a model that performs a gradient descent to define a threshold for the sigmoid function in a Logistic Regression. Boosting was implemented and ROC curves compared.\n\n\u003cb\u003e R - H2O Deep Learning: \u003c/b\u003e\tis a Neural Network model developed to predict recommendations and word-of-mouth advertising.\n\n\u003cb\u003e R - Imbalanced classes \u003c/b\u003e is a model for employee churn, where features have no correlation with target variable and also there are imbalanced classes in the proportion 1/20. A logistic regression from scratch is applied, a hill climbing gradient is used to define the best threshold for the logistic function and after that, boosting was compared regarding AUC in a ROC plot.\n\n\u003cb\u003e Logistic Regression + Gradient Descent + Boosting \u003c/b\u003e is a model where features have no correlation with target variable. Logistic Regression with Gradient Descent was applied, and then Boosting.\n\n\u003cb\u003e R - MNIST: \u003c/b\u003e\tis a solution for the MNIST dataset, developed from scratch.\n\n\u003cb\u003e R - Markov Chains: \u003c/b\u003e\tis a simple visualization of Markov Chains and probabilities associated.\n\n\u003cb\u003e R - NeuralNet: \u003c/b\u003e is a Neural Network model developed to predict and classify word-of-mouth advertising.\n\n\u003cb\u003e R - Ridge Regression: \u003c/b\u003e is a model with Ridge Regularization made from scratch to prevent overfitting.\n\n\u003cb\u003e R - Deep Learning: \u003c/b\u003e is a Neural Network model with 2 hidden layers for prediction of a continuous variable.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=https://github.com/RubensZimbres/Repo-2016/blob/master/pictures/StockMarket.DOW.JONES.png?raw=true\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=https://github.com/RubensZimbres/Repo-2016/blob/master/pictures/Geolocation.png?raw=true\u003e\n\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frubenszimbres%2Frepo-2016","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frubenszimbres%2Frepo-2016","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frubenszimbres%2Frepo-2016/lists"}