{"id":20065908,"url":"https://github.com/reshiadavan/replica","last_synced_at":"2026-04-10T01:50:38.217Z","repository":{"id":143551889,"uuid":"605365707","full_name":"ReshiAdavan/Replica","owner":"ReshiAdavan","description":"A Data Generation AI Tool that generates new data based on data provided as input.","archived":false,"fork":false,"pushed_at":"2023-11-22T14:46:06.000Z","size":8808,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-12T23:09:48.536Z","etag":null,"topics":["bigrams","cnn","gpt","gru","juypter-notebook","lstm","python","pytorch","rnn","transformers"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ReshiAdavan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-23T02:05:37.000Z","updated_at":"2024-01-08T00:25:13.000Z","dependencies_parsed_at":"2023-10-11T21:38:55.003Z","dependency_job_id":"6bd04056-1b60-4881-9742-914597d813b5","html_url":"https://github.com/ReshiAdavan/Replica","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReshiAdavan%2FReplica","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReshiAdavan%2FReplica/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReshiAdavan%2FReplica/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReshiAdavan%2FReplica/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ReshiAdavan","download_url":"https://codeload.github.com/ReshiAdavan/Replica/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241494139,"owners_count":19971870,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigrams","cnn","gpt","gru","juypter-notebook","lstm","python","pytorch","rnn","transformers"],"created_at":"2024-11-13T13:53:16.612Z","updated_at":"2025-12-31T01:04:35.865Z","avatar_url":"https://github.com/ReshiAdavan.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Replica\n\nReplica is a data generation tool that leverages several machine learning architectures, including a ground-up implementation of the GPT-2 model, to generate data similar to data that is passed to the application as input.\n\n### Inspiration\n\nI always thought ML applications suffered especially during their training procedures. The time it took to train, data required to train, and environment were all hinderances to ML models. So I thought to create a model that can infinitely generate data similar to the data you give it. This way, you remove the obstacle of always having to find data to train your model with.\n\nFYI: It does not accept all types of data (i.e mp3 files, images, pixels, etc)\n\n### Topics\n\n- Languages: Python\n- Frameworks/Libraries: PyTorch\n- Architectures: Transformers, RNNs/CNNs/GRUs, Bigrams, BoWs, GPT\n\n### Use It Yourself\n\nIt is as simple as cloning, installing the right python dependencies as prompted, and running the Python file in any IDE with the right interpreter.\n\nThe Juypter Notebooks follow the above, just run the entire file or any cell given the right IDE and environment.\n\n### Architectures (In Detail)\n\nIt is for those who are are curious/interested, in the underlying architecture of each of the models used in Replica.\n\n#### Bigram\n\nA bigram language model is a statistical language model used in natural language processing (NLP) and computational linguistics to predict the likelihood of a word based on the preceding word. It is a type of n-gram model where \"n\" is set to 2, meaning it considers pairs of consecutive words in a text.\n\nIn a bigram language model, the probability of a word is calculated based on the occurrence and frequency of word pairs in a training corpus. Specifically, it estimates the probability of a word \"w\" occurring after a preceding word \"v.\"\n\nIn Replica, the Bigram Language Model, essentially a 'neural net' in structure, is simply a lookup table of logits for the\nnext character given a previous character, and it follows that in implementation.\n\nBigram Diagram:\n\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/Bigram.PNG\" /\u003e\n\n#### Bag of Words (BoW)\n\nThe Bag of Words (BoW) model is a simple and fundamental language representation technique used in natural language processing (NLP) and text analysis. It's not a language model in the sense of a neural network-based language model but rather a basic method for representing and analyzing text data. BoW is used for various NLP tasks, such as document classification, sentiment analysis, and information retrieval.\n\nHere's the general procedure:\n\n- Tokenization\n- Vocabulary Creation\n- Vectorization\n- Sparse Representation\n- Vector Space Model (VSM)\n\nA following example:\n\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/BoW.PNG\" /\u003e\n\nBoW is straightforward and easy to implement, making it a good starting point for text analysis. However, it has several limitations:\n\n- Loss of Word Order\n- Lack of Semantics\n- High Dimensionality\n- Sparse Data\n\nTo address some of these limitations, more advanced text representation methods like TF-IDF (Term Frequency-Inverse Document Frequency) and word embeddings (e.g., Word2Vec, GloVe) have been developed. These methods aim to capture more of the semantics and context of words in text data.\n\n#### RNNs/CNNs/GRUs/LSTMs\n\nRecurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), Gated Recurrent Units (GRUs), and Long Short-Term Memory (LSTM) networks are all types of neural network architectures used in deep learning for various tasks, including natural language processing, image analysis, and sequential data modeling. Here's an overview of each and their architectures:\n\nRecurrent Neural Networks (RNNs):\nArchitecture: RNNs are designed for processing sequential data. They have a simple architecture with one or more recurrent layers where each neuron in the layer maintains a hidden state that is updated at each time step. RNNs can be unidirectional (information flows from the past to the future) or bidirectional (information flows in both directions).\nHow they work: RNNs process sequences one element at a time, and the hidden state at each time step depends on the input at that time step and the previous hidden state. This enables them to capture dependencies within sequential data. However, standard RNNs have problems with vanishing and exploding gradients.\n\nRNN Diagram:\n\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/RNN.PNG\" /\u003e\n\nConvolutional Neural Networks (CNNs):\nArchitecture: CNNs are primarily used for image analysis and have a layered architecture consisting of convolutional layers, pooling layers, and fully connected layers. Convolutional layers use filters (kernels) to capture local patterns in the input data.\nHow they work: CNNs apply convolution operations to the input data to extract local features and use pooling layers to reduce spatial dimensions while preserving important information. They are known for their ability to capture spatial hierarchies and invariances in data, making them highly effective in image classification and other visual tasks.\n\nGated Recurrent Units (GRUs):\nArchitecture: GRUs are a type of RNN variant designed to address some of the issues of standard RNNs. They consist of two gates, an update gate and a reset gate, in addition to the hidden state. These gates control the information flow and enable GRUs to capture long-term dependencies more effectively.\nHow they work: The update gate controls how much of the previous hidden state should be combined with the current input, and the reset gate controls how much of the previous hidden state should be forgotten. GRUs can maintain information over longer sequences without suffering from vanishing gradients to the same extent as traditional RNNs.\n\nGRU Diagrams:\n\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/GRUs_1.PNG\" /\u003e\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/GRUs_2.PNG\" /\u003e\n\nLong Short-Term Memory (LSTM) networks:\nArchitecture: LSTMs are another RNN variant with a more complex architecture compared to traditional RNNs. They have three gates: input gate, forget gate, and output gate, which control the flow of information through the network. LSTMs also have a cell state in addition to the hidden state.\nHow they work: LSTMs are capable of capturing long-term dependencies in sequences and mitigating the vanishing gradient problem by regulating the flow of information. The input gate controls what information to store in the cell state, the forget gate controls what to forget, and the output gate controls what information to output from the cell state.\n\nLSTM Diagrams:\n\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/LSTM_1.PNG\" /\u003e\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/LSTM_2.PNG\" /\u003e\n\nSimilarities:\n\n- RNNs, GRUs, and LSTMs are all designed for sequential data and can capture dependencies over time.\n- GRUs and LSTMs address the vanishing gradient problem better than traditional RNNs.\n- CNNs and RNN variants (e.g., CNN-LSTM) are often used together in tasks involving both spatial and temporal data.\n\nThe fundamental structure of Replica advances from a simple NLP solution to a structure that leverages Deep Neural Networks (DNNs) like the ones mentioned above. We eventually move from this architecture to an advanced CNN, aka our WaveNet.\n\nReplica Implementation:\nI implemented the RNN and GRU cells as well as their respective modules. No implementation of LSTM but it nearly functions the same and is structured almost the same.\n\n#### Transformers\n\nA Transformer is a type of deep learning model architecture that has had a significant impact on natural language processing (NLP) and various other machine learning tasks. It was introduced in the paper \"Attention is All You Need\" (as I referenced to) by Vaswani et al. in 2017 and has since become a fundamental building block for many state-of-the-art NLP models, such as BERT, GPT, and more.\n\nThe transformer architecture in Replica is near equivalent in implementation stratetgy as the infamous chatGPT, particularly GPT 2.0. The activation function used is a Gaussian Error Linear Unit (GELU) which is typically used in deep neural networks. It was introduced as an alternative to traditional activation functions like ReLU (Rectified Linear Unit) and it is known for its smoothness and performance in certain applications.\n\nI also create a vanilla multi-head masked self-attention layer with a projection at the end as part of the structure, along with the transformer block and the language model, aka our GPT2.0.\n\nTransformer Diagrams:\n\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/Transformer_1.PNG\" /\u003e\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/Transformer_2.PNG\" /\u003e\n\n#### WaveNet\n\nWaveNet is a deep generative model for generating audio waveforms, developed by researchers at DeepMind, a subsidiary of Alphabet Inc. It was introduced in a paper titled \"WaveNet: A Generative Model for Raw Audio\" (as I referenced to) in 2016. WaveNet is known for its ability to generate highly realistic and natural-sounding speech and music.\n\nWaveNet's architecture is based on deep neural networks, specifically deep convolutional neural networks (CNNs) and autoregressive models.\n\nWe use the hierarchical model to predict the next set of letters given a previous set of letters.\n\nFor example:\n\n```\n........ --\u003e y\n.......y --\u003e u\n......yu --\u003e h\n.....yuh --\u003e e\n....yuhe --\u003e n\n...yuhen --\u003e g\n..yuheng --\u003e .\n........ --\u003e d\n.......d --\u003e i\n......di --\u003e o\n.....dio --\u003e n\n....dion --\u003e d\n...diond --\u003e r\n..diondr --\u003e e\n.diondre --\u003e .\n........ --\u003e x\n.......x --\u003e a\n......xa --\u003e v\n.....xav --\u003e i\n....xavi --\u003e e\n```\n\nIt is like creating lego building blocks that connect together in such a way where you can feed data between said layers and receive output because of the hierarchical model.\n\nWaveNet Diagram:\n\n\u003cimg src=\"https://github.com/ReshiAdavan/Replica/blob/master/imgs/WaveNet.PNG\" /\u003e\n\nIf you made it this far, congrats! That concludes Replica's README.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freshiadavan%2Freplica","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Freshiadavan%2Freplica","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freshiadavan%2Freplica/lists"}