https://github.com/mytechnotalent/knet

KNET is an educational Retrieval-Augmented Generation (RAG) system built from scratch, designed to empower STEM learners by combining transformers and intelligent retrieval to simulate human reasoning and deliver precise, context-aware answers to foster curiosity and understanding!
https://github.com/mytechnotalent/knet
ai chain-of-thought chain-of-thought-reasoning context-aware data-science machine-learning machine-learning-models neural-network neural-networks pytorch rag retrieval-augmented-generation stem
Last synced: 24 days ago
JSON representation
Host: GitHub
URL: https://github.com/mytechnotalent/knet
Owner: mytechnotalent
License: apache-2.0
Created: 2024-11-23T15:59:20.000Z (5 months ago)
Default Branch: main
Last Pushed: 2024-11-23T16:19:10.000Z (5 months ago)
Last Synced: 2025-03-29T07:45:22.729Z (24 days ago)
Topics: ai, chain-of-thought, chain-of-thought-reasoning, context-aware, data-science, machine-learning, machine-learning-models, neural-network, neural-networks, pytorch, rag, retrieval-augmented-generation, stem
Language: Jupyter Notebook
Homepage:
Size: 9.14 MB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        ```python

from IPython.display import Image

```

```python

Image(filename = 'KNET.jpeg')

```

    

![jpeg](README_files/README_1_0.jpg)

    

# KNET

#### KNET is an educational Retrieval-Augmented Generation (RAG) system built from scratch, designed to empower STEM learners by combining transformers and intelligent retrieval to simulate human reasoning and deliver precise, context-aware answers to foster curiosity and understanding!

#### Author: [Kevin Thomas](mailto:[email protected])

## Imports

```python

import math

import torch

import torch.nn as nn

```

## 1. Define the Positioning Encoding Model

### Transformers rely on positional encoding to retain the order of the sequence data.

```python

class PositionalEncoding(nn.Module):

    """

    Implements positional encoding for transformer models.

    This module injects information about the relative or absolute position

    of tokens in the sequence. the positional encodings have the same dimension

    as the embeddings so that the two can be summed.

    Attributes:

        dropout (nn.Dropout): dropout layer.

        pe (torch.Tensor): positional encoding matrix.

    """

    def __init__(self, d_model, dropout=0.1, max_len=5000):

        """

        Initializes the PositionalEncoding module.

        Args:

            d_model (int): the dimensionality of embeddings.

            dropout (float): dropout rate. default is 0.1.

            max_len (int): maximum length of input sequences. default is 5000.

        """

        super(PositionalEncoding, self).__init__()

        self.dropout = nn.Dropout(p=dropout)

        # create positional encoding matrix

        pe = torch.zeros(max_len, d_model)

        position = torch.arange(0, max_len).unsqueeze(1).float()

        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))

        pe[:, 0::2] = torch.sin(position * div_term)  # even indices

        pe[:, 1::2] = torch.cos(position * div_term)  # odd indices

        self.pe = pe.unsqueeze(1)  # shape: (max_len, 1, d_model)

    def forward(self, x):

        """

        Adds positional encoding to the input tensor.

        Args:

            x (torch.Tensor): input tensor of shape (seq_len, batch_size, d_model).

        Returns:

            torch.Tensor: tensor with positional encoding added.

        """

        x = x + self.pe[: x.size(0)]

        return self.dropout(x)

```

## 2. Define the Transformer Model

### This model includes an embedding layer, positional encoding, and the transformer itself.

```python

class TransformerModel(nn.Module):

    """

    Defines a transformer-based sequence-to-sequence model.

    This model uses an encoder-decoder architecture with positional encoding

    and transformer layers to process sequences.

    Attributes:

        embedding (nn.Embedding): embedding layer for tokens.

        pos_encoder (PositionalEncoding): positional encoding module.

        transformer (nn.Transformer): transformer module.

        fc_out (nn.Linear): linear layer for output generation.

        d_model (int): dimensionality of the model's embeddings.

    """

    def __init__(self, vocab_size, d_model=128, nhead=4, num_layers=2, dropout=0.1):

        """

        Initializes the TransformerModel.

        Args:

            vocab_size (int): size of the vocabulary.

            d_model (int): dimensionality of embeddings. default is 128.

            nhead (int): number of attention heads. default is 4.

            num_layers (int): number of encoder and decoder layers. default is 2.

            dropout (float): dropout rate. default is 0.1.

        """

        super(TransformerModel, self).__init__()

        self.embedding = nn.Embedding(vocab_size, d_model)

        self.pos_encoder = PositionalEncoding(d_model, dropout)

        self.transformer = nn.Transformer(

            d_model=d_model,

            nhead=nhead,

            num_encoder_layers=num_layers,

            num_decoder_layers=num_layers,

            dropout=dropout,

        )

        self.fc_out = nn.Linear(d_model, vocab_size)

        self.d_model = d_model

    def forward(self, src, tgt, src_padding_mask=None, tgt_padding_mask=None, tgt_mask=None):

        """

        Performs a forward pass through the transformer model.

        Args:

            src (torch.Tensor): source tensor of shape (seq_len_src, batch_size).

            tgt (torch.Tensor): target tensor of shape (seq_len_tgt, batch_size).

            src_padding_mask (torch.Tensor): source padding mask.

            tgt_padding_mask (torch.Tensor): target padding mask.

            tgt_mask (torch.Tensor): target mask to prevent attention to future tokens.

        Returns:

            torch.Tensor: output tensor of shape (seq_len_tgt, batch_size, vocab_size).

        """

        src = self.embedding(src) * math.sqrt(self.d_model)

        src = self.pos_encoder(src)

        tgt = self.embedding(tgt) * math.sqrt(self.d_model)

        tgt = self.pos_encoder(tgt)

        output = self.transformer(

            src,

            tgt,

            tgt_mask=tgt_mask,

            src_key_padding_mask=src_padding_mask,

            tgt_key_padding_mask=tgt_padding_mask,

        )

        output = self.fc_out(output)

        return output

```

## 3. Prepare the Corpus and Vocabulary

### We create a sample corpus and build a vocabulary from it.

```python

def tokenize(text):

    """

    Tokenizes a given text into individual words.

    This function converts the input text to lowercase

    and splits it into words based on spaces.

    Args:

        text (str): the input string to be tokenized.

    Returns:

        list: a list of words (tokens) from the input text.

    """

    return text.lower().split()

# define the corpus with query and document labels

corpus_with_labels = [

    ("What is reverse engineering?", "Reverse engineering is the process of analyzing a system to identify its components and their interrelationships."),

    ("What is disassembly?", "Disassembly is the process of converting machine code into human-readable assembly code."),

    ("What is static analysis?", "Static analysis is the examination of software without executing it to understand its structure and behavior."),

    ("What is dynamic analysis?", "Dynamic analysis is the process of analyzing software while it is running to study its behavior."),

    ("What is a debugger?", "A debugger is a tool used to test and debug programs by allowing inspection of memory, registers, and instructions during execution."),

    ("What is IDA Pro?", "IDA Pro is a popular reverse engineering tool that provides a disassembler and decompiler for analyzing binary files."),

    ("What is Ghidra?", "Ghidra is an open-source reverse engineering tool developed by the NSA, featuring a powerful decompiler."),

    ("What is a control flow graph?", "A control flow graph is a representation of all possible paths that can be taken through a program during its execution."),

    ("What is malware analysis?", "Malware analysis is the process of studying malicious software to understand its behavior and mitigate threats."),

    ("What is obfuscation?", "Obfuscation is a technique used to make code harder to understand, often to protect intellectual property or evade detection."),

]

def build_vocab(corpus_with_labels):

    """

    Builds a vocabulary from the corpus by tokenizing all queries and documents.

    Args:

        corpus_with_labels (list of tuples): list of query-document pairs.

    Returns:

        dict: word-to-index mapping.

        dict: index-to-word mapping.

    """

    vocab = set()

    for query, doc in corpus_with_labels:

        vocab.update(tokenize(query))

        vocab.update(tokenize(doc))

    vocab = sorted(vocab)

    word2idx = {word: idx + 1 for idx, word in enumerate(vocab)}

    word2idx[''] = 0  # unknown token

    word2idx[''] = len(word2idx)  # padding token

    word2idx[''] = len(word2idx)  # start of sequence

    word2idx[''] = len(word2idx)  # end of sequence

    idx2word = {idx: word for word, idx in word2idx.items()}

    return word2idx, idx2word

# build vocabulary

word2idx, idx2word = build_vocab(corpus_with_labels)

vocab_size = len(word2idx)

```

## 4. Vectorize the Documents

### Convert each document into a vector using the vocabulary.

```python

def vectorize(text, word2idx, vocab_size):

    """

    Converts a text document into a vector representation.

    This function tokenizes the input text and creates a vector

    where each index corresponds to a word in the vocabulary,

    and the value at that index is the frequency of the word in the text.

    Args:

        text (str): the input document to be vectorized.

        word2idx (dict): mapping of words to indices in the vocabulary.

        vocab_size (int): size of the vocabulary.

    Returns:

        torch.Tensor: a tensor representing the frequency of words in the document.

    """

    vec = torch.zeros(vocab_size)

    tokens = tokenize(text)

    for token in tokens:

        idx = word2idx.get(token, word2idx[''])

        vec[idx] += 1

    return vec

# create document vectors for the retriever

doc_vectors = [vectorize(doc, word2idx, vocab_size) for _, doc in corpus_with_labels]

```

## 5. Implement the Retriever

### Retrieve the top k documents most similar to the query.

```python

def retrieve(query, doc_vectors, corpus, word2idx, vocab_size, top_k=2):

    """

    Retrieves the top-k most similar documents to the query based on cosine similarity.

    Args:

        query (str): the input query text.

        doc_vectors (list of torch.Tensor): a list of document vectors representing the corpus.

        corpus (list of tuples): the corpus of query-document pairs.

        word2idx (dict): mapping of words to indices in the vocabulary.

        vocab_size (int): size of the vocabulary.

        top_k (int): the number of most similar documents to retrieve. default is 2.

    Returns:

        list of str: the top-k most similar documents from the corpus.

    """

    query_vec = vectorize(query, word2idx, vocab_size)

    similarities = []

    for idx, doc_vec in enumerate(doc_vectors):

        sim = torch.dot(query_vec, doc_vec) / (

            torch.norm(query_vec) * torch.norm(doc_vec) + 1e-8

        )

        similarities.append((sim.item(), idx))

    similarities.sort(reverse=True)

    top_docs = [corpus[idx][1] for (_, idx) in similarities[:top_k]]

    return top_docs

```

## 6. Simulate Chain of Thought

### Create a function to simulate the reasoning process.

```python

def generate_answer_with_chain_of_thought(query, retrieved_docs):

    """

    Simulates a reasoning process to generate an answer using a chain of thought.

    This function prints the logical steps involved in reasoning through

    the query and the retrieved documents, then formulates and outputs a simulated answer.

    Args:

        query (str): the input query text.

        retrieved_docs (list of str): a list of documents retrieved for the query.

    Returns:

        None

    """

    print('Chain of Thought:')

    reasoning_steps = [

        f"1. the query is: '{query}'",

        '2. retrieved the following documents:',

    ]

    for idx, doc in enumerate(retrieved_docs):

        reasoning_steps.append(f'   {idx + 1}. {doc}')

    reasoning_steps.append('3. analyzing the retrieved documents to answer the query.')

    reasoning_steps.append('4. formulating the answer based on the information.')

    for step in reasoning_steps:

        print(step)

```

## 7. Initialize the Transformer Model

### Instantiate the transformer model with the vocabulary size.

```python

# initialize the transformer model with updated parameters

transformer_model = TransformerModel(vocab_size=vocab_size, d_model=128, nhead=4, num_layers=2, dropout=0.1)

```

    /opt/anaconda3/envs/prod/lib/python3.12/site-packages/torch/nn/modules/transformer.py:379: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)

      warnings.warn(

## 8. Create the RAG System

### Combine the retriever and the generator (transformer model).

```python

def rag_system(query, transformer_model, retriever, doc_vectors, corpus, word2idx, idx2word, vocab_size):

    """

    Implements a basic Retrieval-Augmented Generation (RAG) system.

    This function retrieves relevant documents for a query using the retriever,

    prepares the input for the transformer model by combining the query and retrieved documents,

    and generates an output using the transformer model.

    Args:

        query (str): the input query text.

        transformer_model (nn.Module): the transformer model for generating output.

        retriever (function): the document retrieval function.

        doc_vectors (list of torch.Tensor): a list of document vectors representing the corpus.

        corpus (list of tuples): the corpus of query-document pairs.

        word2idx (dict): mapping of words to indices in the vocabulary.

        idx2word (dict): mapping of indices to words in the vocabulary.

        vocab_size (int): size of the vocabulary.

    Returns:

        None

    """

    # retrieve relevant documents

    retrieved_docs = retriever(query, doc_vectors, corpus, word2idx, vocab_size)

    # prepare input by concatenating the query and retrieved documents

    input_text = query + ' ' + ' '.join(retrieved_docs)

    print('Input to the transformer model:')

    print(input_text)

    # tokenize and convert input to indices

    tokens = tokenize(input_text)

    input_indices = [word2idx.get(token, word2idx['']) for token in tokens]

    input_tensor = torch.tensor(input_indices).unsqueeze(1)  # add batch dimension

    # initialize decoder input with 

    tgt_input = torch.tensor([word2idx['']]).unsqueeze(1)

    # generate output tokens iteratively

    transformer_model.eval()

    with torch.no_grad():

        for _ in range(50):  # maximum answer length

            tgt_mask = nn.Transformer.generate_square_subsequent_mask(tgt_input.size(0)).to(input_tensor.device)

            src_padding_mask = (input_tensor.squeeze(1) == word2idx['']).unsqueeze(0)

            tgt_padding_mask = (tgt_input.squeeze(1) == word2idx['']).unsqueeze(0)

            output = transformer_model(

                input_tensor,

                tgt_input,

                src_padding_mask=src_padding_mask,

                tgt_padding_mask=tgt_padding_mask,

                tgt_mask=tgt_mask,

            )

            next_token = output.argmax(dim=-1)[-1, :].item()

            tgt_input = torch.cat([tgt_input, torch.tensor([[next_token]])], dim=0)

            if next_token == word2idx['']:

                break

    # decode the output indices to words

    output_indices = tgt_input.squeeze().tolist()[1:]  # exclude the first  token

    answer = ' '.join([idx2word.get(idx, '') for idx in output_indices])

    # simulate reasoning using the chain of thought process

    generate_answer_with_chain_of_thought(query, retrieved_docs)

    print('\nAnswer:')

    print(answer)

```

## 9. Test the RAG System

### Run the system with a sample query.

```python

# test the RAG system with an example query

query = "What is reverse engineering?"

rag_system(query, transformer_model, retrieve, doc_vectors, corpus_with_labels, word2idx, idx2word, vocab_size)

```

    Input to the transformer model:

    What is reverse engineering? Ghidra is an open-source reverse engineering tool developed by the NSA, featuring a powerful decompiler. Reverse engineering is the process of analyzing a system to identify its components and their interrelationships.

    Chain of Thought:

    1. the query is: 'What is reverse engineering?'

    2. retrieved the following documents:

       1. Ghidra is an open-source reverse engineering tool developed by the NSA, featuring a powerful decompiler.

       2. Reverse engineering is the process of analyzing a system to identify its components and their interrelationships.

    3. analyzing the retrieved documents to answer the query.

    4. formulating the answer based on the information.

    

    Answer:

    provides tool behavior execution. study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study study

    /opt/anaconda3/envs/prod/lib/python3.12/site-packages/torch/nn/functional.py:5849: UserWarning: Support for mismatched key_padding_mask and attn_mask is deprecated. Use same type for both instead.

      warnings.warn(

## 10. Train and Test the RAG System

### Run the system with a trained sample query.

```python

import torch.optim as optim

```

```python

from sklearn.model_selection import train_test_split

```

```python

# split the data into training and validation sets

train_pairs, val_pairs = train_test_split(corpus_with_labels, test_size=0.2, random_state=42)

def prepare_data(data, word2idx):

    """

    Prepares tokenized input and output tensors for training or validation.

    This function converts query-document pairs into indexed tensors

    using the vocabulary mapping.

    Args:

        data (list of tuples): list of query-document pairs.

        word2idx (dict): mapping of words to indices in the vocabulary.

    Returns:

        list of tuples: tokenized input-output tensor pairs.

    """

    processed_data = []

    for query, doc in data:

        input_indices = [word2idx.get(token, word2idx['']) for token in tokenize(query)]

        # add  and  tokens to the target

        output_indices = (

            [word2idx['']]

            + [word2idx.get(token, word2idx['']) for token in tokenize(doc)]

            + [word2idx['']]

        )

        input_tensor = torch.tensor(input_indices)

        output_tensor = torch.tensor(output_indices)

        processed_data.append((input_tensor, output_tensor))

    return processed_data

# prepare the training and validation data

train_data = prepare_data(train_pairs, word2idx)

val_data = prepare_data(val_pairs, word2idx)

# implement batch training

from torch.utils.data import DataLoader

def collate_fn(batch):

    """

    Collate function to prepare batches with padding.

    Args:

        batch (list of tuples): list of (src, tgt) tuples.

    Returns:

        src_batch (torch.Tensor): Padded source sequences.

        tgt_batch (torch.Tensor): Padded target sequences.

    """

    src_batch, tgt_batch = zip(*batch)

    src_padded = nn.utils.rnn.pad_sequence(src_batch, padding_value=word2idx[''])

    tgt_padded = nn.utils.rnn.pad_sequence(tgt_batch, padding_value=word2idx[''])

    return src_padded, tgt_padded

# set batch size

batch_size = 2

# create DataLoaders

train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True, collate_fn=collate_fn)

val_loader = DataLoader(val_data, batch_size=batch_size, shuffle=False, collate_fn=collate_fn)

# define loss and optimizer with reduced learning rate

criterion = nn.CrossEntropyLoss(ignore_index=word2idx[''])

optimizer = optim.Adam(transformer_model.parameters(), lr=0.001)

# increase dropout rate in the model (assuming you can modify model initialization)

transformer_model.dropout = 0.3

# adjust learning rate scheduler and early stopping patience

scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=3, verbose=True)

def train_model(model, train_loader, val_loader, vocab_size, epochs=1000):

    """

    Trains the transformer model on the training set and validates it.

    This function performs multiple epochs of training and calculates

    validation loss after each epoch.

    Args:

        model (nn.Module): the transformer model to be trained.

        train_loader (DataLoader): DataLoader for training data.

        val_loader (DataLoader): DataLoader for validation data.

        vocab_size (int): size of the vocabulary.

        epochs (int): number of epochs for training.

    Returns:

        None

    """

    best_val_loss = float('inf')

    patience = 100  # increase patience to allow more epochs before early stopping

    epochs_no_improve = 0

    for epoch in range(epochs):

        model.train()

        total_loss = 0

        for src_batch, tgt_batch in train_loader:

            optimizer.zero_grad()

            tgt_input = tgt_batch[:-1, :]  # Exclude last token for input

            tgt_output = tgt_batch[1:, :]  # Exclude first token for output

            # Generate masks

            tgt_mask = nn.Transformer.generate_square_subsequent_mask(tgt_input.size(0)).to(src_batch.device)

            src_padding_mask = (src_batch == word2idx['']).transpose(0, 1)

            tgt_padding_mask = (tgt_input == word2idx['']).transpose(0, 1)

            output = model(

                src_batch,

                tgt_input,

                src_padding_mask=src_padding_mask,

                tgt_padding_mask=tgt_padding_mask,

                tgt_mask=tgt_mask,

            )

            loss = criterion(output.view(-1, vocab_size), tgt_output.reshape(-1))

            loss.backward()

            torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)  # gradient clipping

            optimizer.step()

            total_loss += loss.item()

        avg_loss = total_loss / len(train_loader)

        # validation step

        model.eval()

        val_loss = 0

        with torch.no_grad():

            for src_batch, tgt_batch in val_loader:

                tgt_input = tgt_batch[:-1, :]

                tgt_output = tgt_batch[1:, :]

                tgt_mask = nn.Transformer.generate_square_subsequent_mask(tgt_input.size(0)).to(src_batch.device)

                src_padding_mask = (src_batch == word2idx['']).transpose(0, 1)

                tgt_padding_mask = (tgt_input == word2idx['']).transpose(0, 1)

                output = model(

                    src_batch,

                    tgt_input,

                    src_padding_mask=src_padding_mask,

                    tgt_padding_mask=tgt_padding_mask,

                    tgt_mask=tgt_mask,

                )

                loss = criterion(output.view(-1, vocab_size), tgt_output.reshape(-1))

                val_loss += loss.item()

        avg_val_loss = val_loss / len(val_loader)

        print(f'Epoch {epoch + 1}/{epochs}, Training Loss: {avg_loss:.4f}, Validation Loss: {avg_val_loss:.4f}')

        # adjust learning rate based on validation loss

        scheduler.step(avg_val_loss)

        # check for early stopping

        if avg_val_loss < best_val_loss:

            best_val_loss = avg_val_loss

            epochs_no_improve = 0

            torch.save(model.state_dict(), 'best_model.pt')  # save the best model

        else:

            epochs_no_improve += 1

            if epochs_no_improve >= patience:

                print(f'Early stopping after {epoch + 1} epochs')

                model.load_state_dict(torch.load('best_model.pt', weights_only=True))  # load the best model

                break

# train the model

train_model(transformer_model, train_loader, val_loader, vocab_size, epochs=1000)

# test the RAG system again

print("\nRe-testing the RAG system with trained model:")

query = "What is reverse engineering?"

rag_system(query, transformer_model, retrieve, doc_vectors, corpus_with_labels, word2idx, idx2word, vocab_size)

```

    Epoch 1/1000, Training Loss: 1.1782, Validation Loss: 4.0728

    Epoch 2/1000, Training Loss: 0.9192, Validation Loss: 3.9542

    Epoch 3/1000, Training Loss: 0.8057, Validation Loss: 4.0870

    Epoch 4/1000, Training Loss: 0.6739, Validation Loss: 4.1276

    Epoch 5/1000, Training Loss: 0.5242, Validation Loss: 4.1934

    Epoch 6/1000, Training Loss: 0.4641, Validation Loss: 4.2709

    Epoch 7/1000, Training Loss: 0.4112, Validation Loss: 4.4382

    Epoch 8/1000, Training Loss: 0.3263, Validation Loss: 4.3522

    Epoch 9/1000, Training Loss: 0.2877, Validation Loss: 4.2833

    Epoch 10/1000, Training Loss: 0.2524, Validation Loss: 4.3287

    Epoch 11/1000, Training Loss: 0.2341, Validation Loss: 4.4280

    Epoch 12/1000, Training Loss: 0.2051, Validation Loss: 4.5004

    Epoch 13/1000, Training Loss: 0.1790, Validation Loss: 4.5016

    Epoch 14/1000, Training Loss: 0.1630, Validation Loss: 4.4548

    Epoch 15/1000, Training Loss: 0.1694, Validation Loss: 4.4290

    Epoch 16/1000, Training Loss: 0.1529, Validation Loss: 4.4136

    Epoch 17/1000, Training Loss: 0.1539, Validation Loss: 4.4288

    Epoch 18/1000, Training Loss: 0.1380, Validation Loss: 4.4644

    Epoch 19/1000, Training Loss: 0.1416, Validation Loss: 4.4763

    Epoch 20/1000, Training Loss: 0.1313, Validation Loss: 4.4871

    Epoch 21/1000, Training Loss: 0.1449, Validation Loss: 4.5029

    Epoch 22/1000, Training Loss: 0.1263, Validation Loss: 4.5159

    Epoch 23/1000, Training Loss: 0.1318, Validation Loss: 4.5258

    Epoch 24/1000, Training Loss: 0.1360, Validation Loss: 4.5295

    Epoch 25/1000, Training Loss: 0.1249, Validation Loss: 4.5281

    Epoch 26/1000, Training Loss: 0.1133, Validation Loss: 4.5269

    Epoch 27/1000, Training Loss: 0.1089, Validation Loss: 4.5249

    Epoch 28/1000, Training Loss: 0.1204, Validation Loss: 4.5245

    Epoch 29/1000, Training Loss: 0.1097, Validation Loss: 4.5249

    Epoch 30/1000, Training Loss: 0.1081, Validation Loss: 4.5284

    Epoch 31/1000, Training Loss: 0.1164, Validation Loss: 4.5314

    Epoch 32/1000, Training Loss: 0.1179, Validation Loss: 4.5347

    Epoch 33/1000, Training Loss: 0.1102, Validation Loss: 4.5373

    Epoch 34/1000, Training Loss: 0.1226, Validation Loss: 4.5389

    Epoch 35/1000, Training Loss: 0.1185, Validation Loss: 4.5399

    Epoch 36/1000, Training Loss: 0.1061, Validation Loss: 4.5410

    Epoch 37/1000, Training Loss: 0.1043, Validation Loss: 4.5426

    Epoch 38/1000, Training Loss: 0.1174, Validation Loss: 4.5438

    Epoch 39/1000, Training Loss: 0.1062, Validation Loss: 4.5447

    Epoch 40/1000, Training Loss: 0.1018, Validation Loss: 4.5454

    Epoch 41/1000, Training Loss: 0.1060, Validation Loss: 4.5461

    Epoch 42/1000, Training Loss: 0.1026, Validation Loss: 4.5464

    Epoch 43/1000, Training Loss: 0.1085, Validation Loss: 4.5464

    Epoch 44/1000, Training Loss: 0.1022, Validation Loss: 4.5464

    Epoch 45/1000, Training Loss: 0.1081, Validation Loss: 4.5464

    Epoch 46/1000, Training Loss: 0.1130, Validation Loss: 4.5464

    Epoch 47/1000, Training Loss: 0.1049, Validation Loss: 4.5465

    Epoch 48/1000, Training Loss: 0.1106, Validation Loss: 4.5465

    Epoch 49/1000, Training Loss: 0.1089, Validation Loss: 4.5465

    Epoch 50/1000, Training Loss: 0.1093, Validation Loss: 4.5466

    Epoch 51/1000, Training Loss: 0.1023, Validation Loss: 4.5466

    Epoch 52/1000, Training Loss: 0.1131, Validation Loss: 4.5465

    Epoch 53/1000, Training Loss: 0.1230, Validation Loss: 4.5465

    Epoch 54/1000, Training Loss: 0.1135, Validation Loss: 4.5464

    Epoch 55/1000, Training Loss: 0.1072, Validation Loss: 4.5464

    Epoch 56/1000, Training Loss: 0.1164, Validation Loss: 4.5464

    Epoch 57/1000, Training Loss: 0.1175, Validation Loss: 4.5465

    Epoch 58/1000, Training Loss: 0.1066, Validation Loss: 4.5466

    Epoch 59/1000, Training Loss: 0.1086, Validation Loss: 4.5466

    Epoch 60/1000, Training Loss: 0.1089, Validation Loss: 4.5466

    Epoch 61/1000, Training Loss: 0.1026, Validation Loss: 4.5466

    Epoch 62/1000, Training Loss: 0.1066, Validation Loss: 4.5467

    Epoch 63/1000, Training Loss: 0.1072, Validation Loss: 4.5467

    Epoch 64/1000, Training Loss: 0.1030, Validation Loss: 4.5467

    Epoch 65/1000, Training Loss: 0.1197, Validation Loss: 4.5467

    Epoch 66/1000, Training Loss: 0.0998, Validation Loss: 4.5467

    Epoch 67/1000, Training Loss: 0.1141, Validation Loss: 4.5467

    Epoch 68/1000, Training Loss: 0.1155, Validation Loss: 4.5467

    Epoch 69/1000, Training Loss: 0.1175, Validation Loss: 4.5467

    Epoch 70/1000, Training Loss: 0.1138, Validation Loss: 4.5467

    Epoch 71/1000, Training Loss: 0.1127, Validation Loss: 4.5467

    Epoch 72/1000, Training Loss: 0.1137, Validation Loss: 4.5467

    Epoch 73/1000, Training Loss: 0.1152, Validation Loss: 4.5467

    Epoch 74/1000, Training Loss: 0.1058, Validation Loss: 4.5467

    Epoch 75/1000, Training Loss: 0.1071, Validation Loss: 4.5467

    Epoch 76/1000, Training Loss: 0.1063, Validation Loss: 4.5467

    Epoch 77/1000, Training Loss: 0.1093, Validation Loss: 4.5467

    Epoch 78/1000, Training Loss: 0.1082, Validation Loss: 4.5467

    Epoch 79/1000, Training Loss: 0.1061, Validation Loss: 4.5467

    Epoch 80/1000, Training Loss: 0.1159, Validation Loss: 4.5467

    Epoch 81/1000, Training Loss: 0.1031, Validation Loss: 4.5467

    Epoch 82/1000, Training Loss: 0.1186, Validation Loss: 4.5467

    Epoch 83/1000, Training Loss: 0.1028, Validation Loss: 4.5467

    Epoch 84/1000, Training Loss: 0.1189, Validation Loss: 4.5467

    Epoch 85/1000, Training Loss: 0.1181, Validation Loss: 4.5467

    Epoch 86/1000, Training Loss: 0.1109, Validation Loss: 4.5467

    Epoch 87/1000, Training Loss: 0.1093, Validation Loss: 4.5467

    Epoch 88/1000, Training Loss: 0.1102, Validation Loss: 4.5467

    Epoch 89/1000, Training Loss: 0.1140, Validation Loss: 4.5467

    Epoch 90/1000, Training Loss: 0.1169, Validation Loss: 4.5467

    Epoch 91/1000, Training Loss: 0.1058, Validation Loss: 4.5467

    Epoch 92/1000, Training Loss: 0.1096, Validation Loss: 4.5467

    Epoch 93/1000, Training Loss: 0.1181, Validation Loss: 4.5467

    Epoch 94/1000, Training Loss: 0.1099, Validation Loss: 4.5467

    Epoch 95/1000, Training Loss: 0.1133, Validation Loss: 4.5467

    Epoch 96/1000, Training Loss: 0.1054, Validation Loss: 4.5467

    Epoch 97/1000, Training Loss: 0.1197, Validation Loss: 4.5468

    Epoch 98/1000, Training Loss: 0.1232, Validation Loss: 4.5468

    Epoch 99/1000, Training Loss: 0.1164, Validation Loss: 4.5468

    Epoch 100/1000, Training Loss: 0.1107, Validation Loss: 4.5468

    Epoch 101/1000, Training Loss: 0.1075, Validation Loss: 4.5468

    Epoch 102/1000, Training Loss: 0.1197, Validation Loss: 4.5468

    Early stopping after 102 epochs

    

    Re-testing the RAG system with trained model:

    Input to the transformer model:

    What is reverse engineering? Ghidra is an open-source reverse engineering tool developed by the NSA, featuring a powerful decompiler. Reverse engineering is the process of analyzing a system to identify its components and their interrelationships.

    Chain of Thought:

    1. the query is: 'What is reverse engineering?'

    2. retrieved the following documents:

       1. Ghidra is an open-source reverse engineering tool developed by the NSA, featuring a powerful decompiler.

       2. Reverse engineering is the process of analyzing a system to identify its components and their interrelationships.

    3. analyzing the retrieved documents to answer the query.

    4. formulating the answer based on the information.

    

    Answer:

    reverse engineering is the process of analyzing a system to identify its components and their interrelationships.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mytechnotalent/knet

Awesome Lists containing this project

README