https://github.com/jaysmito101/lexpp

Small Extremely Powerful Header Only C++ Lexical Analyzer/String Parser Library
https://github.com/jaysmito101/lexpp

cpp11 header-only header-only-library lexer lexical-analyzer string-parser

Last synced: 6 months ago
JSON representation

Small Extremely Powerful Header Only C++ Lexical Analyzer/String Parser Library

Host: GitHub
URL: https://github.com/jaysmito101/lexpp
Owner: Jaysmito101
License: mit
Created: 2021-11-05T16:48:31.000Z (almost 4 years ago)
Default Branch: master
Last Pushed: 2024-01-01T07:38:40.000Z (almost 2 years ago)
Last Synced: 2025-03-15T21:04:56.295Z (7 months ago)
Topics: cpp11, header-only, header-only-library, lexer, lexical-analyzer, string-parser
Language: C++
Homepage: https://github.com/Jaysmito101/lexpp/wiki
Size: 85 KB
Stars: 66
Watchers: 4
Forks: 5
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          




    








  

  

  

    


    






# lexpp

Small Extremely Powerful Header Only C++ Lexical Analyzer/String Parser Library

Lexpp is made with simplicity and size in mind. The entire library is about 500 lines!

Lexpp is very powerful and can be used for almost all parsing needs!

You can check the examples/ for more elaborate usage.

# Docs : 

### https://github.com/Jaysmito101/lexpp/wiki

### For any help join our discord server https://discord.gg/muq5nDxF9t

# How to Use

Just place the `lexpp.h` file in your project include directory.

In one cpp file define `LEXPP_IMPLEMENTATION` before importing lexpp like  this:

    #define LEXPP_IMPLEMENTATION

    #include "lexpp.h"

    

You are all done to use lexpp!

# Basic Examples

## String parsing

     Click To See Code 

    

    std::string data = "some text to parse! ";

    std::vector tokens = lexpp::lex(data, {"<=", "<<", "\n", "::", ",", "}", "{", ";", " "}, false);

    for(std::string& token : tokens){

        std::cout << token << std::endl;

    }

        

 

## Using Custom Token Classifier

        

Some Structs we will need

         Click To See Code 

        

    enum MyTokens{

        Keyword = 0,

        Number,

        String,

        Other

    };

    

    static std::string TokenToString(int tok){

    switch(tok){

        case Keyword: return "Keyword";

        case Number:  return "Number";

        case String:  return "String";

        case Other:   return "Other";

    }

    }

        

Now the Lexing

        

         Click To See Code 

        

    std::vector keywords = {"for", "void", "return", "if", "int"};

    std::vector tokens = lexpp::lex(data, {"<=", "<<", "\n", "::", ",", "}", "{", "(", ")" ";", " "}, [keywords](std::string& token, bool* discard, bool is_separator) -> int {

        if(std::find(keywords.begin(), keywords.end(), token) != keywords.end()){

            return MyTokens::Keyword;

        }

        if(is_number(token))

            return MyTokens::Number;

        else

            return MyTokens::String;

    }, false);

    for(lexpp::Token& token : tokens){

        std::cout << TokenToString(token.type) << " -> " << token.value << std::endl;

    }

    

## Using the `TokenParser` class

We need to extend the `TokenParser` class to have our cuastom token parser

         Click To See Code 

        

    class MyTokenParser : public lexpp::TokenParser

    {

    public:

    MyTokenParser(std::string data, std::string separators)

    :TokenParser(data, separators, false){}

    virtual int process_token(std::string& token, bool* discard, bool isSeparator) override

    {

        if(std::find(keywords.begin(), keywords.end(), token) != keywords.end())

            return MyTokens::Keyword;

        else if(is_number(token))

            return MyTokens::Number;

        else if(isSeparator)

            return MyTokens::Other;

        else

            return MyTokens::String;

    }    

    std::vector keywords = {"for", "void", "return", "if", "int"};

    };

        

    

Now using the class with the lexer

         Click To See Code 

            

    std::vector tokens =     lexpp::lex(std::make_shared(data, "\n :,[]{}().\t"));

    for(lexpp::Token& token : tokens){

        std::cout << TokenToString(token.type) << " -> " << token.value << std::endl;

    }

            

            

## Making an email parser with lexpp

First a struct to store out data

            

         Click To See Code      

    

    struct Email{

        std::string name;

        std::string domainFront;

        std::string domainEnd;

        std::string domain;

    };

    

    

Now we need to make our custom token parser for email parsing

         Click To See Code 

      

    class EmailTokenParser : public lexpp::TokenParser

    {

    public:

    EmailTokenParser(std::string data, std::string separators = "\n@.")

    :TokenParser(data, separators, true){}

    virtual int process_token(std::string& token, bool* discard, bool isSeparator) override

    {

        if(isSeparator){

            if(ci == 2){

                currMail.domain = currMail.domainFront + "." + currMail.domainEnd;

                emailIds.push_back(currMail);

                ci = 0;

                *discard = true;

                return 0;  

            }

            if(token.size() <= 0){

                *discard = true;

                return 0;  

            }

            if(token == "\n"){

                ci = 0;

                *discard = true;

                return 0;  

            }

            else if(token == "@"){

                ci = 1;

                *discard = true;

                return 0;                

            }

            else if(token == "."){

                ci = 2;

                *discard = true;

                return 0;                

            }

        }

        if(ci == 0)

            currMail.name = token;

        else if(ci == 1)

            currMail.domainFront = token;

        else if(ci == 2)

            currMail.domainEnd = token;

    }    

    int ci = 0;

    Email currMail;

    std::vector emailIds;

    };

      

    

Now finally calling lex

         Click To See Code 

    

    std::shared_ptr tok_parser = std::make_shared(data+"\n", "\n@.");

    lexpp::lex(tok_parser);

    for(Email& email : tok_parser->emailIds){

        std::cout << "Email : \nNAME: " << email.name << "\nDOMAIN : " << email.domain << std::endl;

    }

    

# Support

I am just a Highschool student so I may not have the best quality of code but still i am trying my best to write good code!

Any support would be highly appretiated!

For example you could add a feature and contribute via pull requests or you could even report any issues with the program!

And the best thing you could do to support this project is spread word about this so that more people who might be interested in this may use this!

Please considering tweeting about this!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jaysmito101/lexpp

Awesome Lists containing this project

README