Projects in Awesome Lists by naidezhujimo

https://github.com/naidezhujimo/sparse-moe-language-model-v1

This repository contains an implementation of a Sparse Mixture of Experts (MoE) Language Model using PyTorch. The model is designed to handle large-scale text generation tasks efficiently by leveraging multiple expert networks and a routing mechanism to dynamically select the most relevant experts for each input.

moe nlp pytorch transformer

Last synced: 05 Apr 2025

https://github.com/naidezhujimo/three-methods-of-quantification

Last synced: 15 May 2025

https://github.com/naidezhujimo/cot-multi-path-generation-and-self-consistency

Last synced: 15 May 2025

https://github.com/naidezhujimo/coat-mcts-

cot llm mcts testtime

Last synced: 15 May 2025

https://github.com/naidezhujimo/tree-shaped-chain-of-thought

cot llm testtime

Last synced: 15 May 2025

https://github.com/naidezhujimo/glowflow-p1-framework-for-training-and-evaluating-language-models-incomplete-

Personally written, the project is not developed (there are many problems)

fine-tuning gpt2 ppo pytorch

Last synced: 28 Mar 2025

https://github.com/naidezhujimo/yinghub-v2-a-sparse-moe-language-model

YingHub-v2 is an advanced language model built upon the Sparse Mixture of Experts (MoE) architecture. It leverages dynamic routing mechanisms, expert load balancing.incorporating state-of-the-art training and optimization strategies.

llm moe nlp pytorch

Last synced: 28 Mar 2025

https://github.com/naidezhujimo/-dqn-pong-dqn-based-pong-game

Pytorch、gymnasium、DQN、Double Q-Learning、Deuling Network、Multi-step learning、Noise networks

deep-learning gymnasium pytorch rl-learning

Last synced: 04 Mar 2025

https://github.com/naidezhujimo/parallel-sampling-with-sequential-revision-and-joint-optimization

cot llm testtime

Last synced: 15 May 2025

https://github.com/naidezhujimo/mindset-punishment-tip-time-preference-optimization-tpo

cot llm testtime

Last synced: 15 May 2025

https://github.com/naidezhujimo/exploring-the-limit-of-outcome-reward-for-learning-mathematical-reasoning

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

llm rl testtime

Last synced: 15 May 2025

https://github.com/naidezhujimo/proximal-policy-optimization-ppo-for-bipedalwalker-v3

his repository contains an implementation of the Proximal Policy Optimization (PPO) algorithm to solve the BipedalWalker-v3 environment from the Gymnasium library. This project uses a combination of policy and value networks to learn a policy for controlling a bipedal walker.

deep-learning gae gymnasium ppo-pytorch pytorch rl

Last synced: 11 Mar 2025

https://github.com/naidezhujimo/deep-q-network-dqn-for-atari-games

This repository implements a Deep Q-Network (DQN) framework for training agents to play Atari games using the OpenAI Gym environment. The implementation includes various enhancements such as Double DQN, Dueling DQN, and Noisy Networks. The project is designed to be modular and easy to extend.

deep-learning gymnasium pytorch rl-learning

Last synced: 11 Mar 2025

https://github.com/naidezhujimo/lora-dora-qlora-pytorch-efficient-fine-tuning-with-low-rank-adaptation

This repository contains a PyTorch implementation of a Convolutional Neural Network (CNN) for classifying the MNIST dataset. The project explores different fine-tuning techniques, including LoRA (Low-Rank Adaptation), DoRA (Dynamic Low-Rank Adaptation), and QLoRA (Quantized Low-Rank Adaptation), to improve model performance and efficiency.

cv dora fine-tune lora pytorch qlora

Last synced: 11 Mar 2025

https://github.com/naidezhujimo/reinforcement-learning-with-evolution-strategies-es-in-lunarlander-v3

This repository contains an implementation of a reinforcement learning algorithm using Evolution Strategies (ES) to solve the LunarLander-v3 environment from the Gymnasium library. The algorithm uses a neural network to parameterize the policy and optimizes it using noise perturbations and rank-based fitness shaping.

gymnasium pytorch rl

Last synced: 11 Mar 2025

https://github.com/naidezhujimo/yingret

Last synced: 09 Apr 2025

https://github.com/naidezhujimo/triton-flashattention

This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance improvements through hardware-aware optimizations. The implementations range from basic block-wise processing to advanced techniques like FP8 quantization and prefetching

attention flashattention triton

Last synced: 09 Apr 2025

https://github.com/naidezhujimo/yinggem

YingGem is a lightweight Transformer-based language model designed for efficient text generation. It incorporates sliding window attention and rotary positional embeddings to maintain generation quality while significantly reducing computational complexity. Ideal for poetry generation, dialogue systems, and other NLP tasks

llm pytorch transformer

Last synced: 09 Apr 2025

https://github.com/naidezhujimo/cuda-rewrite-fast-matrix-multiplication

This repository contains an optimized implementation of matrix multiplication using CUDA. The goal of this project is to provide a high-performance solution for matrix multiplication operations on NVIDIA GPUs.

cuda

Last synced: 26 Mar 2025

https://github.com/naidezhujimo/firefly-ec

new ec

Last synced: 05 Apr 2025

https://github.com/naidezhujimo/gpt-tokenizer

a technique used in natural language processing to efficiently encode text data.

bpe nlp tokenizer

Last synced: 05 Apr 2025

https://github.com/naidezhujimo/yingmab

Mamba is a modern state space model (SSM) featuring input-dependent state transitions and hardware-aware parallel scans using Triton. This implementation demonstrates high-performance sequence modeling through a combination of causal convolutions, selective parameterization, and GPU-optimized recurrent computations.

mamba pytorch

Last synced: 09 Apr 2025

Last synced: 26 Mar 2025

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome