Projects in Awesome Lists by naidezhujimo
A curated list of projects in awesome lists by naidezhujimo .
https://github.com/naidezhujimo/sparse-moe-language-model-v1
This repository contains an implementation of a Sparse Mixture of Experts (MoE) Language Model using PyTorch. The model is designed to handle large-scale text generation tasks efficiently by leveraging multiple expert networks and a routing mechanism to dynamically select the most relevant experts for each input.
Last synced: 05 Apr 2025
https://github.com/naidezhujimo/cot-multi-path-generation-and-self-consistency
Last synced: 15 May 2025
https://github.com/naidezhujimo/glowflow-p1-framework-for-training-and-evaluating-language-models-incomplete-
Personally written, the project is not developed (there are many problems)
Last synced: 28 Mar 2025
https://github.com/naidezhujimo/yinghub-v2-a-sparse-moe-language-model
YingHub-v2 is an advanced language model built upon the Sparse Mixture of Experts (MoE) architecture. It leverages dynamic routing mechanisms, expert load balancing.incorporating state-of-the-art training and optimization strategies.
Last synced: 28 Mar 2025
https://github.com/naidezhujimo/-dqn-pong-dqn-based-pong-game
Pytorch、gymnasium、DQN、Double Q-Learning、Deuling Network、Multi-step learning、Noise networks
deep-learning gymnasium pytorch rl-learning
Last synced: 04 Mar 2025
https://github.com/naidezhujimo/parallel-sampling-with-sequential-revision-and-joint-optimization
Last synced: 15 May 2025
https://github.com/naidezhujimo/mindset-punishment-tip-time-preference-optimization-tpo
Last synced: 15 May 2025
https://github.com/naidezhujimo/exploring-the-limit-of-outcome-reward-for-learning-mathematical-reasoning
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
Last synced: 15 May 2025
https://github.com/naidezhujimo/proximal-policy-optimization-ppo-for-bipedalwalker-v3
his repository contains an implementation of the Proximal Policy Optimization (PPO) algorithm to solve the BipedalWalker-v3 environment from the Gymnasium library. This project uses a combination of policy and value networks to learn a policy for controlling a bipedal walker.
deep-learning gae gymnasium ppo-pytorch pytorch rl
Last synced: 11 Mar 2025
https://github.com/naidezhujimo/deep-q-network-dqn-for-atari-games
This repository implements a Deep Q-Network (DQN) framework for training agents to play Atari games using the OpenAI Gym environment. The implementation includes various enhancements such as Double DQN, Dueling DQN, and Noisy Networks. The project is designed to be modular and easy to extend.
deep-learning gymnasium pytorch rl-learning
Last synced: 11 Mar 2025
https://github.com/naidezhujimo/lora-dora-qlora-pytorch-efficient-fine-tuning-with-low-rank-adaptation
This repository contains a PyTorch implementation of a Convolutional Neural Network (CNN) for classifying the MNIST dataset. The project explores different fine-tuning techniques, including LoRA (Low-Rank Adaptation), DoRA (Dynamic Low-Rank Adaptation), and QLoRA (Quantized Low-Rank Adaptation), to improve model performance and efficiency.
cv dora fine-tune lora pytorch qlora
Last synced: 11 Mar 2025
https://github.com/naidezhujimo/reinforcement-learning-with-evolution-strategies-es-in-lunarlander-v3
This repository contains an implementation of a reinforcement learning algorithm using Evolution Strategies (ES) to solve the LunarLander-v3 environment from the Gymnasium library. The algorithm uses a neural network to parameterize the policy and optimizes it using noise perturbations and rank-based fitness shaping.
Last synced: 11 Mar 2025
https://github.com/naidezhujimo/triton-flashattention
This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance improvements through hardware-aware optimizations. The implementations range from basic block-wise processing to advanced techniques like FP8 quantization and prefetching
attention flashattention triton
Last synced: 09 Apr 2025
https://github.com/naidezhujimo/yinggem
YingGem is a lightweight Transformer-based language model designed for efficient text generation. It incorporates sliding window attention and rotary positional embeddings to maintain generation quality while significantly reducing computational complexity. Ideal for poetry generation, dialogue systems, and other NLP tasks
Last synced: 09 Apr 2025
https://github.com/naidezhujimo/cuda-rewrite-fast-matrix-multiplication
This repository contains an optimized implementation of matrix multiplication using CUDA. The goal of this project is to provide a high-performance solution for matrix multiplication operations on NVIDIA GPUs.
Last synced: 26 Mar 2025
https://github.com/naidezhujimo/gpt-tokenizer
a technique used in natural language processing to efficiently encode text data.
Last synced: 05 Apr 2025
https://github.com/naidezhujimo/yingmab
Mamba is a modern state space model (SSM) featuring input-dependent state transitions and hardware-aware parallel scans using Triton. This implementation demonstrates high-performance sequence modeling through a combination of causal convolutions, selective parameterization, and GPU-optimized recurrent computations.
Last synced: 09 Apr 2025
https://github.com/naidezhujimo/cuda-learning-just-record-the-learning-process-
just record the learning process,There are notes,Welcome to learn.
Last synced: 26 Mar 2025