Mixture of Experts model papers

C++ data structures for leetcode Reading "Clean code" A tour of C++, Standard library A tour of C++, Basics Golang basics Multi-Threading with Cpp Golang tricks Some Golang concurrency concepts Javascript basics PHP basics Java boxing and unboxing How to safely stop threads in Java Java sort tricks Override equals and hashcode method in Java Convert between list and array in Java Java Programming 4, Tool Class Java Programming 3, Deeper Understanding, Exception Java Programming 2, Class Java Programming 1, Basics, I/O, Data Type C advanced Java basics 2 Java basics 1 C basics Use Google Drive OCR Ruby Tutorial on Lynda Python Tutorial from Pandas Binary Heap in Java Use Itertools for Better Iteration More data type from collections module File manipulation with os module in python Deal with zip file in python Regular expression usage with sample Basic json usage with python Simple file IO Glob module for simple file matching Exception Handling in python Logging in python Functional Programming Set data type in python Loop through dictionary Ruby practice 4 Ruby practice 3 Ruby practice 2 Ruby practice 1 Rails tutorial by Ihower

Migrate blog from Github Pages to Gitlab Pages React basics 7 RESTful routes and use PUT in HTML form Use custom domain for Github page and add HTTPS support Node.js，Express, API and MongoDB basics jQuery basics Document Object Model DOM HTML CSS Bootstrap basics Some tricks found during Spring/SpringMVC development How to create a static blog Rails tutorial by Ihower HTML from W3School 1 Rspec rocks Use markdown in your rails blog A little filter in your rails app Use devise to authenticate your app Better url with friendly id Use will paginate in your rails app Add disqus to your rails app

MacOS bluetooth connectivity issues Home network, media center setup Enable Meta key on Linux Deploy a shad0ws0cks server on VPS and enable BBR Use Soundflower to enable volume control for external monitor Useful commands related to network Some tools for fast development Write smart with Markdown (citations, footnotes, figure, table and equation references) Add GDB support in Bochs environment and terminal output for an OS console Redis basics Some queries in MongoDB awk, sed, cut, content manipulating techniques in shell Remove annoying DS_Store in git File and folder manipulating techniques in shell Markdown quick syntax Quick Reference of Vim Basic linux commands

Design patterns Mixture of Experts model papers English phrases Public speaking course notes Read "Dynamo, Amazon’s Highly Available Key-value Store" Read "Bigtable, A Distributed Storage System for Structured Data" Read "Streaming Systems" 3, Watermarks Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems IOS development with Swift Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems Three cases of binary search TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction Overview in cloud computing 2 TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Overview in distributed systems and cloud computing 1 A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion TAMU Advanced Algorithms 3, Maximum Bandwidth Path (Dijkstra, MST, Linear) TAMU Advanced Algorithms 2, B+ tree and Segment Intersection TAMU Advanced Algorithms 1, BST, 2-3 Tree and Heap TAMU AI, Searching problems Factorization Machine and Field-aware Factorization Machine for CTR prediction TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons Princeton Algorithms P1W6 Hash Tables & Symbol Table Applications Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Princeton Algorithms P1W5 Balanced Search Trees TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Princeton Algorithms P1W4 Priority Queues and Symbol Tables Stanford ML 5 Neural Networks Learning Princeton Algorithms P1W3 Mergesort and Quicksort Stanford ML 4 Neural Networks Basics Princeton Algorithms P1W2 Stack and Queue, Basic Sorts Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Princeton Algorithms P1W1 Union and Find Stanford ML 1 Introduction and Parameter Learning

A tool for docker container safety determination A TCP-like reliable data transfer protocol built on UDP A multi-threading high-performance web crawler TinySQL, a SQL interpreter implementation Algorithms solving maximum bandwidth path problem in network A simple DNS resolver, similar to nslookup A guess color game practice developed with HTML, CSS and Javascript CourseLookr, a DB-based web app searching courses in A&M Solve blocksWorld problem with A* Search

读人类简史读浪潮之巅记忆力抢救日更子在川上曰 Night asks day, A song by JJ Lin

Database 17

DeepLearning 1

TAMU Neural Network 6 Deep Learning Overview

Design 17

Docker 1

A tool for docker container safety determination

GDB 1

Add GDB support in Bochs environment and terminal output for an OS console

Go 3

Golang basics Golang tricks Some Golang concurrency concepts

Google 4

Read "Bigtable, A Distributed Storage System for Structured Data" Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database"

HTML 3

A guess color game practice developed with HTML, CSS and Javascript HTML CSS Bootstrap basics HTML from W3School 1

Java 17

Three cases of binary search A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion Java boxing and unboxing CourseLookr, a DB-based web app searching courses in A&M Some tricks found during Spring/SpringMVC development How to safely stop threads in Java Java sort tricks Override equals and hashcode method in Java Convert between list and array in Java Java Programming 4, Tool Class Java Programming 3, Deeper Understanding, Exception Java Programming 2, Class Java Programming 1, Basics, I/O, Data Type Java basics 2 Java basics 1 Binary Heap in Java

Javascript 4

jQuery basics A guess color game practice developed with HTML, CSS and Javascript Document Object Model DOM Javascript basics

Linux 4

Enable Meta key on Linux awk, sed, cut, content manipulating techniques in shell File and folder manipulating techniques in shell Basic linux commands

MacOS 2

MacOS bluetooth connectivity issues Use Soundflower to enable volume control for external monitor

MachineLearning 17

Mixture of Experts model papers TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Stanford ML 5 Neural Networks Learning Stanford ML 4 Neural Networks Basics Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Stanford ML 1 Introduction and Parameter Learning

Markdown 4

Write smart with Markdown (citations, footnotes, figure, table and equation references) How to create a static blog Markdown quick syntax Use markdown in your rails blog

MongoDB 2

Some queries in MongoDB Node.js，Express, API and MongoDB basics

Multi-threading 3

A multi-threading high-performance web crawler Multi-Threading with Cpp How to safely stop threads in Java

Network 11

Deploy a shad0ws0cks server on VPS and enable BBR TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer A TCP-like reliable data transfer protocol built on UDP TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Useful commands related to network A multi-threading high-performance web crawler Algorithms solving maximum bandwidth path problem in network A simple DNS resolver, similar to nslookup

NeuralNetwork 10

TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction

OS 8

TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading Add GDB support in Bochs environment and terminal output for an OS console

Python 15

Use Google Drive OCR Python Tutorial from Pandas Use Itertools for Better Iteration More data type from collections module File manipulation with os module in python Deal with zip file in python Regular expression usage with sample Basic json usage with python Simple file IO Glob module for simple file matching Exception Handling in python Logging in python Functional Programming Set data type in python Loop through dictionary

RESTful 1

7 RESTful routes and use PUT in HTML form

Rails 9

Rails tutorial by Ihower Rspec rocks Use markdown in your rails blog A little filter in your rails app Use devise to authenticate your app Better url with friendly id Use will paginate in your rails app Add disqus to your rails app Rails tutorial by Ihower

Ruby 6

Ruby Tutorial on Lynda Ruby practice 4 Ruby practice 3 Ruby practice 2 Ruby practice 1 Rails tutorial by Ihower

Shell 2

awk, sed, cut, content manipulating techniques in shell File and folder manipulating techniques in shell

Spring 2

CourseLookr, a DB-based web app searching courses in A&M Some tricks found during Spring/SpringMVC development

TCP 1

A TCP-like reliable data transfer protocol built on UDP

Thread 2

Multi-Threading with Cpp How to safely stop threads in Java

awk 1

awk, sed, cut, content manipulating techniques in shell

Mixture of Experts model papers

2025-07-12

Switch Transformers

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

Precursor of Switch Transformers

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

Mixtral SOTA

Mixtral of Experts blog post

Adaptive mixtures of local experts

Hierarchical Mixtures Of Experts And The Em Algorithm

Blackboard design pattern

Two complementary patterns to build multi-expert systems

Querying various data sources

Copyright claim: Mixture of Experts model papers is created by melonskin on 2025/07/12. Its copyright belongs to the author. Commercial usage must be authorized by the author. The source should be included for non-commercial purposes.
Link to the article: https://amelon.org/2025/07/12/moe-papers.html

Melon blog is created by melonskin. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2016-2025. All rights reserved by melonskin. Powered by Jekyll.

AI 2

Algorithm 17

Amazon 1

Authorization 1

Blog 3

Bootstrap 1

C++ 1

CCpp 5

CSS 2

Cloud 3

Code 1

Crawler 1

DNS 1

Database 17

DeepLearning 1

Design 17

Development 1

Docker 1

English 1

Express 1

GDB 1

Go 3

Google 4

HTML 3

IOS 1

Java 17

Javascript 4

Jekyll 1

Linux 4

MacOS 2

MachineLearning 17

Markdown 4

Mobile 1

MongoDB 2

Multi-threading 3

NAS 1

Network 11

NeuralNetwork 10

Node 1

OS 8

Public-speaking 1

Python 15

RESTful 1

Rails 9

React 1

Redis 1

Ruby 6

Shell 2

Spring 2

System 17

TCP 1

TDD 1

Thread 2

Vim 1

awk 1

git 1

jQuery 1

media 1

network 1

php 1

Mixture of Experts model papers

Switch Transformers

Precursor of Switch Transformers

Mixtral SOTA

Adaptive mixtures of local experts

Hierarchical Mixtures Of Experts And The Em Algorithm

Blackboard design pattern

Querying various data sources