Lite: 200 req/day

Core: 5,000 req/day

Plus: 10,000 req/day

Pro: 80,000 req/day

Max: Unlimited

Lite: 10 req/day

Core: 10 req/day

Plus: 2,000 req/day

Pro: 30,000 req/day

Max: Unlimited

Small Models
Model NameVersionContextSpeed*Prompt FormatRecommendation
Meta-Llama-3.1-8B-Instruct[NEW]3.1131,072

SR: 50t/s

PR: 400t/s

Llama 3 InstructMeta's Newest Llama model. Great for general use.
Meta-Llama-3-8B-Instruct3.0 Abliterated8,192

SR: 50t/s

PR: 400t/s

Llama 3 InstructGreat for general use, verbose, abliterated for less refusal
Awanllm-Llama-3-8B-Dolfin1.08,192

SR: 50t/s

PR: 400t/s

Llama 3 InstructExact instructions following, less refusals, no warnings
Awanllm-Llama-3-8B-Cumulus1.08,192

SR: 50t/s

PR: 400t/s

Llama 3 InstructGreat for storywriting or RP, zero refusal, follows characters
Large Models
Model NameVersionContextSpeed*Prompt FormatRecommendation
Meta-Llama-3.1-70B-Instruct[NEW]3.1131,072

SR: 20t/s

PR: 100t/s

Llama 3 InstructGreat for general use, SOTA dense LLM model
Meta-Llama-3-70B-Instruct1.08,192

SR: 20t/s

PR: 100t/s

Llama 3 InstructGreat for general use, SOTA dense LLM model
*SR = Tokens/Second for Single Requests. PR = Total Tokens/Second for Parallel Requests. Speeds are an estimate.