Awan LLM

Lite: 200 req/day

Core: 5,000 req/day

Plus: 10,000 req/day

Pro: 80,000 req/day

Max: Unlimited

Lite: 10 req/day

Core: 10 req/day

Plus: 2,000 req/day

Pro: 30,000 req/day

Max: Unlimited

Model Name	Version	Context	Speed*	Prompt Format	Recommendation
Small Models
Meta-Llama-3.1-8B-Instruct[NEW]	3.1	131,072	SR: 50t/s PR: 400t/s	Llama 3 Instruct	Meta's Newest Llama model. Great for general use.
Meta-Llama-3-8B-Instruct	3.0 Abliterated	8,192	SR: 50t/s PR: 400t/s	Llama 3 Instruct	Great for general use, verbose, abliterated for less refusal
Awanllm-Llama-3-8B-Dolfin	1.0	8,192	SR: 50t/s PR: 400t/s	Llama 3 Instruct	Exact instructions following, less refusals, no warnings
Awanllm-Llama-3-8B-Cumulus	1.0	8,192	SR: 50t/s PR: 400t/s	Llama 3 Instruct	Great for storywriting or RP, zero refusal, follows characters

Large Models
Model Name	Version	Context	Speed*	Prompt Format	Recommendation
Meta-Llama-3.1-70B-Instruct[NEW]	3.1	131,072	SR: 20t/s PR: 100t/s	Llama 3 Instruct	Great for general use, SOTA dense LLM model
Meta-Llama-3-70B-Instruct	1.0	8,192	SR: 20t/s PR: 100t/s	Llama 3 Instruct	Great for general use, SOTA dense LLM model

*SR = Tokens/Second for Single Requests. PR = Total Tokens/Second for Parallel Requests. Speeds are an estimate.