| BERT (base) | 110M / 420MB | 400–800 MB | High for long seq., moderate | 512 | Prototyping, not optimal | Text classification, NER, feature extraction | Hugging Face GitHub Demo |
| BERT (large) | 340M / 1.6GB | 1.2–2.5 GB | Very high | 512 | Not practical for browser | Advanced NLU, classification, QA | Restackio |
| DistilBERT | 66M / 250MB | 200–400 MB | Lower than BERT-base | 512 | Excellent, browser-optimized | Fast text classification, Q&A, sentiment analysis | Hugging Face |
| MobileBERT | 25M / 100MB | 100–200 MB | Low, efficient | 512 | Best for browser/mobile | Lightweight Q&A, classification | Hugging Face Demo |
| ALBERT (base) | 12M / 45–60MB | 80–160 MB | Very efficient | 512 | Very good, small footprint | Efficient text classification, intent detection | Hugging Face |
| MiniLM | 33M / 120MB | 150–250 MB | Low | 512 | Excellent for search/ranking | Semantic search, ranking, recommendations | Hugging Face |
| TinyBERT | 14M / 55MB | 60–120 MB | Very low | 512 | Excellent | Fast, efficient, competitive on small tasks | Hugging Face |
| ModernBERT | 110M / varies | 400–700 MB | Moderate (browser-optimized) | 4096+ | SoA, large context | Retrieval, classification, code/doc search | Hugging Face Blog |
| LongFormer | 149M / 575MB | 600–1000 MB | High for long documents | 4096 | Research, long docs | Long doc Q&A, summarization | GitHub |
| BioBERT | 110M / 420MB | 400–800 MB | As BERT-base | 512 | Research, domain-specific | Biomedical QA, extraction | Hugging Face |
| Entity-BERT | 110M / 420MB | 400–800 MB | As BERT-base | 512 | Research, specialized | Entity extraction in domain tasks | Frontiers Paper |
| NeoBERT | 80–120M / TBD | Expected lower | Expected lower | 512+ | Research, next-gen efficient | Classification, retrieval, efficient inference | arXiv |
| Qwen2 (1B) | 1B / 4GB | 2–4 GB | Moderate | 2048+ | Good (quantized) | Multitask, edge, fast | Hugging Face |
| Phi-3 (mini) | 1.8B / 7GB | 4–8 GB | Moderate | 128k | Good (quantized) | Code, reasoning, dialogue | Hugging Face |
| TinyLlama | 1.1B / 4GB | 2–4 GB | Moderate | 2048+ | Good (quantized) | General SLM, tiny, fast | Hugging Face |
| Llama 2 (7B) | 7B / 28GB | 8–16 GB | High, edge device | 4096 | Feasible (quantized/edge) | General, strong SLM | Meta |
| Mistral (7B) | 7B / 28GB | 8–16 GB | High, edge device | 8192 | Feasible (quantized/edge) | Fast, efficient, browser/edge | Hugging Face |
| Gemma 2 (2B) | 2B / 8GB | 4–8 GB | Moderate | 8192 | Good (quantized) | Efficient, edge, general SLM | Google |
| Gemma 3 1B | 1B / 529MB | 1–2 GB | Excellent | 32,000 | Excellent | Text gen, QA, summarization, chat | Google HF |
| Gemma 3 4B | 4B / 2GB | 3–5 GB | Edge / Browser (quantized) | 128,000 | Excellent (quantized) | Text/image gen, multilingual, summarization | Google HF |
| StableLM 2 1.6B | 1.6B / 6GB | 2–4 GB | Excellent | 8,192 | Excellent | NLU, chat, multilingual tasks | HF |
| MiniCPM | 1.2B / 5GB | 2–4 GB | Excellent | 2,048 | Excellent | General SLM, chat, summarization | HF |
| BLOOMZ 560M | 560M / 2GB | 1–2 GB | Excellent | 2,048 | Excellent | Translation, chat, summarization, instruct | HF |
| Unreleased: Qwen3 SLM | 1–4B (expected) | ≤4 GB (quantized, est.) | Excellent (design goal) | 32–128k (rumor) | Excellent | Multilingual, multimodal, chat, QA | GitHub |
| Unreleased: Llama 4 Mini | 1–4B (expected) | ≤4 GB (quantized, est.) | Excellent (design goal) | 32–128k (rumor) | Excellent | NLU, chat, summarization | Meta |
| Unreleased: DeepSeek SLM v2 | 1–4B (expected) | ≤4 GB (est.) | Excellent | 32–128k (rumor) | Excellent | General SLM, chat | GitHub |
| Unreleased: Falcon Lite 1B | 1B (expected) | ≤2 GB (est.) | Excellent | 8k+ | Excellent | Multilingual NLU, chat | Falcon HF |