Llama-3.1-Minitron4B: Nvidia's Compact Language Model Revolution

Key Features

🔬

Only 400 million parameters

🚀

Efficient training and deployment

💡

Comparable performance to larger models

🔓

Available on Hugging Face for commercial use

Model Development Process

1

Fine-tuning 8B model on 94B token dataset

2

Applying depth and width pruning techniques

3

Fine-tuning pruned model with NeMo-Aligner

4

Evaluating model capabilities

Performance Highlights

Training Data Reduction
40x less
MMLU Benchmark Improvement
16% increase

Model Capabilities

📝

Instruction Following

🎭

Role Playing

🔍

Retrieval Augmented Generation (RAG)

⚙️

Function Calling

AIbase Logo

@AIbase

Tags:Llama-3.1-Minitron4B Nvidia language model compact language model efficient training model Hugging Face models instruction following AI role playing AI RAG model capabilities