了解 Llama 2 和 GPT-4 之间的主要区别,它们是自然语言处理的领先巨头。揭示它们的优势、劣势以及它们如何塑造语言技术的未来。
基准测试 | 样本数Shot | GPT-3.5 | GPT-4 | PaLM | PaLM-2-L | Llama 2 |
MMLU (5 样本) | 70 | 78.3 | 86.1 | – | – | 86.4 |
TriviaQA (1 样本) | 69.3 | 33 | 37.5 | – | – | 81.4 |
Natural Questions (1 样本) | 68.9 | 37.5 | 52.3 | – | – | 85 |
GSM8K (8 样本) | 85 | 56.5 | 56.8 | – | – | 87 |
HumanEval (0 样本) | 48.1 | 92 | 56.7 | – | – | 51.2 |
BIG-Bench Hard (3 样本) | 29.3 | 56.8 | 26.2 | – | – | 29.9 |
欢迎光临 邳州信息网 (https://www.pzxxw.com/) | Powered by Discuz! X3.4 |