DeepSeek-R1 and Ollama3.2

Keyword(s):

Compute Benchmark GPU LLM

Privacy:

Public

The rapid advancement of Large Language Models (LLMs) has led to significant improvements in natural language processing capabilities, enabling applications such as conversational AI, text generation, and question-answering systems. Recently, the DeepSeek-R1 model has been introduced, which is a state-of-the-art LLM designed for various NLP tasks.

This report aims to evaluate the performance of the DeepSeek-R1 model in terms of its computational efficiency and reasoning capabilities. Specifically, we investigate the model's ability to process tokens per second (TPS), a key metric for measuring the speed and scalability of language models. Additionally, we assess the model's capacity for reasoning by evaluating its performance on a set of challenging tasks that require logical inference, decision-making, and problem-solving.

Through this study, we aim to provide an objective assessment of the DeepSeek-R1 model's strengths and limitations against phi4 and different versions of LLaMA3, offering insights into its potential applications and areas for improvement. The findings of this report are expected to help consumers of open source LLM to understand what could be the choices.

With the philosophy to run advanced software on "old hardware", we select an NVIDIA V100 from Open Telekom Cloud. Powered by 32GB of VRAM in PCI-Express.