
The Quadro RTX 8000 with 48 GB RAM is Ideal for training networks that require large batch sizes that otherwise would be limited on lower end GPUs.Using FP16 showed impressive gains in images/sec across most models when using 4 GPUs.ResNet-50 and ResNet-152 Showed massive scaling when going from 1-2-4 GPUs, a mind blowing 4193.48 images/sec for ResNet-.96 images/sec for ResNet-152 at FP16 & XLA!.AlexNet and VGG16 performed better using smaller batch size on a single GPU, but larger batch size performed better on these models when scaling up to 4 GPUs.


This is especially true when scaling to the 4 GPU configuration.

To demonstrate, we ran the standard tf_cnn_benchmarks.py benchmark script (found here in the official TensorFlow github). Our Exxact Valence Workstation was equipped with 4x Quadro RTX 8000's giving us an awesome 192 GB of GPU memory for our system. Updated with XLA FP32 and XLA FP16 metrics.įor this post, we conducted deep learning performance benchmarks for TensorFlow using the new NVIDIA Quadro RTX 8000 GPUs.
