2018 MACBOOK PRO + EGPU = FASTER TRANSCODING?

In 2016 the MacBook Pro got updated with a new design and beefier specs, but the specs were never really pro. Finally in 2018 Apple issued an update that finally ticked the boxes for a worthy upgrade.
 
6-Core CPU, check.
32GB of RAM, check.
Decent Graphics Card, check.
 
I jumped at the chance to refresh my Data Management Kit with an up to date system. But stock standard is never enough for me, with macOS now supporting eGPUs it got me thinking that there is an even better way to do things. I had a Nvidia 980Ti sitting around and decided to fit it to an Akitio Node Pro eGPU Enclosure. Time to put the new rig through some tests!

THE QUESTION

What I am gunning for with these tests is to see if it's viable to use the Internal GPU which is an AMD Radeon Pro 560X and the eGPU Nvidia 980Ti simultaneously in Resolve Studio to get faster transcode speeds from the 2018 MacBook Pro.

If possible, it would allow for a really portable transcoding solution on-set. You'd have a laptop which would produce modest transcode results, then you could easily plug up your eGPU to take those speeds to the next level. The plan would be to use the Internal GPU for transcoding when you didn't have Mains Power, if you did have Mains Power you'd be able to use both the Internal GPU and eGPU to power through the transcodes at record speed.

TEST PROCEDURE

For this test I used a top of the line 2018 MacBook Pro with the following specs:

CPU: 2.9GHz Intel i9
GPU: AMD Radeon Pro 560X / Intel UHD Graphics 630
RAM: 32GB
SSD: 1TB PCIe

The eGPU was an Akitio Node Pro fitted with a Gigabyte GeForce GTX 980Ti connected via Thunderbolt 3 to the MacBook Pro. To make the 980Ti compatible with macOS I used the macOS-eGPU.sh script.

Transcodes were conducted with DaVinci Resolve 15 Beta 6 for Single GPU Tests and DaVinci Resolve Studio 15 Beta 7 for Dual GPU Tests.

Transcodes were working with these input/output formats:
Source Format: 4K UHD 3840x2160 ARRIRAW from an Alexa LF
Output Format: HD 1920x1080 ProRes 422 HQ with a LogC to REC709 LUT Applied

Footage was courtesy of ARRI from their sample footage collection. The exact clips used from the collection were B002C006 and B003C001. These clips were duplicated 3 Times in the Resolve Timeline to create a total length of source material to be processed of 2:14:09 @ 24fps, aka 2 Mins 14 Secs 9 Frames. ARRIRAW was chosen for these tests as it frequently requires transcoding and is heavily influenced by the performance of a Graphics Card.

Both source footage and exported footage was reading/writing from the MacBook Pro's Internal PCIe SSD which delivers speeds of over 2000MB/s thus negating hard disk speed as a limiting factor in the transcode pipeline.

All tests were conducted in identical conditions. The only settings changed were GPU Processing Mode and which GPU was being used. Settings can be tweaked in Resolve's Preferences and will come into effect once you reboot the software.

TEST RESULTS

This is the raw data from my tests:

This is a comparison of the Transcode FPS between different GPU Modes and GPUs:

This is a comparison of the Transcode Time in minutes between different GPU Modes and GPUs:

When you transcode ARRIRAW through DaVinci Resolve using the Metal GPU Mode with a Nvidia GPU you will get corrupt looking images as explained in a previous blog post. These corrupt looking colour shifts not only display while transcoding in Resolve but they also come through in the exported file. While I did test Metal for speed with the 980Ti, the product of the transcode is unusable and thus the Metal results should be ignored. Transcodes that are effected by GPU Corruption look like this when running in Resolve:

These results were much more interesting then anticipated. ARRIRAW is a GPU dependent video codec so it's no surprise that a beefier GPU performed better. What is interesting is that the CUDA API is significantly faster at processing ARRIRAW than the OpenCL API. Even in the instance where I was using Dual GPUs to process the footage in OpenCL the Single Nvidia GPU using CUDA was still faster. 36fps vs. 54fps. It's a big difference, especially when processing massive amounts of footage. If you could complete the exact same task in 1:02 Mins or 1:32 Mins what would you do?

Worth noting is that each and every time the transcode speed was capped out by the GPU Processor. Not the GPU Memory and not the CPU. I used iStat Menus to verify this which you can see below highlighted in red:

 The lesson to be learnt here is that many people just see the number when referencing the GPU. '6GB GPU, wow, that must be fast'. Well yes, it is fast but that number isn't the be all and end all for producing faster results, particulary while transcoding.

CONCLUSION

The hypothesis was to see if the addition of a eGPU could drastically speed up transcode times while having a small footprint on set. I'd say the answer is yes but the results didn't appear in the way that was expected.

Without a eGPU you can transcode ARRIRAW on the MacBook Pro at 19-20fps depending if you use OpenCL or Metal. Once you plug in a eGPU fitted with a 980Ti you can transcode using CUDA at 54fps which is 2.7x faster. A worthy improvement in my book. Going forward I'll have the eGPU in my kit as a portable option for boosting transcode performance.

I'll also be sure to use the CUDA API where available for transcoding ARRIRAW as it has proven superior for crunching the 1's and 0's when working with ARRI's flagship recording codec.

The eGPU is a flexible tool. You can fit almost any Graphics Card in a eGPU, I just happened to have a spare Nvidia 980Ti around and thus it became my weapon of choice. If I were selecting a specific Graphics Card for the purposes of transcoding with an Apple Computer there are some other attractive options that work natively with macOS that you should consider taking a look into. You could use the Blackmagic eGPU which comes pre-fitted with a Radeon Pro 580. Alternatively you could use the Akitio Node Pro eGPU Enclosure that I am using but fit it with the same GPU that is used in an iMac Pro, the Vega 56 or Vega 64. These Graphics Cards have official support from Apple in macOS and will likely produce different results to those documented above, particularly with Dual Internal GPU and eGPU Setups using the OpenCL and Metal APIs. If you've got an eGPU running any of these GPU Setups we'd love to hear from you in the comments below.

I hope you've found this useful. Questions, comments, want help? Touch base in the comments section below!