Well the first verdict is in. My love/hate affair with Metro 2033 continues! I feel like I spend more time configuring and benchmarking this game than I do playing it! Anyhow, once again I took a run at getting decent scores in Metro2033Benchmark in 3D surround. Here are the benchmark settings:
The setup is stereoscopic 3D on, 5040×1050, settings at Very High, DOF off (no need in 3D), PhysX ON just out of general principal, AF at 16x, AAA since I dont see a huge difference, honestly, between AAA and MSAA visually.
So how did it go? Bittersweet honestly. On the one hand, it now actually does go. On the GTX580’s this setup resulted in tragic framerates not worth reporting (although I wish I had recorded them so I could report them! LOL) On the other hand, though, this really still isn’t playable. Oh well. Maybe by the time we have a GTX990 in quad SLI Metro 2033 will finally be playable in 3D surround! At a certain point, in my opinion, it’s time to blame engine inefficiencies. Sorry 4A!
First pic is with i980x at 3.83Ghz. GPU’s are at 111% power target:
The term “bottleneck” is one of those terms that has become so misused, and misunderstood, that it no longer has meaning. It’s a shame really, how badly so many enthusiasts (and even pro-reviewers honestly) complete fail to understand how a 3D pipeline works in particular, and really how subsystems in a modern multi-threaded OS work in general. That said, here are some observations on scaling, tri-sli, and how the CPU and GPUs interact with Metro…
As a reminder, this is the testbed setup:
- Motherboard: Gigabyte EX58-UD5
- Memory: 12GB Patriot Viper DDR3 @1600Mhz (9-9-9… shitty timings, but what can ya do… 12GB is tough to run at 1600!)
- CPU: Intel Core i7 980x @ baseclock 133 with a 28 multiplier (3.83Ghz in turbo mode), 3.2Ghz QPI, TurboBoost on, stock voltages
- Cooler: Corsair H50 (CPU temps running 33C idle, 58C under load)
- GPU: 3 x Gigabyte N680OC, factory overclocked GTX 680 (111% target), Gigabyte tri-fan cooler. GPUs running 54C idle, 76C (under FurMark Xtreme burn, multi-GPU, 5040)
- Disk: OS and app volume – 2 x 128GB OCZ Vertex 2 SSD, RAID 0 (Intel embedded array controller), 3Gb/s SATA
- Disk: additional app/data volume – 2 x 60GB OCZ Vertex SSD, RAID 0 (Intel embedded array controller), 3Gb/s eSATA (yep… external SSD RAID 0… running raw baby!)
- Audio: Creative Labs X-Fi f@7@l1tY (might as well 1337 it up even more!)
- PSU: Thermaltake ToughPower 1200
- Display: 3 x Viewsonic VX2268WM
- 3D: Nvidia 3D Vision Surround version 1
Some notes on performance during the Metro 2033 3D surround bench runs:
- VRAM hit 1.6GB so the GTX680 2GB is not running out of VRAM at these settings, but the GTX580 was
- GPU utilization ran to 100% across all three GPUs with both the 680s and the 580s (more frequently on the 580s)
- CPU utilization hovered around 40% across all three cores
- Going from 580 to 680 tri-SLI provided a massive boost to performance regardless of CPU overclock level due to the VRAM choke. Even with the 580s running within VRAM limits, however (when usage was showing <1.5GB), the framerates were a good 20-30% lower
- Bumping the CPU from stock clocks (3.33Ghz/3.46Ghz in TurboBoost), to mild overclock (3.7Ghz/3.83Ghz in TurboBoost) resulted in a gain of 1.5FPS. 5% bump in framerate from a 10% bump in clock
Conclusions? Let’s imagine some typical forums hysteria… “Where is the bottleneck?!!??” “What is bottlenecking what???” ” Can the Core i7 Nehalem “feed” 680s???” “Are they HELD BACK???”
As always, all of these questions are specious and answering them directly only serves to perpetuate this kind of thinking. A ttrue “bottleneck” would require a situation where the CPU is only able to stage X frames per second to prime the 3D pipeline and any of the GPU’s under consideration, in a given game at a given res and a given level of detail, would be capable of delivering X+X*Y% higher framerate. This almost never happens and it certainly isnt happening at surround resolutions with a game like Metro. You’d have to pretty much engineer a scenario to replicate those conditions – similar to what the better review sites do to test CPUs where they use a GTX580 or ATI 6970 and run the game at 1024×768 to demonstrate CPU scaling. It requires that extreme of a setup to truly show a bottleneck.
Essentially, the conclusion is that Metro 2033 in max settings with 3D Surround (even at a modest 5040×1050), simply needs more CPU and GPU than can be realistically thrown at it today. Which means it is very likely that the 4A Engine simply isn’t scaling well because this is a ridiculous amount of hardware and other engines that look just as good (ahem… Frostbite) scale better.
We can see that the Core i7 Nehalem benefited hugely from a move to 680s. After all, this is a near 4Ghz CPU, with 6 cores, and all it needs to do in this case is run game logic, run the OS, and stage frames for the 3D pipeline (which in these days of DX11 and fully programmable general purpose shaders is hugely GPU based). On the other hand, a faster CPU would result in higher framerates from the 680s (even though the Core i7 never ran near 100% utilization at either clock speed). Why is this? Who knows. In an age of quickie reviews from advertising supported “experts”, and forum battles between armchair experts, this kind of answer is neither fun nor popular. The reality is, however, it would take an experienced DX11 developer review team, the 4A Engine folks, a conference room with pizza and donuts, a whiteboard, a pile of runtime analysis, and 2 days of intense meeting time to figure out exactly how pipeline efficiency is going on Metro.
In the meantime, Ill keep throwing hardware at it and avoiding forums! We’ll revisit this test when the inevitable GTX780 finally debuts (and yes, it will still be paired with the i980x as I am just far too lazy to tear out this motherboard and I still don’t like what I am seeing from Sandy and Ivy)
Next up, more test results!