I reduced the multiplier from 9 to 8, to focus on FSB rather than CPU speed alone. The first try, 1600 MHz FSB setting (3.2 GHz CPU clock with X8 multiplier), didn't boot at all, but, contrary to common sense, I upped the setting to 1667 MHz instead (3.33 GHz CPU) - and it booted! All voltages were set to Auto, except that I left the DRAM voltage at 1.95 volts for DDR2-833 CL 4-4-4-8 using four 1 GB sticks (two Corsair, two G.skill).

The brand new Sparkle Calibre P880+ factory-overclocked and Peltier-cooled GeForce 8800GTX (GPU 630 MHz, memory 1960-DDR, all without water cooling!) ran the graphics portion, while the old trusty MGE 500W PS supplied the power.

Now, we got something here - this Kentsfield stepping, renamed by Intel as "Clovertown UP" Xeon, does a three-load 1667 MHz FSB, with all benchmarks (3DMark06 CPU, Sandra, Rightmark, AutoCAD city model renderings) passing with flying colours. And all that without any manual voltage tinkering - except REDUCING the memory voltage. In Nvidia monitor, the 1.425 V voltage setting (see photo) was less than 1.45 V displayed voltage setting for the 'standard' 3.33 GHz overclocked Kentsfield, despite latter's slower FSB 1333. The FSB and North Bridge ran at the very same 1.4 V - so nothing unusually stressful for the system.
Happy with this unexpected achievement, I decided to go further and get the best out of that FSB - four hungry cores would surely like to have the fastest possible shared path to the outside world, since they can't have something like HyperTransport *yet*. I used the Corsair XMS6400CL3 memory sticks - standard old fare, no Dominators here yet, but darn good latency, and ran them at DDR2-833 (PC6600) speed at CL 3-3-3-7 at 2.15 volts. 2.1 volts also booted, but crashed 3DMark06 CPU, so one voltage notch up was needed to get to the full stability.
Both WinXP Pro and WinXP 64 benchmark portions ran flawlessly at this level and, I guess here are some possible records for plain air-cooled quad-core Intel in 30+ degrees C hot-air environment (and with auto settings to boot). The 1600x1200 UXGA 3DMark06 Score hit 11179, with CPU score at 4983, but it's really the Sandra scores on these three screenshots that speak for themselves:

The 8.35 GB/s memory score (which could go a bit further after another round of memory parameter fine tuning) is particularly important, as it comes very close to many AMD AM2 memory benchmark results - one known sore point for current Intel Core 2 entries. Since we got four fast 3.33 GHz cores here competing for the same outside path, this extra FSB speed kick, coupled with in-sync low latency memory set at exact matching bandwidth, would be useful in many memory-bound apps.
The setup ran for half a day now, in and out of repeated benchmark runs - seems rock solid. Of course, more can be done - with better cooling, I feel 3.5 GHz CPU / 1750 FSB is not a far-off goal for stable operation, and even more once the X3230 with 10X multiplier is out (chill it to sub-zero and, well, aim to reach 4 GHz at FSB 1600! ). But even this result, achieved in such a short time, feels like a miracle to me, being very tired after all the points about FSB-starved Kentsfields.
In summary, the Intel quad-core FSB bottleneck is (partially) broken - at least on this particular combo. You get much more balanced CPU & memory performance (and overall system response across many apps) without squeezing every MHz out of the CPU core alone and, with many more low latency DDR2-800++ memory kits around, it will be easy to build even a 4 GB system with such performance, Remember, the brand new Vista malware is more sensitive on memory performance... ยต