Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Number crunching :
A new machine
Author |
Message |
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
My old '07 vintage Core 2 Quad is probably dead. Somehow I got it running stably with 2 GB of RAM, but that's far from ideal. Assuming I can't get it fully operational again, I'm looking at a new machine for daily use, development, and crunching (in that order of priority.)
Normally what I do when I build a machine is to find a good, brand name, pre-built machine on sale that has good upgrade opportunities. For many years that's been cost effective (I'm rarely interested in having the fastest, bleeding edge technology.) However, I'm finding now that it's difficult to find a desktop that's made with standard parts, and I need to be able to install a full size GPU and replace the power supply.
Accordingly, for the first time in a number of years I'm looking at building the system rather than buying.
I'm not looking to build "The Mother of All Crunchers". Crunching is only part of what it will be used for, and I only crunch during the cooler part of the year. Cost definitely IS a concern.
The old computer is donating the following parts:
600W power supply (brand new).
GTX 460 GPU
2x 1TB SATA 7200 RPM drives.
DVD burner.
I'll be installing Windows 7 Professional 64 bit on this system.
New parts:
Case: Cooler Master HAF 912
http://www.amazon.com/Cooler-Master-HAF-912-Computer/dp/B00BCXF6O4/
http://www.newegg.com/Product/Product.aspx?Item=N82E16811119233
I like CM's HAF series cases because their focus is on cooling. This isn't a huge case, but it's large enough. It comes with two 120 mm fans, but has room for up to 6 120 mm fans or 2 120 mm fans and 2 200 mm fans. I envision having up to four fans in the case (200 mm in front and on top, 120 mm in back, and possibly 120 mm on the side blowing on the GPU), and the motherboard coincidentally has headers for 4 case fans, so that works well.
One minor drawback is that the front panel ports are USB 2.0 rather than 3.0, so if I want USB 3.0 on the front I'd need to add them later.
I'm guessing this is one of their newer cases, since despite its low price it has a nice selection of features such as 2.5" SSD drive bays, top mounts for either 2x120 mm fans (or a 240 mm radiator) or a 200 mm fan, and a rotatable/removable 3.25" drive cage, and removable intake filters for the front and bottom fan openings.
The price is certainly right: $59.99.
Motherboard: ASUS Z87-Plus
http://www.amazon.com/Asus-Z87-PLUS-DDR3-1600-Motherboard/dp/B00CRJSX5Q/
http://www.newegg.com/Product/Product.aspx?Item=N82E16813131980
I've always had good luck with Asus, and this seems like a decent middle of the road motherboard. Lots of SATA ports if I need it, tons of USB ports, etc. It seems to have decent overclocking abilities, although that's not a priority right now.
$159.
CPU: Core i5 4670K
http://www.amazon.com/Intel-i5-4670K-Quad-Core-Desktop-Processor/dp/B00CO8TBOW/
http://www.newegg.com/Product/Product.aspx?Item=N82E16819116899
This was a tough decision, as I was considering everything from a Haswell Core i3 through an Ivy Bridge-E Core i7 4820K.
I wanted four real cores, so the Core i3 was out. The choice between two Core i7s -- the Ivy Bridge-E 4820K and the Haswell 4770K -- was a tough one. They're about the same price, but the 4820K uses quad channel memory while the 4770K is dual channel. LLR is usually bottlenecked by memory bandwidth so the 4820 is very tempting. However, motherboards for the 4820 are more expensive, so even though the CPUs are about the same price, the entire system will be more expensive. In the end, some benchmarks posted here with an Ivy Bridge-E running in both dual and quad channel mode indicate that while the quad channel memory is important when running six cores, it's not that big of a factor when running 4 cores.
Once the decision was made to go with Haswell and dual channel memory, I decided a Core i5 was a better choice. It's about $100 cheaper than the Core i7, and I won't be using hyperthreading for crunching anyway. The only real disadvantage is that the L3 cache is 6 MB vs 8 MB.
I was going to get the Core i5 4670, but the unlocked 4670K is only a few dollars more and it doesn't pay not to get the K version. The only disadvantage is the "K" chips lack one of the VM features (VT-X, I think), which I believe means that you can't run a 64 bit guest OS inside of a 32 bit host OS. Since I'll be running 64 bit Windows, that shouldn't be a problem.
$225
(Note: At least at the beginning, I'll be using the Intel stock cooler. Depending on temps, I can add a better air or water cooler later on.)
Memory: Corsair Vengeance 8 GB ( 2 x 4 GB ) DDR3 1600 MHz (PC3 12800) 240-Pin DDR3 Memory Kit
http://www.amazon.com/Corsair-Vengeance-240-Pin-Platforms-CMZ8GX3M2A1600C9/dp/B004CRSM4I/
http://www.newegg.com/Product/Product.aspx?Item=N82E16820233144
This ram is on the motherboard's QVL (qualified vendor list). 8 GB is enough for now, and I can always fill the other pair of slots with another 8GB if I need. I like having finned heatsinks on the ram for better cooling.
$87
Comments?
____________
My lucky number is 75898524288+1 | |
|
Tyler Project administrator Volunteer tester Send message
Joined: 4 Dec 12 Posts: 1078 ID: 183129 Credit: 1,376,122,338 RAC: 6,351
                         
|
Looks for a nice machine ;) I've got the Cooler Master HAF X 922, and it's really nice. It's huge, and currently has 3x 200mm fans, and can fit a dual 120mm fan radiator. It's got plenty of room for a cpu heatsink, i've got a cooler master hyper 212+. It is definitely upgrade friendly, with a bottom mounted psu and plenty of room for a gfz card. I like the look, too.
I can't comment much about the i5-4670(k), but I have an i5-2500k and it's nice and surprisingly cool. I've heard the newer chips (haswell, ivy, etc) don't o/c as well as sandy.
As for the RAM, it seems they are most stable at speeds one step under what they are rated for. I'd probably get 1866 RAM, some 1866 corsair RAM is actually less expensive that the 1600 you listed. http://www.amazon.com/dp/B008HK4ZAG
Looks really nice, good luck.
____________
275*2^3585539+1 is prime!!! (1079358 digits)
Proud member of Aggie the Pew
| |
|
|
Nice setup. Just repare for an aftermarket cooler if you want to use the k on the cpu and/or crunch avx WU's on the four cores.
____________
676754^262144+1 is prime | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
As for the RAM, it seems they are most stable at speeds one step under what they are rated for. I'd probably get 1866 RAM, some 1866 corsair RAM is actually less expensive that the 1600 you listed. http://www.amazon.com/dp/B008HK4ZAG
Point taken.
After reading reviews (I assumed ram was ram; apparently Corsair isn't as reliable as it used to be) Corsair is out and G.skill is in. New ram:
http://www.amazon.com/G-skill-RipjawsX-DDR3-1866-Memory-F3-14900CL9D-8GBXL/dp/B004JM1ZG8/
$88
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Nice setup. Just repare for an aftermarket cooler if you want to use the k on the cpu and/or crunch avx WU's on the four cores.
An aftermarket cooler was always in the plan, but since the CPU comes with a cooler, I'll run with that at first and see what kind of temperatures I get.
I'm not necessarily planning on overclocking a lot (and maybe not at all). The K part isn't much more than the non-K part, so it makes sense to get it and leave my options open.
You're absolutely right about AVX. I had forgotten about that. I'll have to see what the temps look like.
____________
My lucky number is 75898524288+1 | |
|
Tyler Project administrator Volunteer tester Send message
Joined: 4 Dec 12 Posts: 1078 ID: 183129 Credit: 1,376,122,338 RAC: 6,351
                         
|
Point taken.
After reading reviews (I assumed ram was ram; apparently Corsair isn't as reliable as it used to be) Corsair is out and G.skill is in. New ram:
http://www.amazon.com/G-skill-RipjawsX-DDR3-1866-Memory-F3-14900CL9D-8GBXL/dp/B004JM1ZG8/
$88
Yeah G-Skill is nice. I have kingston in my pc.
____________
275*2^3585539+1 is prime!!! (1079358 digits)
Proud member of Aggie the Pew
| |
|
|
Mike, since you are getting an ASUS board, their AI software has a great Auto Overclocking set up nowadays. It will automatically set the highest STABLE clocks and voltages. Depending on the quality of the CPU that would range from 3.9 Ghz to 4.2 for core i5 or i7 SB, IB, or Haswell. Temps will of course be higher but not anywhere close to trying to get the best manual overclock settings. I'd use it before deciding on what aftermarket cooler you eventually decide on. I personally would go with a water cooling setup because they have less CPU temperature variance vs. ambient temps.
Edit: The Auto Overclocking setup does take temps into account so the result will be higher with better cooling.
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713
| |
|
|
I have a HAF-XM case for my 3770K. I like it a lot and the airflow is good... 2x200mm fans (one front, one on top), 2x120mm on the side blowing on the GPUs, and 2x120 on the rear (set up push-pull) for the CPU (liquid) cooler radiator. One minor gripe is that sometimes the side panel doesn't want to latch tightly... it tends to "pop out" 1/4 inch or so on one lip and I have to fiddle with it to get it to stay put. No biggee. It's marketed as a "mid-tower", but it is at the large end of that range.
+1 on the faster memory.
I seem to remember that back when we were shaking out the first versions of LLRavx, temps were running 3-5C warmer than with the prior LLR, all other things being constant (not to say that's surprising). That was with SB chips only, at that time.
Good luck and let us know how it goes. Post photos! (a.k.a. geek porn)
--Gary | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
If 1883 ram is better for stability at 1600, then why not buy 2400 speed memory and have LOTS of overhead? The difference in price isn't very large.
With that in mind, I found this at Newegg:
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231574
What's interesting is that it comes with some sort of dual fan gizmo for cooling the ram.
Never saw that before.
(Upon searching further, there's a handful of memory fans out there.)
____________
My lucky number is 75898524288+1 | |
|
Tyler Project administrator Volunteer tester Send message
Joined: 4 Dec 12 Posts: 1078 ID: 183129 Credit: 1,376,122,338 RAC: 6,351
                         
|
If 1883 ram is better for stability at 1600, then why not buy 2400 speed memory and have LOTS of overhead? The difference in price isn't very large.
With that in mind, I found this at Newegg:
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231574
What's interesting is that it comes with some sort of dual fan gizmo for cooling the ram.
Never saw that before.
(Upon searching further, there's a handful of memory fans out there.)
Wow thats cool.. Didn't know 2400 RAM was that cheap. I thought it was twice the cost of 1600.
____________
275*2^3585539+1 is prime!!! (1079358 digits)
Proud member of Aggie the Pew
| |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
If 1883 ram is better for stability at 1600, then why not buy 2400 speed memory and have LOTS of overhead? The difference in price isn't very large.
With that in mind, I found this at Newegg:
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231574
Standard dimm = 1.5V
Low Power dimm = 1.35V
G.SKILL Ripjaws X = 1.65V
This means your dimms need this higher voltage to reach DDR3-2400 speed and cas-latency 9. Are you sure, that this dimms operate at standard voltage?
For long term usage would i invest in quad channel memory with socket 2011. Then you have at least a chance to get an upgrade to hexa- or octa-core (Xeon line).
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
ardo  Send message
Joined: 12 Dec 10 Posts: 168 ID: 76659 Credit: 1,693,455,577 RAC: 0
                   
|
I'm using this:
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231585
Working great so far!
____________
Badge score: 2*5 + 8*7 + 3*8 + 3*9 + 1*10 + 1*11 + 1*13 = 151
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
I'm using this:
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231585
Working great so far!
I'm sure that's good memory, but G.skill does not list those particular parts as being on the QVL for the motherboard I'm probably going to get. (They do list it for the Z87-Pro, but not the Z87-Plus.)
____________
My lucky number is 75898524288+1 | |
|
ardo  Send message
Joined: 12 Dec 10 Posts: 168 ID: 76659 Credit: 1,693,455,577 RAC: 0
                   
|
I'm using this:
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231585
Working great so far!
I'm sure that's good memory, but G.skill does not list those particular parts as being on the QVL for the motherboard I'm probably going to get. (They do list it for the Z87-Pro, but not the Z87-Plus.)
Interesting that this memory is qualified for the more expensive MBs in the ASUS Z87 series but not for yours...
____________
Badge score: 2*5 + 8*7 + 3*8 + 3*9 + 1*10 + 1*11 + 1*13 = 151
| |
|
|
Buy a Xeon 1230 V3. 4 cores / 8 threads, 8 MB cache ( this model is without iGPU ). In boinc, especially in PG ( where cpu temp are much bigger than in other project ) there is small margin for overclocking. So, "K" version isn't as good as somewhere else.
You can easily @ it to 3.7 GHz ( all 8t running ) on Z87 based motherboard. In addition, fast memory and you're good to go. :)
I was in the same place you are, like 3/4 months ago.
My config:
Xeon 1230 v3 @ 3.7
Msi Z87M-G43
2x4 G Skill 2400 MHz 10/12/12/31
Some PSP ( HT off ):
http://scr.hu/0rlv/vmry9 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Buy a Xeon 1230 V3. 4 cores / 8 threads, 8 MB cache ( this model is without iGPU ). In boinc, especially in PG ( where cpu temp are much bigger than in other project ) there is small margin for overclocking. So, "K" version isn't as good as somewhere else.
You can easily @ it to 3.7 GHz ( all 8t running ) on Z87 based motherboard. In addition, fast memory and you're good to go. :)
I was in the same place you are, like 3/4 months ago.
My config:
Xeon 1230 v3 @ 3.7
Msi Z87M-G43
2x4 G Skill 2400 MHz 10/12/12/31
Some PSP ( HT off ):
http://scr.hu/0rlv/vmry9
(I wasn't getting the K version for overclocking so much as it was only about $10 more than than non-K version, so I figured it didn't make sense to not get it. OC isn't on my agenda.)
Interesting. That chip seems to be essentially an i7 4770 (no K) without the built in GPU.
One thing that concerns me is the warning on ASUS's site:
"*Intel Xeon Processor Family is designed for servers. Some features may not support when installed on 8 series chipsets. For more details, refer to ASUS support site at http://support.asus.com."
Nowhere could I find information about what they meant by that. Obviously, there's no onboard video, so that might be what they're talking about. What concerns me, however, is that some of the Intel literature shows the Xeon with PCIe v2 connctivity rather than v3. That's a feature I wouldn't want to lose.
I can't, however, find any solid information.
BTW, I'm either going to get a non-HT CPU or disable it in the BIOS. Besides being useless for LLR, it adds a lot of variability to the benchmarking I do to set PrimeGrid's credit ratios. Therefore, the advantage of this chip for me is the larger cache.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
If 1883 ram is better for stability at 1600, then why not buy 2400 speed memory and have LOTS of overhead? The difference in price isn't very large.
With that in mind, I found this at Newegg:
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231574
Interesting development.
Intel doesn't recommend using memory voltages above 1.5v on SB, IB, or Haswell, and doing so can void the warranty and supposedly can shorten the life of the cpu.
With that in mind, I'm probably going to purchase 1833 speed ram (to run at 1600).
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Pulled the trigger and ordered some parts:
PCPartPicker part list / Price breakdown by merchant / Benchmarks
CPU: Intel Core i5-4670K 3.4GHz Quad-Core Processor ($219.99 @ Amazon)
Motherboard: Asus Z87-PLUS ATX LGA1150 Motherboard ($159.99 @ Newegg)
Memory: G.Skill Ares Series 8GB (2 x 4GB) DDR3-2400 Memory ($76.99 @ Newegg)
Case: Cooler Master HAF 912 ATX Mid Tower Case ($59.99 @ TigerDirect)
Case Fan: Cooler Master Megaflow 110.0 CFM 200mm Fan ($8.99 @ Newegg)
Other: SilverStone 140mm Fan Filter with Magnet for Case Fan/Power Supply Fan and Panel Air Vent FF141B ($8.99)
Total: $568.93
(Prices include shipping, taxes, and discounts when available.)
(Generated by PCPartPicker 2014-02-14 23:26 EST-0500)
A couple of thoughts on the part selection that went into some of the final part decisions:
I had decided to stick with 1866 memory so the memory could run at 1.5 volts. That's all the CPU officially supports. I ended up buying 2400 speed memory. Why? I have the option of running faster if I want to, and if not I can always run the memory at 1866 and 1.5 volts rather than the 1.65 volts required for 2400. Finally, 2400 speed memory was $1 less expensive than the otherwise identical 1866 ram. Go figure.
I spent a while trying to decide between the Core i5 and the Xeon. In the end, two things caused me to get the Core i5: it's about $30 cheaper, and I might want to run the video from the i5's onboard video and leave the real video card to crunching. Running in that configuration I should be able to run Genefer on the GPU without any screen lag. (If I ever want to play games, I can reconfigure the system easily enough.)
For right now, I'm going with the stock cooler until I see what the temperatures will be. On the case, I'll have 200 and 120 mm fans (front and side) blowing in and a 120 mm fan (rear) blowing out. The top will be a passive exhaust. (The power supply also draws in air from the bottom and exhausts from the rear.) I've got lots of options to improve cooling if needed.
____________
My lucky number is 75898524288+1 | |
|
|
I have now built a system with X79 and i7-4930k@4Ghz (6 core machine).
The times of S5 (5 WUs parallel + GPU) vs. i5-2500k@4Ghz are:
7000s-7600s | 7900s-8200s (+1 for the quadchannel)
I need to tweak the cooling a bit (around 76-80°C with Noctua @ 700U/min)
With 6WUs 80-86°C ^^ | |
|
|
I do not know a detail because I do not have a memory of GSkill.
A overclocking memory may have only a few profiles in XMP.
I use a 2400 memory of Team group.
My memory has only 1333,2133 and 2400 in XMP.
If your memory does not have 1866, you may not use 1866.
| |
|
|
You can easily set it manually. You don't have to stick to xmp profile. | |
|
|
For right now, I'm going with the stock cooler until I see what the temperatures will be.
I just put this cooler master cooler on a quad core xeon and I'm very pleased with it, especially for the money.
I've got an H100i on my overclocked i7 and it does a good job. It would do even better if I would get around to scraping off the crap thermal paste it came with and put the arctic sliver on there since I finally found my tube of it...
Sounds like a good system. Just beware that not all BIOS will let you use integrated graphics if you have a graphics card installed. It's beyond annoying... | |
|
|
I have a question, my new i7-4930k@4Ghz had 4x4GB 1600Mhz RAM in Quad channel mode (no HT, 5 at once + 1 core for GPU) died yesterday so I still run in Triple Channel. My runtime of a SR5 WU went up from 1h50m-2h15m, to 2,5h-3h. Is that possible that the performance difference between Quad and Triple Channel is so high? The temps are looking ok now, so no cooldown.
How much + performance can I get from 1333Mhz to 1600Mhz in Quadchannel mode? | |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Technology MHz Speed (Data Rate) Module Classification Module Peak Bandwidth
DDR3 (1.5V)
DDR3L (1.35V)
800 (DDR3-800) PC3-6400 6400 MB/sec. or 6.4 GB/sec.
1066 (DDR3-1066) PC3-8500/PC3L-8500 8500 MB/sec. or 8.5 GB/sec.
1333 (DDR3-1333) PC3-10600/PC3L-10600 10600 MB/sec. or 10.6 GB/sec.
1600 (DDR3-1600) PC3-12800 12800 MB/sec. or 12.8 GB/sec.
1866 (DDR3-1866) PC3-14900 14900 MB/sec. or 14.9 GB/sec.
A change to DDR3-1600 would be a 17% improvement on memory bandwidth over DDR3-1333...
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
|
Technology MHz Speed (Data Rate) Module Classification Module Peak Bandwidth
DDR3 (1.5V)
DDR3L (1.35V)
800 (DDR3-800) PC3-6400 6400 MB/sec. or 6.4 GB/sec.
1066 (DDR3-1066) PC3-8500/PC3L-8500 8500 MB/sec. or 8.5 GB/sec.
1333 (DDR3-1333) PC3-10600/PC3L-10600 10600 MB/sec. or 10.6 GB/sec.
1600 (DDR3-1600) PC3-12800 12800 MB/sec. or 12.8 GB/sec.
1866 (DDR3-1866) PC3-14900 14900 MB/sec. or 14.9 GB/sec.
A change to DDR3-1600 would be a 17% improvement on memory bandwidth over DDR3-1333...
And from triple channel to quad channel, how much + performance? | |
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
From my point of view. After changing from triple to quad channel you should see a 25% improvement in peak bandwidth. But remember this is only the peak performance.
1333:
3x 10.6GB/s peak bandwidth = 31.8GB/s
4x 10.6GB/s peak bandwidth = 42.4GB/s
1600:
3x 12.8GB/s peak bandwidth = 38.4GB/s
4x 12.8GB/s peak bandwidth = 51.2GB/s
____________
Best wishes. Knowledge is power. by jjwhalen
| |
|
|
From my point of view. After changing from triple to quad channel you should see a 25% improvement in peak bandwidth. But remember this is only the peak performance.
1333:
3x 10.6GB/s peak bandwidth = 31.8GB/s
4x 10.6GB/s peak bandwidth = 42.4GB/s
1600:
3x 12.8GB/s peak bandwidth = 38.4GB/s
4x 12.8GB/s peak bandwidth = 51.2GB/s
ok, then I can say the missing module with quadchannel is also the missing 1h ~30% lost in performance. Need to wait for the new RAM and see where I can have my old good runtimes again, thx. | |
|
|
I had decided to stick with 1866 memory so the memory could run at 1.5 volts. That's all the CPU officially supports. I ended up buying 2400 speed memory. Why? I have the option of running faster if I want to, and if not I can always run the memory at 1866 and 1.5 volts rather than the 1.65 volts required for 2400. Finally, 2400 speed memory was $1 less expensive than the otherwise identical 1866 ram. Go figure.
From what I've read Intel SB and IB i5 & i7 memory controller can handle up to1.65v. Not sure about Z87 MBs and Haswell but many have done so like some over at the Mersenne forums http://www.mersenneforum.org/showthread.php?t=17982
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713
| |
|
|
I had decided to stick with 1866 memory so the memory could run at 1.5 volts. That's all the CPU officially supports. I ended up buying 2400 speed memory. Why? I have the option of running faster if I want to, and if not I can always run the memory at 1866 and 1.5 volts rather than the 1.65 volts required for 2400. Finally, 2400 speed memory was $1 less expensive than the otherwise identical 1866 ram. Go figure.
From what I've read Intel SB and IB i5 & i7 memory controller can handle up to1.65v. Not sure about Z87 MBs and Haswell but many have done so like some over at the Mersenne forums http://www.mersenneforum.org/showthread.php?t=17982
Haswell can run 1,65V RAM but its important to run @1,5V due chip failure over the time. Its specified by intel. | |
|
|
The new 1600Mhz RAM arrived, the problem with longer runtime was collatz (need more power than I thought). The performance in normal runtime (SR5) of the 1600Mhz in quad channel vs. 1333Mhz in triple channel increased by around 10-15%. | |
|
mikey Send message
Joined: 17 Mar 09 Posts: 1654 ID: 37043 Credit: 733,584,432 RAC: 85,769
                     
|
The new 1600Mhz RAM arrived, the problem with longer runtime was collatz (need more power than I thought). The performance in normal runtime (SR5) of the 1600Mhz in quad channel vs. 1333Mhz in triple channel increased by around 10-15%.
That's very good to hear! | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
The new computer is up and running, at least the hardware is. I still have a laundry list of software that needs to be installed before things are back to normal.
Some observations:
Cooling:
Those HAF (high airflow) cases really make a difference. I'm using the same GPU as before, and it's running a good 20 degrees C cooler than in the old computer. I didn't blow the dust out of it while the computer was apart, either. Wow. Just wow. The case is a Cooler Master HAS 912 with the front 120 mm fan moved to the side and a CM 200 mm fan in the front. The rear 120 mm exhaust fan is unchanged and the top vents are left open for passive convective cooling.
Overclocking:
The Asus motherboard comes with some interesting utilities. You can do a lot of the BIOS configuration stuff from within Windows, and you can restart directly into the BOIS without having to sit there pounding on the Del key like a bored chimpanzee in a lab experiment. That's a nice touch.
It has an "auto overclock tuning" button in Windows software. It works, sort of.
It happily raised the memory from 1600 MHZ to 2400 MHZ, and the CPU from 3.4 GHz to 4.6 GHz, while testing it "under load" as it gradually increased the frequencies and voltages.
Even with the stock cooler (something better is on order), the cpu temp remained ok.
Not surprisingly, their definition of "under load" does not mean "continuously running AVX instructions on all cores". Starting up LLR rapidly brought the CPU temps into the 80s and it hit 90 within a few seconds.
Back to stock speeds for now. :)
As expected, LLR runtimes (stock clocks) are about half of what they were with the Core2Quad Q6600.
Memory:
We all know that LLR and GFN are usually constrained by memory bandwidth. The Asus utilities show the CPU power consumed. Running PPSE (which uses cache more and memory less) consumes 5 to 10 watts more power than SR5 does, and the CPU runs a few degrees hotter as well.
Once I get the new cpu cooler, I'll try overclocking the memory first to see what effect that has.
Gripes:
Although I'm very happy with the purchases, there's a couple of things I don't like:
HAF 912: You seem to have to remove BOTH the left and right side panels in order to release the plastic clips holding the front bezel. Therefore, to clean the front air filter, you need to remove both sides as well as the front.
Asus Z87-Plus: Not sure if this is specific to this motherboard or if it affects all new "UEFI" BIOSs. You can't get into Widows Safe mode by hitting F8 anymore. F8 just brings you to a menu that lets you select which disk to boot from. To get to safe mode, you have to already be in Windows and run the msconfig program to configure Windows to boot into safe mode on the next boot.
I think the BIOS itself is putting the computer to sleep, above and beyond Windows' own power settings. I need to dig deeper into this. There are all sorts of power saving options, including powering down unused ports, but I think it's overdoing it.
Nice touches:
The motherboad has a power switch on the motherboard itself. It's a good idea to hook everything up on the bench before mounting it all in the case, so to actually turn it on without the case's power switch you usually need to short two pins on the motherboard to simulate the switch. Having an actual button makes it simpler.
The CPU wattage display is nice.
The system tells you how effective your cooling system is by displaying the degree/watt ratio, i.e., how much hotter your cpu gets for each extra watt used. The lower the number, the better the cooling. It's 0.59 degrees C per watt with the stock cooler.
Lots of control of the fans, including turning them off under low loads.
Acknowledgements:
First a very special thank you to a few people who, unasked, donated hardware to help me rebuild. That was incredibly generous. You know who you are!
I'd also like to thank everyone here for their help and comments. I spent a lot of time reading website reviews, but I also listened to the advice here and the advice here was very, very good.
____________
My lucky number is 75898524288+1 | |
|
|
Congratulations! New hardware is always exciting.
I hadn't considered the UEFI BIOS being why you can no longer hit F8 to get the Windows boot menu. Interesting - the few times I have needed to get there I just reset the machine when Windows first starts, that forces it into recovery mode. It's not elegant but luckily I've only needed to do it a few times. You've just reminded me there has to be a better way - time to do some searching.
Edit - cool, you can restore the old F8 menu. Off to do so now! http://lifehacker.com/bring-back-the-old-f8-safe-mode-shortcut-in-windows-8-577175460 | |
|
mikey Send message
Joined: 17 Mar 09 Posts: 1654 ID: 37043 Credit: 733,584,432 RAC: 85,769
                     
|
Congratulations! New hardware is always exciting.
I hadn't considered the UEFI BIOS being why you can no longer hit F8 to get the Windows boot menu. Interesting - the few times I have needed to get there I just reset the machine when Windows first starts, that forces it into recovery mode. It's not elegant but luckily I've only needed to do it a few times. You've just reminded me there has to be a better way - time to do some searching.
Edit - cool, you can restore the old F8 menu. Off to do so now! http://lifehacker.com/bring-back-the-old-f8-safe-mode-shortcut-in-windows-8-577175460
Sorta makes you wonder how many more legacy/standard options there are.
Congratulations on the new pc and I hope it gives you a very long life of crunching and whatever else you have in mind for it! It sounds like you made lots of very good choices for it!!! | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
I just installed the software for the Cyberpower UPS this computer is plugged into. It shows how much power the UPS is supplying on the battery backed up plugs.
This computer can crunch LLR over twice as fast as the Core 2 Quad it replaced, yet the max power consumption dropped from about 390 watts to 265 watts. That's with the same GPU, and includes two monitors and a network switch.
That's a big reduction in power!
____________
My lucky number is 75898524288+1 | |
|
mikey Send message
Joined: 17 Mar 09 Posts: 1654 ID: 37043 Credit: 733,584,432 RAC: 85,769
                     
|
I just installed the software for the Cyberpower UPS this computer is plugged into. It shows how much power the UPS is supplying on the battery backed up plugs.
This computer can crunch LLR over twice as fast as the Core 2 Quad it replaced, yet the max power consumption dropped from about 390 watts to 265 watts. That's with the same GPU, and includes two monitors and a network switch.
That's a big reduction in power!
Comparatively those old chips weren't very 'green' were they! I too am just amazed at how many 6 core AMD pc's I can bring online, they are cheap, as opposed to some old dual core pc's, all on the same circuit! I can take 2 old dual core pc's off line and put 3 6 core machines on and the circuit breaker doesn't blow. 3 dual cores would pop the breaker the instant the 3rd one started up, this is a standard 15amp lighting circuit. | |
|
axnVolunteer developer Send message
Joined: 29 Dec 07 Posts: 285 ID: 16874 Credit: 28,027,106 RAC: 0
            
|
After some timing experiments with my i5-3330, I have come to the conclusion that dual channel (DDR3-1333 CL9) is not sufficient for a quad core :-(
These are the rough timings from running SR5 workload:
1 instance: 2.5 ms/iter
2 instance: 2.6 ms/iter
3 instance: 3.1 ms/iter
4 instance: 3.7 ms/iter
As a result of this timing exercise, I have gone from running BOINC on all four cores to just 3 cores with about < 10% dop in aggregate throughput.
The result of "no difference" from the other thread must've been due to the humongous L3 cache|(15MB) which would be enough to hold two FFTs in their enitrety plus change.
Mike, do you have scaling timings similar to mine? Especially against different memory speeds?
PS:- I've ordered two sticks of DDR3-1600. Hopefully, that'll bring some improvement in performance. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Mike, do you have scaling timings similar to mine? Especially against different memory speeds?
Nope, not yet. Next step for me is replacing the stock cooler, and then I'll try overclocking the ram from 1600 to 2400 while leaving the cpu at stock.
I agree with what you're saying -- it fits perfectly with everything we know (or think we know). Going with LGA2011 just wasn't in the budget -- and as much fun as it is, I'm not trying to create the ultimate crunching machine. Competitive spirit aside, it really doesn't matter how fast it is.
Now I need to decide whether I want the new cpu cooler blowing towards the rear or the top. Towards the top will probably exhaust the air better (huge 200 mm grill on top vs a 120 mm fan on the rear) but it will be pulling in air from the vicinity of the GPU rather than the front of the case. I also need to think about the voltage regulator heat sinks, which are located both above and to the rear of the cpu.
____________
My lucky number is 75898524288+1 | |
|
darkclown Volunteer tester Send message
Joined: 3 Oct 06 Posts: 328 ID: 3605 Credit: 1,422,865,129 RAC: 337,605
                         
|
Just allocate a mineral oil bath for the system, cooling problems solved!
:)
Looks like a nice box!
Can you post a summary of the final parts list, or is the original post up to date?
____________
My lucky #: 60133106^131072+1 (GFN 17-mega) | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Just allocate a mineral oil bath for the system, cooling problems solved!
:)
Looks like a nice box!
Can you post a summary of the final parts list, or is the original post up to date?
Core i5 4670K
Asus Z87-Plus
G.Skill 2x4GB 2400 MHZ DDR3 ram (Aeries series)
CoolerMaster HAF 912
CM 200 mm fan for the front of the case, moving the 120 mm fan to the side
extra 140 mm filter for the side intake
Noctua NH-U14S cooler + second NF-A15 PWM 140/150 mm fan (interesting fact: the fan the NH-U14S comes with is the same part number, but spins 300 RPM faster than the fan you can buy stand alone. That's intentional.)
Hand me down parts: Corsair CX600 PS, PNY GTX 460 GPU, and a DVD burner. The GTX 460 will shortly be replaced by a GTX 580 (also a hand me down).
____________
My lucky number is 75898524288+1 | |
|
|
Just allocate a mineral oil bath for the system, cooling problems solved!
:)
Looks like a nice box!
Can you post a summary of the final parts list, or is the original post up to date?
Core i5 4670K
Asus Z87-Plus
G.Skill 2x4GB 2400 MHZ DDR3 ram (Aeries series)
CoolerMaster HAF 912
CM 200 mm fan for the front of the case, moving the 120 mm fan to the side
extra 140 mm filter for the side intake
Noctua NH-U14S cooler + second NF-A15 PWM 140/150 mm fan (interesting fact: the fan the NH-U14S comes with is the same part number, but spins 300 RPM faster than the fan you can buy stand alone. That's intentional.)
Hand me down parts: Corsair CX600 PS, PNY GTX 460 GPU, and a DVD burner. The GTX 460 will shortly be replaced by a GTX 580 (also a hand me down).
I have built this one:
Core i7 4930k (HT off)
Gigabyte X79-UP4
Corsair 4x4GB 1600Mhz DDR3 RAM
my old Noctua NH-U12P cooler with mounting kit @1000U/min, socket 2011
Asus HD7950
Bequiet 480W E9 modular Straight Power
All are kind of silent parts which I need it but have enough power. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
This afternoon I replaced the Intel stock cooler with the Noctua NH-U14S (with a second fan in push-pull configuration). It's, um, slightly larger.
To say there's a big difference in temperatures is a significant understatement. With all 4 cores running LLR, it's about 20 degrees cooler AND the fans are barely spinning. Forget the temperatures; the computer is MUCH quieter than before. In fact, I had a scare when I powered it up because the BIOS screamed at me that there was a CPU fan failure. The fans were spinning way below the default warning threshold.
I decided to mount it vertically, with the exhaust pointing straight up through the vents in the top of the case. When I looked at the actual sizes, it just made no sense to have those 150 mm fans pointing at the rear 120 mm exhaust fan. Trying to connect a fire hose to a drinking straw is the analogy that comes to mind.
Did I mention it's quiet? I f I didn't hear the hard drives moving every now and then, the only way I'd know it's turned on is that I can see one of the CPU fans turning through the top vent. And it's crunching full bore at the moment.
I can thoroughly recommend this CPU cooler. Attaching it was easy (and the instructions are well written), it come with a 6 year warranty, and the fans have a 150,000 hour MTBF. The company has a history of providing free adapter kits to existing owners when new CPU sockets come out.
In other news, I swapped the GTX 460 with a superclocked GTX 580. Time to give that a test drive and see if it can run Genefer reliably with the factory overclocking.
Real numbers:
4 cores running PSP-LLR, GPU idle:
Asus AI Suite 3 is reporting a CPU temp of 52 degrees (core temp reports 64 degrees).
CPU fan (150 mm 1500 RPM Noctua) is spinning at 646 RPM
2nd CPU fan (150 mm 1200 RPM Noctua) is spinning at 531 RPM
Rear fan (120 mm Cooler Master) is spinning at 885 RPM
Side fan (120 mm Cooler Master) is spinning at 831 RPM
Front fan (200 mm Cooler Master) is spinning at 464 RPM<
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
In other news, I swapped the GTX 460 with a superclocked GTX 580. Time to give that a test drive and see if it can run Genefer reliably with the factory overclocking.
Ok, this was expected, I guess:
GPU=GeForce GTX 580
Global memory=1610612736 Shared memory/block=49152 Registers/block=32768 Warp size=32
Max threads/block=1024
Max thread dim=1024 1024 64
Max grid=65535 65535 65535
CC=2.0
Clock=1594 MHz
# of MP=16
No project preference set; using AUTO-SHIFT=8
Starting initialization...
maxErr during b^N initialization = 0.0000 (14.286 seconds).
Testing 370268^1048576+1...
Estimated total run time for 370268^1048576+1 is 8:59:33
maxErr exceeded for 370268^1048576+1, 0.5000 > 0.4500
MaxErr exceeded may be caused by overclocking, overheated GPUs and other transient errors.
Waiting 10 minutes before attempting to continue from last checkpoint...
Going to have to un-overclock it. :)
____________
My lucky number is 75898524288+1 | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2380 ID: 1178 Credit: 17,929,053,988 RAC: 10,800,378
                                                
|
Try the memory first. I bet you can run it with the shader overclock with a lower memory setting.
| |
|
|
Core i5 4670K
Asus Z87-Plus
G.Skill 2x4GB 2400 MHZ DDR3 ram (Aeries series)
CoolerMaster HAF 912
CM 200 mm fan for the front of the case, moving the 120 mm fan to the side
extra 140 mm filter for the side intake
Noctua NH-U14S cooler + second NF-A15 PWM 140/150 mm fan (interesting fact: the fan the NH-U14S comes with is the same part number, but spins 300 RPM faster than the fan you can buy stand alone. That's intentional.)
Hand me down parts: Corsair CX600 PS, PNY GTX 460 GPU, and a DVD burner. The GTX 460 will shortly be replaced by a GTX 580 (also a hand me down).
I have built this one:
Core i7 4930k (HT off)
Gigabyte X79-UP4
Corsair 4x4GB 1600Mhz DDR3 RAM
my old Noctua NH-U12P cooler with mounting kit @1000U/min, socket 2011
Asus HD7950
Bequiet 480W E9 modular Straight Power
All are kind of silent parts which I need it but have enough power.
Mike. Looks like you have a nice machine there.
rebirther. Looks like a sweet build also.
I built my first monster build in 8 years about 3 weeks ago:
Core i7-3930K
EVGA X79 DARK
ARCTIC Freezer i30 Extreme cooler
Mushkin Enhanced Redline 4 x 8GB 1866Mhz DDR3
ASUS GeForce GTX TITAN 6GB 384-Bit GDDR5
Thermaltake 1200W power supply
NZXT H630 CA-H630F-M1 Matte Black Ultra Tower Silent Case
I really like this case I've added 3 200mm fans for a total of 4 plus the 140mm that comes on the rear. May add more 140's haven't decided yet.
The Titan has been running GFN WR with OpenCL app in 81 hours.
I kept one core free for the Titan.
With HT on I'm running 11 SR5 at 5.5hrs each.
CPU sets happily at 60C in a 30-35C room.
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Try the memory first. I bet you can run it with the shader overclock with a lower memory setting.
First thing I did was put everything back to stock -- or at least I tried to do that. I installed EVGA precision from the disk that came with the GTX 580, and it immediately said "There's a newer version!". So I installed the latest. Then I uninstalled the latest and went back to the old one because the new Precision won't let you lower the clocks below factory settings.
Even at stock, however, it still errored out, so the next thing I did was lower the memory clock about 10 % below stock. Now it seems to be running stable. I'll let it complete this GFN before playing with the clocks again.
Actually, I lied. I left out a step. Somewhere in there I set all the fans to full speed to see if it made a difference. Noise went up, temps went down. Error occurred at the exact same place.
____________
My lucky number is 75898524288+1 | |
|
darkclown Volunteer tester Send message
Joined: 3 Oct 06 Posts: 328 ID: 3605 Credit: 1,422,865,129 RAC: 337,605
                         
|
Friend of mine just found his pair of 580s in his closet. I need to see if I can talk him out of one...
____________
My lucky #: 60133106^131072+1 (GFN 17-mega) | |
|
|
I installed EVGA precision from the disk that came with the GTX 580, and it immediately said "There's a newer version!". So I installed the latest. Then I uninstalled the latest and went back to the old one because the new Precision won't let you lower the clocks below factory settings.
Yes, it's very annoying. I got my 580 used and don't have the disk with the older version - but luckily MSI Afterburner works with EVGA cards too - and afterburner is far less garish than the EVGA utility (it reminded me of some truly horrid winamp skins from the '90s).
And congrats on the CPU cooler - that's a really nice one. I wish I would have gotten that one instead of the watercooling block - I think the Noctua would have done just as well and it was $25 cheaper and takes up less room in the case! And two fans vs. two fans and a pump in the watercooler is simpler too. And those appear to be higher quality fans as well.
Thanks for sharing! | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Stock Intel cooler:
... next to the Noctua NH-U14S:
____________
My lucky number is 75898524288+1 | |
|
|
Try the memory first. I bet you can run it with the shader overclock with a lower memory setting.
First thing I did was put everything back to stock -- or at least I tried to do that. I installed EVGA precision from the disk that came with the GTX 580, and it immediately said "There's a newer version!". So I installed the latest. Then I uninstalled the latest and went back to the old one because the new Precision won't let you lower the clocks below factory settings.
Even at stock, however, it still errored out, so the next thing I did was lower the memory clock about 10 % below stock. Now it seems to be running stable. I'll let it complete this GFN before playing with the clocks again.
Actually, I lied. I left out a step. Somewhere in there I set all the fans to full speed to see if it made a difference. Noise went up, temps went down. Error occurred at the exact same place.
During the GFN challenge I was able to OC ALL my Fermis by lowering the memory clocks 10-15%. With my GTX 460 (at 1884 Mhz) & 570 (at 1844 Mhz) I was able to OC 25-30% even without my A/C cooling setup I had at my old house, although my 580 OC'ed the least (at 1644 Mhz).
So memory clock IS the key.
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
During the GFN challenge I was able to OC ALL my Fermis by lowering the memory clocks 10-15%. With my GTX 460 (at 1884 Mhz) & 570 (at 1844 Mhz) I was able to OC 25-30% even without my A/C cooling setup I had at my old house, although my 580 OC'ed the least (at 1644 Mhz).
So memory clock IS the key.
That certainly fits with everything we know (or think we know). When I get a chance I'll put the 580 back at factory clocks except for the memory.
____________
My lucky number is 75898524288+1 | |
|
mikey Send message
Joined: 17 Mar 09 Posts: 1654 ID: 37043 Credit: 733,584,432 RAC: 85,769
                     
|
Stock Intel cooler:
... next to the Noctua NH-U14S:
That's HUGE!!! | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
After some timing experiments with my i5-3330, I have come to the conclusion that dual channel (DDR3-1333 CL9) is not sufficient for a quad core :-(
These are the rough timings from running SR5 workload:
1 instance: 2.5 ms/iter
2 instance: 2.6 ms/iter
3 instance: 3.1 ms/iter
4 instance: 3.7 ms/iter
As a result of this timing exercise, I have gone from running BOINC on all four cores to just 3 cores with about < 10% dop in aggregate throughput.
The result of "no difference" from the other thread must've been due to the humongous L3 cache|(15MB) which would be enough to hold two FFTs in their enitrety plus change.
Mike, do you have scaling timings similar to mine? Especially against different memory speeds?
PS:- I've ordered two sticks of DDR3-1600. Hopefully, that'll bring some improvement in performance.
I ran some tests this morning.
Test conditions:
Stock clocks: I5 4680K @3.4 GHz, memory at 1600 MHz
Fans: all fans set to full speed (to minimize effects of temperature on TurboBoost)
OS: Windows, in normal mode.
Command: llr64 -d -q"222113*2^14948141+1"
The test was run with an actual PSP candidate. All cores were running the same number. (Because different numbers can have different iteration times, you should ALWAYS run benchmarks from the command line on the same test number and never use live BOINC tests for benchmarks.)
Note that this test measures more than just memory contention. With all cores running, you're also seeing a lower CPU clock speed since TurboBoost is raising the clock to 3.8 GHz with one core running but leaves it at 3.4 GHz with 4 cores running. Furthermore, all the Windows background processes (about 7% total CPU time) are running, which definitely affects the 4 core results.
Cores core 1 core 2 core 3 core 4 Iter/sec I/S/core Incremental Loss
1 8.515 117.4398121 117.4398121 n/a 0.00%
2 8.761 8.763 228.2583915 114.1291957 110.8185794 2.82%
3 9.495 9.488 9.509 315.8784081 105.2928027 87.6200166 10.34%
4 11.073 11.896 12.001 11.056 348.1466467 87.03666167 32.2682386 25.89%
Iter/Sec is the total work done -- so as you can see, it's still beneficial to run all 4 cores. More crunching is happening with 4 cores running than with 3.
Incremental is the amount of additional work you get done by running an additional core.
Loss is the total amount of inefficiency due to memory contention, change in CPU frequency, pre-emption by other Windows tasks, etc.
Performance really drops off with the 4th core (it gets hit especially hard by the other Windows processes sucking up CPU time), but even so, it's still worthwhile to run it. More crunching gets done with 4 cores (348 iterations/sec) than with 3 cores (315 I/s).
Once I start playing with overclocking, I may try repeating this test with the CPU still at stock speeds but the memory at 2400 (if it can do that reliably). This probably won't happen until after the PSP challenge is over, however.
____________
My lucky number is 75898524288+1 | |
|
|
Looks like I'm having the same issue with my new (referbed) GTX 580.
http://www.primegrid.com/results.php?userid=32401&offset=0&show_names=0&state=4&appid=16
I just dumped windows for Linux hoping that would help. It does not look like it did.
Anyone know how do I lower the memory clock in Linux?
____________
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Cores core 1 core 2 core 3 core 4 Iter/sec I/S/core Incremental Loss
1 8.515 117.4398121 117.4398121 n/a 0.00%
2 8.761 8.763 228.2583915 114.1291957 110.8185794 2.82%
3 9.495 9.488 9.509 315.8784081 105.2928027 87.6200166 10.34%
4 11.073 11.896 12.001 11.056 348.1466467 87.03666167 32.2682386 25.89%
I increased the memory speed to 2400 MHz. I *think* the BIOS, on its own, increased the CPU base speed from 3.4 GHz to 3.6 GHz. I'm not totally sure why, or if I'm understanding everything correctly. Therefore, some of the raw speed increase might be due to the CPU simply running faster. However, it's clearly visible in the results how much better the memory does at 2400 MHz because the difference between 1 core running and 4 cores running is substantially smaller.
Cores core 1 core 2 core 3 core 4 Iter/sec I/S/core Incremental Loss
1 7.893 126.6945395 126.6945395 n/a 0.00%
2 8.065 8.066 247.9697486 123.9848743 121.2752092 2.14%
3 8.336 8.350 8.334 359.7124921 119.904164 111.7427435 5.36%
4 9.248 8.882 9.244 9.048 439.4186831 109.8546708 79.70619097 13.29%
Getting faster memory (assuming your MB and CPU support it) is clearly a very cost effective means of getting more LLR performance. (In my particular purchases, with Internet price fluctuations being what they are, the 2400 MHz memory was actually $1 cheaper than the otherwise identical 1600 MHz part.)
Temps are running about 5 degrees higher, but the fans are only running at about 60% speed. We'll see if I get any LLR tasks fail validation.
____________
My lucky number is 75898524288+1 | |
|
|
To maximize the output, you can run 2 short wu and 2 long, i.e. 2 x sgs (pps ex) + 2 psp/sob. | |
|
axnVolunteer developer Send message
Joined: 29 Dec 07 Posts: 285 ID: 16874 Credit: 28,027,106 RAC: 0
            
|
Cores core 1 core 2 core 3 core 4 Iter/sec I/S/core Incremental Loss
1 7.893 126.6945395 126.6945395 n/a 0.00%
2 8.065 8.066 247.9697486 123.9848743 121.2752092 2.14%
3 8.336 8.350 8.334 359.7124921 119.904164 111.7427435 5.36%
4 9.248 8.882 9.244 9.048 439.4186831 109.8546708 79.70619097 13.29%
w00t! Now that's some good scaling. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
During the GFN challenge I was able to OC ALL my Fermis by lowering the memory clocks 10-15%. With my GTX 460 (at 1884 Mhz) & 570 (at 1844 Mhz) I was able to OC 25-30% even without my A/C cooling setup I had at my old house, although my 580 OC'ed the least (at 1644 Mhz).
So memory clock IS the key.
That certainly fits with everything we know (or think we know). When I get a chance I'll put the 580 back at factory clocks except for the memory.
As expected, the factory overclocking seemed to work as well as the memory was underclocked, so I'll leave it like that.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Here's some real life results showing the difference the 2400 Mhz memory makes. This is the task page for the Core i5's PSP tasks. The first 4 at the bottom are with the memory set to 1600 MHz and the newer ones are with the memory running at 2400 MHz.
http://www.primegrid.com/results.php?hostid=424445&offset=0&show_names=0&state=0&appid=8
(If you're reading this sometime in the future, that link might not show anything. Sorry.)
The CPU time dropped from about 160K seconds to 136K seconds AND these newer tasks are also about 5% larger than the earlier tasks that took 160K seconds. (Ignore the elapsed time measurements; I have way too much junk running in the background.)
____________
My lucky number is 75898524288+1 | |
|
|
has anyone experience with the newest i7 4790k Processor ?
____________
| |
|
|
has anyone experience with the newest i7 4790k Processor ?
Hi, T-Armstrong. This should really be a new thread topic and would probably be of interest to anyone interested in the new cpu and Z97 motherboards. | |
|
|
has anyone experience with the newest i7 4790k Processor ?
Hi, T-Armstrong. This should really be a new thread topic and would probably be of interest to anyone interested in the new cpu and Z97 motherboards.
OK I understand
____________
| |
|
|
Cores core 1 core 2 core 3 core 4 Iter/sec I/S/core Incremental Loss
1 7.893 126.6945395 126.6945395 n/a 0.00%
2 8.065 8.066 247.9697486 123.9848743 121.2752092 2.14%
3 8.336 8.350 8.334 359.7124921 119.904164 111.7427435 5.36%
4 9.248 8.882 9.244 9.048 439.4186831 109.8546708 79.70619097 13.29%
That looks like very interesting data.. how are you getting it?
I recall very early in my time here being advised to check how LLR reacts to my CPU being hyperthreaded. That in most cases, 4 tasks (1 per core) was nearly always the optimal configuration.
After reading this thread though.. if most of you really are memory bandwidth limited (Assuming a steady CPU clock of between 3.6 GHz to 4 GHz) and if my CPU clock ranges from 2.35 to 3.17 GHz, is it possible I could be CPU limited?
I've taken some run time measurements running SR5:
hours daily / avg run time * # of tasks * sample credit
24 / 4.75 * 4 * 830 = 16,774 (4 tasks at once)
24 / 5.75 * 5 * 830 = 17,443 (5 tasks at once)
My explanation for this is that the CPU is underutilized in some way. That could absolutely be caused by a bandwidth bottleneck, but.. in task manager (which is by no means scientifically conclusive)..
It is very "spikey" with 4 tasks (possibly a result of hyperthreading swapping pipelines to the cores.)
Whereas with 5 tasks its much steadier. hmm.
There's a clear & obvious slowdown on all 5 tasks running them together instead of 4 at a time. But if my numbers hold water, I'm completing 4/5ths a task more per day running 5.
I have briefly tested 6 & 7 at once also, but the increased runtime and lower clock avg clock speed doom it
That said, that variance decreases the lower the clock speed is. Running at 2.06 GHz the variance is nearly negligible, but its hard to get consistent data running 6 @ 2.06 GHz | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Cores core 1 core 2 core 3 core 4 Iter/sec I/S/core Incremental Loss
1 7.893 126.6945395 126.6945395 n/a 0.00%
2 8.065 8.066 247.9697486 123.9848743 121.2752092 2.14%
3 8.336 8.350 8.334 359.7124921 119.904164 111.7427435 5.36%
4 9.248 8.882 9.244 9.048 439.4186831 109.8546708 79.70619097 13.29%
That looks like very interesting data.. how are you getting it?
I explained the test conditions in the same post where I posted that data. One addendum: if I were repeating the test today, I'd do it in windows safe mode so there would be fewer background tasks running.
One critically important point from that post: NEVER use live BOINC tasks for benchmarks. They vary a lot and can produce misleading results. Always run tests manually from the command line.
The iteration timing numbers are output by LLR periodically (every 10000 iterations, but never use the first iteration because it included initialization processing and is not indicative of the actual iteration time.) Always wait until at least iteration 20000 AND all of the cores you're testing were running at the beginning of the iteration counter. Translation: make sure all instances are running for at least 10000 iterations with all instances running, and all are up to at least 20000 before using the timing numbers.
____________
My lucky number is 75898524288+1 | |
|
|
Cores core 1 core 2 core 3 core 4 Iter/sec I/S/core Incremental Loss
1 7.893 126.6945395 126.6945395 n/a 0.00%
2 8.065 8.066 247.9697486 123.9848743 121.2752092 2.14%
3 8.336 8.350 8.334 359.7124921 119.904164 111.7427435 5.36%
4 9.248 8.882 9.244 9.048 439.4186831 109.8546708 79.70619097 13.29%
That looks like very interesting data.. how are you getting it?
I explained the test conditions in the same post where I posted that data. One addendum: if I were repeating the test today, I'd do it in windows safe mode so there would be fewer background tasks running.
One critically important point from that post: NEVER use live BOINC tasks for benchmarks. They vary a lot and can produce misleading results. Always run tests manually from the command line.
The iteration timing numbers are output by LLR periodically (every 10000 iterations, but never use the first iteration because it included initialization processing and is not indicative of the actual iteration time.) Always wait until at least iteration 20000 AND all of the cores you're testing were running at the beginning of the iteration counter. Translation: make sure all instances are running for at least 10000 iterations with all instances running, and all are up to at least 20000 before using the timing numbers.
I understand the hardware upgrades as it relates to the test conditions (upgrading the RAM alleviated a memory bandwidth bottleneck & improved instructions per CPU clock). But I don't know how to -get- that data. That's what I was asking.
Up until now I've used a stopwatch and a calculator.. I do not know how to read iteration timing numbers of LLR -- you hinted at using a command line?
It sounds like it's also got a consistent benchmark.. which would eliminate the chance of variability while testing configurations. However, that also means actual units, (and each LLR subproject) may behave differently than said benchmark.
I'm looking to build a desktop down the road, but for now, its just one lonely laptop. I don't foresee ever running BOINC by itself - I use my laptop for everything. I typically run a minimal amount of background programs, but, I'd prefer to test with those in place so I know how it acts 99% of the time
(#1 reason I haven't built a desktop is because I can't in good conscious build one weaker than my current laptop)
My apologies. I am very literate with computer hardware but software.. I know of file types, file extensions, am used to wrestling with absent-minded routers & can brute force Fallout 3 to run on a Windows 8 laptop (that took over a week)
Beyond that I know very little more than I've let on. Most of what I know relates to hardware in some fashion.
** I tried command prompt to ensure it wasn't "that easy"
I entered llr64 -d -q"313126*5^1801143-1" and also tried "llr64.exe" and both times it said:
"llr64 is not recognized as an internal or external command, operable program, or batch file."
probably a list of things wrong/missing | |
|
|
** I tried command prompt to ensure it wasn't "that easy"
I entered llr64 -d -q"313126*5^1801143-1" and also tried "llr64.exe" and both times it said:
"llr64 is not recognized as an internal or external command, operable program, or batch file."
probably a list of things wrong/missing
I don't know the proper command line but I've seen somewhere on this forum where one of the command line arguments is '-b'. I'm assuming it signifies benchmarking. I can find posts with command line benchmarks run with the genefer app, but they also contain additional specified variables besides the -b, and I suspect the same must be done with every subproject app run from the command line.
I think someone else with more information will likely post the information you need. Don't give up hope. | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
I'm on my phone, so this will be a bit brief.
Your first command line looks correct. Is the llr64.exe executable in the directory where you're running the command? It may also be called cllr64.exe.
When it runs, you'll see the iteration timing numbers.
-b is for genefer, not llr.
____________
My lucky number is 75898524288+1 | |
|
|
I'm on my phone, so this will be a bit brief.
Your first command line looks correct. Is the llr64.exe executable in the directory where you're running the command? It may also be called cllr64.exe.
When it runs, you'll see the iteration timing numbers.
-b is for genefer, not llr.
hmm. I think the executable I'm looking for has a completely different name. I don't have the standalone LLR program installed, but I have the "primegrid_cllr64_3.8.13_windows_x86_64.exe" and "primegrid_llr_wrapper_6.24_windows_x86_64.exe" in the BOINC folder. If THAT is the name I have to type in command prompt that might be why it failed.
I'm not sure those will work, or if will interfere with BOINC in some way.. or how to set parameters like how many cores to test BUT.. I can try renaming it & see what happens.
*moved to user account folder (from C:/program data folder), renamed "llr64.exe"
used the "313126*5^1801143-1" value I ran thru BOINC yesterday. pressed enter.
And it looks like its doing something. Currently single core
**it says its about 2% done, updating every 10000 bits. Time per bit 6.583 ms
what's odd is the CPU is running at only 1.37 GHz, again single core/thread. Honestly not sure what to make of it
***okay haha. it appears LLR via BOINC (SR5) does the exact same thing if its the only task running - a low clock between 1.3 GHz and 1.5 GHz. Very interesting
**** After re-examining everything, I'd like to solve the puzzle =P
I -think- the nifty chart that was posted was handmade based on what I observed, the time per bit. And then.. it was done again with multiple instances manually started & remeasured. If so, it seems I have some legwork to do, but.. hopefully I'll figure it out.
chief concern now being 2 things.
1, command prompt does not accept commands while running LLR Currently unable to run additional instances of either LLR or cmd.exe simultaneously for testing
2, the data I'm getting (time per bit) is that averaged over the total run time? If it is, that presents a potential for misleading data, if the run time is not great enough to average out the startup time you cited. | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1022 ID: 301928 Credit: 543,195,386 RAC: 2
                        
|
1, command prompt does not accept commands while running LLR Currently unable to run additional instances of either LLR or cmd.exe simultaneously for testing
You must create 4 (or how much tests you wish to run at once) different temporary folders, copy executable to each folder and open 4 command prompt windows. Each test must be run in his own temporary folder, this is a simplest way for beginner (it's possible to run 4 copies from single folder, but you have to use more options to set unique .ini files and disable checkpoints).
2, the data I'm getting (time per bit) is that averaged over the total run time? If it is, that presents a potential for misleading data, if the run time is not great enough to average out the startup time you cited.
No, the time is shown only for latest 10000 iterations. | |
|
|
1, command prompt does not accept commands while running LLR Currently unable to run additional instances of either LLR or cmd.exe simultaneously for testing
You must create 4 (or how much tests you wish to run at once) different temporary folders, copy executable to each folder and open 4 command prompt windows. Each test must be run in his own temporary folder, this is a simplest way for beginner (it's possible to run 4 copies from single folder, but you have to use more options to set unique .ini files and disable checkpoints).
2, the data I'm getting (time per bit) is that averaged over the total run time? If it is, that presents a potential for misleading data, if the run time is not great enough to average out the startup time you cited.
No, the time is shown only for latest 10000 iterations.
That makes sense. & have run some tests. But the results are puzzling. It was puzzling enough having the single core test not fully clock the CPU, but its actually going faster with 5 workunits than it was with just 1. (5.85 ms now)
Something seems suspicious.. if I run the test more than once does the system cache any part of the benchmark?
Even though it makes logical sense to assume 1 task at a time is faster than multiple tasks no matter -what- the tests claim, I would never do so in practice. What I really want to know is if these tests validate the results I showed earlier with BOINC (assuming the % complete can be trusted)
That running 5 units simultaneously yielded more daily units complete (by about 3/4ths a task) than running just 4 (1 per core)
But there are variables to test, such as whether the optimal number of workunits changes with a lower clock speed (which happens when using the GPU)
That said I'm ready to call it a night. Thank you | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13954 ID: 53948 Credit: 392,586,193 RAC: 178,879
                               
|
Except for the clock speed thing you seem to have figured it out. The clock speed problem is a bit bizarre.
____________
My lucky number is 75898524288+1 | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 1022 ID: 301928 Credit: 543,195,386 RAC: 2
                        
|
That makes sense. & have run some tests. But the results are puzzling. It was puzzling enough having the single core test not fully clock the CPU, but its actually going faster with 5 workunits than it was with just 1. (5.85 ms now)
It's a bane of "green" technologies. By default, clock of modern CPU is set depending on system load. With only one task running, system thinks that load is still not high enough and CPU is working on low speed. I know how to disable this on Linux (it's called "governor" and must be switched from default "ondemand" to "performance" mode) but not on Windows, may be somebody else could help.
if I run the test more than once does the system cache any part of the benchmark?
No. But LLR writes a checkpoint file, named something like "z123456", so on the next run it will continue from checkpoint. I don't think it could affect benchmarks except for initialization phase, with you should ignore anyway, but if you want do disable this behavior, delete this file before new test or add to LLR command line: -oNoSaveFile=1
That running 5 units simultaneously yielded more daily units complete (by about 3/4ths a task) than running just 4 (1 per core)
This is a bit unusual (4 tasks works better for most setups), but things could become even more complex when power management and CPU clock adjustments comes to play. So, just find out which settings are giving maximum overall performance (number of iterations) for YOUR system and forget about it.
total_iterations = 1000 / average_time_per_bit * number_of_tasks
| |
|
|
1, command prompt does not accept commands while running LLR Currently unable to run additional instances of either LLR or cmd.exe simultaneously for testing
You must create 4 (or how much tests you wish to run at once) different temporary folders, copy executable to each folder and open 4 command prompt windows. Each test must be run in his own temporary folder, this is a simplest way for beginner (it's possible to run 4 copies from single folder, but you have to use more options to set unique .ini files and disable checkpoints).
2, the data I'm getting (time per bit) is that averaged over the total run time? If it is, that presents a potential for misleading data, if the run time is not great enough to average out the startup time you cited.
No, the time is shown only for latest 10000 iterations.
That makes sense. & have run some tests. But the results are puzzling. It was puzzling enough having the single core test not fully clock the CPU, but its actually going faster with 5 workunits than it was with just 1. (5.85 ms now)
Something seems suspicious.. if I run the test more than once does the system cache any part of the benchmark?
Even though it makes logical sense to assume 1 task at a time is faster than multiple tasks no matter -what- the tests claim, I would never do so in practice. What I really want to know is if these tests validate the results I showed earlier with BOINC (assuming the % complete can be trusted)
That running 5 units simultaneously yielded more daily units complete (by about 3/4ths a task) than running just 4 (1 per core)
But there are variables to test, such as whether the optimal number of workunits changes with a lower clock speed (which happens when using the GPU)
That said I'm ready to call it a night. Thank you
[/quote]
It's a bane of "green" technologies. By default, clock of modern CPU is set depending on system load. With only one task running, system thinks that load is still not high enough and CPU is working on low speed. I know how to disable this on Linux (it's called "governor" and must be switched from default "ondemand" to "performance" mode) but not on Windows, may be somebody else could help. [quote]
For Windows- Check Power Options- you could be on energy saver mode- set to high performance and check off any "saver" mode for PCI Express and set processor power management to 100%
Another area to check- what setting is on for Intel Speed Step in you're BIOS?(If you don't mind flashing BIOS-locate a custom VBIOS/BIOS to open advanced settings. This can range from memory timings to Intel processor Turbo setting.) Looking at CPU times for you're SR5 tasks- it looks like you have dual channel 1600MHz RAM- Ivy Bridge's IMC can take up 2666MHz RAM clocks- upgrading to faster RAM modules will increase performance ~25%. | |
|
|
what's odd is the CPU is running at only 1.37 GHz, again single core/thread.
What I really want to know is if these tests validate the results I showed earlier with BOINC (assuming the % complete can be trusted)
That running 5 units simultaneously yielded more daily units complete (by about 3/4ths a task) than running just 4 (1 per core)
But there are variables to test, such as whether the optimal number of workunits changes with a lower clock speed (which happens when using the GPU)
It's a bane of "green" technologies. By default, clock of modern CPU is set depending on system load. With only one task running, system thinks that load is still not high enough and CPU is working on low speed. I know how to disable this on Linux (it's called "governor" and must be switched from default "ondemand" to "performance" mode) but not on Windows, may be somebody else could help.
For Windows- Check Power Options- you could be on energy saver mode- set to high performance and check off any "saver" mode for PCI Express and set processor power management to 100%
Another area to check- what setting is on for Intel Speed Step in you're BIOS?(If you don't mind flashing BIOS-locate a custom VBIOS/BIOS to open advanced settings. This can range from memory timings to Intel processor Turbo setting.) Looking at CPU times for you're SR5 tasks- it looks like you have dual channel 1600MHz RAM- Ivy Bridge's IMC can take up 2666MHz RAM clocks- upgrading to faster RAM modules will increase performance ~25%.
-Copied the above to reduce hunting for relevant points of discussion
That's very perceptive -- yes I should have dual-channel 1600 MHz RAM. Although because my CPU speed varies so much, if you looked at my results page that might be hard to discern.
The reason I believe my system is slightly faster with 5 units instead of 4 is because somewhere my system is underutilized in a way that bottlenecks most systems.
For example, because my CPU runs at between 2.35 GHz & 3.17 GHz (instead of the 3.5 GHz to 4 GHz on desktops) I suspect desktop systems bottleneck their RAM or cache more than I do.
In my system, the memory may be faster than the CPU, therefore increasing task count to 5 uses more CPU time without slowing -too much-
I used Task Manager to support this theory. It's "spikeyness" could be attributed to idle time by the CPU & its quite pronounced when only running 4 tasks.
-Finally, you're correct about the Windows power management. You can disable turbo boost by adjusting max state from 100% to 99%. While increasing the minimum speed might help with single tasks clocking all the way, its ONLY SR5 that appears to do that. If I convert a video to H.264 or something, its single thread and it goes clear to 3.4 GHz
Increasing minimum speed would prevent it from downclocking to avoid overheating, something to be aware of when its already running 95 C fully loaded with either Turbo Boost or the GPU in use, running 4 SR5 tasks, 1 ESP task, GPU running PPS SV, temps are 95 C CPU and 81 C GPU
As its an OEM laptop (HP ENVY DV7) my overclocking options are.. highly unlikely.
But if I could attach a desktop cooler somehow & get my thermals down lower than external coolers currently do, I could run at 3.17 GHz more of the time.
Even when the GPU is inactive, it still cycles up & down running LLR | |
|
|
what's odd is the CPU is running at only 1.37 GHz, again single core/thread.
What I really want to know is if these tests validate the results I showed earlier with BOINC (assuming the % complete can be trusted)
That running 5 units simultaneously yielded more daily units complete (by about 3/4ths a task) than running just 4 (1 per core)
But there are variables to test, such as whether the optimal number of workunits changes with a lower clock speed (which happens when using the GPU)
It's a bane of "green" technologies. By default, clock of modern CPU is set depending on system load. With only one task running, system thinks that load is still not high enough and CPU is working on low speed. I know how to disable this on Linux (it's called "governor" and must be switched from default "ondemand" to "performance" mode) but not on Windows, may be somebody else could help.
For Windows- Check Power Options- you could be on energy saver mode- set to high performance and check off any "saver" mode for PCI Express and set processor power management to 100%
Another area to check- what setting is on for Intel Speed Step in you're BIOS?(If you don't mind flashing BIOS-locate a custom VBIOS/BIOS to open advanced settings. This can range from memory timings to Intel processor Turbo setting.) Looking at CPU times for you're SR5 tasks- it looks like you have dual channel 1600MHz RAM- Ivy Bridge's IMC can take up 2666MHz RAM clocks- upgrading to faster RAM modules will increase performance ~25%.
-Copied the above to reduce hunting for relevant points of discussion
That's very perceptive -- yes I should have dual-channel 1600 MHz RAM. Although because my CPU speed varies so much, if you looked at my results page that might be hard to discern.
The reason I believe my system is slightly faster with 5 units instead of 4 is because somewhere my system is underutilized in a way that bottlenecks most systems.
For example, because my CPU runs at between 2.35 GHz & 3.17 GHz (instead of the 3.5 GHz to 4 GHz on desktops) I suspect desktop systems bottleneck their RAM or cache more than I do.
In my system, the memory may be faster than the CPU, therefore increasing task count to 5 uses more CPU time without slowing -too much-
I used Task Manager to support this theory. It's "spikeyness" could be attributed to idle time by the CPU & its quite pronounced when only running 4 tasks.
-Finally, you're correct about the Windows power management. You can disable turbo boost by adjusting max state from 100% to 99%. While increasing the minimum speed might help with single tasks clocking all the way, its ONLY SR5 that appears to do that. If I convert a video to H.264 or something, its single thread and it goes clear to 3.4 GHz
Increasing minimum speed would prevent it from downclocking to avoid overheating, something to be aware of when its already running 95 C fully loaded with either Turbo Boost or the GPU in use, running 4 SR5 tasks, 1 ESP task, GPU running PPS SV, temps are 95 C CPU and 81 C GPU
As its an OEM laptop (HP ENVY DV7) my overclocking options are.. highly unlikely.
But if I could attach a desktop cooler somehow & get my thermals down lower than external coolers currently do, I could run at 3.17 GHz more of the time.
Even when the GPU is inactive, it still cycles up & down running LLR
I think "spikeyness" as you call it can be attributed to those 95C CPU temps. Any CPU at these temps will throttle-causing CPU speeds to be all over the place from thermals being high. Check voltage- look for sudden changes when running LLR. Voltage should be stable once temps are below 90C. If you can get temps down to 80-90C. See if downclocking behavior persists. If so- windows 8.1 could be acting flakey and require an update. Maybe HP BIOS settings are not correct for proper long-term LLR computing. T-Junction temp for Ivy bridge PGA988B(PGA989) CPU is 105C and for BGA (soldered) CPU socket rated at 100C. These temps are absolute value- at around 90C Ivy Bridge will throttle hard. AVX LLR is potent and will heat up any CPU with weak/less than optimal cooling- check the fan and heat sink for dust build up- also prop up back of laptop for better air flow. (check to see if fan control setting can be re-set in BIOS or if there a power management setting that allows fan control- maybe fan not running at full speed.) | |
|
|
I think "spikeyness" as you call it can be attributed to those 95C CPU temps. Any CPU at these temps will throttle-causing CPU speeds to be all over the place from thermals being high. Check voltage- look for sudden changes when running LLR. Voltage should be stable once temps are below 90C. If you can get temps down to 80-90C. See if downclocking behavior persists. If so- windows 8.1 could be acting flakey and require an update. Maybe HP BIOS settings are not correct for proper long-term LLR computing. T-Junction temp for Ivy bridge PGA988B(PGA989) CPU is 105C and for BGA (soldered) CPU socket rated at 100C. These temps are absolute value- at around 90C Ivy Bridge will throttle hard. AVX LLR is potent and will heat up any CPU with weak/less than optimal cooling- check the fan and heat sink for dust build up- also prop up back of laptop for better air flow. (check to see if fan control setting can be re-set in BIOS or if there a power management setting that allows fan control- maybe fan not running at full speed.)
I looked at that. My thermals only cause Turbo Boost to cycle the clock speed as it hits ~97 C (between 2.35 GHz (max design speed) and 3.17 GHz (max TB speed @ 4 cores))
If I disable it entirely, it runs at a steady 2.35 GHz (unless the GPU contributes to excess heat over 96 C )
I'm about to run a series of 24 tests using SR5 at 1 to 8 tasks, with 2.35 GHz enforced for 8, Turbo boost enabled for 8, and GPU active on PPS SV for 8 and see what data I get.
That should answer alot of questions & nail down what task count is optimal under all 3 scenarios
(i'm opting out of 8 tests with both Turbo Boost & GPU active simultaneously as that quickly pushes it below an average of 2.35 GHz & is therefore less efficient than the other 3 setups) | |
|
|
I have the data for all 27 tests I ran. But it looks cluttered so I cut it down to the essentials:
Task Qnt Avg b/s bits/s/core Total bits/s Temp Conditions
1 3.709 ms 269.6 269.6 55 C 2.35 GHz
1 2.770 ms 361.0 361.0 65 C 3.24 GHz*
2 3.770 ms 265.2 530.5 61 C 2.35 GHz
2 2.844 ms 351.6 703.2 77 C 3.19 GHz*
3 3.893 ms 256.8 770.6 69 C 2.35 GHz
3 3.115 ms 321.0 963.1 80 C - 90 C 2.35 - 3.17 GHz*
4 4.363 ms 229.2 916.8 76 C 2.35 GHz
4 3.894 ms 256.8 1027.2 81 C - 92 C 2.35 - 3.17 GHz*
5 5.424 ms 184.4 921.8 75 C 2.35 GHz
5~ 4.776 ms 209.4 1046.9 82 C - 93 C 2.35 - 3.17 GHz*
6 6.321 ms 158.2 949.2 76 C 2.35 GHz
6 5.843 ms 171.4 1026.9 82 C - 93 C 2.26 - 3.17 GHz*
7~ 7.166 ms 139.5 976.8 77 C 2.35 GHz
7 6.764 ms 147.8 1034.8 82 C - 93 C 2.23 - 3.17 GHz*
8 8.234 ms 121.4 971.6 78 C 2.35 - 2.38 GHz
8 7.786 ms 128.4 1027.5 81 C - 93 C 2.23 - 3.19 GHz*
* notes use of Turbo Boost
~leading configurations
Although I did not include the data for the GPU tests, the best one there was at 6 instances of SR5 at 933.7 at 96 C.
its pretty clear when it hits 99 C & throttles down to 2.06 at 7 instances the result would be lower, at 848.1
The tests at 2.06 GHz fixed speed follow the same curve as in 2.35 GHz, favoring the 6 to 7 instance range
There is about a 3% to 4% gain over 4 instances on total task completion rate after 5 tasks for the 2.35 GHz tests.
For the full speed tests, there's alot of variability because its hard to say for sure if the higher speeds bottleneck the memory system more, or, if its simply harder to keep a high clock speed at with more instances
Regardless of why, the faster variable speed tests favored a slightly lower number of tasks with the best score at 5, while the fixed, low speed tests fared better around 7
For me, my best tasks per day should be around 6, but its very close.
It is my theory that, relative to my system, those with faster processors will be more efficient closer to 4 instances, with higher speed RAM -might- lean more towards 6
The one thing the data shows that is irrefutable is that there is no profound efficiency loss for running more than 4 SR5 LLR tasks
There's likely more data to be collected from other systems though | |
|
Message boards :
Number crunching :
A new machine |