Author |
Message |
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
llrAVX
Gary Craig, a member of Aggie The Pew, was successful in compiling an AVX version of LLR using gwnum v27.2 (ftp://mersenne.org/gimps/source272.zip) and LLR v3.8.6dev. We've been testing it for the past week with success. Depending on the LLR project, speed improvements range from 20% to 50%.
Currently, this can only be run using an app_info file and available for Linux, MacIntel, and Windows A CPU only app_info file is included in the download that will allow you to participate in the following LLR projects:
- 321 Prime Search
- Cullen Prime Search
- Prime Sierpinski Problem
- Proth Prime Search
- Seventeen or Bust
- Sophie Germain Prime Search (UPDATE: Not working because of large k's. Must wait on update to gwnum)
- The Riesel Problem
- Woodall Prime Search
Before using app_info, please be sure to have all previous work completed and returned. Otherwise, all work will be lost. If someone would like to provide step by step instructions, I'll be happy to include them in this post. Additionally, if anyone wishes to create a full project app_info file (including GPU), that can be included in the package as well.
Download llrAVX with CPU ONLY app_info here: llr3.8.6dev_avx.7z
Ronald has app_info files on his site which include all PrimeGrid projects including the GPU projects. Specifically he has the following:
- All Linux32 CPU and CUDA projects; CPU ONLY; CPU and OpenCL
- All Linux64 CPU and CUDA projects; CPU ONLY; CPU and OpenCL
- All Windows CPU and CUDA projects; CPU ONLY; CPU and OpenCL
NOTE: Rename the files for your OS and move them into the appropriate BOINC folder and then restart BOINC:
app_info_linux.xml --> app_info.xml
app_info_macintel.xml --> app_info.xml
app_info_windows.xml --> app_info.xml and
llr3.8.6dev_linux_avx --> llravx
llr3.8.6dev_macintel_avx --> llravx
llr3.8.6dev_windows_avx.exe --> llravx.exe This has been tested on Intel Sandy Bridge but should also work on AMD Bulldozer [UPDATE] DOES NOT WORK on AMD Bulldozer. Currently, these are the only two CPU's that support AVX. Attention: AVX is supported only after Win7 SP1. WinXP does not have AVX support.
NOTE: This can be used in PRPNet as well. 5.0.3 has just been released with the llrAVX builds included. [UPDATE] 5.0.4 has been released.
The speed up is due to George Woltman's update to gwnum v27.2. Thank you!!!
Special thanks to Gary Craig for the build and to Lennart Vogel for the testing! :)
[edit] Credit to Iain Bethune for the MacIntel build.
[edit] Credit to Rebirther for the Windows build.
[edit] Credit to Ronald for app_info files.
____________
|
|
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
Added a MacIntel_AVX build provided by Iain Bethune.
If someone could test it and provide feedback, that would be very much appreciated.
____________
|
|
|
|
I cant waiting to test it with ppselow in prpnet on windows. Hope for the win app soon :)
Edit:
Could be interesting to use it with sr5 instead pfgw. A little boost perhaps. |
|
|
Sysadm@Nbg Volunteer moderator Volunteer tester Project scientist
 Send message
Joined: 5 Feb 08 Posts: 1224 ID: 18646 Credit: 877,929,236 RAC: 321,810
                      
|
tested it with prpnet and PPSElow (on i7-2600K hyperthreading on)
client 1 with stok app
[2012-01-09 17:25:07 CET] Server: PPSElow, Candidate: 4875*2^302003+1 Program: llr Residue: 432AFFCBD94A9016 Time: 103 seconds
client 2 with the AVX build
[2012-01-09 17:55:31 CET] Server: PPSElow, Candidate: 3747*2^302007+1 Program: llr Residue: D11ED507C8F83687 Time: 66 seconds
nice :D
very, very well done :thumbup:
____________
Sysadm@Nbg
my current lucky number: 113856050^65536 + 1
PSA-PRPNet-Stats-URL: http://u-g-f.de/PRPNet/
|
|
|
|
I just tried to test it with Mac...
I copied the 2 files to /Library/Application Support/Boinc Data/projects/www.primegrid.com
When I tried to open Boinc again I get the following error message:
"BOINC ownership or permissions are not set properly, please reinstall BOINC. (Error code -1202)"
Could anybody tell me what I did wrong, please?
EDIT: Figured it out myself, had to run the Boinc installer.app again... |
|
|
|
tested it with prpnet and PPSElow (on i7-2600K hyperthreading on)
client 1 with stok app
[2012-01-09 17:25:07 CET] Server: PPSElow, Candidate: 4875*2^302003+1 Program: llr Residue: 432AFFCBD94A9016 Time: 103 seconds
client 2 with the AVX build
[2012-01-09 17:55:31 CET] Server: PPSElow, Candidate: 3747*2^302007+1 Program: llr Residue: D11ED507C8F83687 Time: 66 seconds
nice :D
very, very well done :thumbup:
Thats what I have thought ^^ |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Wow.
I know I'm not adding anything particularly useful with this comment, but I just had to say wow.
I have not really regretted having nothing faster than a Core2Quad until now.
Wow.
____________
My lucky number is 75898524288+1 |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
Great news!
I'm really awaiting Win app.
____________
My stats |
|
|
|
I have tested in ubuntu64 vmware and cant see any difference in speed between old llr and avx. I added the name from both llr in ini-file. Whats wrong? I see also 2-3 instances instead only 1. |
|
|
|
This performance boost will only be seen if you have a sandy bridge or bulldozer cpu's...right?
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Great news!
I'm really awaiting Win app.
About 15 quads of different kind (Q9550, E5420, E5520, E6530, i5 760, i5 2500, i7 920/HT) to test performance boost.
Unless I'm mistaken, AVX is a new set of SIMD instructions added to the Intel's 2nd generation Core i (i.e., Sandy Bridge) and (???) AMD's new Bulldozer CPUs.
The AVX improvements aren't applicable to older CPUs. I think the only CPU you listed that has AVX instructions is the i5 2500. Performance on all the other CPUs should be unchanged, unless there's other improvements in the code.
____________
My lucky number is 75898524288+1 |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
Yes, only Sandy Bridge and Bulldozer.
http://en.wikipedia.org/wiki/Advanced_Vector_Extensions
____________
My stats |
|
|
|
Oh that bites. I don't own one of those.
____________
|
|
|
|
Second test in vmware. The vm doesnt affect the avx improvement. Looks like only a native linux does it. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Second test in vmware. The vm doesnt affect the avx improvement. Looks like only a native linux does it.
Are you running this on a Sandy Bridge CPU?
____________
My lucky number is 75898524288+1 |
|
|
|
Second test in vmware. The vm doesnt affect the avx improvement. Looks like only a native linux does it.
Are you running this on a Sandy Bridge CPU?
Indeed, i5-2500k@4Ghz |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Second test in vmware. The vm doesnt affect the avx improvement. Looks like only a native linux does it.
Are you running this on a Sandy Bridge CPU?
Indeed, i5-2500k@4Ghz
Although the VM doesn't emulate the CPU, it *might* emulate the CPU identification.
It's possible that if the VM isn't AVX aware, programs running in the VM might, therefore, not be aware that the CPU supports AVX because the VM isn't supplying the correct CPU abilities.
There's two possible fixes I can think of:
1) Maybe gwnum (and llr) have a switch that will force it to use AVX.
2) See if VMWare has a newer release that supports AVX.
____________
My lucky number is 75898524288+1 |
|
|
|
Oh interesting about the VMware. I was just thinking of setting up a linux VM tonight to try this out instead of installing linux. (until we can get a native windows build)
Hopefully the VM issues can be resolved!
____________
|
|
|
|
Added a MacIntel_AVX build provided by Iain Bethune.
If someone could test it and provide feedback, that would be very much appreciated.
I am running an intel iMac, and will give it a go...
|
|
|
|
Oh interesting about the VMware. I was just thinking of setting up a linux VM tonight to try this out instead of installing linux. (until we can get a native windows build)
Hopefully the VM issues can be resolved!
Got it! 31-36sec for ppselow ;)
Tomorrow I will test a sr5 with it. |
|
|
|
Oh interesting about the VMware. I was just thinking of setting up a linux VM tonight to try this out instead of installing linux. (until we can get a native windows build)
Hopefully the VM issues can be resolved!
Got it! 31-36sec for ppselow ;)
Tomorrow I will test a sr5 with it.
Out of curiosity, what did you have to do to get it to work in a VM? I have a Windows (64-bit) Sandy Bridge box where running Linux natively isn't really an option (it's used for production work by family members) and I was thinking the exact same thing about trying it in a VM. My VM software of choice is typically VirtualBox, but I'm guessing whatever you did to make it work in VMWare might work in VirtualBox as well. |
|
|
|
whats the word on a Windows 64 bit app. I'd love to try this out on my Fx8150 |
|
|
|
Oh interesting about the VMware. I was just thinking of setting up a linux VM tonight to try this out instead of installing linux. (until we can get a native windows build)
Hopefully the VM issues can be resolved!
Got it! 31-36sec for ppselow ;)
Tomorrow I will test a sr5 with it.
Out of curiosity, what did you have to do to get it to work in a VM? I have a Windows (64-bit) Sandy Bridge box where running Linux natively isn't really an option (it's used for production work by family members) and I was thinking the exact same thing about trying it in a VM. My VM software of choice is typically VirtualBox, but I'm guessing whatever you did to make it work in VMWare might work in VirtualBox as well.
Virtual Box doesnt work. If you have a vm workstation (looks like you can convert all versions) you need to convert the current machine.
VM->manage-> change hardware compatibility and choose "compatible with ESX Server", that worked for me. |
|
|
|
I have tested these so far:
before / after
PPSElow: 58s/30-36s
SGS: 9min/4min
TPS: 3,5min/1,5min
This version is running as hell.
Reference: i5-2500k@4Ghz |
|
|
|
One thing I have noticed over the last week-plus with the AVX version is increased heat output... the AVX version runs 2C-3C hotter than the stock app (at least on my box, under full load of 8 LLR threads on a 2600K). If you have been running o/c'd to your thermal limits before, you should definitely check your temperatures now under full load. That being said, my box doesn't have the greatest ventilation, so perhaps it is a localized phenomenon.
--Gary
____________
"I am he as you are he as you are me and we are all together"
87*2^3496188+1 is prime! (1052460 digits)
4 is not prime! (1 digit) |
|
|
|
One thing I have noticed over the last week-plus with the AVX version is increased heat output... the AVX version runs 2C-3C hotter than the stock app (at least on my box, under full load of 8 LLR threads on a 2600K). If you have been running o/c'd to your thermal limits before, you should definitely check your temperatures now under full load. That being said, my box doesn't have the greatest ventilation, so perhaps it is a localized phenomenon.
--Gary
Thank you very much for your work.
I see the 2-3°C temperature increase too (4 LLR threads on a 2500K).
____________
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
The increased temperatures with AVX-code are nothing new for me.
According to some cpu-tests with SB and SB-E (3960X) in the german computer-journal c't increases AVX the total system load by 30W.
They measured 214W with an AVX-optimized linpack for a 3960X-system with a small GPU, SSD and 80plus power-supply while the same system needed only 185W without AVX-code.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
One thing I have noticed over the last week-plus with the AVX version is increased heat output... the AVX version runs 2C-3C hotter than the stock app (at least on my box, under full load of 8 LLR threads on a 2600K). If you have been running o/c'd to your thermal limits before, you should definitely check your temperatures now under full load. That being said, my box doesn't have the greatest ventilation, so perhaps it is a localized phenomenon.
--Gary
I was actually going to ask if anyone was seeing increased temps. You *should* see increased temperatures and power consumption. Those AVX instructions let the CPU do more at once, and doing things consumes power.
____________
My lucky number is 75898524288+1 |
|
|
|
One thing I have noticed over the last week-plus with the AVX version is increased heat output... the AVX version runs 2C-3C hotter than the stock app (at least on my box, under full load of 8 LLR threads on a 2600K). If you have been running o/c'd to your thermal limits before, you should definitely check your temperatures now under full load. That being said, my box doesn't have the greatest ventilation, so perhaps it is a localized phenomenon.
--Gary
I was actually going to ask if anyone was seeing increased temps. You *should* see increased temperatures and power consumption. Those AVX instructions let the CPU do more at once, and doing things consumes power.
Yes, checked with coretemp around 4°C + 12W more than before (cpu only). |
|
|
|
Newbie question: what should I do with the files on linux? I mean, where shuld I put them?
thanks
Ubuntu 11.10
____________
676754^262144+1 is prime |
|
|
|
Newbie question: what should I do with the files on linux? I mean, where shuld I put them?
thanks
Ubuntu 11.10
Copy the new llr into your prpnet1/2.. folder, the best way is to rename it into llr. |
|
|
|
Newbie question: what should I do with the files on linux? I mean, where shuld I put them?
thanks
Ubuntu 11.10
Copy the new llr into your prpnet1/2.. folder, the best way is to rename it into llr.
My mistake,
Had done that in prpnet. I meant what should I do in Boinc... |
|
|
|
Newbie question: what should I do with the files on linux? I mean, where shuld I put them?
thanks
Ubuntu 11.10
Copy the new llr into your prpnet1/2.. folder, the best way is to rename it into llr.
My mistake,
Had done that in prpnet. I meant what should I do in Boinc...
Only copy both linux files into the primegrid folder and restart BOINC. |
|
|
|
Newbie question: what should I do with the files on linux? I mean, where shuld I put them?
thanks
Ubuntu 11.10
Copy the new llr into your prpnet1/2.. folder, the best way is to rename it into llr.
My mistake,
Had done that in prpnet. I meant what should I do in Boinc...
Only copy both linux files into the primegrid folder and restart BOINC.
If you're running in Ubuntu 11.10 AND you've installed the repository version from synaptic, then the folder you're looking for is:
/var/lib/boinc-client/projects/www.primegrid.com
You also need admin access via sudo to copy the necessary files. As soon as you've copied them, you also have to close and restart the client, otherwise you'll get computation errors.
BTW, if you've got a CUDA/OpenCL capable GPU, don't bother: you won't receive any more tasks for it, so you'll run much slower than before.
____________
Choose, and act.
|
|
|
|
I posted this also in news:
Compiled a win app successfully with latest code:
Download-link
Pls test also with BOINC, only run ppselow at the moment. |
|
|
|
I guess this is for sandy bridge systems only? |
|
|
|
I guess this is for sandy bridge systems only?
As John posted here SB + Bulldozer only. |
|
|
|
]
Pls test also with BOINC, only run ppselow at the moment.
How, please... |
|
|
|
]
Pls test also with BOINC, only run ppselow at the moment.
How, please...
If you have windows, copy the app_info.xml from other thread + this llr.exe into primegrid folder and restart BOINC. Thats all ;) |
|
|
|
]
Pls test also with BOINC, only run ppselow at the moment.
How, please...
If you have windows, copy the app_info.xml from other thread + this llr.exe into primegrid folder and restart BOINC. Thats all ;)
The app_info.xml are written for linux and mac. Filenames in it do not match windows, I believe.
Anyway, it runs very well on prpnet. Amazing increase in speed.
Thanks |
|
|
|
]
Pls test also with BOINC, only run ppselow at the moment.
How, please...
If you have windows, copy the app_info.xml from other thread + this llr.exe into primegrid folder and restart BOINC. Thats all ;)
The app_info.xml are written for linux and mac. Filenames in it do not match windows, I believe.
Anyway, it runs very well on prpnet. Amazing increase in speed.
Thanks
It was updated for win http://primegrid.pytalhost.net/ as well. |
|
|
|
Windows version using 5.0.3 and AVX
I tried it with PRP-PSP on a 2500k and it seems to work, but I am not sure if and how much improvement it gives. |
|
|
|
Windows version using 5.0.3 and AVX
I tried it with PRP-PSP on a 2500k and it seems to work, but I am not sure if and how much improvement it gives.
If you have ran it with the older llr you can check your time values in test_results.log. |
|
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
I posted this also in news:
Compiled a win app successfully with latest code:
Download-link
Pls test also with BOINC, only run ppselow at the moment.
Archive in first post has been updated to include this build. A CPU only app_info file for windows is also included.
NOTE: Rename the files for your OS and move them into the appropriate BOINC folder and then restart BOINC:
app_info_linux64.xml --> app_info.xml
app_info_macintel.xml --> app_info.xml
app_info_windows.xml --> app_info.xml and
llr3.8.6dev_linux64_avx --> llravx
llr3.8.6dev_macintel_avx --> llravx
llr3.8.6dev_windows_avx.exe --> llravx.exe Attention: AVX is supported only after Win7 SP1. WinXP does not have AVX support.
____________
|
|
|
|
Windows 7Pro64 i7-2700K stock speed (3.5GHz)
PRPNet 502 PPSElow2 was taking 67sec per unit and with test app 32-37sec per unit :)
I hope it is OK to continue to use this version.
____________
35 x 2^3587843+1 is prime! |
|
|
|
hmm I'm feeling stupid here. I can't seem to get this working with PRPnet. -- Atleast, I have zero change/improvement.
I am downloading (i think) the correct llr.exe file, and i put it into my individual prpnet sub folder, and run the task, and trying to benchmark on PPSElow...
before and after, I'm getting roughly 52 seconds per WU. (0.170ms per bit)
i7-3930k @ 4.2Ghz hyperthreading on.
I haven't tried any experiments in BOINC with app_info.xml yet..
____________
|
|
|
|
hmm I'm feeling stupid here. I can't seem to get this working with PRPnet. -- Atleast, I have zero change/improvement.
I am downloading (i think) the correct llr.exe file, and i put it into my individual prpnet sub folder, and run the task, and trying to benchmark on PPSElow...
before and after, I'm getting roughly 52 seconds per WU. (0.170ms per bit)
i7-3930k @ 4.2Ghz hyperthreading on.
I haven't tried any experiments in BOINC with app_info.xml yet..
Is it this box?
ID: 235836
Microsoft Windows 7
Ultimate x64 Edition, (06.01.7600.00)
Windows 7 SP 1 is required for AVX support.
____________
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
I took some smaller prime from http://prpnet.mine.nu:12000/user_primes.html
7745*2^245547+1 Prime PC-Gary 2011-08-18 19:10:28 GMT 73922
Time is fine, but no prime found. Bad luck?
Running i5-2500 on Win 2008R2 SP1
Using LLR 3.8.6
7745*2^245547+1 is not prime. Proth RES64: 69C444D105D03865 Time : 53.775 sec.
using LLRavx
Iter: 51/245559, ERROR: ROUND OFF (2.812417676e+013) > 0.4
Continuing from last save file.
Disregard last error. Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.
Iter: 51/245559, ERROR: ROUND OFF (6.796424524e+013) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
7745*2^245547+1 is not prime. Proth RES64: 69C444D105D03865 Time : 29.542 sec.
____________
My stats |
|
|
|
I took some smaller prime from http://prpnet.mine.nu:12000/user_primes.html
7745*2^245547+1 Prime PC-Gary 2011-08-18 19:10:28 GMT 73922
Time is fine, but no prime found. Bad luck?
Running i5-2500 on Win 2008R2 SP1
Using LLR 3.8.6
7745*2^245547+1 is not prime. Proth RES64: 69C444D105D03865 Time : 53.775 sec.
using LLRavx
Iter: 51/245559, ERROR: ROUND OFF (2.812417676e+013) > 0.4
Continuing from last save file.
Disregard last error. Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.
Iter: 51/245559, ERROR: ROUND OFF (6.796424524e+013) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
7745*2^245547+1 is not prime. Proth RES64: 69C444D105D03865 Time : 29.542 sec.
Unable to replicate that experience. -q option does not work with llr (or I'm unable to use it properly).
|
|
|
|
Using the app_file with boinc disables cuda work (PPS and GCW sieves).
Times in pps llr are reduced in 1/3 with a intel i-5 2500k. But I'm still waiting for validation. If Honza's results were just a bad coincidence, llravx is really a great improvement.
[edit] fists task validated. wingman is using "normal" llr:
http://www.primegrid.com/workunit.php?wuid=240115427 |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
Unable to replicate that experience. -q option does not work with llr (or I'm unable to use it properly).
llravx -d -q"7745*2^245547+1"
Checked other PC-Gary primes and those are OK (reported as primes). I guess the one mentioned is false prime? FPGW64 says as well (composite)
____________
My stats |
|
|
|
I get units 40% faster in comparison to non-avx units. Only downside, a lot more heat and a lot more power consumption on my 2500k
LLR non-avx: 18min. in boinc
LLR avx: : 11min. in boinc |
|
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3233 ID: 50683 Credit: 151,443,349 RAC: 73,965
                         
|
C:\Users\I\Desktop>llr -d -q"7745*2^245547+1"
Starting Proth prime test of 7745*2^245547+1
Using all-complex AMD K10 type-1 FFT length 24K, Pass1=96, Pass2=256, a = 3
7745*2^245547+1 is not prime. Proth RES64: 69C444D105D03865 Time : 76.045 sec.
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |
|
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3233 ID: 50683 Credit: 151,443,349 RAC: 73,965
                         
|
I get units 40% faster in comparison to non-avx units. Only downside, a lot more heat and a lot more power consumption on my 2500k
LLR non-avx: 18min. in boinc
LLR avx: : 11min. in boinc
Must must sacrifice energy and heat production for faster results :)
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |
|
|
|
Testing some of my old primes (that llr is llravx).
\llr>llr -d -q"6507*2^286292+1"
Starting Proth prime test of 6507*2^286292+1
Using all-complex AVX Core2 type-3 FFT length 20K, Pass1=256, Pass2=80, a = 5
6507*2^286292+1 is prime! Time : 28.918 sec.
But also this:
\llr>llr -d -q"1769*2^274741+1"
Starting Proth prime test of 1769*2^274741+1
Using all-complex AVX FFT length 18K, a = 3
Iter: 51/274751, ERROR: ROUND OFF (5.569146551e+014) > 0.4
Continuing from last save file.
Resuming Proth prime test of 1769*2^274741+1 at bit 2 [0.00%]
Using all-complex AVX FFT length 18K, a = 3
Disregard last error. Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.
Resuming Proth prime test of 1769*2^274741+1 at bit 51 [0.01%]
Using all-complex AVX FFT length 18K, a = 3
Iter: 51/274751, ERROR: ROUND OFF (6.922030492e+014) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Proth prime test of 1769*2^274741+1
Using all-complex AVX Core2 type-3 FFT length 20K, Pass1=256, Pass2=80, a = 3
1769*2^274741+1 is prime! Time : 25.878 sec.
|
|
|
|
ID: 235836
Microsoft Windows 7
Ultimate x64 Edition, (06.01.7600.00)
Windows 7 SP 1 is required for AVX support.
Thanks, thats most likely it. Its a new build and I haven't gotten it fully updated yet. I'll do that now!
____________
|
|
|
|
Just a thought. If the AVX build is functional for all processors even if it does only gives a benefit to those which support AVX why not issue the AVX build as a general release?
I'm prompted [url]inter alia[/url] to raise the question by a team mate who notes that I've AVX compatible processors but am still running the old LLR. I think my explanation about w*rk, rats, family and life in general getting in the way fell on stoney ground.
____________
Oh Bondage? Up Yours.
http://www.youtube.com/watch?v=ogypBUCb7DA
|
|
|
|
ID: 235836
Microsoft Windows 7
Ultimate x64 Edition, (06.01.7600.00)
Windows 7 SP 1 is required for AVX support.
Thanks, thats most likely it. Its a new build and I haven't gotten it fully updated yet. I'll do that now!
That did it! Thank you for pointing this out for me.
For PPSElow before when running only up to 6 cores, I was doing roughly 54 seconds per WU, now I am doing 32 seconds.
When running a full 12 WUs, previously I was running around 94 seconds, and now its around 65 seconds.
____________
|
|
|
|
Just a thought. If the AVX build is functional for all processors even if it does only gives a benefit to those which support AVX why not issue the AVX build as a general release?
I'm prompted [url]inter alia[/url] to raise the question by a team mate who notes that I've AVX compatible processors but am still running the old LLR. I think my explanation about w*rk, rats, family and life in general getting in the way fell on stoney ground.
We need todo some more test first on different host settings.
Lennart
|
|
|
|
Just a thought. If the AVX build is functional for all processors even if it does only gives a benefit to those which support AVX why not issue the AVX build as a general release?
I'm prompted [url]inter alia[/url] to raise the question by a team mate who notes that I've AVX compatible processors but am still running the old LLR. I think my explanation about w*rk, rats, family and life in general getting in the way fell on stoney ground.
We need todo some more test first on different host settings.
Lennart
When this announcement was first made it didn't specify what AVX was and on which processors it was intended to run. Therefore I jumped right in and tried it on this Q6600 running Linux and got this result and this one and this one.
Now I am NOT very swift when it comes to Linux so I may easily have messed up the installation, but if not then I would have to say "no" this app isn't functional for all processors. |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Just a thought. If the AVX build is functional for all processors even if it does only gives a benefit to those which support AVX why not issue the AVX build as a general release?
I'm prompted [url]inter alia[/url] to raise the question by a team mate who notes that I've AVX compatible processors but am still running the old LLR. I think my explanation about w*rk, rats, family and life in general getting in the way fell on stoney ground.
We need todo some more test first on different host settings.
Lennart
When this announcement was first made it didn't specify what AVX was and on which processors it was intended to run. Therefore I jumped right in and tried it on this Q6600 running Linux and got this result and this one and this one.
Now I am NOT very swift when it comes to Linux so I may easily have messed up the installation, but if not then I would have to say "no" this app isn't functional for all processors.
Maybe you can try my app. I compiled it inside a virtual machine with Lubuntu11.10_32bit.
Gary tested my app on his Core i7 2600K with Linux64 and the app was ~1% slower than his avx-compilation on the same host.
I tested my app also in linux32/64 on a Core2Duo and Core2Quad without problems.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3233 ID: 50683 Credit: 151,443,349 RAC: 73,965
                         
|
I try it on my AMD 960T and have no time difference ( that was expected) but all results are valid. Didnot try on Linux, windows only..
I some post I read that AVX is supported by Windows 7 SP1, so maybe that is problem? Linux maybe dont have support for AVX?
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
I try it on my AMD 960T and have no time difference ( that was expected) but all results are valid. Didnot try on Linux, windows only..
I some post I read that AVX is supported by Windows 7 SP1, so maybe that is problem? Linux maybe dont have support for AVX?
AVX is implemented since kernel-version 2.6.30. This was released 2009, 9th june...
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
If the AVX build is functional for all processors why not issue the AVX build as a general release?
We need to do some more test first on different host settings.
Lennart
I may easily have messed up the installation, but if not then I would have to say "no" this app isn't functional for all processors.
Maybe you can try my app.
Sure rroonnaalldd, I'll give it a try but I'm on my way out for at least 6 hours or so. If you want, either post a link to an archive with app(s) and app_info file (or PM me) and I'll give it a shot when I get back.
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
If the AVX build is functional for all processors why not issue the AVX build as a general release?
We need to do some more test first on different host settings.
Lennart
I may easily have messed up the installation, but if not then I would have to say "no" this app isn't functional for all processors.
Maybe you can try my app.
Sure rroonnaalldd, I'll give it a try but I'm on my way out for at least 6 hours or so. If you want, either post a link to an archive with app(s) and app_info file (or PM me) and I'll give it a shot when I get back.
Sorry, i forget to post the link to thread App_info file...
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Well, that was a lot of work for nothing. 8-((
Boinc started normally, no errors, and found the app_info file. It downloaded a work unit and began to run it. But just like before it finished in 3 or 4 seconds so it's obviously still not working.
Sorry, I tried but . . .
[edit]
As I said before, I'm not the best Linux person in the world so I may have messed something up, but as I said I got no error messages and it tried to run so ?
[/edit] |
|
|
|
I'm also not that skilled in Linux, but it should not affected by the application. I did not see any iCore or bulldozer at your hosts, so nothing of the new avx instruction set is used by your cpu's.
Regards Odi
____________
|
|
|
|
I'm also not that skilled in Linux, but it should not affected by the application. I did not see any iCore or bulldozer at your hosts, so nothing of the new avx instruction set is used by your cpu's.
Regards Odi
Hi Odi,
This was to help by testing the statement made somewhere above that the app may be compatible with older processors albiet with no speed improvement. Some have had luck with that. I guess the hope is that someday soon it may be sent to everyone by default without the app_info file and all the manual stuff that creates. |
|
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3233 ID: 50683 Credit: 151,443,349 RAC: 73,965
                         
|
tested it with prpnet and PPSElow (on i7-2600K hyperthreading on)
client 1 with stok app
[2012-01-09 17:25:07 CET] Server: PPSElow, Candidate: 4875*2^302003+1 Program: llr Residue: 432AFFCBD94A9016 Time: 103 seconds
client 2 with the AVX build
[2012-01-09 17:55:31 CET] Server: PPSElow, Candidate: 3747*2^302007+1 Program: llr Residue: D11ED507C8F83687 Time: 66 seconds
nice :D
very, very well done :thumbup:
Please tell me at what freq are those results?
OC?
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |
|
|
|
a minor problem...
1/11/2012 7:13:02 PM | PrimeGrid | File referenced in app_info.xml does not exist: primegrid_llr_wrapper_6.10_windows_intelx86.exe
1/11/2012 7:13:02 PM | PrimeGrid | File referenced in app_info.xml does not exist: llr.ini.6.00
an interesting one to say the least. Do I need to download these files as well?
problem was solved but now I just crashed 8 Cullen Wu's in about 30 seconds each
this doesn't look like a proper Cullen time does it. |
|
|
|
I can confirm Honza's experience regarding 7745*2^245547+1 being not prime despite being listed on the PRPnet prime list. However, it looks like this has nothing to do with the AVX build as the stock LLR app (3.8.6) also reports the number as composite, with the same residue as the AVX build does. Plus there was another post about a confirmation of "composite" with pfgw64.
Ironic that this is the very first "prime" listed on the PPS Low port prime list page. I ran a few others randomly (both with 3.8.6 and LLRavx, using llr from the command line) and the primality was confirmed each time. Huh.
--Gary
I took some smaller prime from http://prpnet.mine.nu:12000/user_primes.html
7745*2^245547+1 Prime PC-Gary 2011-08-18 19:10:28 GMT 73922
Time is fine, but no prime found. Bad luck?
Running i5-2500 on Win 2008R2 SP1
Using LLR 3.8.6
7745*2^245547+1 is not prime. Proth RES64: 69C444D105D03865 Time : 53.775 sec.
using LLRavx
Iter: 51/245559, ERROR: ROUND OFF (2.812417676e+013) > 0.4
Continuing from last save file.
Disregard last error. Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.
Iter: 51/245559, ERROR: ROUND OFF (6.796424524e+013) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
7745*2^245547+1 is not prime. Proth RES64: 69C444D105D03865 Time : 29.542 sec.
|
|
|
|
Just a note of caution: llrAVX, as per this thread, does not work on AMD Bulldozer processors. |
|
|
|
If the AVX build is functional for all processors why not issue the AVX build as a general release?
We need to do some more test first on different host settings.
Lennart
I may easily have messed up the installation, but if not then I would have to say "no" this app isn't functional for all processors.
Maybe you can try my app.
Sure rroonnaalldd, I'll give it a try but I'm on my way out for at least 6 hours or so. If you want, either post a link to an archive with app(s) and app_info file (or PM me) and I'll give it a shot when I get back.
Sorry, i forget to post the link to thread App_info file...
I have been running Ronald's (rroonnaalldd'ss??) 32-bit-native build of llrAVX with app_info in BOINC for several hours now and have been returning only valid units so far, in PPS and SGS LLR.
Assuming this works out, this might be a candidate for promotion to "stock app" for Linux, since Ronald said he had problems running mine on 32-bit. The software developer in me says we need a lot more testing on this before making that determination though. Teeny-tiny downside is that it is a % or two slower than my build when running on 64-bit, but that's a small price to pay for relieving everyone from the app_info requirement.
--Gary |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
I can confirm Honza's experience regarding 7745*2^245547+1 being not prime despite being listed on the PRPnet prime list.
Just for good measure, I checked it with PFGW, which also reports it as composite.
____________
My lucky number is 75898524288+1 |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
Bad luck with first listed false prime, bad luck again.
I've got several i5 2500 host in different enviroment.
Some are native Windows 7 or 2008R2 running AVX version very well (let say 12-15).
One has stock app in VMWare enviroment (let say 20 minutes).
Last one is running AVX version in VMWare enviroment - takes 63 minutes, about 3x SLOWER than stock version.
Anyone can confirm such behaviour?
____________
My stats |
|
|
|
Has anyone tested SGS with the new AVXllr? What is the increase? |
|
|
|
I did 4 units of SGS to see if it would work. 1, 2, 3 and 4. Haven't crunched them in a while, but if I remember correctly with the stock app they were about 10 minutes (600 seconds) whereas now they're between 265 and 310 seconds apparently. Now I'm throwing some Cullen's at it.
____________
PrimeGrid Challenge Overall standings --- Last update: From Pi to Paddy (2016)
|
|
|
|
Has anyone tested SGS with the new AVXllr? What is the increase?
370/375 seconds to 270/275 seconds, but some odd results: "cpu time" longer than "run time".
http://www.primegrid.com/results.php?hostid=210773&offset=0&show_names=0&state=3&appid=2 |
|
|
|
370/375 seconds to 270/275 seconds, but some odd results: "cpu time" longer than "run time".
http://www.primegrid.com/results.php?hostid=210773&offset=0&show_names=0&state=3&appid=2
Thanks a lot! |
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
Last one is running AVX version in VMWare enviroment - takes 63 minutes, about 3x SLOWER than stock version.
Well, I've reverted back to stock app on this host and...processing time went back to ~20 minutes as expected instead of 63 minutes using AVX.
THIS might be a show-stopper to use currect AVX version as stock.
Anybody checked with latest VMWare 5.x? I'm running 4.1 versions.
____________
My stats |
|
|
|
Last one is running AVX version in VMWare enviroment - takes 63 minutes, about 3x SLOWER than stock version.
Well, I've reverted back to stock app on this host and...processing time went back to ~20 minutes as expected instead of 63 minutes using AVX.
THIS might be a show-stopper to use currect AVX version as stock.
Anybody checked with latest VMWare 5.x? I'm running 4.1 versions.
I have ran it in vm workstation 8 with ubuntu64 without lowering speed but it was ppselow. |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Last one is running AVX version in VMWare enviroment - takes 63 minutes, about 3x SLOWER than stock version.
Well, I've reverted back to stock app on this host and...processing time went back to ~20 minutes as expected instead of 63 minutes using AVX.
THIS might be a show-stopper to use currect AVX version as stock.
Anybody checked with latest VMWare 5.x? I'm running 4.1 versions.
Do you talk about vSphere or desktop-stuff?
In the first case i would suggest to check the EVC-setting for your machines.
I test this in VMserver1 with DotschUX/1.2_64bit on host 115188
[add]
I have compiled the llr386-source against gwnum27.2. Maybe it solves also the BD-problem...
sllr3.8.6_linux32_avx.7z
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
Do you talk about vSphere or desktop-stuff?
In the first case i would suggest to check the EVC-setting for your machines.
vSphere (VMWare ESXi 4.x Hypervisor, ESX 5.x VMvisor and similar stuff).
Now I found that BOINC 6.13.x is VirtualBox aware.
This host is running ~21 minutes using AVX version (same as stock). I'll investigate if it is a coincidence (it also runs PPR Sieve in GPU).
____________
My stats |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
Do you talk about vSphere or desktop-stuff?
In the first case i would suggest to check the EVC-setting for your machines.
vSphere (VMWare ESXi 4.x Hypervisor, ESX 5.x VMvisor and similar stuff).
Now I found that BOINC 6.13.x is VirtualBox aware.
This host is running ~21 minutes using AVX version (same as stock). I'll investigate if it is a coincidence (it also runs PPR Sieve in GPU).
Sieve on Cuda makes no stress for the cpu. My cpu-usage with tpsieve while computing a unit was the most time zero with some spikes to 3%.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Honza Volunteer moderator Volunteer tester Project scientist Send message
Joined: 15 Aug 05 Posts: 1957 ID: 352 Credit: 6,139,676,503 RAC: 2,272,243
                                      
|
Sieve on Cuda makes no stress for the cpu. My cpu-usage with tpsieve while computing a unit was the most time zero with some spikes to 3%.
Yes, this is what i thought.
Uninstalled VMBOX, restarted host and without TPSieve still not good boost.
Otherwise TPsieve is doing ~9Mp/sec. Perhaps there is something else wrong with this host.
Another host is i5 660 at 3.33Ghz so pretty much the same as i5 2500 but has Westmere microarchitecture/Clarkdale without AVX support.
Run times are ~47 with AVX version - too bad. It has freshly installed BOINC 7.0.7. I'll go to stock app and expect times to actually get better.
____________
My stats |
|
|
|
Just a note of caution: llrAVX, as per this thread, does not work on AMD Bulldozer processors.
You'd think the author of the apps would have investigated this beforehand and made sure it worked. I assume they ran in on sandybridge CPU's and assumed they'd get the same reasult on an AMD Bulldozer. Thanks for the assumptions guys! |
|
|
|
Just a note of caution: llrAVX, as per this thread, does not work on AMD Bulldozer processors.
You'd think the author of the apps would have investigated this beforehand and made sure it worked. I assume they ran in on sandybridge CPU's and assumed they'd get the same reasult on an AMD Bulldozer. Thanks for the assumptions guys!
27.2 src is a pre-release, if the devs know about the Bulldozer problem they will fix this in the final code. After that we can build new apps and yes only tested with SB CPUs :/ |
|
|
John Honorary cruncher
 Send message
Joined: 21 Feb 06 Posts: 2875 ID: 2449 Credit: 2,681,934 RAC: 0
                 
|
Just a note of caution: llrAVX, as per this thread, does not work on AMD Bulldozer processors.
You'd think the author of the apps would have investigated this beforehand and made sure it worked. I assume they ran in on sandybridge CPU's and assumed they'd get the same reasult on an AMD Bulldozer. Thanks for the assumptions guys!
Sorry, it was my fault. In the first post, I said, "This has been tested on Intel Sandy Bridge but should also work on AMD Bulldozer. Currently, these are the only two CPU's that support AVX."
Are you saying the llrAVX builds in the first post for both Windows and Linux do not work on AMD Bulldozer?
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
Just a note of caution: llrAVX, as per this thread, does not work on AMD Bulldozer processors.
You'd think the author of the apps would have investigated this beforehand and made sure it worked. I assume they ran in on sandybridge CPU's and assumed they'd get the same reasult on an AMD Bulldozer. Thanks for the assumptions guys!
That's the purpose of beta testing. If you already knew it worked correctly, it wouldn't be necessary to test.
Everyone involved is a volunteer; none of us have a testing lab with dozens of different types of computers to verify code against. I'm sure very few people have both a Bulldozer CPU and a Sandy Bridge CPU. You said "they" should have "investigated this beforehand". Guess what? You're looking at that very investigation. This is how it's done.
We write code, verify that it works to the best of our ability, and release it for testing. If it's a test, that implies the possibility that it might not work as expected. Considering all the different types of CPUs, motherboards, operating systems, and so forth, unexpected problems tend to occur with some regularity, especially with something as new as AVX.
If you're uncomfortable with the risk, then simply wait until the program is thoroughly tested and released into production. Nobody is forcing you to use pre-production software.
____________
My lucky number is 75898524288+1 |
|
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3233 ID: 50683 Credit: 151,443,349 RAC: 73,965
                         
|
To Michael Goetz, and John, and ALL other who made, compile and build this app.
I am very great full for your effort, time you spend on this app and this build.
I just try ( like everybody with this processor should try) but app doesn't work.
I dont blame you, I dont yell at you, nothing "negative"
Just try, and got "negative result", then post it to this forum.
So please, continue with development, make it work on AMD processors, and all will be happier :)
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
To Michael Goetz, and John, and ALL other who made, compile and build this app.
I am very great full for your effort, time you spend on this app and this build.
I just try ( like everybody with this processor should try) but app doesn't work.
I dont blame you, I dont yell at you, nothing "negative"
Just try, and got "negative result", then post it to this forum.
So please, continue with development, make it work on AMD processors, and all will be happier :)
Indeed (although I'm not involved in the AVX code as I don't have any hardware with AVX).
People make code, people break code, people fix code. And everyone wins in the end. :)
BTW, since you have a Bulldozer, I heard that those share a single FPU between each pair of cores (similar to what Intel does with Hyperthreading). If that's true, you should see (on a 6 core bulldozer) a significant drop in LLR performance between having 3 tasks running and 6 tasks running. Is that what you've observed?
____________
My lucky number is 75898524288+1 |
|
|
|
[add]
I have compiled the llr386-source against gwnum27.2. Maybe it solves also the BD-problem...
sllr3.8.6_linux32_avx.7z
One question: the 64-bit app_info should be compatible with this 32 bit build, that's right? If so I propose to modify the package (ie modify the linux64 name for the app_info to linux and include the 32 bit sllrAVX for linux).
____________
Choose, and act.
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
BTW, since you have a Bulldozer, I heard that those share a single FPU between each pair of cores (similar to what Intel does with Hyperthreading). If that's true, you should see (on a 6 core bulldozer) a significant drop in LLR performance between having 3 tasks running and 6 tasks running. Is that what you've observed?
This sharing is not similar to Intels hyperthreading.
Intels hyperthreading simulates a full cpu-core. AMDs BD acts in a different way. As example the 8150. He has 4 modules and each modul has 2 integer cores but only 1 FPU per modul. Therefore it makes no difference in performance or throughput if you compile an app for BD with AVX or SSE3. BD can calculate 1 AVX-instruction with 256bit or 2 SSE3-instructions with 128bit in the same time...
The only way to generate faster AVX-code for BD would be the usage of FMA (Fused-Multiply-Add) which Sandy Bridge and also their follower Ivy Bridge not have. Intel announced FMA with its successor Haswell in 2013.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
[add]
I have compiled the llr386-source against gwnum27.2. Maybe it solves also the BD-problem...
sllr3.8.6_linux32_avx.7z
One question: the 64-bit app_info should be compatible with this 32 bit build, that's right? If so I propose to modify the package (ie modify the linux64 name for the app_info to linux and include the 32 bit sllrAVX for linux).
Please rename the new app to the old name and it should work.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3233 ID: 50683 Credit: 151,443,349 RAC: 73,965
                         
|
...
BTW, since you have a Bulldozer, I heard that those share a single FPU between each pair of cores (similar to what Intel does with Hyperthreading). If that's true, you should see (on a 6 core bulldozer) a significant drop in LLR performance between having 3 tasks running and 6 tasks running. Is that what you've observed?
No, there is no significant drop in LLR performance between 4 or 8 tasks (not I, but my friend have 8 core AMD). Yes there is drop, but not know what percentage of drop we are talking about. I will test it, and post result here.
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
@Crun-chi
Please answer Johns question from posting #46810.
John wrote: Are you saying the llrAVX builds in the first post for both Windows and Linux do not work on AMD Bulldozer?
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3233 ID: 50683 Credit: 151,443,349 RAC: 73,965
                         
|
As I say in this post www.primegrid.com/forum_thread.php?id=3933 app doesnot work under Windows. Two users also confirm my statment. Today I will test under Linux and will post results here, on forum.
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |
|
|
|
@Crun-chi
Please answer Johns question from posting #46810.
John wrote: Are you saying the llrAVX builds in the first post for both Windows and Linux do not work on AMD Bulldozer?
The Windows 7 answer is a big fat No. It does not work.
|
|
|
|
I did 4 units of SGS to see if it would work. 1, 2, 3 and 4. Haven't crunched them in a while, but if I remember correctly with the stock app they were about 10 minutes (600 seconds) whereas now they're between 265 and 310 seconds apparently. Now I'm throwing some Cullen's at it.
Pyrus, let us know how the Cullen's go. You're the first to go there, as far as I know. In my initial testing I ran a bunch of TRP LLR and a few 321 LLR but that's as far as I went. I forget now but I think those avx units ran almost twice as fast as usual, consistent with PPS and SGS.
--Gary
____________
"I am he as you are he as you are me and we are all together"
87*2^3496188+1 is prime! (1052460 digits)
4 is not prime! (1 digit) |
|
|
ardo  Send message
Joined: 12 Dec 10 Posts: 168 ID: 76659 Credit: 1,693,455,577 RAC: 0
                   
|
I've been running Woodall the last couple of days and so far so good. Most of them clocked in at around 26 hours, some of them an hour or so faster. Not bad for 2600k@4.4GHz (no HT).
____________
Badge score: 2*5 + 8*7 + 3*8 + 3*9 + 1*10 + 1*11 + 1*13 = 151
|
|
|
|
I've been running Woodall the last couple of days and so far so good. Most of them clocked in at around 26 hours, some of them an hour or so faster. Not bad for 2600k@4.4GHz (no HT).
That's great news. Maybe not quite as huge a speed-up, %-age wise, as the smaller units, but still quite nice (I seem to remember running WOO during the recent challenge in roughly 32 hours, with HT off, on a 2600K @4.3... but memories fade...)
Anybody run SoB with this yet? :-)
--Gary
____________
"I am he as you are he as you are me and we are all together"
87*2^3496188+1 is prime! (1052460 digits)
4 is not prime! (1 digit) |
|
|
|
I did 4 units of SGS to see if it would work. 1, 2, 3 and 4. Haven't crunched them in a while, but if I remember correctly with the stock app they were about 10 minutes (600 seconds) whereas now they're between 265 and 310 seconds apparently. Now I'm throwing some Cullen's at it.
Pyrus, let us know how the Cullen's go. You're the first to go there, as far as I know. In my initial testing I ran a bunch of TRP LLR and a few 321 LLR but that's as far as I went. I forget now but I think those avx units ran almost twice as fast as usual, consistent with PPS and SGS.
--Gary
So I did 4. Runtimes are around 82000 seconds. Now I obviously have to wait for validation :)
____________
PrimeGrid Challenge Overall standings --- Last update: From Pi to Paddy (2016)
|
|
|
|
I have a i3 2100 and using windows 7 ultimate x64. After I've renamed and moved the two files in the BOINC folder, what should I do? |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
I have a i3 2100 and using windows 7 ultimate x64. After I've renamed and moved the two files in the BOINC folder, what should I do?
You need the SP1 and a restart of boinc.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
After seeing some invalid SGS-results in the stats for my host 115188, i started one on the bash and got the message: ./sllr3.8.6-linux32_avx -d -q"30448908048555*2^666666-1"
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-1 FFT length 72K, Pass1=96, Pass2=768
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Disregard last error. Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-3 FFT length 80K, Pass1=320, Pass2=256
Iter: 5/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Iter: 3/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Iter: 4/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Disregard last error. Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.
Fatal error at setup : Number sent to gwsetup is too large for the FFTs to handle.
I get the same message also with: ./sllr3.8.6dev-linux32_avx -d -q"30448908048555*2^666666-1"
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-1 FFT length 72K, Pass1=96, Pass2=768
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Disregard last error. Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-3 FFT length 80K, Pass1=320, Pass2=256
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-1 FFT length 84K, Pass1=112, Pass2=768
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-3 FFT length 96K, Pass1=128, Pass2=768
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-3 FFT length 112K, Pass1=448, Pass2=256
Iter: 5/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Iter: 5/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Fatal error at setup : Number sent to gwsetup is too large for the FFTs to handle.
...and: ./sllr3.8.6dev-linux64_avx -d -q"30448908048555*2^666666-1"
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-1 FFT length 72K, Pass1=96, Pass2=768
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Disregard last error. Result is reproducible and thus not a hardware problem.
For added safety, redoing iteration using a slower, more reliable method.
Continuing from last save file.
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-3 FFT length 80K, Pass1=320, Pass2=256
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-1 FFT length 84K, Pass1=112, Pass2=768
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-3 FFT length 96K, Pass1=128, Pass2=768
Iter: 6/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-3 FFT length 112K, Pass1=448, Pass2=256
Iter: 5/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Iter: 5/45, ERROR: ROUND OFF (1) > 0.4
Continuing from last save file.
Unrecoverable error, Restarting with next larger FFT length...
Fatal error at setup : Number sent to gwsetup is too large for the FFTs to handle.
[add]
I made a retest with the PG-app v6.10: ./primegrid_sllr_3.8.6_i686-pc-linux-gnu -d -q"30448908048555*2^666666-1"
Starting Lucas Lehmer Riesel prime test of 30448908048555*2^666666-1
Using zero-padded Core2 type-1 FFT length 72K, Pass1=96, Pass2=768
V1 = 5 ; Computing U0...done.Starting Lucas-Lehmer loop...
30448908048555*2^666666-1, iteration : 10000 / 666666 [1.50%]. Time per iteration : 1.292 ms.
448908048555*2^666666-1, iteration : 30000 / 666666 [4.50%]. Time per iteration : 1.245 ms.
^C
Caught signal. Terminating.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
I have SP1 and I've done the reboot of BOINC, But how can I check the difference? |
|
|
Crun-chi Volunteer tester
 Send message
Joined: 25 Nov 09 Posts: 3233 ID: 50683 Credit: 151,443,349 RAC: 73,965
                         
|
I have SP1 and I've done the reboot of BOINC, But how can I check the difference?
Compare few WU times before, and after and you should notice difference with AVX app
____________
92*10^1585996-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
2022202116^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie! |
|
|
|
ok thank you! |
|
|
|
After seeing some invalid SGS-results in the stats for my host 115188, i started one on the bash and got the message: ./sllr3.8.6-linux32_avx -d -q"30448908048555*2^666666-1"
Does the same thing happen on any of your other Linux boxes? I ran your same test number from the command line on my i7 and it looked like it was running fine. I tried both the llravx that I built originally and the one you built... both did not get the round-off messages, and reached the "not prime" decision with the same residue.
--Gary
____________
"I am he as you are he as you are me and we are all together"
87*2^3496188+1 is prime! (1052460 digits)
4 is not prime! (1 digit) |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
After seeing some invalid SGS-results in the stats for my host 115188, i started one on the bash and got the message: ./sllr3.8.6-linux32_avx -d -q"30448908048555*2^666666-1"
Does the same thing happen on any of your other Linux boxes? I ran your same test number from the command line on my i7 and it looked like it was running fine. I tried both the llravx that I built originally and the one you built... both did not get the round-off messages, and reached the "not prime" decision with the same residue.
--Gary
Yes.
I will recompile both apps from the llr386dev- and llr386-source with their gwnum-lib.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
[add]
I have compiled the llr386-source against gwnum27.2. Maybe it solves also the BD-problem...
sllr3.8.6_linux32_avx.7z
One question: the 64-bit app_info should be compatible with this 32 bit build, that's right? If so I propose to modify the package (ie modify the linux64 name for the app_info to linux and include the 32 bit sllrAVX for linux).
Please rename the new app to the old name and it should work.
This wasn't for me, I don't own a linux32 system, only 64. I just pointed out that it was not very clear AND it is not well visible at first glance, so someone should miss this build ;)
____________
Choose, and act.
|
|
|
|
About halfway through this topic a request was made for someone to update the .xml files with GPUs entries. Has anyone attempted this?
Self interest at play here as all my Sandy Bridge machines have GPUs. |
|
|
rroonnaalldd Volunteer developer Volunteer tester
 Send message
Joined: 3 Jul 09 Posts: 1213 ID: 42893 Credit: 34,634,263 RAC: 0
                 
|
About halfway through this topic a request was made for someone to update the .xml files with GPUs entries. Has anyone attempted this?
Self interest at play here as all my Sandy Bridge machines have GPUs.
Please use the thread App_info file as basis for the GPU and rename all LLR-apps to the new with AVX.
____________
Best wishes. Knowledge is power. by jjwhalen
|
|
|
|
Thnx everybody that participated to make this AVX working!!
Yesterday I installed it on the Core-i5 2500 machine..
and my PPS-LLR WU's went from 1250-1350 seconds to 750-850 seconds
33% FASTER!
huge booster ;) While this is only for small tasks.. i wonder what this does for the SoB units?
____________
Member of the Dutch Power Cows
My Stats |
|
|
|
Doing a SoB task: 75% after 54 hours. On the way to a 72 hours SOB. I think the last one, with non avx llr took over 120 hour in the same host. |
|
|
|
2500K@4.5GHz
SGS:
- 6m 5s without avx;
- from 4m to 5m with avx; there are some big differences;
When new WU starts on, all core are on 100% usage for few seconds.
CN:
- 23h without avx;
- 20h with avx;
|
|
|
|
Doing a SoB task: 75% after 54 hours. On the way to a 72 hours SOB. I think the last one, with non avx llr took over 120 hour in the same host.
Yeah I just wanted to check out SoB too, I got up to 6.3% or so after 5 hrs and 45 minutes.
____________
|
|
|
Sysadm@Nbg Volunteer moderator Volunteer tester Project scientist
 Send message
Joined: 5 Feb 08 Posts: 1224 ID: 18646 Credit: 877,929,236 RAC: 321,810
                      
|
after a few days with my avx-build under Linux 64bit / i7-2600K with the PRPNet-Client, now I try it under boinc/PPSllr using app_info.xml, too.
first result (fortunately with lennart as wingman ;) ) is validated "Valid"
some are waiting for the slower wingmen ...
runtime without HT ~13min, with HT ~20min (@ 3.40GHz)
____________
Sysadm@Nbg
my current lucky number: 113856050^65536 + 1
PSA-PRPNet-Stats-URL: http://u-g-f.de/PRPNet/
|
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
Ok, I finally joined in. i7-2600k running stock. HT is on, but I've limited boinc to 4 cores. Win7-64.
Running boinc normally it was taking 20-23.5 minutes depending on how much I was using the computer at the same time.
First units just went through, and are taking 13-14 minutes so that's taken about 1/3 the time off! |
|
|
|
Doing a SoB task: 75% after 54 hours. On the way to a 72 hours SOB. I think the last one, with non avx llr took over 120 hour in the same host.
Yeah I just wanted to check out SoB too, I got up to 6.3% or so after 5 hrs and 45 minutes.
I finished my first SoB with llravx in less than 72 hours. Waiting for validation now.
http://www.primegrid.com/workunit.php?wuid=240140113
|
|
|
|
So, has anyone written the dummies installation manual yet? For us 'doze-ers, the tuxers know everything intuitively.
____________
Cheers,
PeterV |
|
|
|
I've got a 228 hour LLR waiting...im not sure if i got it to work. :) |
|
|
|
looks like a no. I've got about a dozen prime grid assignments that about two seconds in ended with a computational error... |
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
So, has anyone written the dummies installation manual yet? For us 'doze-ers, the tuxers know everything intuitively.
Since I just figured it out myself last night... this might be a little vague since I'm not near my boinc machines at the moment.
Download the file in the 1st post, and uncompress it with 7zip.
Rename the two files for windows as described in the 1st post.
Move or copy them to the primegrid folder. On my machine that was something like:
c:\program files(x86)\boinc\primegrid
Restart boinc to pick it up, and there should be a line in the messages to say it found it.
As mentioned before, make sure all work is completed and sent back before you do this. |
|
|
|
Sorry, mackerel, i don't have a ...\boinc\primegrid folder, only a ...\boinc\locale one. however, i do have a <boincdata>\projects\primegrid one so i put the files in there. on a previous occasion a 3rd-party milkyway client also had to go in a <boincdata> subdirectory.
so far, so good. however, on restart primegrid refused to send me any wus, in spite of having thousands to send, and me ticking PPS LLR [CPU] so something isn't working.
are there any other oopsies you haven't told me about, eg 'you must first up/down-grade to boinc 5.01'. i'm currently running 6.12.34 (x64).
thanks anyway.
you may call yourself mackerel but your pic looks more like a mallard. :)
____________
Cheers,
PeterV |
|
|
|
Sorry, mackerel, i don't have a ...\boinc\primegrid folder, only a ...\boinc\locale one. however, i do have a <boincdata>\projects\primegrid one so i put the files in there. on a previous occasion a 3rd-party milkyway client also had to go in a <boincdata> subdirectory.
so far, so good. however, on restart primegrid refused to send me any wus, in spite of having thousands to send, and me ticking PPS LLR [CPU] so something isn't working.
are there any other oopsies you haven't told me about, eg 'you must first up/down-grade to boinc 5.01'. i'm currently running 6.12.34 (x64).
thanks anyway.
you may call yourself mackerel but your pic looks more like a mallard. :)
Did you rename the files as told in the first post?
Proper folder might be
c:/ProgramData/Boinc/Projects/www.primegrid.com
6.12.34 will do fine.
Don't forget to run cpu benchmarks after restarting Boinc in Advanced Menu.
[edit] I think that this will only work if you have ran some llr tasks before. There are some files that must be on your primegrid folder in order to the app_info file included in the download work properly. So, if you deleted the previous content of the primegrid folder, that could explain the behaviour you described. If so, reinstall boinc, run at least one llr task, then retry llravx installation. Copy the two files into the folder, but do not delete whatever is already there unless you are an advanced user.
The app_info file available in the download does not include Cuda sieves, so pps sieve and cgw sieve will not work (you will not get any Cuda tasks), even if you have them selected to be sent.
Please note that there are six files in the download zip. But only two should de be copied to the primegrid folder: the two that have your OS (windows, linux OR Mac) on their name.
Copy them, then rename them according to the first post in this thread. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
WHERE THE FILES GO
First of all, you need to locate where BOINC put its data directory.
Most likely it's here (Windows 7):
C:\ProgramData\BOINC\projects\www.primegrid.com
The part that is in bold varies according to your operating system AND can be set by you during installation to be anywhere at all. To find the real location that BOINC is using on YOUR system, look at the beginning of the BOINC log. There will be a line that says "Data directory" and lists the correct location. (Can't find the log? With newer BOINC clients, hit CTRL-SHIFT-E to show the log. Older BOINC clients have the log in one of the tabs.)
Once you find the BOINC data directory, the files need to go int the ...\projects\www.primegrid.com subdirectory.
Hopefully that should resolve the problems of those still having difficulty.
____________
My lucky number is 75898524288+1 |
|
|
mackerel Volunteer tester
 Send message
Joined: 2 Oct 08 Posts: 2645 ID: 29980 Credit: 568,565,361 RAC: 198
                              
|
As others have pointed out by now, I got the location wrong... that's memory for you! But what I wrote previously was all I needed. Don't know if any more is. |
|
|
|
latest messages:
18/01/2012 12:56:26 PM | PrimeGrid | Found app_info.xml; using anonymous platform
18/01/2012 12:56:26 PM | PrimeGrid | [error] No application found for task: windows_intelx86 610 ; discarding
____________
Cheers,
PeterV |
|
|
|
Core i5-2400 at stock clock (3200 MHz with TurboBoost)
[root@rr018 ~]# ./primegrid_sllr_3.8.6_i686-pc-linux-gnu -d pps_llr_110414549
Starting Proth prime test of 355*2^1052170+1
355*2^1052170+1 is not prime. Proth RES64: 33F7C5920E02572C Time : 756.250 sec.
[root@rr018 llr3.8.6dev_avx]# ./sllr3.8.6dev_linux64_avx -d pps_llr_110414549
Starting Proth prime test of 355*2^1052170+1
355*2^1052170+1 is not prime. Proth RES64: 33F7C5920E02572C Time : 513.531 sec.
Attached the client with the app_info.xml and avxllr and crunching some WUs for the time being.
Coretemp is considerably higher now, power consumption would be too accordingly:
(6 minutes into the first four WUs)
Cpu speed from cpuinfo 3110.00Mhz
cpuinfo might be wrong if cpufreq is enabled. To guess correctly try estimating via tsc
Linux's inbuilt cpu_khz code emulated now
True Frequency (without accounting Turbo) 3110 MHz
CPU Multiplier 31x || Bus clock frequency (BCLK) 100.32 MHz
Socket [0] - [physical cores=4, logical cores=4, max online cores ever=4]
TURBO ENABLED on 4 Cores, Hyper Threading OFF
True Frequency 3210.32 MHz (100.32 x [32])
Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 34x/33x/33x/32x
Current Frequency 3210.32 MHz [100.32 x 32.00] (Max of below)
Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp
Core 1 [0]: 3210.32 (32.00x) 100 0 0 0 0 73
Core 2 [1]: 3210.32 (32.00x) 100 0 0 0 0 76
Core 3 [2]: 3210.32 (32.00x) 100 0 0 0 0 76
Core 4 [3]: 3210.32 (32.00x) 100 0 0 0 0 74
C0 = Processor running without halting
C1 = Processor running with halts (States >C0 are power saver)
C3 = Cores running with PLL turned off and core cache turned off
C6 = Everything in C3 + core state saved to last level cache
Above values in table are in percentage over the last 1 sec
Was around 65°C with the stock app.
I am curious what the FX-8120/8150 or the i7-3930K would achieve. |
|
|
|
I have a i7-2700K Win7 64bit SP1 GTX570.
I have downloaded the app_info and exe files from the first post and they run successfully under BOINC but GPU sieve does not run using that app_info.
I would like to run both the AVXllr and also PPS Sieve on the GPU.
I have tried using the Win32 app info kindly supplied by rroonnaalldd but I cannot get it to work.
Has anyone got GPU sieve and AVXllr working a the same time under BOINC using Win7 64bit? If so, could you please copy the app_info so I might try again?
Thanks,
Peter
____________
35 x 2^3587843+1 is prime! |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14011 ID: 53948 Credit: 433,209,141 RAC: 977,735
                               
|
I have a i7-2700K Win7 64bit SP1 GTX570.
I have downloaded the app_info and exe files from the first post and they run successfully under BOINC but GPU sieve does not run using that app_info.
I would like to run both the AVXllr and also PPS Sieve on the GPU.
I have tried using the Win32 app info kindly supplied by rroonnaalldd but I cannot get it to work.
Has anyone got GPU sieve and AVXllr working a the same time under BOINC using Win7 64bit? If so, could you please copy the app_info so I might try again?
Thanks,
Peter
Remove (or rename) the app_info file and have BOINC download one of the GPU sieve WUs that you're having trouble with.
Then look in the BOINC data directory called sched_reply_www.primegrid.com.xml.
Compare the appropriate section inside that file to what's inside your app_info. You're looking for the parts that deal with the GPU sieve WU.
That file should show you what's wrong with your app_info.xml.
____________
My lucky number is 75898524288+1 |
|
|
|
Remove (or rename) the app_info file and have BOINC download one of the GPU sieve WUs that you're having trouble with...
Sorry Michael I failed.
When I use the app_info from the first post I dont get any GPU WUs at all.
BOINC doesnt ask for them.
I tried looking at the xml file but it may as well be in a foreign language.
I will have to wait for either a win64bit app_info or for the app to be rolled out via BOINC.
Thanks for looking at my post.
Peter
____________
35 x 2^3587843+1 is prime! |
|
|
|
Remove (or rename) the app_info file and have BOINC download one of the GPU sieve WUs that you're having trouble with...
Sorry Michael I failed.
When I use the app_info from the first post I dont get any GPU WUs at all.
BOINC doesnt ask for them.
I tried looking at the xml file but it may as well be in a foreign language.
I will have to wait for either a win64bit app_info or for the app to be rolled out via BOINC.
Thanks for looking at my post.
Peter
Use this one here
You only need to download missing files from primegrid. Its working with gcw/pps sieve. |
|
|
|
Use this one here
You only need to download missing files from primegrid. Its working with gcw/pps sieve.
I'll give it a try, have to wait a while, I'm an hour or so into 2xTRPllr. I'll have a bash once they're completed.
Thanks,
Peter |
|
|
|
Use this one here
You only need to download missing files from primegrid. Its working with gcw/pps sieve.
Yes that app_info worked for me. I must have downloaded the correct files at last. Thank you all for your help.
As noted elsewhere, the PPSSieve GPU WUs are taking a little longer to complete now (GTX570 stock speeds) i.e. 1130secs vs 1030secs, but at least it's all working.
Thanks again,
Pete.
____________
35 x 2^3587843+1 is prime! |
|
|
|
Use this one here
You only need to download missing files from primegrid. Its working with gcw/pps sieve.
Yes that app_info worked for me. I must have downloaded the correct files at last. Thank you all for your help.
As noted elsewhere, the PPSSieve GPU WUs are taking a little longer to complete now (GTX570 stock speeds) i.e. 1130secs vs 1030secs, but at least it's all working.
Thanks again,
Pete.
I have that same problem. From 900 seconds to 980 seconds....
So pointwise I lose a lot when running the AVX.
____________
|
|
|
|
I have that same problem. From 900 seconds to 980 seconds....
So pointwise I lose a lot when running the AVX.
I'm happy to suffer that for the duration of the challenge and maybe afterwards as well. Let's hope that AVXllr gets distributed via BOINC sooner rather than later, but I understand that there are issues (with Bulldozer?) that need resolving. I've been using the app under PRPNet so I have the option to go back to being app_info-less under BOINC and switch all CPU cores to PRPNet.
Pete
____________
35 x 2^3587843+1 is prime! |
|
|
|
I tried llravx on linux and get occasional "segmentation fault" when running tests on bases other than 2.
So far I always ran them in background mode. I'll try to get one in foreground (may take some patience) and provide more detailed error messages (if any occur) as well as detailed machine and OS specs.
____________
There are only 10 kinds of people - those who understand binary and those who don't
|
|
|
|
I tried llravx on linux and get occasional "segmentation fault" when running tests on bases other than 2.
So far I always ran them in background mode. I'll try to get one in foreground (may take some patience) and provide more detailed error messages (if any occur) as well as detailed machine and OS specs.
Alright, it took quite a while to realise what happened. It may be due to my way of doing this. Here's what I do:
I run llr manually during the night hours. In the morning I send a kill signal to the jobs, causing them to write a restart file and shut down. In the evening I restart the jobs.
Sometimes when doing this I get something like this:
./llravx llr_sr366_220k-222k.txt
Caught signal. Terminating.
*** glibc detected *** ./llravx: double free or corruption (out): 0x0b763150 ***
======= Backtrace: =========
[0x9d27a2e]
[0x9d2bde9]
[0x809528a]
[0x80853e9]
[0x8086e3e]
[0x8087e37]
[0x808a9d1]
[0x808ad76]
[0x9d1780f]
[0x8048201]
======= Memory map: ========
08048000-09da5000 r-xp 00000000 00:21 1300688712
09da5000-09db9000 rwxp 01d5c000 00:21 1300688712
09db9000-09dc7000 rwxp 09db9000 00:00 0
0b6ae000-0b915000 rwxp 0b6ae000 00:00 0 [heap]
f2900000-f2921000 rwxp f2900000 00:00 0
f2921000-f2a00000 ---p f2921000 00:00 0
f2ac9000-f31a7000 rwxp f2ac9000 00:00 0
f32a8000-f4de6000 rwxp f32a8000 00:00 0
f4de6000-f4de7000 ---p f4de6000 00:00 0
f4de7000-f57e7000 rwxp f4de7000 00:00 0
f57e7000-f57e8000 ---p f57e7000 00:00 0
f57e8000-f61e8000 rwxp f57e8000 00:00 0
f628a000-f75af000 rwxp f628a000 00:00 0
f75e9000-f75ea000 rwxp f75e9000 00:00 0
f75ea000-f75eb000 ---p f75ea000 00:00 0
f75eb000-f7feb000 rwxp f75eb000 00:00 0
ffea5000-ffed6000 rwxp 7ffffffcd000 00:00 0 [stack]
ffffe000-fffff000 r-xp ffffe000 00:00 0
Abort
16780.092u 2.091s 4:40:03.40 99.8%
OS: Linux 2.6.18-194.17.4.el5 #1 SMP Mon Oct 25 15:50:53 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
Machine: model name : Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
Candidate tested: 1747*366^220569-1
When I tried to restart the job, it seemed to work fine. The candidate was processed to the end with no error messages. I did not have the time to compare residuals yet, but I will do this just to make sure.
This is a brand new and high-end dual-CPU machine but I don't see any speedup. So I suppose it does not support avx. I tried to find a list of avx-capable CPUs with no success. Could somebody tell me how to find out whether a machine does support avx?
Thanks
Peter
PS: interrupting llr jobs and restarting them with llravx seems to work just fine.
____________
There are only 10 kinds of people - those who understand binary and those who don't
|
|
|
Sysadm@Nbg Volunteer moderator Volunteer tester Project scientist
 Send message
Joined: 5 Feb 08 Posts: 1224 ID: 18646 Credit: 877,929,236 RAC: 321,810
                      
|
AVX comes with Intels Sandy Bridge processors; maybe this list helps
____________
Sysadm@Nbg
my current lucky number: 113856050^65536 + 1
PSA-PRPNet-Stats-URL: http://u-g-f.de/PRPNet/
|
|
|
|
AVX comes with Intels Sandy Bridge processors; maybe this list helps
Thanks! So no speedup is the expected behavior. Still it should work without strange messages, right?
____________
There are only 10 kinds of people - those who understand binary and those who don't
|
|
|
|
AVX comes with Intels Sandy Bridge processors; maybe this list helps
Thanks! So no speedup is the expected behavior. Still it should work without strange messages, right?
In addition to the CPU side of things:
If AVX support wasn't backported to your kernel 2.6.18 should not support it. AVX support was introduced with kernel version 2.6.30 but I don't know what extra patches were applied to your kernel version.
One quick and dirty way to check if your CPU (and your kernel) supports AVX is this command:
cat /proc/cpuinfo | sort -u | grep avx
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
Expect new entries like rdrnd for Ivy Bridge and avx2 for Haswell ;)
____________
|
|
|
|
I am thinking about new opterons 62xx ( avx onboard ) but afaik avx on amd don't work. Any progress in this field? |
|
|
|
I think we need a new Setting in our PrimeGrid preferences:
Use AVX
together with
Use CPU
Use ATI GPU
Use NVIDIA GPU
Is this an issue of PrimeGrid or Boinc Community?
I guess this is Boinc Community issue, but if...
____________
|
|
|
|
I think we need a new Setting in our PrimeGrid preferences:
Use AVX.
+1
Messing with an app_info file can be a pain. ;-)
____________
|
|
|
|
I think we need a new Setting in our PrimeGrid preferences:
Use AVX.
+1
Messing with an app_info file can be a pain. ;-)
AVX is an attribute/capability of your CPU, not a separate device. If/when an AVX-enabled version of LLR becomes the standard CPU app for primality testing in PG/BOINC, it will determine if your CPU is capable, and if so, it will use AVX instructions automatically, and if not, it will use a different (slower) instruction set. There is no need for the user to make a choice.
The current app is only available via app_info because it is still in "test". It has insufficient mileage on it to release into the wild, and there are still failures (so far as I am aware) with running on AMD/Bulldozer processors. Perhaps the next gwnum library release will resolve this but I don't know.
--Gary
p.s. Yes I agree app_info is a pain :-)
____________
"I am he as you are he as you are me and we are all together"
87*2^3496188+1 is prime! (1052460 digits)
4 |