PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise
11) Message boards : Number crunching : ARM Devices (Message 63592)
Posted 2742 days ago by ebahapoProject donor
Would it be possible to build this application for a generic ARM Linux platforms like arm-unknown-linux-gnueabi (without hardware support for floating-point, like ARMv5) and arm-unknown-linux-gnueabihf (with support for floating-point, like ARMv6)?

Though ARM is not a high-performance processor by today's standards, it may be as fast as the typical PC of a few years ago and comparable to a current Intel Atom. Perhaps PPS-LLR and SG-LLR would good candidates to try at first.

Other projects like Enigma, OProject, QCN, Radioactive, Yoyo and WUProp already provide an application for such platforms.

I've helped out other projects getting the applications built and tested, as can be seen here. Please, let me know if I can help.

TIA
12) Message boards : Proth Prime Search : PPS LLR (Message 17496)
Posted 4050 days ago by ebahapoProject donor
I see some PPS LLR WUs which are "extended". Can a selection be added to the preferences page to enable only such WUs?

TIA
13) Message boards : AP26 - AP27 Search : Improving the AP26 application (Message 14197)
Posted 4215 days ago by ebahapoProject donor
I see. Yet, other PrimeGrid applications take much longer than 41min to run, so why not 32-bit AP26 too?

With AP26, there's a 1:8 processing time ratio between 64 bit and 32 bit. With the sieves, the greatest ratio is 1:2. And with LLR, it's 1:1.

Since the x86 Linux application has been out, it seems that the 64 to 32-bit ratio is more like 1:2. Why not release a x86 Windows version too?

TIA
14) Message boards : Number crunching : New LLR wrapper (Message 13724)
Posted 4229 days ago by ebahapoProject donor
Should the wrapper really be statically linked against all libraries, or just against the C++ libraries?

The sr2sieve executables are not statically linked. How many people are unable to run sieve tasks because of this?

The sr2sieve executables linked libstdc++ statically, as you can see below:

$ ldd ~/primegrid_sr2sieve_* ~/primegrid_sr2sieve_1.07_x86_64-pc-linux-gnu.orig: libm.so.6 => /lib64/tls/libm.so.6 (0x00002ab39270a000) libc.so.6 => /lib64/tls/libc.so.6 (0x00002ab392863000) /lib64/ld-linux-x86-64.so.2 (0x00002ab3925f3000) ~/primegrid_sr2sieve_wrapper_1.07_x86_64-pc-linux-gnu: libm.so.6 => /lib64/tls/libm.so.6 (0x00002b7b63600000) libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x00002b7b63759000) libc.so.6 => /lib64/tls/libc.so.6 (0x00002b7b6386d000) /lib64/ld-linux-x86-64.so.2 (0x00002b7b634e9000)

So it's really a matter of selectively linking specific libraries statically.

HTH
15) Message boards : Number crunching : New LLR wrapper (Message 13696)
Posted 4230 days ago by ebahapoProject donor
Looks like linking the c++ libs statically (like was done for the prpclient) would be a good thing to aid maximum portability for least pain - especially on older distros.

Indeed. Remember this recipe?

HTH
16) Message boards : Number crunching : Laptop Overheating (Message 13133)
Posted 4250 days ago by ebahapoProject donor
I limit the BOINC setting "use at most % of CPU time" in a separate venue to control the temperature on laptops. I found out that by setting it to 20%, the fan on all my laptops do not spool up, with the temperature remaining around 60C.

Laptops are much more sensitive to temperature failure because they were not build for heavy duty: the air inlets are small, the high-speed fan has weak bearings, the compact design doesn't allow heat to dissipate quickly, etc. All in all, it's better to contribute to BOINC projects at a reduced rate than having to replace the laptop due to a thermal failure caused by power-hungry BOINC projects.

HTH
17) Message boards : AP26 - AP27 Search : Improving the AP26 application (Message 12934)
Posted 4258 days ago by ebahapoProject donor
From what I can work out, part of the reason that the old 32-bit code was so much slower is that GCC (at least version 4.1.2) doesn't seem to be able to make an important optimisation for this sort of function in 32-bit mode:

That's because only x86-64 has 64-bit registers and 64-bit multiplications.

HTH
18) Message boards : AP26 - AP27 Search : Improving the AP26 application (Message 12927)
Posted 4258 days ago by ebahapoProject donor
Shouldn't it be

if(sito&=OKOK101[n59-101*(int64_t)(n59*(1.0/101))])

instead?

Yes, thanks.

#if 0 sito=OKOK61[r61]; if(sito&=OKOK67[r67]) if(sito&=OKOK71[r71]) if(sito&=OKOK73[r73]) if(sito&=OKOK79[r79]) if(sito&=OKOK83[r83]) if(sito&=OKOK89[r89]) if(sito&=OKOK97[r97]) #else sito=OKOK61[r61] & OKOK67[r67] & OKOK71[r71] & OKOK73[r73] & OKOK79[r79] & OKOK83[r83] & OKOK89[r89] & OKOK97[r97]; #endif if(sito&=OKOK101[n59-101*(int64_t)(n59*(1.0/101))]) if(sito&=OKOK103[n59-103*(int64_t)(n59*(1.0/103))]) if(sito&=OKOK107[n59-107*(int64_t)(n59*(1.0/107))]) if(sito&=OKOK109[n59-109*(int64_t)(n59*(1.0/109))]) if(sito&=OKOK113[n59-113*(int64_t)(n59*(1.0/113))]) if(sito&=OKOK127[n59-127*(int64_t)(n59*(1.0/127))]) if(sito&=OKOK131[n59-131*(int64_t)(n59*(1.0/131))]) if(sito&=OKOK137[n59-137*(int64_t)(n59*(1.0/137))]) if(sito&=OKOK139[n59-139*(int64_t)(n59*(1.0/139))]) if(sito&=OKOK149[n59-149*(int64_t)(n59*(1.0/149))]) if(sito&=OKOK151[n59-151*(int64_t)(n59*(1.0/151))]) if(sito&=OKOK157[n59-157*(int64_t)(n59*(1.0/157))]) if(sito&=OKOK163[n59-163*(int64_t)(n59*(1.0/163))]) if(sito&=OKOK167[n59-167*(int64_t)(n59*(1.0/167))]) if(sito&=OKOK173[n59-173*(int64_t)(n59*(1.0/173))]) if(sito&=OKOK179[n59-179*(int64_t)(n59*(1.0/179))]) if(sito&=OKOK181[n59-181*(int64_t)(n59*(1.0/181))]) if(sito&=OKOK191[n59-191*(int64_t)(n59*(1.0/191))]) if(sito&=OKOK193[n59-193*(int64_t)(n59*(1.0/193))]) if(sito&=OKOK197[n59-197*(int64_t)(n59*(1.0/197))]) if(sito&=OKOK199[n59-199*(int64_t)(n59*(1.0/199))]) if(sito&=OKOK211[n59-211*(int64_t)(n59*(1.0/211))]) if(sito&=OKOK223[n59-223*(int64_t)(n59*(1.0/223))]) if(sito&=OKOK227[n59-227*(int64_t)(n59*(1.0/227))]) if(sito&=OKOK229[n59-229*(int64_t)(n59*(1.0/229))]) if(sito&=OKOK233[n59-233*(int64_t)(n59*(1.0/233))]) if(sito&=OKOK239[n59-239*(int64_t)(n59*(1.0/239))]) if(sito&=OKOK241[n59-241*(int64_t)(n59*(1.0/241))]) if(sito&=OKOK251[n59-251*(int64_t)(n59*(1.0/251))]) if(sito&=OKOK257[n59-257*(int64_t)(n59*(1.0/257))]) if(sito&=OKOK263[n59-263*(int64_t)(n59*(1.0/263))]) if(sito&=OKOK269[n59-269*(int64_t)(n59*(1.0/269))]) if(sito&=OKOK271[n59-271*(int64_t)(n59*(1.0/271))]) if(sito&=OKOK277[n59-277*(int64_t)(n59*(1.0/277))]) if(sito&=OKOK281[n59-281*(int64_t)(n59*(1.0/281))]) if(sito&=OKOK283[n59-283*(int64_t)(n59*(1.0/283))]) if(sito&=OKOK293[n59-293*(int64_t)(n59*(1.0/293))]) if(sito&=OKOK307[n59-307*(int64_t)(n59*(1.0/307))]) if(sito&=OKOK311[n59-311*(int64_t)(n59*(1.0/311))]) if(sito&=OKOK313[n59-313*(int64_t)(n59*(1.0/313))]) if(sito&=OKOK317[n59-317*(int64_t)(n59*(1.0/317))]) if(sito&=OKOK331[n59-331*(int64_t)(n59*(1.0/331))])
19) Message boards : AP26 - AP27 Search : Improving the AP26 application (Message 12924)
Posted 4258 days ago by ebahapoProject donor
Note that n59 does not exceed 2^48.
Each of the varaibles "sito" and OKOK***[***] has 64 bits.

The most time consuming part starts at the line containing OKOK101.

Since n59 is smaller than 2^48, long double is not needed, as it fits in the 52 bits of mantissa in double.

How about this code instead?

#if 0 sito=OKOK61[r61]; if(sito&=OKOK67[r67]) if(sito&=OKOK71[r71]) if(sito&=OKOK73[r73]) if(sito&=OKOK79[r79]) if(sito&=OKOK83[r83]) if(sito&=OKOK89[r89]) if(sito&=OKOK97[r97]) #else sito=OKOK61[r61] & OKOK67[r67] & OKOK71[r71] & OKOK73[r73] & OKOK79[r79] & OKOK83[r83] & OKOK89[r89] & OKOK97[r97]; #endif if(sito&=OKOK101[n59-(int64_t)(n59*(1.0/101))]) if(sito&=OKOK103[n59-(int64_t)(n59*(1.0/103))]) if(sito&=OKOK107[n59-(int64_t)(n59*(1.0/107))]) if(sito&=OKOK109[n59-(int64_t)(n59*(1.0/109))]) if(sito&=OKOK113[n59-(int64_t)(n59*(1.0/113))]) if(sito&=OKOK127[n59-(int64_t)(n59*(1.0/127))]) if(sito&=OKOK131[n59-(int64_t)(n59*(1.0/131))]) if(sito&=OKOK137[n59-(int64_t)(n59*(1.0/137))]) if(sito&=OKOK139[n59-(int64_t)(n59*(1.0/139))]) if(sito&=OKOK149[n59-(int64_t)(n59*(1.0/149))]) if(sito&=OKOK151[n59-(int64_t)(n59*(1.0/151))]) if(sito&=OKOK157[n59-(int64_t)(n59*(1.0/157))]) if(sito&=OKOK163[n59-(int64_t)(n59*(1.0/163))]) if(sito&=OKOK167[n59-(int64_t)(n59*(1.0/167))]) if(sito&=OKOK173[n59-(int64_t)(n59*(1.0/173))]) if(sito&=OKOK179[n59-(int64_t)(n59*(1.0/179))]) if(sito&=OKOK181[n59-(int64_t)(n59*(1.0/181))]) if(sito&=OKOK191[n59-(int64_t)(n59*(1.0/191))]) if(sito&=OKOK193[n59-(int64_t)(n59*(1.0/193))]) if(sito&=OKOK197[n59-(int64_t)(n59*(1.0/197))]) if(sito&=OKOK199[n59-(int64_t)(n59*(1.0/199))]) if(sito&=OKOK211[n59-(int64_t)(n59*(1.0/211))]) if(sito&=OKOK223[n59-(int64_t)(n59*(1.0/223))]) if(sito&=OKOK227[n59-(int64_t)(n59*(1.0/227))]) if(sito&=OKOK229[n59-(int64_t)(n59*(1.0/229))]) if(sito&=OKOK233[n59-(int64_t)(n59*(1.0/233))]) if(sito&=OKOK239[n59-(int64_t)(n59*(1.0/239))]) if(sito&=OKOK241[n59-(int64_t)(n59*(1.0/241))]) if(sito&=OKOK251[n59-(int64_t)(n59*(1.0/251))]) if(sito&=OKOK257[n59-(int64_t)(n59*(1.0/257))]) if(sito&=OKOK263[n59-(int64_t)(n59*(1.0/263))]) if(sito&=OKOK269[n59-(int64_t)(n59*(1.0/269))]) if(sito&=OKOK271[n59-(int64_t)(n59*(1.0/271))]) if(sito&=OKOK277[n59-(int64_t)(n59*(1.0/277))]) if(sito&=OKOK281[n59-(int64_t)(n59*(1.0/281))]) if(sito&=OKOK283[n59-(int64_t)(n59*(1.0/283))]) if(sito&=OKOK293[n59-(int64_t)(n59*(1.0/293))]) if(sito&=OKOK307[n59-(int64_t)(n59*(1.0/307))]) if(sito&=OKOK311[n59-(int64_t)(n59*(1.0/311))]) if(sito&=OKOK313[n59-(int64_t)(n59*(1.0/313))]) if(sito&=OKOK317[n59-(int64_t)(n59*(1.0/317))]) if(sito&=OKOK331[n59-(int64_t)(n59*(1.0/331))])

The code that replaces the remainder operation should be much faster, but with so many chained branches, there ought to be a lot of CPU stalls due to branch misprediction.

HTH
20) Message boards : AP26 - AP27 Search : Improving the AP26 application (Message 12922)
Posted 4258 days ago by ebahapoProject donor
edit: BTW x87 ops can be vectorised by hand, sr2sieve uses highly vectorised ASM code and x87 ops are used in the x86_64 version instead of SSE2 when the factors exceed 2^51.

x87 cannot be vectorized because each instruction operates on only one datum, unlike SSE2, which operates on two or four data.

But I understand why x87 was used. However, I think that fixed point would be an alternative too.


Next 10 posts
[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2020 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 1.08, 1.34, 1.75
Generated 19 Sep 2020 | 10:55:14 UTC