PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise
1) Message boards : News : Be careful with BOINC computers on the Internet (Message 133993)
Posted 341 days ago by Profile JStateson
I did a similar test, and there is indeed something wrong with the message. When I try to connect from an address that is not in remote_hosts.cfg, I do get an error message. But the IP address reported in the erro message.

For example, I just got the following message, when trying to connect from 80.214.154.84.

620: 09-Aug-2019 06:34:19 (low) [] GUI RPC request from non-allowed address 2.0.191.203


Even if the IP @ in the message is wrong, Mike's advices need to be used for servers connected to Internet.


If you (or anyone reading this thread) has a GitHub account, please go over to https://github.com/BOINC/boinc/issues/3246 and add your support to get this issue fixed. The fact that the French IP address is used by default clearly shows a problem with code.
2) Message boards : Generalized Fermat Prime Search : Need help excluding genefer 21 or maybe all genefer (Message 113179)
Posted 986 days ago by Profile JStateson
I have pair of gtx 670 which are not good enough for genefer 21 (usually) and it seems even titans are not good enough: I am at bottom the 109365 second cancel job


I don't understand what you mean by "not good enough". If this is the task you're referring to, it was estimating a completion time of 28 hours for the task. There's no reason you can't run any of the GFN tasks on a GTX 670. It will, of course, take longer than a newer GPU, but it's still plenty fast enough. This project started years before the GTX 670 came out, and we've run the larger GFN-22 tasks on much slower GPUs than your Maxwell-class 670s.

Also, maybe you should talk to the people at Gridcoin who run the grcpool accounts. It's not at all appropriate to be assigning random computers to many of the tasks at PrimeGrid. Pooled computers should be assigned to short, low-heat tasks. GCW-Sieve on the CPU and PPS-Sieve on the GPU. There's a problem both with assigning really slow computers to really long tasks, but more importantly, with assigning poorly cooled computers of unsuspecting users to tasks that might damage their computers. (Maybe GFN-15 would be a better choice than PPS-Sieve for GPUs, now that I think of it. It probably runs cooler.)



I have spent some time looking at this problem. Yes, my gtx670 is good enough. No, the gridcoin team does not assign random computers to your tasks. It assigns ALL hosts equally. I spend some time looking at this because I have been a big supporter of primegrid. Excluding a project because it takes too long is not the proper way to go because the work unit gets downloaded and then deleted by the exclusion command.

Yes, your project has been around long before the gtx670 was put together, but it seems you have not changed your CPU estimate requirements to keep up with newer CPUs and GPUs.

Looking here there are 3 tasks that errored but the other 4 either timed out or were aborted by the user on purpose. Note that the in progress is still runing weeks later.

The problem is your CPU estimate at

    sched_request_www.primegrid.com
    <avg_ncpus>0.078836</avg_ncpus>
    <max_ncpus>0.078836</max_ncpus>


and


    init_data.xml
    <ncpus>0.078836</ncpus>



Is way too low for some of your work units. The value of 0.078837 (or 0.078836 as above when re-written after checkpoint restore) is exceedingly low for a gtx670 for what appears to be all the genefer work units. I do not know what the minimum requirement is but the value of 1.0 for cpu moved the time to complete from about 2 weeks to hours or less.

I discovered this when I observed that my GPU temps were way too cool, only 2-3c above the CPU and the %GPU usage stayed about 2% then shot to 80% momentarily, then back to 2.

To fix this I had to do 2 things. The file sched_request_www.primegrid.com at \ProgramData\Boinc had to be edited and every occurrence of 0.078837 (or 0.078836) replaced with 1.0

I also created the following app_config.xml file and put it in the PrimeGrid project directory. However, this file by itself was not sufficient and I had to edit the above scheduler file.


    <app_config>

    <app>
    <name>genefer</name>
    <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
    </gpu_versions>
    </app>

    <app>
    <name>ap26</name>
    <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
    </gpu_versions>
    </app>

    <app>
    <name>pps_sr2sieve</name>
    <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
    </gpu_versions>
    </app>

    <app>
    <name>genefer15</name>
    <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
    </gpu_versions>
    </app>

    <app>
    <name>genefer16</name>
    <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
    </gpu_versions>
    </app>

    <app>
    <name>genefer17low</name>
    <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
    </gpu_versions>
    </app>

    <app>
    <name>genefer17mega</name>
    <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
    </gpu_versions>
    </app>

    <app>
    <name>genefer18</name>
    <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
    </gpu_versions>
    </app>

    <app>
    <name>genefer20</name>
    <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
    </gpu_versions>
    </app>

    </app_config>



This is what it looks like now


The Genefer 19 work unit still shows .0788 because the checkpoint restores the original 0.0788 value and I aborted the task as it would drag on for days at that rate.

3) Message boards : Generalized Fermat Prime Search : Need help excluding genefer 21 or maybe all genefer (Message 112731)
Posted 999 days ago by Profile JStateson
I have pair of gtx 670 which are not good enough for genefer 21 (usually) and it seems even titans are not good enough: I am at bottom the 109365 second cancel job

I do not have ownership of the project so I am limited to cc_config or app_config changes.

Looking at a FAQ here in this subforum I see the following example:

    <exclude_gpu>
    <url]http://www.primegrid.com/</url>
    <device_num>1</device_num>
    <type>nvidia</type>
    <app>genefer</app>
    </exclude_gpu>



Can I just leave off the device num so that genefer on all gpus get excluded?

Can I assume <app>genefer</app> works for all, not just 21?

How can I just exclude 21 as from looking at my valid tasks it seems that number smaller than 21 seem ok. Most tasks perform in 30 minutes or less. Some, like that ap27 search appear to take 40-50 days but they actually finish in only 1-2 hours (strange, not sure why), has anyone else seen this?

[EDIT] Got rid of the nvidia line after seeing this message:

    P5E_QUAD

    20 PrimeGrid 12/28/2017 9:52:10 AM cc_config.xml: bad type 'nvidia' in GPU exclusion; valid types: NVIDIA



Now, I see that genefer 21 tasks (two of them) are no longer running, simply waiting and two genefer (17 and a 16) are running. So specifying just "genefer" seems to have not stopped all genefer from running. If the app did not like "genefer" it would have been nice if it had listed what was needed to stop the job. I mis-spelled an einstein task (in app_config, not cc_config) and the app listed the correct names of the tasks which was convenient but that might not apply here.

[EDIT2] Aborted the genefer21 tasks, will let 16 and 17 run as they look good but set set no new work until I can figure this out.

4) Message boards : Number crunching : Genefer CPU Completion Times (Message 61626)
Posted 2809 days ago by Profile JStateson
I gave up on all my opteron cpu's. Even fastest opt290 was even 1/2 way finished with 4 days left. They are missing ssse3 and have 1/2 the cache of a q6700.
My q6700 will easily finish day or 2 ahead, q9550 may finish on the 13th but is is close. I lost 2 cpu work units with microsoft's tuesday updates and a couple of cuda units that night also.
5) Message boards : Number crunching : Underclocking not dropping temps & precision X not working (Message 61444)
Posted 2813 days ago by Profile JStateson
Hmm - Seems I cannot edit my post to show it was solved.

Have solved the problem -MSI's afterburner correctly sets the voltage for multiple GPUs. Temps on the 2nd gpu dropped immediately when I set the gpu core voltage to a level corresponding to the downclock speed.
6) Message boards : Number crunching : Underclocking not dropping temps & precision X not working (Message 61441)
Posted 2813 days ago by Profile JStateson
As suggested in another thread, I solved a lot of problem downclocking the memory and gpu. However, I noticed that the temperatures did not drop. The voltage on the gtx460's are way up to make sure the factory overclock works. I tried setting them down. I did this because one of the gpu's was over 85c and was being throttled by a safety program I got installed that throttles and notifies me.

To make a long story short - Precision X, 3.04, changes on the first gpu voltage and DOES NOT CHANGE THE OTHERS!. I posted a complaint at their site here

Are there any other program like eVga's precision X that can adjust voltage correctly? Anyone else seen this problem? I need to set the voltage back on one of my GPUs that is running to hot. Alternately, I will have to get my large fans out of storage as I put them away for winter.
7) Message boards : Number crunching : np=30, np=15, np=7 ??? (Message 61320)
Posted 2815 days ago by Profile JStateson
Thanks for the explanation. I used one computer to read that stderr output and another to post at the forum and between reading and posting MP got changed to NP and "multi" to "number"


PPS (Sieve): Detected 30 multiprocessors
Genefr: MP=30

I loaded nVidia tools onto the system with the GTS-280 and supposidly the clocks are all set at the standard settings.

The wingman for this work unit downloaded 120+ units at 18:01 GMT so it may be a while before my unit gets confirmed as valid. I have not used this board gtx280, for a while and a capacitor broke off which needs to be replaced.
8) Message boards : Number crunching : np=30, np=15, np=7 ??? (Message 61310)
Posted 2816 days ago by Profile JStateson
I see those number listed in PrimeGrid task units for various apps. Sometimes it is spelled out Number Processors = 7, etc.

GTX 570 has np=15 and takes 32k seconds to run GFN and runs warm
GTX 460 has np=7 and takes 64k seconds and runs hot
Unaccountably, GTX280 has np=30 and takes 33 hours and seems to be at idle temp (51c)

Fermi has some integer pipelining that the 280 does not have. The 280 is also compute 1.3. Anyway, the NP means Number of Processors but I think there are way more than 7, 15 or 30

What is the significence?

Thanks for looking!
9) Message boards : Number crunching : The Year of The Snake Challenge (Message 61283)
Posted 2816 days ago by Profile JStateson
I see terms like 40k task and 60k task. What does this mean and how can I determine what type of task I am running?

Thanks!
10) Message boards : Number crunching : The Year of The Snake Challenge (Message 61267)
Posted 2817 days ago by Profile JStateson
Some statistics at the 24 hour and a surprise (for me). It looks like my old gtx280 will complete a task at the 32 hour mark. gtx570 and 460 are completing at 8.8h. and 17h respectively.
Since the gtx280 has about the same flops and memory as the 460 I assume the difference is its 1.3 capability compared to the 2.1.

cpu's percent complete (average all cores)
Q6700 %9.1 done
Q9550 %6 done
Opteron290 %5.3 done.

I would be interested in any values for i7 core %done

Thanks!


Next 10 posts
[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2020 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 3.43, 2.80, 2.19
Generated 22 Sep 2020 | 22:03:32 UTC