Message boards : Number crunching : Quantum chemistry calculations on GPU
Author | Message |
---|---|
I have received 4 'Quantum chemistry calculations on GPU' WU, 3 of them calculated successfully on 2 linux machines with 4080 and 4070ti. | |
ID: 60938 | Rating: 0 | rate: / Reply Quote | |
I've had 3 failures and 3 successes. Seems to be test tasks. Interesting bit is that they seem to be employing Nvidia Tensor core calculation paths. | |
ID: 60939 | Rating: 0 | rate: / Reply Quote | |
Hello, | |
ID: 60941 | Rating: 0 | rate: / Reply Quote | |
Hello, Thanks Steve. is it intended that these tasks do not use the GPU right now? most are reporting that they run a process on the GPU, but no GPU utilization and no significant power draw over idle. will they use tensor cores on RTX cards? if so, are they necessary? what about older GTX cards? ____________ | |
ID: 60942 | Rating: 0 | rate: / Reply Quote | |
I did now catch the card using VRAM and power resources if I just ignore what nvidia-smi is telling me which gpu has the job task on it. Nvidia-smi is getting confused with these QC test tasks but reports properly for the ATMbeta python tasks. | |
ID: 60943 | Rating: 0 | rate: / Reply Quote | |
Thanks for the feedback and thanks all for running the tests! | |
ID: 60945 | Rating: 0 | rate: / Reply Quote | |
| |
ID: 60946 | Rating: 0 | rate: / Reply Quote | |
Saw a big batch of work units yesterday for this project and all of them were successful on our end. For a relatively early beta, that's fantastic. What are others seeing with that big batch? | |
ID: 61024 | Rating: 0 | rate: / Reply Quote | |
I think no checkpoints right now. but they at least restart from the beginning without an error. | |
ID: 61027 | Rating: 0 | rate: / Reply Quote | |
Makes sense. Were you running them 1x or 2x? | |
ID: 61030 | Rating: 0 | rate: / Reply Quote | |
i was running them mostly at 3x. one system I was running them at 4x actually to see if the VRAM was sufficient. it was fastest overall at 4x with ~2000s runtimes. so the fastest tasks were completing in about 8.5 minutes effective. | |
ID: 61031 | Rating: 0 | rate: / Reply Quote | |
Wow that is impressive! | |
ID: 61038 | Rating: 0 | rate: / Reply Quote | |
I would really appreciate it if the task requirements and properties such as VRAM requirement and no chechkpointing, only Linux or Windows, etc would be highlighted in the preference section right were you mark the sub projects that you would like to support. That would save us volunteers a lot of time instead of finding out eventually that your GPU isn't capable of handling them or by digging through pages and pages of forum entries. | |
ID: 61460 | Rating: 0 | rate: / Reply Quote | |
You should PM Gianni or Toni and point them at your post request. Steve, the developer for the science app discussed here has nothing to do with the project web pages. | |
ID: 61462 | Rating: 0 | rate: / Reply Quote | |
Ok Keith. Thanks for the info. I will ask nicely. :-) | |
ID: 61463 | Rating: 0 | rate: / Reply Quote | |
For quite a while now, only QC tasks have been available, with sometimes more than 100.000 unsent tasks, as seen in the project status page. | |
ID: 61468 | Rating: 0 | rate: / Reply Quote | |
What computer/os does it take to run the Quantum chemistry calculations on GPU? I've got a powerful Windows 11 machine with a top end I-9 processor and NVIDIA RTX 4090 sitting here idle. | |
ID: 61600 | Rating: 0 | rate: / Reply Quote | |
What computer/os does it take to run the Quantum chemistry calculations on GPU? I've got a powerful Windows 11 machine with a top end I-9 processor and NVIDIA RTX 4090 sitting here idle. Linux. | |
ID: 61601 | Rating: 0 | rate: / Reply Quote | |
What computer/os does it take to run the Quantum chemistry calculations on GPU? I've got a powerful Windows 11 machine with a top end I-9 processor and NVIDIA RTX 4090 sitting here idle. an old P100 or V100 is many times faster than a 4090 for these tasks. but yeah. only available for Linux anyway. ____________ | |
ID: 61603 | Rating: 0 | rate: / Reply Quote | |
Got one task where all eight hosts are failing due to "Nuclear gradients of %s not converged" at step #6 | |
ID: 61613 | Rating: 0 | rate: / Reply Quote | |
Heads up: problem with new Linux BOINC installation script. FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/boinc-client/.cupy' (see errors for host 625407) If you have sudo access to your machine (and I assume you do, if you are installing your own software), you should find /var/lib/boinc, and ctreate a symlink folder which points to it, and call that /var/lib/boinc-client | |
ID: 61811 | Rating: 0 | rate: / Reply Quote | |
Heads up: problem with new Linux BOINC installation script. i'm willing to bet that the directory listed is just a relative file path definition, not hard coded. something like assuming you're in the running slot. then going to "../../.cupy" or maybe utilizing an environment variable with $PATH or even BOINC's internal path variables. I don't even install boinc to that directory, nor do i have anything related to boinc in my /var/lib directory. i have it in my home folder. if it were hard coded, no one with a standalone install would be doing work, and no one has reported any issues, so... if you migrated an existing system with apps and stuff already downloaded, you might have some lingering configurations from the old setup? try resetting the project. ____________ | |
ID: 61812 | Rating: 0 | rate: / Reply Quote | |
This is a brand-new machine (less than 3 weeks old) - supplied with no OS installed, so I've installed Linux Mint and BOINC from cold - no history. | |
ID: 61813 | Rating: 0 | rate: / Reply Quote | |
This is a brand-new machine (less than 3 weeks old) - supplied with no OS installed, so I've installed Linux Mint and BOINC from cold - no history. i think you running it on a separate SSD is likely the issue, or a problem with Linux Mint. and it wont impact most people. it's clearly not hard coded, since I do not have any /var/lib/boinc /var/lib/boinc-client /var/lib/boinc-data or otherwise in my directories at all. in fact, the .cupy directory in use on my system is just put in the users home directory (/home/ian/.cupy) and this works perfectly fine for QChem. you seem to have something else going on with the system to get the environment variables confused between $HOME and /var/lib/boinc-client, cause it's not hard coded to that on GPUGRID's end. ____________ | |
ID: 61814 | Rating: 0 | rate: / Reply Quote | |
Yes that temp dir used by cupy should be located at $HOME/.cupy by default as I mentioned here: https://github.com/BOINC/boinc/discussions/5811#discussioncomment-10670615 | |
ID: 61815 | Rating: 0 | rate: / Reply Quote | |
This is a brand-new machine (less than 3 weeks old) - supplied with no OS installed, so I've installed Linux Mint and BOINC from cold - no history. I've got what may be the same problem with a fresh install of Linux Mint 22 Ubuntu 24.04 Noble. Can't run BOINC on it following these instructions: https://isaac.ssl.berkeley.edu/linux_install.php?os_num=6&build=alpha Please post a link to your "parallel conversation at BOINC" or better yet the solution. TIA | |
ID: 61818 | Rating: 0 | rate: / Reply Quote | |
This is a brand-new machine (less than 3 weeks old) - supplied with no OS installed, so I've installed Linux Mint and BOINC from cold - no history. Sounds like richards problem was in relation to the QChem tasks specifically, not BOINC as a whole. if you're having a problem running BOINC, you have a separate issue. ____________ | |
ID: 61819 | Rating: 0 | rate: / Reply Quote | |
Sounds like richards problem was in relation to the QChem tasks specifically, not BOINC as a whole. if you're having a problem running BOINC, you have a separate issue. Yes, it was specific to Quantum Chemistry tasks. As soon as I saw the results of the first night's run (all very quick errors), I switched to ATMML and they ran flawlessly. Then I investigated the error message about the missing file and the location it was looking in - I knew that didn't match the installation I'd only just completed. So I devised the workround I posted before, and it worked with no other changes. It's an easy fix, so I suggested BOINC cover it - but after discussion, it was decided to fix it at the project end. Unfortunately, the date on the apps page still shows an installation data of 9 Jul 2024, so the problem probably still exists. | |
ID: 61820 | Rating: 0 | rate: / Reply Quote | |
I've got what may be the same problem with a fresh install of Linux Mint 22 Ubuntu 24.04 Noble. Can't run BOINC on it following these instructions: I've had a quick look at your host 624473 - that's the only Mint 22 I can see in the list. You only let two tasks run to the point where they reported a failure. Both were ACEND 3 tasks. One was 'Particle coordinate is nan': the other was 'Cannot use a restart file on a different device!'. Both are well known processing errors here, and not related to the version of BOINC used. Please post a link to your "parallel conversation at BOINC" or better yet the solution. It's at https://github.com/BOINC/boinc/discussions/5811 | |
ID: 61821 | Rating: 0 | rate: / Reply Quote | |
https://www.gpugrid.net/hosts_user.php?userid=563937 | |
ID: 61822 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : Quantum chemistry calculations on GPU