Advanced search

Message boards : Number crunching : Server is out of disk space

Author Message
Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 486
Credit: 11,550,461,298
RAC: 4,768,395
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61548 - Posted: 20 Jun 2024 | 21:56:05 UTC

Thu 20 Jun 2024 05:49:06 PM EDT | GPUGRID | [error] Error reported by file upload server: Server is out of disk space

mrchips
Send message
Joined: 9 May 21
Posts: 16
Credit: 1,412,539,259
RAC: 47,170
Level
Met
Scientific publications
wat
Message 61550 - Posted: 21 Jun 2024 | 0:23:57 UTC

still out of space
____________

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 486
Credit: 11,550,461,298
RAC: 4,768,395
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61554 - Posted: 21 Jun 2024 | 22:41:23 UTC

It's full again.

Fri 21 Jun 2024 06:40:41 PM EDT | GPUGRID | [error] Error reported by file upload server: Server is out of disk space

pututu
Send message
Joined: 8 Oct 16
Posts: 26
Credit: 4,153,801,869
RAC: 14,670
Level
Arg
Scientific publications
watwatwatwat
Message 61719 - Posted: 25 Aug 2024 | 17:00:25 UTC
Last modified: 25 Aug 2024 | 17:25:46 UTC

Currently, I'm seeing disk full error message.

Edit1: seems to be clearing slowly now.

Edit2: uploading seems to be intermittent...

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 486
Credit: 11,550,461,298
RAC: 4,768,395
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61720 - Posted: 25 Aug 2024 | 18:43:54 UTC

Confirmed. Server is out of disk space.

pututu
Send message
Joined: 8 Oct 16
Posts: 26
Credit: 4,153,801,869
RAC: 14,670
Level
Arg
Scientific publications
watwatwatwat
Message 61722 - Posted: 25 Aug 2024 | 21:07:26 UTC
Last modified: 25 Aug 2024 | 21:09:12 UTC

Seems like it is intermittent, meaning need to perform network transfer retry occasionally to upload the completed tasks. Best to run with a script.

Fingers crossed that the server disk hasn't completely crashed yet as of this posting...

Freewill
Send message
Joined: 18 Mar 10
Posts: 21
Credit: 35,796,571,419
RAC: 56,463,185
Level
Trp
Scientific publications
watwatwatwatwat
Message 61728 - Posted: 26 Aug 2024 | 4:55:44 UTC

Still getting this error as of 26 Aug almost 7:00 AM Madrid time.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1629
Credit: 9,672,847,368
RAC: 7,829,637
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61731 - Posted: 26 Aug 2024 | 8:00:37 UTC

26/08/2024 08:58:40 | GPUGRID | [error] Error reported by file upload server: Maintenance underway: file uploads are temporarily disabled.

Steve
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 21 Dec 23
Posts: 46
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 61734 - Posted: 26 Aug 2024 | 9:13:22 UTC - in response to Message 61731.

Thank you for reporting. It is now back and with more space.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 486
Credit: 11,550,461,298
RAC: 4,768,395
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61779 - Posted: 8 Sep 2024 | 21:07:19 UTC

Server is out of disc space again.


Sun 08 Sep 2024 05:05:46 PM EDT | GPUGRID | Started upload of p38_A31_A28_r0_4-QUICO_ATM_AF_04_Benchmark-16-20-RND3498_1_0
Sun 08 Sep 2024 05:05:47 PM EDT | GPUGRID | [error] Error reported by file upload server: Server is out of disk space
Sun 08 Sep 2024 05:05:47 PM EDT | GPUGRID | Temporarily failed upload of p38_A31_A28_r0_4-QUICO_ATM_AF_04_Benchmark-16-20-RND3498_1_0: transient upload error
Sun 08 Sep 2024 05:05:47 PM EDT | GPUGRID | Backing off 00:24:33 on upload of p38_A31_A28_r0_4-QUICO_ATM_AF_04_Benchmark-16-20-RND3498_1_0



Stacie
Send message
Joined: 29 Mar 20
Posts: 22
Credit: 829,995,819
RAC: 813,869
Level
Glu
Scientific publications
wat
Message 61780 - Posted: 9 Sep 2024 | 0:27:54 UTC - in response to Message 61779.

Is this why finished workunits are failing to upload? They are starting to pile up in my que.
____________

Stacie
Send message
Joined: 29 Mar 20
Posts: 22
Credit: 829,995,819
RAC: 813,869
Level
Glu
Scientific publications
wat
Message 61781 - Posted: 9 Sep 2024 | 1:51:46 UTC

...do we lose bonus turnaround credit because the upload server is locked up?
____________

Erich56
Send message
Joined: 1 Jan 15
Posts: 1146
Credit: 11,516,061,501
RAC: 19,931,062
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 61782 - Posted: 9 Sep 2024 | 2:31:47 UTC - in response to Message 61781.

...do we lose bonus turnaround credit because the upload server is locked up?

yes, at least in the past this has happened. I doubt this has changed now.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1146
Credit: 11,516,061,501
RAC: 19,931,062
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 61784 - Posted: 9 Sep 2024 | 7:50:29 UTC - in response to Message 61734.

On August 26 Steve wrote:

Thank you for reporting. It is now back and with more space.

Steve, now again no finished tasks can be uploaded since last night :-(

How come that this happens that often? Is there not any kind of automated reporting early enough (say, once the disk is 80% full), or other measures that prevent this problem to happen every few weeks ? This shouldn't be that difficult to implement.

WPrion
Send message
Joined: 30 Apr 13
Posts: 99
Credit: 3,405,464,157
RAC: 5,326,390
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61785 - Posted: 9 Sep 2024 | 10:40:25 UTC - in response to Message 61784.
Last modified: 9 Sep 2024 | 10:41:04 UTC


How come that this happens that often? Is there not any kind of automated reporting early enough (say, once the disk is 80% full), or other measures that prevent this problem to happen every few weeks ? This shouldn't be that difficult to implement.


Kinda ironic. They are super anxious to get this project completed so they award 10X the points the tasks deserve, therefore attract an army of crunchers, but then don't keep their hardware up to speed to support the effort.

Profile Steve Dodd
Send message
Joined: 26 Dec 08
Posts: 18
Credit: 4,302,371,348
RAC: 1,527,661
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62014 - Posted: 13 Dec 2024 | 21:33:34 UTC
Last modified: 13 Dec 2024 | 21:34:11 UTC

Oopsie. Just got this message:

12/13/2024 1:29:04 PM | GPUGRID | [error] Error reported by file upload server: Server is out of disk space

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 486
Credit: 11,550,461,298
RAC: 4,768,395
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62015 - Posted: 13 Dec 2024 | 23:41:52 UTC - in response to Message 62014.

I got the same message:

Fri 13 Dec 2024 06:38:59 PM EST | GPUGRID | [error] Error reported by file upload server: Server is out of disk space




Erich56
Send message
Joined: 1 Jan 15
Posts: 1146
Credit: 11,516,061,501
RAC: 19,931,062
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 62016 - Posted: 14 Dec 2024 | 5:38:04 UTC - in response to Message 62015.

totally ununderstandable how this can happen so often.
Do they still have no precautions to prevent something like this? Should be easy enough

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 460
Credit: 889,073,404
RAC: 1,100,130
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62017 - Posted: 14 Dec 2024 | 11:50:37 UTC

After a long time im crunching a bit again and no Uploads since yesterday possible -.- Still the same problems like years before O.o
____________
DSKAG Austria Research Team: http://www.research.dskag.at



TheFiend
Send message
Joined: 26 Aug 11
Posts: 100
Credit: 2,604,575,367
RAC: 1,062,119
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62018 - Posted: 14 Dec 2024 | 14:19:38 UTC - in response to Message 62017.

After a long time im crunching a bit again and no Uploads since yesterday possible -.- Still the same problems like years before O.o


I'm another one that has just restarted crunching GPUGRID again after upgrading GPU's.... It's not the only project that encounters this problem...

doug
Send message
Joined: 2 Dec 24
Posts: 1
Credit: 17,570,373
RAC: 240,155
Level
Pro
Scientific publications
wat
Message 62019 - Posted: 14 Dec 2024 | 14:41:18 UTC

Hi,

I'm also getting the out of disk space message:

12/14/2024 9:35:39 AM | GPUGRID | [error] Error reported by file upload server: Server is out of disk space

12/14/2024 9:35:39 AM | GPUGRID | Temporarily failed upload of

The times are EST.

Doug

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 460
Credit: 889,073,404
RAC: 1,100,130
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62020 - Posted: 14 Dec 2024 | 16:38:38 UTC - in response to Message 62018.

On other projects this is very rarely i must say and im crunching 24/7 since 2011 ^^ But i dont complain too much, just happy this project has WUs again for just ParttimecrunchingPCs.
____________
DSKAG Austria Research Team: http://www.research.dskag.at



mikey
Send message
Joined: 2 Jan 09
Posts: 298
Credit: 6,785,855,288
RAC: 3,400,546
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62021 - Posted: 14 Dec 2024 | 16:47:48 UTC - in response to Message 62020.

On other projects this is very rarely i must say and im crunching 24/7 since 2011 ^^ But i dont complain too much, just happy this project has WUs again for just ParttimecrunchingPCs.


The problem probably is the fact that the credits are sooo high and the tasks are sooo available that more and more people who wants TONS of credits per day are coming here for them. It also comes down to that as opposed to 6 months ago when several more projects were around to spread the load gpugrid is now one of the few gpu projects around so like the rest is being overwhelmed. I would guess that in time they will figure out how to move things thru the system more quickly and get bigger drivers too and things will be alot better.

Greg _BE
Send message
Joined: 30 Jun 14
Posts: 137
Credit: 122,677,395
RAC: 21,115
Level
Cys
Scientific publications
watwatwatwatwatwat
Message 62023 - Posted: 14 Dec 2024 | 18:28:53 UTC

Probably won't see anything done about it (disk space) until Monday if we are lucky.

KeithBriggs
Send message
Joined: 29 Aug 24
Posts: 37
Credit: 1,645,415,047
RAC: 15,776,688
Level
His
Scientific publications
wat
Message 62027 - Posted: 14 Dec 2024 | 20:32:18 UTC - in response to Message 62023.

Probably won't see anything done about it (disk space) until Monday if we are lucky.


+1

I started getting errors while computing (EWC) as soon as I started running a script to get them uploaded. There was already a lot of EWC which might contribute to the server overload in the first place. It seems odd that a script would be able to impact a task during what I call the initial cpu-only stage.

KeithBriggs
Send message
Joined: 29 Aug 24
Posts: 37
Credit: 1,645,415,047
RAC: 15,776,688
Level
His
Scientific publications
wat
Message 62028 - Posted: 14 Dec 2024 | 20:51:44 UTC - in response to Message 62027.

Richard Haselgrove identified the issue on the other thread: Corrupt compression likely due to server space but it's also a compression issue.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1629
Credit: 9,672,847,368
RAC: 7,829,637
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62029 - Posted: 14 Dec 2024 | 21:08:35 UTC

Some files have started to trickle up, but others are reaching 100% and then sticking again. But it's a sign they may be trying to move some of the results to alternative storage. It'll take time, though.

Robertobit
Send message
Joined: 29 Mar 20
Posts: 3
Credit: 695,184,248
RAC: 390,818
Level
Lys
Scientific publications
wat
Message 62030 - Posted: 14 Dec 2024 | 22:00:33 UTC

Hi I have the same issue | [error] Error reported by file upload server: Server is out of disk space

Robertobit
Send message
Joined: 29 Mar 20
Posts: 3
Credit: 695,184,248
RAC: 390,818
Level
Lys
Scientific publications
wat
Message 62031 - Posted: 14 Dec 2024 | 22:01:20 UTC - in response to Message 61548.

|me too [error] Error reported by file upload server: Server is out of disk space

Profile Khali
Send message
Joined: 13 Jan 14
Posts: 9
Credit: 388,632,885
RAC: 486,788
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 62032 - Posted: 14 Dec 2024 | 22:13:37 UTC

Just adding in my two cents here. I have files stuck in project back off as well.

Funny how this always seems to happen on a weekend and nothing can be done until Monday. Oh well, at least it wasn't Christmas week where every one is away for the holidays for up to a weeks time.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1146
Credit: 11,516,061,501
RAC: 19,931,062
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 62034 - Posted: 15 Dec 2024 | 7:40:07 UTC - in response to Message 62032.

Funny how this always seems to happen on a weekend and nothing can be done until Monday.

I am wondering anyway how much or not they are interested in keeping their own project working properly, given that obviously everyone leaves on Friday afternoon and no one keeps an eye on what's happening until the following week :-(

Greg _BE
Send message
Joined: 30 Jun 14
Posts: 137
Credit: 122,677,395
RAC: 21,115
Level
Cys
Scientific publications
watwatwatwatwatwat
Message 62036 - Posted: 15 Dec 2024 | 11:58:23 UTC - in response to Message 62034.
Last modified: 15 Dec 2024 | 12:05:59 UTC

Funny how this always seems to happen on a weekend and nothing can be done until Monday.

I am wondering anyway how much or not they are interested in keeping their own project working properly, given that obviously everyone leaves on Friday afternoon and no one keeps an eye on what's happening until the following week :-(


EU project (Barcelona). M-F only. Not paid for weekends. They are a donation based program with a bit of government funding, so there is no money for weekend work. So if something goes SOL after hours then it waits until the next working day. Actually I don't think any BOINC project has a 24/7 monitor. Rosetta in the early days did, but all the projects I am on now don't.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1146
Credit: 11,516,061,501
RAC: 19,931,062
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 62037 - Posted: 15 Dec 2024 | 12:00:44 UTC - in response to Message 62036.

Funny how this always seems to happen on a weekend and nothing can be done until Monday.

I am wondering anyway how much or not they are interested in keeping their own project working properly, given that obviously everyone leaves on Friday afternoon and no one keeps an eye on what's happening until the following week :-(


EU project (Barcelona). M-F only. Not paid for weekends. And if they are a donation based projects (from the looks of their one page) then there is no money for weekend work. So if something goes SOL after hours then it waits until the next working day.

:-( :-( :-(

zioriga
Send message
Joined: 30 Oct 08
Posts: 47
Credit: 567,016,028
RAC: 92,344
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62038 - Posted: 15 Dec 2024 | 12:41:51 UTC

And credits are less rewarded, 'cause of this delay

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1629
Credit: 9,672,847,368
RAC: 7,829,637
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62045 - Posted: 16 Dec 2024 | 10:11:52 UTC

Phew. All mine have uploaded, reported, and had replacement ATMML tasks issued.

roundup
Send message
Joined: 11 May 10
Posts: 65
Credit: 10,330,278,875
RAC: 4,674,736
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62070 - Posted: 23 Dec 2024 | 6:41:50 UTC

And again:


Mo 23 Dez 2024 07:26:13 CET | GPUGRID | [error] Error reported by file upload server: Server is out of disk space
Mo 23 Dez 2024 07:26:13 CET | GPUGRID | [error] Error reported by file upload server: can't open file /home/ps3grid/projects/PS3GRID/upload/1e4/1i5xA00_300_3-ANTONIOM_MDCATH300r1si-13-50-RND8245_0_10: No space left on device


Vince
Send message
Joined: 27 Aug 22
Posts: 1
Credit: 667,195,495
RAC: 453,029
Level
Lys
Scientific publications
wat
Message 62071 - Posted: 23 Dec 2024 | 9:23:33 UTC

Server space full no uploads possible
A totally unacceptable situation
This happening far too often and will have me looking at alternative
projects like moowrap and or primegrid for my dedicated GPU work

roundup
Send message
Joined: 11 May 10
Posts: 65
Credit: 10,330,278,875
RAC: 4,674,736
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62072 - Posted: 23 Dec 2024 | 12:14:55 UTC - in response to Message 62071.


A totally unacceptable situation
This happening far too often and will have me looking at alternative
projects like moowrap and or primegrid for my dedicated GPU work


The focus is on the science behind our projects, not on WU's regular supply. Since 2023, the supply has been much more regular than in many previous years. That has really improved.
Also, most of us understand that there are no unlimited resources for most european research institutes to buy or scale hardware.
Basically, it's always a good idea to have a backup project that steps in with resource share = 0 whenever there's no work for the main project.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1629
Credit: 9,672,847,368
RAC: 7,829,637
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 62073 - Posted: 23 Dec 2024 | 13:32:40 UTC - in response to Message 62072.

Yes, but ...

The science the project is interested in is contained in the result files we return to the project: until they have been accepted, no (observable) science has happened. A regular supply of tasks implies, inevitably, a regular flow of attempted uploads: the project staff should be able to monitor the inflow, and if necessary regulate the work flow to what is manageable and affordable.

Having said that, two machines have uploaded completed ATMML tasks this lunchtime, and downloaded replacements without bothering my backup project.

Freewill
Send message
Joined: 18 Mar 10
Posts: 21
Credit: 35,796,571,419
RAC: 56,463,185
Level
Trp
Scientific publications
watwatwatwatwat
Message 62074 - Posted: 23 Dec 2024 | 15:54:43 UTC - in response to Message 62073.

On the project's Discord server, one of the admins (giadefa) has posted that "by the end of January we will move to a new server with more disk space." So, that change should address this issue. For now, they seem to be watching as the immediate issue has been resolve. I have been able to upload my completed tasks recently.

Profile Michael H.W. Weber
Send message
Joined: 9 Feb 16
Posts: 73
Credit: 656,229,684
RAC: 130,815
Level
Lys
Scientific publications
watwatwatwatwatwatwatwat
Message 62075 - Posted: 24 Dec 2024 | 14:48:39 UTC

...the usual problem not solved in years.

Michael.
____________
President of Rechenkraft.net - Germany's first and largest distributed computing organization.

Post to thread

Message boards : Number crunching : Server is out of disk space

//