Page 1 of 12

17-Dec Missing Images Update

PostPosted: Mon Dec 17, 2007 5:33 pm
by slug
Update 18-Dec-2007 01:50 UTC
Finally. After what seemed like an eternity, the filesystem repair process completed successfully. It was tempting to take a chance and try to put things back online immediately, but fortunately the engineers at NetApp insisted that we carefully check and repair everything first to avoid data loss.
Everything should be back online and normal with two kinds of exceptions.

1. Anyone that was uploading anything at the occurence of the problem on Saturday afternoon might have some broken images. If after uploading, you could see your photos, then they should be fine. If they seemed immediately broken, then I can't fix them since they were never even written to the disk here. (If there are any instances of this problem, there will be very few I think)
2. Anyone who used the regenerate thumbnail feature during the last day will have no thumbnails. I should have just disabled the tool during that time. The effect is that you won't have thumbnails right now and you'll see the original where you would expect a thumbnail. I'm running a script now to go through and find all these cases and fix them.

If tomorrow you're still seeing any broken images or gigantic thumbnail that the automated process I'm running now should have fixed, send me an email at slug@pbase.com and I'll take care of them.

I have new equipment here for which I'm finishing up their configuration. These will allow us to immediately serve out all the photos in the event that the main storage needs repair like this again. Some images during the last two days were in fact coming from these machines, but I didn't have them completely ready yet. Soon.

Really sorry for all the trouble. I know it's painful to have visitors to your site find broken images.

-Chuck Neel
slug@pbase.com


==========
Original post from earlier before the fix.
==========

Here's an update on the progress of fixing the problem of broken images.
I've been working non-stop with the engineers at NetApp to resolve this problem.
Until now, I haven't had a good idea of how long until resolution since the NetApp guys haven't been able to predict it due to the large size of the system. Now I'm hoping the process will be complete within 8 hours, but I can't be sure since I haven't had to go through this before.

On Saturday, we lost 3 disks simultaneously in our main storage system which runs on NetApp hardware. This caused an 8 Terabyte volume to have some inconsistencies which have to be analyed and repaired before we can put the volume back online. Fortunately, the cause of the problem is something NetApp understands and they've provided updated firmware for the disk shelves to correct the bug responsible.
Right now we're just waiting for the filesystem analysis program to complete. At that time, we'll be able to bring the volume back online and all of your photos will display properly.

While this has been a painful process, and I apologize for the disruption, the NetApp system is designed to recover from such failures. Also, their support team, while not cheap, is amazingly responsive and doesn't hesitate to send parts immediately, or spend hours of time on the phone walking us through all the steps to recover.

Fortunately, with the exception of a couple hours on Saturday, new uploads, direct linking, and the majority of the site have been working as usual.

I wish the recovery process could have gone faster, but after a problem with the filesystem, it's important to analyze it carefully so we can be sure everything is healthy.

Chuck Neel
slug@pbase.com

PostPosted: Mon Dec 17, 2007 5:37 pm
by niekirk
Thanks Slug - it's really useful to have some idea of how long it might take before it's fixed.

Much appreciated.

PostPosted: Mon Dec 17, 2007 5:39 pm
by christopher_schlaf
Thanks for the update

PostPosted: Mon Dec 17, 2007 5:42 pm
by flemmingbo
Thanks very much Slug for the update and the work - all the hard work is very much appreciated!

best regards,

Flemming

PostPosted: Mon Dec 17, 2007 5:45 pm
by kstuebin
Thanks for keeping us informed. It really is appreciated and I know you've been doing all you can. It just goes to show that "if it can go wrong, it will." :)

PostPosted: Mon Dec 17, 2007 5:50 pm
by sofo
Much appreciated Slug!

Thanks!

PostPosted: Mon Dec 17, 2007 5:58 pm
by sparkmeister
Thanks for the update! I can only try to imagine what a nightmare 8 terabytes of data could be!

PostPosted: Mon Dec 17, 2007 6:01 pm
by alex28
Thanks Slug,

I know that these things just can happen. The biggest companies on earth have computerproblems now and then.
Pbase was a very reliable company in the past years and i am sure they will be the coming years.
There are many more worse problems on this planet, don't forget that.

keep up the good work to you and your team

Alex from Holland

PostPosted: Mon Dec 17, 2007 6:03 pm
by decyb
thx for this update

i hope netapp can provide you a full fix for this kind of problem and not just a workaround.

Good luck to fix up this problem and many thx for your time to solve it.

PostPosted: Mon Dec 17, 2007 6:07 pm
by milv
Thanks for info!
Best!
Milan

Re: 17-Dec Missing Images Update

PostPosted: Mon Dec 17, 2007 6:10 pm
by jchriste
slug wrote:I wish the recovery process could have gone faster, but after a problem with the filesystem, it's important to analyze it carefully so we can be sure everything is healthy.


Thanks for your update... I was just about to get impatient... :)

And yes, please check things before putting the volumes online again... I have been working with storage previously and know that problems will get worse if not making compleatly sure that you understand what caused the problem...

Thanks for your effort
KR Jørgen

PostPosted: Mon Dec 17, 2007 6:13 pm
by jdf
Hello, as a customer, i am happy to hear from you and glad as everything will work out.

Regards,

Diego

PostPosted: Mon Dec 17, 2007 6:18 pm
by 455rocket
Thanks for the update its very much appreciated.
Phil.

PostPosted: Mon Dec 17, 2007 6:22 pm
by kims
Thanks for the update.

regards Kim

PostPosted: Mon Dec 17, 2007 6:23 pm
by gabrield
Pbase is still the best photo gallery site on the net and the best value for your money too! I am confident we are in good hands