Page 1 of 25

24-Sep Downtime

PostPosted: Sun Oct 04, 2009 6:38 am
by emily
You are probably aware of the recent downtime PBase experienced. After a loss of power on the morning of 24-Sep-2009, the main database server did not come back online properly. We worked nonstop with Oracle specialists to get a new, faster database up and running as quickly as possible. Most everything is back up now, but we are still completing the recovery.

In consideration of the downtime, we are not charging anyone for the month of September.

Our developers have been working hard over the past year on some exciting software updates, which we hope to put online soon.

Thank you for your patience,
The PBase Team


For those of you who want more details or still have questions, keep reading ...

There was a power failure at our datacenter, SunGard Availability, on the morning of 24-Sep-2009. SunGard is fully equipped with backup power sources, such as generators and battery arrays. An electrical fault occurred, which enabled the battery backup. However, the electrical problem could not be fixed before the batteries ran down. Unfortunately, the same fault did not allow the generators to supply power. When power came back, the main database machine did not come back online properly. We immediately started working with Oracle specialists and SunGard support to get things back up and running. We were able to get the database back up, but not steadily enough to run PBase from it. Since our top priority was to preserve data integrity, we decided (along with the Oracle specialists) to set up a brand new database server rather than trying to make adjustments to the old one. We got the new server ready and worked constantly over the next several days to get it loaded with data imported from the old server. We do keep regular backups of the database, but in the interest of time used this other method of data recovery instead of copying and restoring from the backups.
The new database server is faster and better than the old one. We are already seeing an improvement in the time it takes to load PBase pages. Additional hardware is on order to make it even better. We are also working on setting up an identical server in a mirrored pair. This means that if anything happened to the main database, the standby database could take over right away.

We know that some images are not displaying at this time. Do not worry. The images themselves were not affected, because they are not stored in the database. We are still working to recover the image information needed to display these images on PBase.

The current stats collection method is limited, and due to site performance reasons is currently disabled. We are working on an improved method, which will be better, faster, and retroactive. Once completed, statistics for pageviews since the power failure will be available to you.

If you still have any questions after reading this page, please email us at support@pbase.com

Update : 08-Oct-2009 16:27 UTC / 12:27 EST

Some of you have noticed that the amount of disk space your account or individual images are using is incorrect, showing higher usage than expected. These will all be fixed once the import of the remaining images is complete. Do not worry. We will not let these days of inaccurate numbers affect your average usage for the month of October.

If you have images that do not display or have missing sizes, we have scripts running to detect this and automatically fix it. Once you view such a page, that image should be fixed in roughly 2 minutes.

We know that some of you have some missing gallery structure. This is more difficult for us to detect, so please send an email to support@pbase.com if you encounter missing galleries / gallery structure. Once we are aware of a problem with your galleries, we will be able to take care of it right away.

The new hitcounts information is not ready yet. We are still working on it. When it is completed, we will let you know.

Update : 22-Oct-2009

nkirby wrote:The hitcounts code is still being worked on. We are implementing a new method that will allow more information to be available to you. This is our highest priority right now and we should see the first update soon.

Please be aware that no information is being lost currently. When the hitcounts comes back online the data from the past few weeks will be reconstructed and available to view. Thanks again for your patience.

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 8:43 am
by pablof
Thanks for the update. I understand your problems and appreciate the work involved in getting things back up. What I (and others) missed in the past week was information on the status and progress of recovery, I think in such cases updated information for subscribers is very important.
All the best
PabloF

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 11:51 am
by bracciodiferro
Hi Emily!
You write: "We are also working on setting up an identical server in a mirrored pair"
But please use a different backup power sources ,not the same :is SAFEST ...!
Best,
Paolo

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 12:00 pm
by fishit
Emily thanks for the explanation, seems like the perfect storm and PBase was caught in it. I do not see what else you all could have done.
The new server rebuild does seem to run faster, so there is some consolation.
The choice to rebuild was the best rought even though it took longer, some will *iss and moan about the down time, so be it.
This being the 10th anniversary of PBase and still have not found a nicer place to host my photography, happy 10th !!

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 1:11 pm
by lynnh
I believe you have had daily updates on the progress and been much more forthcoming early in process. Do appreciate the FREE month.

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 1:13 pm
by don_empey_photography
yes it seems that PBase is back up again but the frustration we have gone through over the past 2 weeks is inexcusable. No communication from the PBase team on progress during this whole exercise is a poor way to run your business. With PBases history of unreliable service, many of us have been forced to look into alternatives and there are some really good viable ones out there that pale PBase by comparison. I have been a PBase member for years now and although it has served me well for the most part but there is just a lot of mistrust in the systems now.

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 1:14 pm
by amoxtli
Thank you for this update and for the detailed explanation. Communication goes a long way. Continued messages on the software updates you mentioned would be much appreciated.
Best Wishes,
Walter

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 1:14 pm
by lynnh
I meant to say you should have had daily updates for us.

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 1:25 pm
by bracciodiferro
I ask you if you expected when PBase will 'again operating at full 100% .
You can tell me how long should we wait?
Best,
Paolo

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 1:56 pm
by cassiar
We rarely hear from the admins in this forum, so it's good to see that there is still some life behind the scenes :mrgreen:
The only thing your paying members need more is good and reliable communication. For example through the NEWS section of this forum, which is hardly used.
All things can go wrong, but the lack of communication is the worst. Hopefully we can look forward with some optimism for the future. Keep up the good work and thank you for this news !

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 2:07 pm
by ngruev
Simply put, this should not have happened! All the hardware and sofware to prevent the downtime has been available on the market for many years. I am trying to find an excuse for you, but I can't. This is a major goof up. I hope you have learned your lesson and will not allow this to happen again. Ever!
The one month free service is not a favor. It is a must.

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 2:37 pm
by trinko
For a web based business PBase could stand to learn a bit about communications. Frankly I think every PBase member should have been emailed the first message in this thread. Free month is reasonable. Further it sounds like the failure was in the design of the server provider. There's no way that the same failure that took out the main power should prohibit the use of the generators. That sort of single point failure is intolerable in an assured service design. But you can't really expect PBase, or other users of the same provider, to check out the electrical details of their server provider. Hopefully PBase will learn from this and have better communications in the future if another problem occurs. Also I hope that PBase either gets a new hosting service or verifies that the problems with the backup power have been fixed.

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 3:01 pm
by photographieur
Thank you for this information, Emily... We missed PBase a lot ! But I prefer to know that you spent your time trying to fix the breakdown than communicating with us. I wish it will never happen again !

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 3:28 pm
by carrhighlander
trinko wrote:For a web based business PBase could stand to learn a bit about communications. Frankly I think every PBase member should have been emailed the first message in this thread. Free month is reasonable. Hopefully PBase will learn from this and have better communications in the future if another problem occurs. Also I hope that PBase either gets a new hosting service or verifies that the problems with the backup power have been fixed.


I hope for PBase sake that's it not too late, I already know that many long term users of PBase have been looking elsewhere Smugmug, Zenfolio, Flickr and others. I've now got a trial account on Smugmug, and it looks good.
As Trinko mentioned above an email to all current paid up members would have been well received by all. COMMUNICATION, has always been very poor on PBase the forums are full of it along with the speed issues that we have all experienced in the past, it's a big pity that a major failure like last week was what was needed to address these.
Please don't forget we are the paying customer. Keep us informed.

slug wrote:We're very sorry about the recent performance problems. Things are running smoothly now and we can get back to improving the software.
There was a combination of sporadically failing hardware that caused difficulty isolating the problem.
In the process, we've replaced/upgraded network switches and load balancers.
Also, we've made a lot of improvements to our system monitoring software which helps us detect most problems before they happen.
I apologize for my lack of progress reporting which unfortunately spawns many wild theories.

Re: 24-Sep Downtime

PostPosted: Sun Oct 04, 2009 4:33 pm
by mbaumser
I'm still trying to get an answer why my usage was about 487mb before the crash and was 500mb after pbase came back up.

As other have indicated. Communication and customer service from PBASE is horrible.

Marc