Posts Tagged ‘vertical scaling’

Our server was heavily overloaded

Saturday, September 12th, 2009

We are investigating this issue, too many requests is made in one second.

Meanwhile we allocated more memory resources during this time temporarily.

Update 09:43 pm — Server has been stabilized. We are still in the midst of finding out what happened.
Update 12:12 am Sunday — It began when one of our servers had an overloaded queue due to a bug introduced in a recent update. This server attempted to resend too quickly to another server which couldn’t keep up. The whole chain reaction is the reason for our 2 hour service disruption. We are optimizing our load right now.

gladlyCast server is back up

Tuesday, August 11th, 2009

We are pleased to announce that we removed the major bottleneck that slowed down gladlyCast’s Twitter status update delivery. The system is now well again, but not without feeding it with some vegetables:

Surely you know this is a vegetable

Did we miss any of your SMSes during the upgrade?
Do tweet us about it. By informing us for these fault, you help us become more reliable for you. :)

Scheduled downtime: Major server upgrade at 3 am

Sunday, August 9th, 2009

When you’re sound asleep, we’ll be performing a major upgrade to make our purple gladlyCast server harder, better, faster, stronger:

Scaling vertically at 3 am

We are performing the upgrade on Aug 11 (Tuesday), at 3 am in the morning as that’s the time where the least people use our services. The update will take maximum 1 hour. If you send an SMS during our downtime, it will be queued safely on another server and delivered to Twitter once our upgrade is completed.

SMS will be queued at gateway and redelivered to gladlyCast server as soon as possible.

No messages is expected to be lost during this transition. We treat your SMSes with care and provide refreshments to make them happy. We would update our gladlyCast Twitter regularly during which time.

The August 11 plan for the upgrade is to:

  1. 3:00 am — Pull the plug and upgrade.
  2. 3:20 am — Move everything over and wake the server.
  3. 3:25 am — Wake our Twitter service and redeliver all queued messages sent between 3:00 am and 3:25 am.
  4. 3:30 am — Wake our web server with sign ups re-enabled.
  5. Spend the next 30 minutes ensuring all components are functioning.

(During this transition, this blog will go down too. This is something we haven’t thought about and I’m feeling rather dumb right now.)

Update 3:30 am: Our web server is up. (:
Update 3:33 am: Sign ups are not re-enabled at the moment.
Update 3:43 am: Components are functional except for phone association. Still working on that one.
Update 4:10 am: Sign up is re-enabled. Upgrade completed.

 

gladlyCode