Our primary production environment is currently hosted on Amazon Web Services (AWS). As we’ve taken on more system load, we have had to scale up these services. On August 16th and 17th, we started hitting Amazon’s IOPS burst quota, which then cascaded to pause our core matching engine. When our matching engine paused, we were forced to bring down the rest of our system to diagnose the problem. These disruptions lasted from approximately 12:18 PM EDT to 9:30 PM EDT on August 16th and 12:40 PM EDT to 3:30 PM EDT on August 17th.
The matching engine component of the Gemini platform was the only component affected by this infrastructure failure. When we brought down the entire system for maintenance, we temporarily suspended deposit and withdrawal processing. At no time were customers funds or accounts at risk, and there was no security impact on our online and offline digital asset storage systems. To remedy this issue, we’ve increased our Amazon IOPS burst quota by over 100x and scaled up the machines that we have running the core matching engine.
This is not the first scaling challenge we’ve encountered, and it won’t be the last — but usually we’re able to resolve these sorts of issues behind the scenes. We already have a system in place to monitor for degraded exchange performance and alert our Site Reliability team so they can remediate the problem before it affects our customers. We’ve upgraded to a larger and different instance type that supports a higher IOPS burst quota and allows us to monitor it. And we’re continuing to improve our performance and infrastructure monitoring so we can anticipate potential problems more quickly in the future.
We realize that our communication during the system outage was not consistent with the quality experience you have come to expect from Gemini, and we will be improving our communication plan going forward, taking into account the feedback we’ve received.
Some customer trading activity during this period was affected, and we will be reaching out to those customers within the coming days with more detailed information. If you have any questions about this outage, please don’t hesitate to contact Gemini customer support with specific details.
Onward and upward!