Tokyo Commodity Exchange Inc. (TOCOM or “the Exchange”) reported that its new trading system launched on May 7, 2009 experienced a technical problem from around 10:30 on Tuesday, May 12. TOCOM expresses its deepest apologies for the inconveniences the problem may have caused to market participants and other parties concerned.
The below describes when and how the problem took place, the result of the Exchange’s investigation of the problem, and how TOCOM will prevent further problems.
1. Chronological report on the technical problem
-
<:LI>Shortly after 10:30, the Exchange started to receive notifications from some Members and IT vendors that their systems were experiencing unstable connectivity to the Exchange's system. The Exchange confirmed at that point that its system was operating normally.
- Thereafter, the Exchange received notifications from over 10 market participants that they could not receive market data or place orders. For trading to be able to continue smoothly even in the case of a system failure, TOCOM has set up its network communication equipment (i.e: the lines and routers used to connect to the trading system) to be fully redundant and duplicated. Even though the Exchange requested Members to connect to the Exchange's trading system through the backup network, they were unable to establish connectivity.
- At 11:30, based on its contingency plan, the Exchange decided to suspend the trading session since a substantial number of Members were unable to place orders and execute other trading actions through their systems, and also because the TOCOM trading system was possibly experiencing a failure. At 11:35 the Exchange suspended the session for all products.
- Having confirmed that both of the duplicated routers at TOCOM's data center were operating almost at full load capacity (over 99%), the Exchange then focused the investigation on identifying the primary cause of the problem: a) the lines; b) the routers; or c) excessive packet transmissions from Members. The Exchange confirmed with the carriers that the problem was not caused by the communication lines. Based on an analysis of the other possible causes, the Exchange rebooted both of the two routers at the data center by 13:40. As a result, the Exchange confirmed stable connectivity for all Members for about 20 minutes, and a normal load level for the routers at the data center. Simultaneously, the Exchange confirmed that there was no excessive packet transmission from Members.
- At 14:15, the Exchange decided to resume the session in consideration of the above findings. The Exchange started to accept orders at 14:30, resumed the session at 15:00 and closed the day session at 15:30.
Thereafter the settlement price for the day session was determined normally, and the night session started at 17:00 as usual.
2. Cause of the problem
As reported above, it has become clear that the high-load (over 99%) of the routers at the data center was the cause of the connectivity failure between the Member systems and the Exchange system. However, we have not been able to identify what primarily caused the router to be loaded to such a high level.
Through our analysis, the Exchange has confirmed that the technical problem was limited to one of the two sets of routers at the data center. The two sets of routers were set in duplicate to distribute the load. Connectivity failure was also limited to some Member systems.
The Exchange has also verified that, should the same type of event happen again, connectivity failure can be resolved by rebooting or switching the router at the Exchange’s data center without suspending the trading session.
3. Preventive measures
As described above, it has been confirmed that the same kind of technical problem can be resolved without suspending the trading session. Therefore the Exchange has started today’s (May 13) session with the following measures in place:
- Monitoring of the network facilities and countermeasures
- Strictly monitor load status of the router at the Exchange’s data center
- If the same problem occurs, the Exchange will reboot the affected router after having notified the Members and other concerned parties.
In such a case, the switch of line might happen depending of the connectivity status. However, the Member systems will not detected the switch as a “technical problem” for all the network line is set up to be redundant and duplicated.
- Replacement of the defective router in prevention of further problems
The Exchange will replace the defective router with new equipment in order to prevent further problems in the evening of May 13 (There are two sets of routers, or in total four routers, at the TOCOM data center).
The Exchange continues to investigate the primary cause of the problem and will make an announcement as soon as it is identified.
Once again, Tokyo Commodity Exchange apologizes for any inconvenience the technical problem of May 12 may have caused to market participants and other concerned parties.