Mondo Visione Worldwide Financial Markets Intelligence

FTSE Mondo Visione Exchanges Index:

The dangers of putting market data on steroids: Going for speed, ending up with bulk

Date 14/07/2006

Peter Lankford
President, Technology Business Development Corporation

I often characterise the industry I work in as 'testosterone charged'. No, it's not professional wrestling, drag car racing, or body building (definitely not body building). It's the high-performance world of real-time data management for the capital markets.

In recent years, the drive to squeeze out latency while coping with explosive traffic growth has sent trading firms searching for the best ways to 'put their system on steroids'. It has also led to a macho one-upsmanship among technology vendors that is beneficial for the industry and sometimes entertaining to watch. Product names, company names, and advertisements are overflowing with testosterone-laced images of race cars, bobsleds, ferocious wild animals, and nearly every superlative that connotes speed or hyperactivity. It's as if the brilliant geeks behind these technologies finally have licence to express their inner Arnold.

To be fair, I've built a chunk of my career on this obsession, and our company continues to stoke the competitive fires by helping firms find ways to push through traditional performance boundaries. But we're also shifting more attention to the consequences of the arms race, as are many trading firms who are in the thick of it.

In particular, the proliferation of speed-inducing technologies is starting to complicate everyone's data environments at precisely the time that competition in the data business and regulatory changes promise to compound the complexity even further. The risk is that firms will lock in a new set of content stovepipes that are difficult and expensive to integrate. Heads of desks, risk managers, and others who depend on well-integrated data will find their application developers spending more time coping with the mechanics of content management, and less time implementing new business logic.

This is by no means a new issue, but it seems to be moving up on corporate agendas. Fortunately, there are ways to deal with it, but more work needs to be done. To review the situation, let's start with the latency race.

Steroids for all

Two or three years ago, when the media buzz around latency began to rise above the drone of cost-reduction and business continuity, the main business driver was quite narrow: speeding up distribution of North American equities and options data to automated trading applications.

These applications, which rapidly issue, cancel, and re-issue orders, can act on information extremely quickly. In some of these markets, firms can profit from as little as one millisecond of advantage over competitors. In most of them, however, the game is still being played in the tens of milliseconds.

By now, the obsession with latency has seeped into nearly every geography, asset class, and position in the pre-trade and trade process. As the capital markets CTO of a global bank recently put it to me: 'We'll take a serious look at any product with potential to improve the speed of just about anything'.

A measure of this seriousness is that latency management has been enshrined into policy at many institutions. Not only are IT groups routinely held to latency-related service level agreements, but in some cases, corporate audit departments actually conduct latency audits. In large firms, latency monitoring is coming to be viewed as a basic aspect of operational risk management.

There are several techniques for monitoring latency, but many of them suffer from a sort of Heisenberg Uncertainty Principle: the more accurately you try to measure latency, the more you end up increasing it. My own firm's research indicates that the best approach is passive traffic monitoring. This requires very fast tools, and the most promising approaches we've seen use specialised hardware in dedicated monitoring appliances.

Fueling the concern with latency are the spread of electronic trading, a leveling of the playing field in established areas, and a desire to exploit - or defend against - new latency arbitrage opportunities.

Outside of equities and equity derivatives, foreign exchange desks are quickly losing their latency tolerance. Many clients now source tradable FX data directly from electronic trading venues and trade against it algorithmically. The same is starting to happen in highly liquid fixed income securities such as US treasuries. While most other fixed income instruments can also be traded electronically, a relative lack of liquidity makes latency less critical to the success of trades.

In terms of geography, both Europe and Asia are becoming increasingly active markets for low-latency direct feed technology. Although most sell-side firms are already directly connected to European exchanges, which have been electronic for years, they are starting to demand lower latency from their trading system vendors or bypassing those systems altogether with new technology.

In terms of data flow, the low-hanging fruit of the last few years has been reducing the time it takes for data to get from an exchange to the client site and then to trading applications. Clients have eliminated hundreds of milliseconds by connecting directly to exchanges, and tens of milliseconds by re-configuring their client site market data platform or migrating to a new one. Most large firms and many small ones have plucked this fruit, but there is sometimes still room to accelerate data acquisition even after establishing direct connectivity.

For example, a number of firms that originally adopted Layer 3 connectivity from an extranet provider are migrating to dedicated circuits which they manage themselves. Firewalls are also coming under scrutiny, as banks try to square security policies with the need for speed. In a Mohammed-to-the-mountain move, many firms are even moving the trading applications themselves closer to the data sources. In particular, smaller firms such as hedge funds can benefit by relocating their apps to a prime broker or independent hosting centre that is closer to the markets. Many exchanges are also betting that large firms will increasingly co-locate their apps at the exchange, but the benefits of this approach are limited if an application trades on more than one venue.

The drug suppliers

The field of technology vendors who enable direct connectivity is now starting to show signs of maturity. In 2003, it seemed like new direct-feed integration vendors were sprouting every few months. Today, this proliferation of vendors has stopped. Some of them have exited the race, while the leaders are breaking away from the pack, collecting most of the new customers, and expanding their offerings across geographies and asset classes. These leaders are also bringing new competition to the arena of market data platforms, with offerings that help clients get the most benefit from their direct feeds.

If vendor proliferation is a measure of demand, then trading firms are now paying more attention to the next step in the pre-trade process: analysis and signal generation. As of the time of writing (March 2006), I count no fewer than 15 vendors offering solutions to reduce the latency of analytics. These products take in massive streams of data and apply filters or complex analytics to them in real-time. Each of them claims to have a secret sauce for making real-time analytics superfast, and each supports high-level scripting or graphical tools for constructing those analytics. This will be an interesting space to watch over the next year, when it should become apparent where the demand is strongest and which vendors are best positioned to meet it.

Perhaps the most intriguing development is the growing set of vendors that are taking messaging and analytic logic right down to the metal - or rather, the silicon. These vendors, whose roots range from web traffic acceleration and text search to IP routing and messaging, are effectively trying to move hardware 'up the stack', enabling dedicated processors to take on functions that were traditionally reserved for software. The enabler is the declining cost of FPGA and ASIC technology, which reduces the unit volumes necessary to achieve an acceptable return on investment. The resulting appliances, actually a combination of hardware- and software-based logic, can be an order of magnitude faster than software-only solutions (shaving several tenths of a millisecond). However, their main value probably lies in their higher data capacity (millions of updates per second). Only a few firms could extract commercial advantage from the latency delta, while many more could reduce costs through server consolidation.

Such appliances face a number of challenges. Hardware vendors must prove that they can modify and productise their wares as quickly as software vendors. And clients need to adapt to a new paradigm. Collapsing an infrastructure concentrates its points of failure and requires analysis from a business continuity standpoint.

Appliances must also address enterprise IT policies on security, networking, operating platforms, and so on. It feels like we're at the beginning of the appliance curve, with more vendors likely to emerge before the losers are eventually weeded out. I expect that over the next three years, the industry will converge on specific uses for hardware-based logic and market data appliances, and cases where it still prefers a traditional software approach.

'Speed in the market' versus 'speed to market'

The ironic consequence of all these high-speed data, software, and hardware products is that they threaten to reduce that other form of speed that drives profit: time to market. Direct feeds have disrupted both of the product categories that tended to keep things manageable. Not only have they shattered the notion of the one-stop-shop consolidated feed; they are also pulling through sales of new market data platforms into organisations that already have one. Off-the-shelf analytic systems and data appliances promise to stir things up further. If not handled properly, this menagerie of new APIs, infrastructure, symbologies, and data models could create a serious drag on productivity and responsiveness.

Other market dynamics are likely to exacerbate the situation. In the US, just when the NYSE/Arca and Nasdaq/Inet deals made it look like the number of North American venues would decrease, new ECNs sprang up, and now broker-dealers have begun to get in the game. Similarly, in Europe MiFID is expected to multiply the number of transaction venues for liquid securities.

This means that firms will soon have more data sources they need to connect to directly. In addition, consolidated feed vendors will increasingly shed full tick data sources from their networks as volumes drive skyward, forcing customers to take those sources directly. Meanwhile, despite the un-sexiness of consolidated feeds, Bloomberg has now jumped into that business with both feet, challenging Reuters on its home turf. Many clients will soon find themselves with more than one consolidated feed to integrate, and some will want the flexibility to switch easily between these feeds and their corresponding data models.

In the old world of human-dominated trading, we relied on desktop applications to integrate content for decision makers. In today's world of machine-oriented trading, computer programmes also need to 'see' the data. That means that application developers must deal with considerable complexity. Each of the data sources in a trading environment has a potentially unique way of referring to and accessing content, a different way of representing and formatting data, and different data vocabularies.

On top of this, APIs sometimes don't support the same programming languages, compilers, and operating system versions. It is hard to find developers who are familiar with these proprietary technologies, and it is expensive to train those who aren't. As the scope of algorithmic content consumption widens - for example, in cross-asset trading - so does the developer's challenge.

Creating, maintaining, and enhancing applications in such a fragmented environment consumes a great deal of time and money. Trying to change a piece of that environment can cost even more. In particular, once APIs and data models are embedded in an organisation, the costs to switch away from them are enormous. This hands data providers and technology vendors a huge lever to use in commercial negotiations for years to come. It also introduces risk, since it is difficult to switch data providers if there's a problem.

By shoe-horning critical applications into this complex, proprietary environment, firms are effectively embedding all of these costs, risks, and inflexibilities into their business. This hard-coded dependency even hurts data and technology vendors. It's difficult for a vendor to sprint ahead while dragging the huge ball and chain of an installed base that is tied to legacy products.

To phrase it in the 'testosterone' terms of a military analogy, a trading infrastructure with high performance but low integration is like an army with powerful war machines but poor command and control. Such an army is rigid and can be undermined by more coordinated assailants.

Staying nimble

Again, I'm certainly not suggesting that these integration problems are new. Only that current trends are bound to make them worse. A growing number of trading institutions share this view, as evidenced by the holistic approach they are taking to data models and infrastructure as they renovate their trading architectures. So what options do these firms have to choose from?

One might be to adopt open industry standards such as MDDL (Market Data Definition Language) or FIX (Financial Information Exchange). MDDL is a standard put forward by the Financial Information Services Division (FISD) of the Software and Information Industry Association for modeling market data, using XML as the data representation. FIX, developed by FIX Protocol Ltd (FPL), is primarily used as a transaction protocol but also contains a model for market data.

Until recently, both standards faced the show-stopper issue that the latency, processing overhead, and bandwidth consumption of their wire protocols were far too high for streaming price data. However, FISD has created a binary expression for MDDL called xtcMessage, and FPL has created something similar called FAST (FIX Adapted for Streaming). Both of these get latency and resource consumption down to levels previously only associated with proprietary market data protocols.

While MDDL offers the richer data model, FIX has the advantage of market momentum due to its use in transactions. From where I sit, industry support for the FIX approach following the release of FAST has been impressive compared to support for MDDL. This lead might just be temporary, but MDDL is looking a bit like Betamax to FIX's VHS.

How broadly FIX will be adopted for market data is still an open question, however. It is most compelling today for exchanges, particularly those whose existing protocols are encountering latency or bandwidth problems. Yet even some exchanges that participated in the FAST proof-of-concept have no plans to adopt it or are only adopting selected portions of it and modifying the standard to suit their needs. It is also not clear what economics will drive FIX farther down the distribution chain into client site architectures, where proprietary wire protocols are already sufficient and data model requirements are much broader.

Another option is proprietary 'standards'. Most vendors who own a symbology, data model, or integration technology would like to convince clients to use it as their internal standard. In many ways this is attractive, particularly if the client already uses these things throughout their organisation and the vendor is willing to assume ongoing responsibility for interoperability with the client's current and future data sources (offering the proverbial 'single throat to choke'. However, this approach has the obvious risk of tying the client to the vendor's commercial policies and its ability to innovate. This model is becoming popular in the non-real-time space of end-of-day pricing and reference data management. We have yet to see how it will play out in the front office.

A final option, not mutually exclusive with the others, is to use technology to insulate businesses from the intricacies of data access. This is the 'deal with it' approach. It assumes that data complexities will not go away from the bottom up but rather must be dealt with from the top down, starting from the application developer's needs. It uses middleware to juggle multiple symbologies, normalise content to data models chosen by the business, integrate real-time and non-real-time data from across the enterprise, and enable the use of industry standards wherever that makes sense.

One challenge of this approach is to manage the tradeoffs between insulation and performance, since there is no free lunch. Most clients will take a layered approach to deployment. Those applications that require absolute highest performance (sub-millisecond sensitive) will go to the lowest layer possible, while the majority of applications, which can tolerate an extra millisecond or two, will consume from an insulated layer.

Hardware-based solutions may play an important part in the long-run answer, particularly in wire format conversion, but probably not in the short term. Given the frequent logic updates necessary to accommodate the constant churn of data protocols in the industry, this functionality probably belongs in software for now. Perhaps when a standard wire protocol like FAST is widely adopted, this will change.

The natural place to turn for the required insulation software is the newly expanding field of market data platform (MDP) vendors. These players understand real-time data well and have always taken the integration challenge seriously. Most of them currently offer some sort of data normalisation and symbology mapping components. But these components are constrained by the underlying MDP infrastructure and have limitations ranging from performance, to entitlements management, to integration of non-real-time content.

Importantly, most of these MDP solutions also rely on the vendor's own breed of proprietary APIs, which leaves a legacy of long-term problems described above. Yet a standard API for streaming data does exist: the Java Messaging Service (JMS). JMS, which can be represented in languages other than Java, is heavily used in financial firms and enjoys broad support by enterprise technology vendors. While MDP vendors have traditionally held that JMS is not fit for market data, several of them are now talking publicly or privately about providing market data through a JMS API. This is a positive step.

In general, I think that MDP vendors would be wise to take even more cues from the enterprise technology gang. The buzz words those vendors aim at other industries are 'adaptive','agile', and 'on-demand integration'. It is tempting to think that this is about getting the rest of the world to catch up with the advanced technology of the capital markets front office, where innovations like the messaging bus were first to catch on.

But when it comes to the layers above raw connectivity, I think our industry is actually behind the others. To see what I mean, take a look at products in nearly any middleware category starting with 'E'. Enterprise Application Integration (EAI) vendors, with their focus on transactions, Enterprise Information Integration (EII) vendors, with their web-application orientation, and even Extract/Transform/Load (ETL) vendors, who deal in the world of batch processing, are all offering impressively flexible tools for transforming data, virtualising data sources, and otherwise simplifying access to data through open standard interfaces.

These technologies themselves currently can't operate at the speeds required by the front office, and they are usually limited by their relational-data orientations. However, the concepts readily apply to real-time market data. In fact, if MDP vendors take too long to solve customers' integration problems, enterprise technology vendors might just creep into the MDP market and disrupt it.

Wouldn't it be ironic if our industry was so bulked up on steroids that it was invaded by a bunch of nimble girlie men?

Peter Lankford is President of Technology Business Development Corporation (www.tbdcorp.com), which provides securities firms and their vendors with services ranging from business strategy to software development, in the area of real-time information management. Previously, Peter was Senior Vice-President of Information Management Solutions at Reuters, where he led the market data systems business for three years. During his eight years with Reuters, he also oversaw strategic marketing for real-time datafeeds and TIBCO-based enterprise integration solutions. Prior to Reuters, Peter held management positions at Citibank, First Chicago, and operating-system maker IGC.