Mondo Visione Worldwide Financial Markets Intelligence

FTSE Mondo Visione Exchanges Index:

MDDL - Market Data for the Real World

Date 07/06/2004

James E. Hartley
Chief Technologist, SIIA/FISD

Overview

The Market Data Definition Language (MDDL) is an effort by the membership of the Financial Information Services Division (FISD) of the Software & Information Industry Association to define a framework for a common vocabulary to describe market data for the world?s financial instruments (see www.mddl.org for more information).  The consortium seeks to define standard terms for describing all market data - not to limit the market data that is distributed to some common subset.  In addition to a method of encoding market data, MDDL is defining a common protocol for exchanging market data - to capitalize on a standards-based framework for financial information which will reduce the overall cost of processing market data and supporting market data systems.

Any market data consumer - be it a data vendor, redistributor, bank, brokerage, or investment manager, etc. - currently must deal with a dizzying collection of disparate sources where each provides (sometimes unique) market data in a unique way.Each consumer must spend considerable money and resources to adapt software to acquire the market data, perform translations to normalize each source to a common internal format, train development and product staff in the nuances of those translations and normalized structure, and maintain this infrastructure as each source independently imposes changes on its interface and the data it provides.  In many cases this acquisition can be outsourced or contracted from a consolidator - but in modern times large customers may not wish to accept the latencies such services add to the distribution of data - nor the lack of control over what components of the source data are made available to them.

Although the benefit of a normalized data description language and a common communications protocol is rather obvious to a firm that consumes market data, such organizations are unable to affect such change without due adoption by their suppliers. Similarly, many providers are reticent to make changes without sufficient demand from their customers (often requiring a monetary incentive).  Thus this change, as with many changes, has been slowed by a standoff amongst market participants.  Luckily, a few forward-thinking exchanges (and other data providers) are easing the transition.

The value proposition of standards is great for every participant throughout the financial information lifecycle especially if the industry-wide overall cost of processing and distributing market data is considered - not only the costs within an individual firm, but the costs one organization must spend interfacing to another (through training and software development or maintenance).

MDDL seeks to harmonize the description of market data and the way it is exchanged amongst all participants in the financial information industry.  Through modern approaches to representing data, a dynamic communications protocol has been developed that not only promotes clear understanding and normalized interfaces, but provides an immediate way to add new content - without the need to wait for all recipients to retool their systems to handle the new information.  MDDL is a means by which the industry can clearly define market data so it is readily understood while reducing the overall costs of distributing and processing that information.

The Relevance of Standards

Standards help everyone in the information supply chain - providers can use a generally accepted and understand nomenclature to describe their data while consumers can normalize processing around the standard. As appropriate standards become more pervasive, producers need not expend so much effort in supporting datafeed systems or describing the data to consumers thus increasing their profit margin.  Consumers reduce their incremental spend for acquiring additional data that can be used to improve their decision making capabilities.

There are examples of standards in every aspect of daily life - and the same principals can be applied to the financial information industry. Consider the telephone system - now that the various phone companies internationally have agreed on standards for telecommunication, just about any phone can be used to connect to any other phone the world over.  It matters neither which vendor made the physical telephones nor which communications providers are used to connect to the international network - the data (in this case, human voices) is transported between the connected parties (that is assuming applicable business arrangements have been made to pay for the connection).

The proper application of standards throughout the financial information lifecycle has the capability to reduce the cost of processing transactions and to minimize the delay between the provider and the consumer of market data. Many industry analysts can quote statistics about the number of trade failures (and the costs of those failures!) due to inaccurate reference data amongst trading partners.  Consider the confusion that arises when information is distributed via two different providers but arrives at the recipient as seemingly different data.  Inaccurate data requires humans to intervene to correct the misunderstanding - or to place translation systems into the process which cost real money to create and support, add distribution delays, and are another point of failure in an already complex system.

Ongoing Standards Developments

One organization interested in useful standards is the REDAC, along with its U.K.-based cousin RDUG (Reference Data User Group), have set about finding solutions to industry issues associated with the clear and precise communication of reference data.  Principal tenets of these issues are Unique Instrument Identification (UII), Legal Entity Identifiers (LEI), and Business Entity Identifiers (BEI) - all standards for the identification of relevant reference data and the foundational underpinnings of securities processing automation.

ISO (International Organization for Standardization) Technical Committee 68 (Banking, Securities, and Related Financial Services), Sub Committee 4 (Securities and Related Financial Instruments) has established Working Group 8 (WG8) to define an International Business Entity Identifier (IBEI) for identifying business entities playing a role in the lifecycle of, and events related to, a financial instrument - including ?pools of money? and other ?holders? of assets.  REDAC and RDUG, through FISD?s status as a liaison to ISO TC68/SC4, are actively working with WG8 to define this standard so that the new identifier standard subsumes the requirements for LEI and BEI.

The ISO standard 15022 Data Field Dictionary and Catalog of Messages is being redefined in terms of eXtensible Markup Language (XML) as directed by ISO. TC68/SC4?s Working Group 10 has set the guidelines for modeling content in the new ISO 15022 XML Edition while the relatively new Working Group 11 is chartered with expanding the repository for a common ?Market Data Model? (a process which may take several years).  MDDL (through FISD?s liaison status) is working with WG 11 to incorporate MDDL?s terms, definitions, and relationships into the repository while developing a common model for the financial industry that will include setup, pricing, and maintenance data items in a fashion all applications can use.  Indeed, it is hoped that FIX, FpML, TWIST, SWIFT, and other similar organizations will normalize around the new ISO 15022 XML Edition based ?Market Data Model? to provide a common framework for the entire financial information services industry.

The MDDL organization embraces the work of these groups to define common elements that can be used to clarify the communication between market data participants. As such, MDDL conveys each of these standard identifiers as part of the nomenclature used to describe market data.  In its current form, MDDL can encode the reference data and pricing information available in many datafeeds now available.

Securities Processing Automation

The ultimate goal of all financial industry standards efforts is (and should be!) Securities Processing Automation (SPA).  SPA is the effective application of computers to process the ever-increasing volume of transactions with minimal (and ideally no) handling by human hands (beyond the initial decision to deal).  Without standards that can be implemented by computers and understood by all industry participants, SPA can never be achieved and thus much of the existing capabilities of our industry will stagnate or be limited by the amount (and cost) of human resources that can be applied to a problem.  As more applications are developed using appropriate standards, the content is more readily understood by a broader audience and processing times (and delay!) between provider and consumer are minimized - but more importantly, accurate data is exchanged ensuring quality processing.

Any market data system supporting the entire trading, settlement and clearing, reporting, and analysis lifecycle is a complex manipulation of reference databases, historical analytics and pricing, and realtime updates.  To affect SPA, it is imperative that the datasets are kept accurate and that the linkages between the datasets are precise and workable.  The terms used to describe the reference data should be used to define the pricing information as well - and the maintenance and changes (i.e. corporate actions) to this data should be expressed in those same terms.  Thus, a common vocabulary for describing data throughout the financial information lifecycle is required for effective Securities Processing Automation.

Beyond describing the data, it is necessary to exchange data between different participants. As long as a common language is used it does not matter how the information is physically transferred between parties.  However, if the trading lifecycle is broken into its major components - trade execution, market data reporting and analysis, and settlement and clearing - one can appreciate that a common transference mechanism within each component reduces the complexity and delays associated with communicating information.

Consider the effect of new types of assets that are not compatible with existing systems, or new information necessary to process transactions, or additional analytics that can clarify trends or market positioning.  SPA is limited by the ability of infrastructure and interfaces to allow new data elements, or even new types of data, to be exchanged by trading partners.  Modern market data systems (and, indeed, trading systems) must be designed with flexibility in mind to avoid costly overhauls and upgrades to systems exchanging data.  Certainly, the application of business logic must be calculated into the cost of processing new assets or new market data but the underlying infrastructure should be designed with flexibility in mind.  Modern standards for market data permit this growth - and proper implementation of them provides a measure of ?future proofing?.

Applying Standards to Real Data

Once the decision to incorporate standards into a trading or market data system has been decided, the very real work of mapping existing data to the standard begins. This is generally not a difficult process - although retooling systems to interface with the standards may take some time and effort.  Such migrations are best performed when existing systems are proven to require significant modifications, or when a new interface must be adapted into the current model.  With careful planning, the migration to a standards-based infrastructure need not be costly or disruptive - in fact such improvements are most likely successful when combined with planned enhancements.

The key to a successful implementation of standards in a financial information system is how well the resultant system adapts to new content or processing requirements. Standards cannot define every aspect of the system - but systems designed with the notion that standards could conceivably dictate how the systems operate are more likely to be adaptable to changes in the industry.  Modern computers are powerful enough that many of the decisions for content and processing need not be ?hard-coded? into the software (for implied performance improvements) but can be abstracted into externally provided configuration information.

Likewise, the content that is exchanged between counterparties (in any trading or market data communication) is usually well-defined at an instant in time but is likely to evolve over time as relationships deepen or automation matures. The interfaces between systems (either within an organization or between counterparties) should be expected to change and the system designed to allow this - without software changes by (potentially) expensive IT departments.  As with business processing rules, datafeed and ?on-the-wire? content should be abstracted into configuration parameters such that changes in content do not require changes to software to communicate or distribute information throughout the system.

Once information is available throughout an infrastructure in a standard format, applications can be written to process the content - either to provide ?value-added? services, to perform automated functionality, or to display the information for a human. These applications should be written with data abstraction concepts in mind as well - to allow for (unplanned) additional data in the source data stream or changes in data requirements of the receiving processing subsystems.  These concepts are not new - although many existing systems have been developed without considering the inevitable requirements for change (or assuming that such changes could only be implemented by employing considerable human resources to make the modifications).

XML for Market Data

The eXtensible Markup Language (XML) is a simple, very flexible text format which is an effective way of encoding content so it is easily abstracted and adapted to new uses and requirements.  As a product of the World Wide Web Consortium (W3C - see http://www.w3c.org/),the XML specification is in wide-spread use throughout the Internet and most major data-intensive software applications.  XML is a ?natural? syntax to use in describing data exchanged across interfaces as well as information stored within subsystems.  If used correctly, XML-based data and applications provide all of the power and flexibility required in modern market data systems yet are performant enough for the most-demanding data requirements.

MDDL is the application of XML to the market data problem. The result is a very powerful descriptive language for market data that has the flexibility to adapt to the every-changing financial industry while providing a unifying vocabulary on which a standards-based market data infrastructure can be built.  As noted above, however, MDDL (and XML) cannot solve all of the problems of modern systems unless proper attention is given to abstracting data and functionality while implementing systems that allow for changing data requirements.

XML, when used in its literal form, can dramatically increase the size of a market data document. Consider that a ?normal? market data trade report from an exchange may require ninety (90) bytes of bandwidth to communicate the trade - an ?XMLized? version of this report may require more than 900 bytes!  Now weigh this against the requirement that most data vendors must be able to report this trade in nine (9) bytes and you can understand why XML is not universally accepted as ?the answer?.  Such assumptions, however, are the product of the current IT environment prevalent throughout the industry.

There is not much contention about the notion that XML (MDDL) should be used to encode reference data and historical pricing - these datasets are generally ?end-of-day? or ?one offs? that do not update frequently. Thus, if the pricing or reference data file is large then common off-the-shelf compression techniques can be used to reduce the file to a manageable size.  Similarly, the XML encoding of corporate actions or other changes to reference data or historical pricing is acceptable because the frequency of messages does not create bandwidth concerns.

Realtime streaming pricing provides a different challenge.  Major market data vendors are looking to process 100,000 (or even millions!) of market data trade and quote reports every second over multiple high-capacity communications circuits while using the smallest amount of bandwidth possible to deliver that data downstream.  Existing communications circuits and processing systems between vendors and their customers are already overtaxed with the amount of data that must be conveyed.  Consumers do not wish to increase their expenditure on more expensive communications circuits (despite their desire to acquire data directly from the source).  With all of these concerns, it is understandable why adopting an encoding that is more ?verbose? is not desirable and ?proprietary? encodings and protocols are still the normal course in market data systems.

An Industry Standard Market Data Communications Protocol

With bandwidth and throughput considerations in mind, FISD has developed a protocol that describes the data exchanged over a communications circuit using XML in such a way that the data can be compressed every bit as efficiently as existing proprietary methods. Further, the protocol exchanges this configuration information ?over the wire? as part of the session establishment thus yielding the flexibility required for future requirements.  Because the actual content is exchanged as a field (rather than a completely marked up XML document) additional processing is not required to generate or receive the updates (as compared to current delivery schemes).

Using this protocol, a market data provider creates an fisdMessage compliant datafeed that is used to distribute its XML encoded data.  The configuration information defines what data will be sent in each update message (called a ?template?).  The protocol distributes the template downstream to consumers and only the data VALUES that update are delivered (not the entire XML document!).  The definitions of the data fields are precise - because they map directly to the XML - and the content is not unduly ?verbose? because only the raw data is distributed (not the markup).

A consumer of market data can (independently) write an fisdMessage compliant datafeed handler that can connect to any fisdMessage provider.  In this way, a single application can be used to interface to all providers and acquire all content.  The consumer?s software support for datafeeds is significantly reduced (by potentially as many datafeeds as they consume) and the function of mapping each datafeed to a common internal reference system is simplified because each source uses the agreed standard terminology (MDDL).  Additionally, infrastructure support for datafeed handlers is reduced because the connectivity and hardware requirements for all datafeeds are now similar - deployment models are simply a function of convenience and available processing power rather than dictated by provider?s proprietary methods.

Look for the fisdMessage protocol to be implemented by several major exchanges in 2004 as an evolutionary step forward in datafeed distribution. Although the specific content of each exchange is different, a single datafeed handler can be used to capture the data from all of the exchanges.  Further, an exchange using this protocol can add new data to the datafeed without software changes in the handler.  As MDDL and fisdMessage adoption progresses throughout the financial information lifecycle, consumers of market data should realize reduced operational costs, reduced latency in processing market data, and increased flexibility for handling additional data sources.

James E. Hartley is Chief Technologist of SIIA/FISD charged with the development of MDDL as an industry tool for communicating market data.  Mr. Hartley lives in Denver, Colorado, U.S.A. He can be contacted via jhartley@siia.net or at +1 303 322 1393.  See www.mddl.orgfor more information.