There are oceans of data in the municipal bond market. Need data on trades? Got it. Bid wanteds and request for quotes? Right here. Data on macroeconomic or demographic trends? Check. Yield curves and prices? Yes. Financial performance? Yes again. Bond offering documents and disclosure? Yessiree. New issuance structure and levels? Uh-huh. Sector specific operational data, say on utilities or healthcare? Yup.
Whatever data you could want, somewhere it exists. Moreover, new data is being generated every second, minute, hour, and day. For free or fee, data abounds.
Got Data?
Arguably the world’s most comprehensive and publicly accessible municipal market database for all three can be found at the Municipal Securities Rulemaking Board. If it pertains to a publicly offered municipal security, from Official Statements to Annual Comprehensive Financial Reports (ACFR) to rating changes to bond trades and everything in between, it has to be filed with the MSRB. Disclosure documents associated with municipal bond issues are available at no charge the MSRB’s Electronic Municipal Market Access platform (EMMA for short), as is municipal trading data, both available on a subscription basis for a fee for those who need these reported in real-time. The truly remarkable EMMA Labs offers a technology sandbox to play in, including a dynamic data dashboard powered by EMMA trade data and a disclosure search tool with keyword search capabilities.
In fact, many of the market’s fee-for-data providers draw from EMMA.
Financial information on the thousands of bond issuers and bond funded projects abounds. Merritt Research Services, now an Investortools Company, was the first firm to realize the market’s need for financial data. Established 1985 and commercially releasing its first database in 1986, the firm now has full financial disclosures on some 12,000 borrowers across 15 sectors, from School Districts to Life Care Retirement Centers as well as information and disclosures on about 50,000 obligors across the entire spectrum of municipal bond issuers. Some data sets, such as Higher Education, go back 35 years. Focused on complete audited financial information because it is detailed, Merritt does extensive back testing for anomalies to assure there are no errors. With hundreds of unique data fields for every sector, the firm offers benchmark median values for financial statement entries for a variety of sectors.
Merritt is not alone. DPCData, founded in 1992, boasts 30 years of disclosure data for all active municipal issuers and at least five years of financial data on around 25,000 obligors. Since municipal bond issuers and obligors can be complex given there are both municipal entities that issue bonds directly, like the City of New York, and ‘conduit issuers’, like the Dormitory Authority of the State of New York, who issue bonds on behalf of other municipal or non-profit entities, DPC strives to normalize and organize the data. It is the correct labelling and taxonomy of the data that is key to assuring its quality and, by extension, making it more consumable—and scalable. Once the framework is established, the cost to add more data is minimal. Correspondingly, DPC offers a variety of data-based solutions including credit, disclosure, geomapping, and climate risk scoring, backed by ever increasing amounts of updated data.
There are other financial data providers, such as Bloomberg and Mergent, each offering an array of data solutions drawn from similarly extensive obligor reporting.
But Wait. There’s More. And Even More.
In addition to posted trades and financial information, there is also a vast amount of market information in bid-wanted and request for price lists. A bid-wanted list is when an investor wants to sell a number of bonds from a portfolio. A list gets put together and posted on a trading platform for other investors to bid on. It’s an online auction, sort of an eBay for bonds. Similarly, sometimes investors just want an indication as to what the market would buy a bond at. That’s where a request for price list comes in.
Tens of thousands of bonds are being bid on and priced off of these lists every day. Invaluable information being provided in almost real time, this data not only gives insight into price levels and trade sizes but also overall market sentiment.
Then there is detailed bond underwriting data, such as a bond’s debt structure and pricing. S&P Global Market Intelligence has tracked and accumulated key information on municipal bond underwriting for the past 20 years. Accurately self-described as the “Golden Record” for the market, the searchable data in their Muni Deal Query includes 150 fields covering $10+ trillion of municipal bond issuance across more than 320,000 financings. This all translates to well over 4 million bonds.
When it comes to yield curves, Refinitiv’s TM3 (the ‘M3’ derived from its more formal title, Municipal Market Monitor) is generally acknowledged as the market’s primary source, its MMD AAA Yield Curve the market’s unofficial benchmark curve. With over 250 scales, MMD yield curves focus on the US Municipal sectors, credits, coupons, tax types and a variety of other segments of the US Municipal market which participants use to measure yield action in the market. However, other vendors such as ICE, IHS Markit, and S&P Municipal Bond Index offer their own versions of the various curves.
Given the municipal bond market finances projects in communities nationwide, market participants consider far more expansive than financial statements, pricing, yield curves, and trades. Demographic and economic data on every town, city, and county is in the U.S. Census. The Bureau of Labor Statistics has volumes of economic data. The U.S. Federal Reserve offers copious amounts of numbers and statistics across the financial assets and liabilities for households to corporations. The Federal Reserve Bank of St. Louis hosts the Federal Reserve Economic Data (aptly given the moniker FRED), tracking GDP, unemployment, inflation and just about any other economic data point you could want. Climate data is tracked by the National Oceanic and Atmospheric Administration. Data on nearly every aspect of the United States is available somewhere on a U.S. government agency or department website.
It’s not just the federal government with data. Local governments have an enormous amount of public data collected from all the services they provide. More than 130 cities have open data portals where anyone with a computer can access literally hundreds of data points on nearly any topic. With the passage and implementation of the Financial Data Transparency Act of 2022, there promises to be even more financial data reported from local governments.
There’s more, such as data for various sectors within the municipal bond market. Healthcare, utilities, water, sewer—you name it, there is copious amounts of operational data. From outpatient visits to gallons of wastewater treated, there is data.
Aside from government and agency sourced data, there are the headlines and posts in traditional and social media. Sometimes overlooked as a data source since they don’t come organized in downloadable CSV files, all these can be amassed and analyzed through large language models (LLM) and other AI tools to ascertain market sentiment. Even content in emails and team communication apps, like Slack, can be scraped and analyzed.
It can be a bit overwhelming.
The Future Was Yesterday.
All well and good, but now the question is, what can you do with all this data?
Nearly anything you can imagine. In using AI and data, like all technologies, the question is not can it be done, but how can it be done, and usually how fast can it be done.
In AI, it’s fast. NVIDIA reports that to accelerate performance for multitrillion-parameter and mixture-of-experts AI models, the latest iteration of NVIDIA NVLink® delivers groundbreaking 1.8TB/s bidirectional throughput per GPU, ensuring seamless high-speed communication among up to 576 GPUs for the most complex LLMs. Throughput is the amount of information a computer system can process in a given amount of time, measured in bits, bytes, or data packets per second. Per second.
Let me interpret by example and comparison.
By some estimates, the MSRB’s Official Statement and disclosure information gathered over the last 14 years and 5 years, respectively, totals around 54 million pages. Let’s convert those pages into digits. In digital terms, this sentence has about 52 bytes. Each character, which includes spaces and commas and periods, is roughly one byte. A kilobyte has 1024 bytes, somewhat short of a half a page of text, so a full page is around 2.5 kilobytes. A megabyte is 1024 kilobytes and one gigabyte has 1024 megabytes. Ballparking it, there are around 178,000,000 words in 1 gigabyte. Using some back-of-the-spreadsheet calculations, this all totals up to a roughly estimated 244 gigabytes of words, plus or minus 10% on either side.
There are 1,000 gigabytes in one terabyte. The NVIDIA chip can process 1.8 terabytes per second—in short, it can process all of that muni data in around 0.13 seconds.
Let’s compare that to humans. Either fact or fact-by-repeated-lore, the “10,000 hour” rule offers that it takes human beings about that amount of time to truly master a complex task. That means working tirelessly eight hours a day, with no weekends or vacations, for around three and half years. If you’re some slacker that wants time off, it’s closer to five years.
Human being: 5 years. Computer: 0.13 seconds.
Siri, Build Me An Optimized Bond Portfolio
With that kind of processing speed, in barely a few seconds an AI driven investment model can analyze vast amounts of data, find investment opportunities, and execute trades, all the while continuously learning from its actions to be increasingly accurate.
And the municipal bond market has oceans of data ripe for the AI picking.
This is the third article in the series, AI and the Municipal Bond Market. The next article will cover how AI is tackling one of the municipal bond market’s most intractable problems—pricing bonds.
Read the full article here