xBRL-JSON: making XBRL easier
This week we published the first public working draft of the XBRL Open Information Model (OIM). The OIM is a syntax-independent model of the content of an XBRL report, designed to make it easy to work with XBRL data in a variety of different formats, as well as providing simplification of some of the more arcane aspects of the XBRL specification. You can read more about the goals of the OIM in the release announcement.
Accompanying this release was a public working draft of xBRL-JSON, a standard JSON representation of an XBRL report based on the OIM.
Making XBRL data easier to consume
With a growing quantity of good quality XBRL data being made publically available, we’re increasingly seeing users reaping the benefit of structured, financial data, but uptake is slow. I believe that at least part of this is due to XBRL’s steep learning curve: to accomplish even simple tasks you need to understand contexts, units, dimensions, DTSs and more.
The OIM, and a JSON representation in particular, give us a great opportunity to make XBRL data more accessible to anyone who benefits from accessing data in a structured form.
We see the primary role of xBRL-JSON as being the publication of XBRL data that has already been validated as part of a regulatory data collection system, and to this end, the design of xBRL-JSON has prioritised simplicity of consumption over efficiency of representation.
For submissions from filers to regulators in an XBRL reporting system, we expect the existing XML-based syntax (now dubbed xBRL-XML) to remain the default choice for some time to come, due to the stronger validation offered by XML Schema compared to current JSON equivalents.
xBRL-JSON: what does it look like?
As an example, consider a single XBRL fact represented in XML with its accompanying context and unit definitions:
<xbrli:contextRef id="c1"> <xbrli:entity> <xbrli:identifier scheme="http://standards.iso.org/iso/17442">12345</xbrli:identifier> </xbrli:entity> <xbrli:period> <xbrli:startDate>2015-01-01</xbrli:startDate> <xbrli:endDate>2015-12-31</xbrli:endDate> </xbrli:period> </xbrli:contextRef> <xbrli:unit id="u1"> <xbrli:measure>iso4217:USD</xbrli:measure> </xbrli:unit> <gaap:Profit contextRef="c1" unitRef="u1" decimals="-6" >12000000</gaap:Profit>
The same fact in xBRL-JSON would look like this:
{ "oim:concept": "gaap:Profit", "oim:accuracy": -6, "oim:unitNumerator": [ "iso4217:USD" ], "oim:period": "2015-01-01/2016-01-01", "oim:entity": "lei:12345", "value": "12000000", "numericValue": 12000000, }
It’s unfair to directly compare the size of representation, as in the XML case, contexts and units may be reused by multiple facts, but the JSON representation is much simpler to understand: it’s seven key-value pairs describing the properties of a fact, and crucially you can load it into just about any programming language, get an array of objects and start working with the data.
XML vs JSON
At the time XBRL v2.1 was being defined in 2003, XML was a natural choice for the syntax. Since then, there has been a significant move away from XML for many applications, with JSON (JavaScript Object Notation) being the default replacement.
In explaining this trend, it is often stated that JSON is a “lightweight” alternative to XML, free from the complexities of namespaces, schemas and validation.
Anyone who has witnessed the huge improvements in data quality that can be achieved with electronic filing using rigorous, XML-based validation will be rightly concerned at the suggestion that validation is viewed as an unnecessary burden.
JSON is often easier to work with, but this is not because it is “lightweight” but because the structures that JSON offers map directly into the constructs found in pretty much all programming languages. JSON offers arrays, objects (a.k.a. “associative arrays” or “key/value maps”), strings, numbers and booleans. Pretty much all programming languages have native representations of all of these types, making it possible to load any JSON document and start doing useful work with it.
By contrast, XML documents are made up of nested lists of elements, which are a cross between an associative array (a map of element names to their values) and ordered lists (order is significant, elements can be repeated). XML also allows mixed content, meaning that the child elements may be interspersed with bits of text content. In order to do useful work with XML data, you invariably need to first expend effort marshalling it into the data structures available in your programming language.
It is this “impedence mismatch” between the XML data model and programming data structures that makes consuming XML data hard work compared to using JSON.
xBRL-JSON vs xBRL-XML
xBRL-JSON is not intended to render xBRL-XML obsolete; it is about having the right tool for the job. The OIM initiative is about freeing XBRL data from a single syntactic representation and easing transformation to and from whatever representations make sense for a given task.
The PWDs can be found on our specifications sub-site
A Public Working Draft is the first public step in our standardisation process, so this is the perfect time to get involved. The OIM Working Group is actively seeking public feedback on this draft. We’re also looking for volunteers to assist with the next activities of the group which include:
- constructing test cases that test the equivalence of difference syntactic representations,
- refining the model based on feedback received, and
- expanding to model to include information drawn from an XBRL taxonomy.
If you are a member of the consortium or a Jurisdiction, you can submit your name though the WG enrolment form. If not email join@xbrl.org to learn about your options.