This document serves to document the motivation behind certain design decisions made in xBRL-JSON.
The specification for numbers in JSON (RFC 8259 section 6) sets the expectation that this type is represented by a double-precision, floating point number.
XBRL implementations typically make heavy use of the xs:decimal
datatype, as this is what xbrli:monetaryItemType
is derived from.
xs:decimal
is an arbitrary precision decimal number (although
implementations are only required to support a minimum of 18 digits).
Representing XBRL numeric values as JSON numbers would result in values being approximated using floating point numbers, resulting in a loss of precision. For example, the number 0.1 cannot be represented exactly as a base 2 floating point number. A simple example of the problems that this can cause is shown below. The sample uses Python, but any language that uses floating point numbers for JSON numbers will see similar issues.
>>> x = json.loads('{"values": [0.1, 0.2], "total": 0.3}')
>>> sum(x["values"]) == x["total"]
False
>>> sum(x["values"])
0.30000000000000004
The xBRL-JSON specificiation therefore represents numbers using a string representation which includes a valid lexical representation of according to the relevant XML Schema datatype (see Section 3).
This is described in more detail in the [OIM design document][oim-design].
In designing xBRL-JSON, the Working Group noted that other, JSON-based technologies re-use the XML Schema datatype system (XSD). For example, the JSON-LD specification includes many examples that make use of XSD.
Earlier drafts of xBRL-JSON included facts an array of objects. More recent drafts use an object with fact IDs as keys. The current approach was deemed preferable because:
The barrier to adopting this approach initially was the fact that ID attributes are optional on facts in XBRL v2.1, and so an ID may need to be generated when converting from XML.
It was deemed that fact IDs are sufficiently useful for traceability of facts through different representations that they should be made mandatory in the model, and generated in a consistent manner if required.
It has been noted that some tools with generic JSON support do not work with "records" that are members of an object rather than an array. It was felt that there will always be limits to what can be achieved in generic JSON tools when analysing xBRL-JSON, and as such, the advantages of using an object outweigh the benefits of improving support for such tools "out of the box".
A fact object can trivially be converted into a fact array. For example, this
can be done using the open source jq
utility using the following command:
jq '.facts = (.facts | to_entries | map_values(.value + {id: .key}))'
The dimensions associated with a fact are contained within a dimensions
object. For example:
"f923": {
"value": "1234",
"decimals": 0,
"dimensions": {
"concept": "tax:NumericConcept",
"entity": "cid:123456789",
"period": "2015-01-01T00:00:00/2016-01-01T00:00:00",
"unit": "iso4217:GBP",
"my:region": "my:UK"
}
}
The group also considered a simpler, flatter structure, without an additional container:
"f923": {
"value": "1234",
"decimals": 0,
"concept": "tax:NumericConcept",
"entity": "cid:123456789",
"period": "2015-01-01T00:00:00/2016-01-01T00:00:00",
"unit": "iso4217:GBP",
"my:region": "my:UK"
}
The use of the additional container was chosen because:
QNames provide a compact representation of Expanded Names (a namespace URI and local name pair), but they add significant complexity:
The Working Group considered and experimented with using an expanded format instead of QNames, either "Clark notation" of "{uri}localname" or representing as a single URI, e.g. "uri#localname".
Both of these options make processing significantly simpler, but at the cost of having significantly larger documents, and with a significant loss of readability. Even simple examples become extremely cumbersome if an expanded form is used throughout.
xBRL-JSON therefore retains the use of QNames, but uses a more restrictive format of namespace bindings, that have the benefit of being able to reliably determine QName equality within a document using a simple string comparison (i.e. if they have the same prefix, they have the same namespace).
The specification reserves a number of prefixes that, if used, must be bound to a specified URI. These include the xbrli and iso4217 prefixes. This allows consuming applications to identify QNames within these namespaces without doing a namespace lookup. For example, if an application wishes to use the Euro symbol (€) to represent Euro units, this can be done by finding units with "iso4217:EUR" as the measure.
One of the downsides of using QNames is that ":" is not a permitted character within identifiers in Javascript (or most other languages). This means that it is not possible to use the more conscise notation to directly access object properties, e.g.:
var geog = fact.dimensions.Geography;
Instead, you must use:
var geog = fact.dimensions['eg:Geography'];
xBRL-JSON uses unprefixed names for built-in dimensions (e.g. "concept"), and prefixed names for taxonomy-defined dimensions (see Section 13). It is felt as this provides a reasonable compromise, as it will generally only be built in dimensions that users will wish to hard-code. e.g.:
var concept = fact.dimensions.concept;
var period = fact.dimensions.period;
Taxonomy-defined dimensions will need to be handled generically, rather than selected with a literal QName:
for dim in Object.keys(fact.dimensions) {
var dimVal = fact.dimensions[dim];
}
The period dimension can be either an instant or a duration. In the case of a
duration, the Working Group considered modelling the period dimension as a pair of
properties (start and end), as a single property with a string value, and as a
single property with an object value containing start
and end
members.
A single property was deemed most appropriate, but raises the question of how
to specify the value. ISO8601 defines a format for expressing time intervals, but
this is very poorly supported by in common languages, and provides significant
flexibility in format. For example, a year starting at 2001-01-01
can be expressed as any of:
2001-01-01T00:00:00/2002-01-01T00:00:00
2001-01-01T00:00:00/P1Y
P1Y/2002-01-01T00:00:00
Given the lack of support for this format in common libraries, it was felt that
allowing this full flexibility placed an unreasonable burden on implementers of
consuming software, and so instead a restricted version permitting only the
first format has been adopted. This can be split into a pair of ISO8601 date
time values with a simple string split on the /
character.
Using a string value was deemed preferable to an object with separate components for consistency with all other dimensions.
XBRL v2.1 permits the time component of date times to be omitted, but applies a different interpretation depending on the context in which it is used:
00:00:00
at the start of that day.24:00:00
on that day (or, equivalently, as 00:00:00
on the following day)This allows durations to be expressed in the natural, inclusive form. For example, we would typically describe 2001 as "1st Jan 2001 to 31st December 2001", but has lead to a significant number of off-by-one bugs in software.
The group considered:
00:00:00
on that day, andThe group was keen to avoid repeating the confusion caused by (1), but felt that switching to (2) would cause further confusion due to the differences to v2.1. Although somewhat more verbose, (3) is the easiest for developers to work with, as it uses standard ISO8601 date times, and requires no special processing at all.
The Extensible Enumerations 2.0 specification permits the definition of concepts that take as their value a set of expanded names. Various approaches were considered for the xBRL-JSON representation of such facts:
Approach (2) was chosen over (1) because it means that all fact values share the same JSON type of "String". This makes parsing and simple value comparison easier, particularly for processors that do not have type information from a taxonomy available.
The approach taken is consistent with the representation of duration periods (see Section 7).
Whilst a JSON list representation may be easier to work with once parsed, Extensible Enumerations are not widely used, and using list representation in xBRL-JSON risks simple software throwing type-errors when encountering enumeration values. For operations that do need to work with the values as a list, conversion from a space-separated string to a list is trivial in all common programming languages.
Approach (2) was chosen over (3) for consistency with use of QNames elsewhere (see Section 6).
The Working Group discussed the extent to which xBRL-JSON documents should be
required to be canonicalised. The OIM uses the XML Schema datatyping system
(see Section 3), which provides alternative lexical
representations of the same value. For example, 1
, 01.0
and +1.0
are all
alternative representations of the same value.
Similar flexibility occurs from the use of QNames (see Section 6), as the use of a different prefix for a namespace does not alter the semantics of a QName values.
Canonicalisation can make comparison of documents and values much easier, as semantic equivalence can be established by simple string comparison. If it is known that values have been canonicalised, you do not need to refer to their datatypes in order to determine equivalence. Such comparison is particularly important for conformance suite tests.
The group discussed the possibility of canonicalising prefixes by taking a hash
of the namespace URI. For example, http://fasb.org/us-gaap/2019-01-31
can be
represented using a prefix derived from its SHA-256 hash:
ns-10bea9499cc09aad9b97d384615ad9248599453acf8f5c014fefe91ed622cc72
. However,
the use of such verbose and opaque strings undermines the benefits of using
prefixes in the first place (as noted in Section 6). Shorter
hash-based strings such as ns-10bea can function in a restricted environment,
but risk collisions across the full space of possible URIs, and they remain
opaque. Accordingly, we leave document authors free to choose concise,
meaningful prefixes that uniquely identify namespaces within the scope of a
report.
The group considered requiring canonical lexical values (as per XML Schema) for facts and dimensions, but concluded that it did not provide sufficient value on its own to warrant the additional burden placed on document creators, as even if required, comparing documents requires awareness of QName values.
Instead, xBRL-JSON provides an optional "canonical values" feature, which prescribes a standard approach for the canonicalisation of values, and allows documents to declare when this has been used.
The OIM supports links between pairs of facts. In xBRL-JSON, such links could be represented as a property of either the source or target fact, or entirely separately, referencing both source and target facts by ID.
The former gives an obvious asymmetry in representation, making it easier to traverse the relationship in one direction than the other. It should be noted that relationships are asymmetric in the model, in that the outgoing relationships from a fact are ordered, but incoming relationships are not. Grouping relationships by source fact allows this ordering to be captured concisely by representing the relationships for a fact as an ordered list.
Links are used to model footnotes, as well as other relationships, and it was felt that making it easy to get from a fact to its associated footnote(s) was a benefit. It was felt that placing link information on facts gave a more intuitive format, than a separate block of JSON containing objects where both keys and values are IDs.
Consumers requiring more advanced, bi-directional traversal of relationships would not be unduly inconvenienced by representing link information on facts.
In a number of places in the syntax, JSON objects may have properties whose
values are a collection of zero or more items. For example the facts
property on a report has a value which is an object containing zero or more
facts.
In this case, an empty collection could be represented by omitting the property altogether, rather than requiring it to be present with no members. This is more concise, but requires a small amount of additional coding in most languages in order to cope with the possibility of an absent property.
Requiring all such properties to be present, even if the collection is empty, would impose an unnecessary overhead on document size. For example, many XBRL documents contain few, if any, links, but this property can be present on any fact in this document. Requiring it to be present would potentially add significantly to the size of the document.
It was felt that the benefit of having a single, consistent approach of allowing all such properties to be omitted outweighed the cost of a small amount of additional code needed to cope with the possible omission of the property, even in cases where the inclusion would not represent a significant overhead.
A goal of the Open Information Model has been to provide greater consistency
between XBRL v2.1 "built-in" dimensions, such as period, unit, and concept, and
taxonomy defined dimension, as defined in Dimensions 1.0. To this end, all
dimensions in xBRL-JSON are defined in a single JSON object called dimensions
.
xBRL-JSON makes use of QNames, but has tried to remove some of the associated complexity of the XML specifications, such as scoped namespace bindings, default namespaces, and non-unique prefixes (i.e. multiple prefixes for the same namespace).
To this end, earlier drafts used a fixed prefix of xbrl
for built-in
dimensions, but it was felt that this added unnecessary verbosity to the
format. Built-in dimensions are now represented using unprefixed names. This has the benefits that:
:
in the name.The xBRL-JSON specification states that processors "MAY" use the error code
oime:invalidFactValue
if they are performing full validation of fact values.
This intended to provide a standard error code for use when validating constraints in specifications that don't prescribe error codes (e.g. XML Schema, XBRL v2.1 and the Data Types Registry), while permitting processors to use any existing codes they may already have in place for those specifications.
By contrast, the Extensible Enumerations 2.0 specification and the Dimensions 1.0 specification do define error codes, and in the interests of interoperability we require that those codes are used.