Handling Duplicate Facts in XBRL and Inline XBRL 1.0

Working Group Note 14 January 2025

This version
https://www.xbrl.org/WGN/xbrl-duplicates/WGN-2025-01-14/xbrl-duplicates-2025-01-14.html
Editor
Contributors
David Bell, UBPartner <dbell@ubpartner.fr>
Herm Fishcher, ExBee <herm@exbee.dev>
Mark Goodhand, CoreFiling <mrg@corefiling.com>

Table of Contents

1 Abstract

This document identifies a number of technical issues relating to the handling of duplicates in XBRL, and in particular, the possibility of duplicates introduced as a result of the transformation of Inline XBRL documents into XBRL. The document makes a number of recommendations for the handling of duplicates in XBRL data collection systems.

2 Introduction

XBRL reports may include duplicate facts, that is, two or more facts that purport to provide the same piece of information. The reported values may be the same, or different. Where the values are different, this may or may not be indicative of a problem in the data. For example, the same information reported at differing levels of precision would yield different literal values, but is not indicative of an error in the data.

The XBRL v2.1 specification provides a definition of duplicate facts, but aside from specifying their impact on calculation checks, it does not prohibit or even discourage their use. Further, it does not make any distinction between different classes of duplicate facts based on the consistency of their values.

Whilst preparer guidance for the creation of XBRL reports usually discourages or prohibits the use of duplicate facts, best practice for the creation of Inline XBRL reports can legimately lead to duplicates. It is not uncommon for a report to include the same piece of information multiple times, and tagging guidance for Inline XBRL typically requires all such occurrences to be tagged. In some cases, such facts may be de-duplicated when the document is transformed into XBRL, but in other cases duplicate facts will be present in the resulting XBRL.

It is not uncommon for a report to contain the same piece of information multiple times, but stated to differing levels of precision. For example, a figure for revenue may be stated to the nearest million when referenced in text, but reported to the nearest thousand on the income statement. If both are tagged in an Inline XBRL document, it will result in duplicate facts which are not the same, but which should be consistent.

This document describes definitions for different classes of duplicates, clarifies the behaviour required of Inline XBRL processors with respect to duplicates, and provides recommendations for how duplicates should be handled in a filing system.

This document does not address consistency between fractional facts in XBRL (i.e. facts for concepts using the xbrli:fractionItemType data type. Such facts are not in common use, and are not supported by the Open Information Model specification. References to numeric facts in this in document refer to non-fractional numeric facts.

3 Classes of duplicates

The Open Information Model defines different classes of duplicate facts, all of which would be considered duplicates under the XBRL v2.1 definition of duplicate items. The figure below shows the relationship between the different classes of duplicates defined in the OIM.

Figure 1: Classes of duplicates

Earlier versions of this Working Group Note provided their own definitions of these classes of duplicate facts. These definitions have now been superseded by the more formal definitions in the Open Information Model (OIM). This document now provides informal descriptions of these classes, and in the event of any conflict between this Working Group Note and the Open Information Model specification, the latter takes precedence.

3.1.1 Calculations 1.1

The Calculations 1.1 specification provides an update to the calculation functionality provided by the XBRL v2.1 specification, and leverages OIM definitions where possible. Calculations 1.1 additionally supports consistency checking of reports where numeric values have been truncated rather than rounded. In order to support this, Calculations 1.1 provides modified definitions of Consistent Duplicates and Inconsistent Duplicates. These are equivalent to the OIM definitions when rounding is used.

3.2 Complete duplicates

Complete duplicates are facts that share the same value, reported to the same precision (for numeric facts) and in the same language (where applicable) and with the same dimensions. Complete duplicates will have different "id" properties.

Custom attributes (i.e. those not defined by the XBRL specification) are not preserved in the Open Information Model, so facts with differing custom attributes may still be considered duplicates.

The definition also ignores any footnotes attached to the facts.

3.3 Consistent duplicates (numeric facts)

Numeric facts that share the same dimensions, and which have values that are consistent with having been rounded from a single actual value are considered consistent duplicates.

Consistency of numeric values should be established by considering the interval represented by each reported fact, and ensuring that there is overlap between all intervals.

For example, given the reported values 2,500 (stated to -2 decimal places) and 2,000 (stated to -3 decimal places), we have the following intervals:

These two intervals overlap, and so are considered consistent (both facts are consistent with having been rounded from a value in the range [2450,2500]).

Technically, the interval represented by a reported value depends on the rounding method used to obtain it. For example, 2,000 (-3 d.p.) is only a valid rounding of 2,500 under certain rounding methods (e.g. "round half to even" or "bankers rounding"). If the value were rounded using the "round half away from zero" (or "commerical rounding") method, it would be 3,000 (-3 d.p.).

In order to avoid declaring values to be inconsistent in this edge case, the intervals used in establishing consistency should be inclusive (or "closed"), but with the constraint that values reported to the same accuracy must have the same reported value. In other words, 2,000 (-3 d.p.) is inconsistent with 3,000 (-3 d.p.) even though there is overlap between the closed intervals; such values would imply an actual value of 2,500, but rounded using different methods in different places in the report.

The approximation of using closed intervals only affects the outcome of the consistency check in the event that the exact value is also included in the report stated to infinite precision.

Numeric facts that are complete duplicates are also considered to be consistent duplicates.

3.4 Multi-language alternatives (string facts)

Facts which are text facts and which differ in language but which would otherwise be duplicate facts are considered multi-language alternatives.

Multi-language alternatives may provide multi-language translations of a fact, but there is no automatable way to determine if the statements being made in different languages are equivalent. As such, these are treated as a separate class of duplicates which implementations may choose to allow or prohibit depending on the need to support multi-language translations.

Note that the definition of text fact excludes certain string-based datatypes to which the language dimension is considered inapplicable.

Note that in xBRL-XML, the effective language for a particular element may be inherited from an @xml:lang attribute appearing on an ancestor element.

3.5 Inconsistent duplicates

Duplicate facts that are neither complete duplicates nor consistent duplicates are considered inconsistent duplicates.

4 Issues with the XBRL v2.1 definition of duplicates

The XBRL v2.1 definition of duplicates is imperfect, as it relies on string equality when comparing the content of elements in the context definition. This means that QNames values that use different namespace prefixes to refer to the same namespace would not be considered the same. In the general case, a QName-aware comparison may not be possible, as it requires XML type information which may not be available for the contents of segment and scenario elements.

For implementations that are following the recommendations made in the Use of Dimensions Working Group Note where the only content of these elements will be XBRL Dimensions, a more robust QName-aware comparison can and should be used.

The Open Information Model does not support any content from segment or scenario elements other than XBRL Dimensions, and provides more robust, model-based definitions of duplicate facts.

5 Duplicates in Inline XBRL: processor behaviour

The Inline XBRL specification allows, but does not require, processors to de-duplicate facts that are complete duplicates when transforming from Inline XBRL to XBRL. It should be noted that the presence of id attributes on all facts in the Inline XBRL document is sufficient to prevent this de-duplication, as processors are required to create a fact for each distinct value of the id attribute.

Consistent (but not complete) duplicates and inconsistent duplicates will not be de-duplicated by an Inline XBRL processor, and so will always be included in the resulting XBRL.

As such, XBRL that is produced by an Inline XBRL processor may contain duplicates of all categories described above, with the exact handling of complete duplicates being processor-dependent. More details on the behaviour relating to duplicates can be found in the Inline XBRL Primer.

6 Issues with duplicates

The presence of duplicates in an XBRL document can create problems for processing, storage and analysis of the data.

6.1 Calculation consistency checking

The XBRL v2.1 specification provides the "summation-item" arcrole that can be used to define simple calculation relationships between concepts that will be checked against corresponding facts in an XBRL report. Such calculation checks are only performed if neither the total, nor any of the contributing items, have duplicates; the presence of any duplicates effectively disables calculation checks.

The behaviour of calculations in the presence of duplicate facts has been addressed by Calculations 1.1 which provides a newer version of the "summation-item" arcrole with improved semantics.

The summation-item relationship defined in XBRL v2.1 is now considered legacy functionality, and users are encouraged to adopt the new Calculations 1.1 arcrole in taxonomies, and to use Calculations 1.1 semantics when checking calculations using the legacy summation-item arcrole.

6.2 Calculation consistency checks in Inline XBRL

As de-duplication of facts in an Inline XBRL document is, to some extent, implementation dependent, this in turn means that the results of checking of legacy "summation-item" relationships against XBRL produced via an Inline XBRL transform are implementation dependent.

Even where an Inline XBRL processor performs de-duplication to the maximum extent allowed by the specification, the use of Inline XBRL may still yield consistent or inconsistent duplicates in the resulting XBRL. The presence of either would effectively disable any legacy calculation checks performed on these facts.

As noted above, it is recommended that users migrate to Calculations 1.1 relationships and functionality in order to avoid these issues.

6.3 Formula processing

Variables in formula rules will "bind" to each occurrence of a duplicate fact, causing the rules to evaluate for each combination of the duplicate facts involved in the rule. This may have an adverse effect on performance, and may cause repetition of any detected errors.

For example, take a trivial rule that checks the equality of two facts, A and B:

  A == B

If A and B are both reported twice, as duplicates A1 and A2, and B1 and B2, then the rule will be evaluated four times, once for each combination of facts:

  A1 == B1
  A1 == B2
  A2 == B1
  A2 == B2

Clearly, if more duplicate facts are involved, this has the potential to lead to a very large number of unnecessary evaluations, and a large number of repeated results.

6.4 Storage

The presence of duplicates adds some complexity to the storage of facts in systems such as databases as, in general, individual facts can no longer be uniquely identified by the combination of their dimensions.

If Inconsistent Duplicates are rejected, and Complete Duplicates are de-duplicated, then any remaining Consistent Duplicates and Multi-Language Alternatives can be uniquely identified by the combination of their dimensions, and the decimals property (where applicable).

Where facts are derived from an Inline XBRL report, it may be desirable to retain all facts, including Complete Duplicates, in order to retain traceability to all occurrences in the source document.

7 Duplicates in XBRL: Recommended approach

Where Inline XBRL is not used, it is typically possible to avoid the issues relating to duplicates by simply using filing rules to prohibit the inclusion of any duplicates at all:

8 Duplicates in Inline XBRL

Inline XBRL introduces "legitimate" reasons for the inclusion of duplicates. In many reporting scenarios, it is common for the same piece of information to be presented multiple times in a human-readable presentation of the report. For example, the figure for revenue may appear on the main financial statements as well as in a note giving a detailed breakdown.

It is best practice to tag all occurrences of figures in an Inline XBRL document, as this makes it easier to spot untagged figures and ensure consistency between the difference occurrences of a fact, and it makes it possible for consumers to navigate effectively between XBRL facts and the locations in which they were reported in the presentation.

8.1 Complete duplicates in Inline XBRL

As described above, whether multiple tagging of complete duplicates in Inline XBRL results in duplicates in the resulting XBRL is processor-dependent, unless id attributes are present. The id property is a required property in the Open Information Model, and will be generated as part of the model creation process if it is not included in the source document. It is recommended that id attributes are used for all facts in Inline XBRL documents, as it ensures traceability from the source to the resulting model.

8.2 Consistent duplicates in Inline XBRL

It is possible that the same information is presented to a different precision in different places. In this case, even using an Inline XBRL processor that de-duplicates to the maximum extent allowed by the specification, consistent duplicates will be present in the resulting XBRL.

8.3 Inconsistent duplicates in Inline XBRL

It is also possible that the same fact is reported with different values that are not consistent, giving inconsistent duplicates in the resulting XBRL. In the case of numeric facts, this is almost certainly indicative of an error, either in the underlying presentation or in the way that it has been tagged.

In the case of non-numeric facts, it is possible that inconsistent duplicates would result from strings that would be considered equivalent from a business perspective. For example, when reporting the name of a director, the forename might be abbreviated in one place but not another (e.g. "J. Smith" vs "John Smith"). Similarly, differences in capitalisation and spacing would result in duplicates that are technically inconsistent, but which might be considered acceptable from a business perspective.

Clearly, the ideal solution would be to have all such facts presented in exactly the same manner, so that all occurrences can be tagged without introducing inconsistent duplicates. In practice, the document production workflow may make this impractical, and in this case, the benefits of completeness of tagging may be outweighed by the complexities created by having inconsistent duplicates in the document.

9 Recommended approach

The following recommendations are made for dealing with duplicates in Inline XBRL:

If completeness of tagging is considered critical, then an alternative approach would be to only reject numeric, inconsistent duplicates, and accept that inconsistent non-numeric duplicates may be present in the XBRL.

Appendix A Document history

Date Description
9th Sept 2015 Initial public release
7th Sept 2016 Updated definition of complete duplicates to include xml:lang (bug 567). Added diagram to clarify the relationship between definitions (bug 570). Noted the omission of fraction item types, custom attributes and footnotes from the definitions. Clarified wording around multi-language facts (bug 571).
17th April 2018 Changed description of method used in establishing consistency of numeric duplicates
14th Jan 2025 Updated to replace definitions of classes of duplicates with references to Open Information Model definitions. Recommend the use of Calculations 1.1 in preference to XBRL 2.1 summation-item.