The Tau model for Data Redundancy

Sunderrajan Krishnan and André JOURNEL. ( 2005 )
in: 25th gOcad Meeting, ASGA

Abstract

Many decision-making processes in the Earth sciences require the combination of multiple data originating from diverse sources. These data are often indirect and uncertain, and their combination would call for a probabilistic approach. These data are also partially redundant, anyone with each other or with all others taken jointly. This overlap in information arises due to a variety of reasons - because the data rise from the same physical process (geology), because they originate from the same location or the same measurement device, etc. When combining such redundant data, one must account for their information overlap, less we run the risk of over compounding apparently consistent, but actually redundant data with the risk of a wrong decision. The proposed tau model combines partially redundant data, each taking the form of a prior probability for the event being assessed to occur given that datum taken alone. The parameters of that tau model measure the additional contribution brought by any single datum over that of all previously considered data; they are data sequence-dependent and also data values-dependent. As one should expect, data redundancy depends on the order in which the data are considered and also on the data values themselves. However, averaging the tau model parameters over all possible data values leads to exact analytical expressions and corresponding approximations and inference avenues. A calibration-based technique shows one possible method for inferring these tau weights.

Download / Links

    BibTeX Reference

    @inproceedings{KrishnanRM2005,
     abstract = { Many decision-making processes in the Earth sciences require the combination of multiple data originating from diverse sources. These data are often indirect and uncertain, and their combination would call for a probabilistic approach. These data are also partially redundant, anyone with each other or with all others taken jointly. This overlap in information arises due to a variety of reasons - because the data rise from the same physical process (geology), because they originate from the same location or the same measurement device, etc. When combining such redundant data, one must account for their information overlap, less we run the risk of over compounding apparently consistent, but actually redundant data with the risk of a wrong decision. The proposed tau model combines partially redundant data, each taking the form of a prior probability for the event being assessed to occur given that datum taken alone. The parameters of that tau model measure the additional contribution brought by any single datum over that of all previously considered data; they are data sequence-dependent and also data values-dependent. As one should expect, data redundancy depends on the order in which the data are considered and also on the data values themselves. However, averaging the tau model parameters over all possible data values leads to exact analytical expressions and corresponding approximations and inference avenues. A calibration-based technique shows one possible method for inferring these tau weights. },
     author = { Krishnan, Sunderrajan AND JOURNEL, André },
     booktitle = { 25th gOcad Meeting },
     month = { "june" },
     publisher = { ASGA },
     title = { The Tau model for Data Redundancy },
     year = { 2005 }
    }