학술논문

Using the Semantic Information G Measure to Explain and Extend Rate-Distortion Functions and Maximum Entropy Distributions
Document Type
Working Paper
Author
Source
Entropy 2021, 23, 1050
Subject
Computer Science - Information Theory
94A29, 94A34, 94A15, 94A17, 62B10, 68T05, 62F15, 68P30
H.1.1
G.1.6
I.1.2
I.2.4
I.2.6
I.5.3
G.3
E.4
Language
Abstract
In the rate-distortion function and the Maximum Entropy (ME) method, Minimum Mutual In-formation (MMI) distributions and ME distributions are expressed by Bayes-like formulas, in-cluding Negative Exponential Functions (NEFs) and partition functions. Why do these non-probability functions exist in Bayes-like formulas? On the other hand, the rate-distortion function has three disadvantages: (1) the distortion function is subjectively defined; (2) the defi-nition of the distortion function between instances and labels is often difficult; (3) it cannot be used for data compression according to the labels' semantic meanings. The author has proposed using the semantic information G measure with both statistical probability and logical probability before. We can now explain NEFs as truth functions, partition functions as logical probabilities, Bayes-like formulas as semantic Bayes' formulas, MMI as Semantic Mutual Information (SMI), and ME as extreme ME minus SMI. In overcoming the above disadvantages, this paper sets up the relationship between truth functions and distortion functions, obtains truth functions from samples by machine learning, and constructs constraint conditions with truth functions to extend rate-distortion functions. Two examples are used to help readers understand the MMI iteration and to support the theoretical results. Using truth functions and the semantic information G measure, we can combine machine learning and data compression, including semantic com-pression. We need further studies to explore general data compression and recovery, according to the semantic meaning.
Comment: 22 pages, 5 figures