VOOZH about

URL: https://en.wikipedia.org/wiki/Sample_entropy

โ‡ฑ Sample entropy - Wikipedia


Jump to content
From Wikipedia, the free encyclopedia
Modification of approximate entropy

Sample entropy (SampEn; more appropriately K_2 entropy or Takensโ€“Grassbergerโ€“Procaccia correlation entropy ) is a modification of approximate entropy (ApEn; more appropriately "Procacciaโ€“Cohen entropy"), used for assessing the complexity of physiological and other time-series signals, diagnosing e.g. diseased states.[1] SampEn has two advantages over ApEn: data length independence and a relatively trouble-free implementation. Also, there is a small computational difference: In ApEn, the comparison between the template vector (see below) and the rest of the vectors also includes comparison with itself. This guarantees that probabilities ๐Ÿ‘ {\displaystyle C_{i}'^{m}(r)}
are never zero. Consequently, it is always possible to take a logarithm of probabilities. Because template comparisons with itself lower ApEn values, the signals are interpreted to be more regular than they actually are. These self-matches are not included in SampEn. However, since SampEn makes direct use of the correlation integrals, it is not a real measure of information but an approximation. The foundations and differences with ApEn, as well as a step-by-step tutorial for its application is available at.[2]

SampEn is indeed identical to the "correlation entropy" K_2 of Grassberger & Procaccia,[3] except that it is suggested in the latter that certain limits should be taken in order to achieve a result invariant under changes of variables. No such limits and no invariance properties are considered in SampEn.

There is a multiscale version of SampEn as well, suggested by Costa and others.[4] SampEn can be used in biomedical and biomechanical research, for example to evaluate postural control.[5][6]

Definition

[edit]

Like approximate entropy (ApEn), Sample entropy (SampEn) is a measure of complexity.[1] But it does not include self-similar patterns as ApEn does. For a given embedding dimension ๐Ÿ‘ {\displaystyle m}
, tolerance ๐Ÿ‘ {\displaystyle r}
and number of data points ๐Ÿ‘ {\displaystyle N}
, SampEn is the negative natural logarithm of the probability that if two sets of simultaneous data points of length ๐Ÿ‘ {\displaystyle m}
have distance ๐Ÿ‘ {\displaystyle <r}
then two sets of simultaneous data points of length ๐Ÿ‘ {\displaystyle m+1}
also have distance ๐Ÿ‘ {\displaystyle <r}
. And we represent it by ๐Ÿ‘ {\displaystyle SampEn(m,r,N)}
(or by ๐Ÿ‘ {\displaystyle SampEn(m,r,\tau ,N)}
including sampling time ๐Ÿ‘ {\displaystyle \tau }
).

Now assume we have a time-series data set of length ๐Ÿ‘ {\displaystyle N={\{x_{1},x_{2},x_{3},...,x_{N}\}}}
with a constant time interval ๐Ÿ‘ {\displaystyle \tau }
. We define a template vector of length ๐Ÿ‘ {\displaystyle m}
, such that ๐Ÿ‘ {\displaystyle X_{m}(i)={\{x_{i},x_{i+1},x_{i+2},...,x_{i+m-1}\}}}
and the distance function ๐Ÿ‘ {\displaystyle d[X_{m}(i),X_{m}(j)]}
(iโ‰ j) is to be the Chebyshev distance (but it could be any distance function, including Euclidean distance). We define the sample entropy to be

๐Ÿ‘ {\displaystyle SampEn=-\ln {A \over B}}

Where

๐Ÿ‘ {\displaystyle A}
= number of template vector pairs having ๐Ÿ‘ {\displaystyle d[X_{m+1}(i),X_{m+1}(j)]<r}

๐Ÿ‘ {\displaystyle B}
= number of template vector pairs having ๐Ÿ‘ {\displaystyle d[X_{m}(i),X_{m}(j)]<r}

It is clear from the definition that ๐Ÿ‘ {\displaystyle A}
will always have a value smaller or equal to ๐Ÿ‘ {\displaystyle B}
. Therefore, ๐Ÿ‘ {\displaystyle SampEn(m,r,\tau )}
will be always either be zero or positive value. A smaller value of ๐Ÿ‘ {\displaystyle SampEn}
also indicates more self-similarity in data set or less noise.

Generally we take the value of ๐Ÿ‘ {\displaystyle m}
to be ๐Ÿ‘ {\displaystyle 2}
and the value of ๐Ÿ‘ {\displaystyle r}
to be ๐Ÿ‘ {\displaystyle 0.2\times std}
. Where std stands for standard deviation which should be taken over a very large dataset. For instance, the r value of 6 ms is appropriate for sample entropy calculations of heart rate intervals, since this corresponds to ๐Ÿ‘ {\displaystyle 0.2\times std}
for a very large population.

Multiscale SampEn

[edit]

The definition mentioned above is a special case of multi scale sampEn with ๐Ÿ‘ {\displaystyle \delta =1}
, where ๐Ÿ‘ {\displaystyle \delta }
is called skipping parameter. In multiscale SampEn template vectors are defined with a certain interval between its elements, specified by the value of ๐Ÿ‘ {\displaystyle \delta }
. And modified template vector is defined as ๐Ÿ‘ {\displaystyle X_{m,\delta }(i)={x_{i},x_{i+\delta },x_{i+2\times \delta },...,x_{i+(m-1)\times \delta }}}
and sampEn can be written as ๐Ÿ‘ {\displaystyle SampEn\left(m,r,\delta \right)=-\ln {A_{\delta } \over B_{\delta }}}
And we calculate ๐Ÿ‘ {\displaystyle A_{\delta }}
and ๐Ÿ‘ {\displaystyle B_{\delta }}
like before.

Implementation

[edit]

Sample entropy can be implemented easily in many different programming languages. Below lies an example written in Python.

fromitertoolsimport combinations
frommathimport log


defconstruct_templates(timeseries_data: list, m: int = 2):
 num_windows = len(timeseries_data) - m + 1
 return [timeseries_data[x : x + m] for x in range(0, num_windows)]


defget_matches(templates: list, r: float) -> int:
 return len(
 list(filter(lambda x: is_match(x[0], x[1], r), combinations(templates, 2)))
 )


defis_match(template_1: list, template_2: list, r: float) -> bool:
 return all([abs(x - y) < r for (x, y) in zip(template_1, template_2)])


defsample_entropy(timeseries_data: list, window_size: int, r: float):
 B = get_matches(construct_templates(timeseries_data, window_size), r)
 A = get_matches(construct_templates(timeseries_data, window_size + 1), r)
 return -log(A / B)
This article may contain excessive or irrelevant examples. Please help improve it by removing less pertinent examples and elaborating on existing ones. (July 2024) (Learn how and when to remove this message)

An equivalent example in numerical Python.

importnumpy 

defconstruct_templates(timeseries_data, m):
 num_windows = len(timeseries_data) - m + 1
 return numpy.array([timeseries_data[x : x + m] for x in range(0, num_windows)])

defget_matches(templates, r) -> int:
 return len(
 list(filter(lambda x: is_match(x[0], x[1], r), combinations(templates)))
 )

defcombinations(x):
 idx = numpy.stack(numpy.triu_indices(len(x), k=1), axis=-1)
 return x[idx]

defis_match(template_1, template_2, r) -> bool:
 return numpy.all([abs(x - y) < r for (x, y) in zip(template_1, template_2)])

defsample_entropy(timeseries_data, window_size, r):
 B = get_matches(construct_templates(timeseries_data, window_size), r)
 A = get_matches(construct_templates(timeseries_data, window_size + 1), r)
 return -numpy.log(A / B)

An example written in other languages can be found:

See also

[edit]

References

[edit]
  1. ^ a b Richman, JS; Moorman, JR (2000). "Physiological time-series analysis using approximate entropy and sample entropy". American Journal of Physiology. Heart and Circulatory Physiology. 278 (6): H2039โ€“49. doi:10.1152/ajpheart.2000.278.6.H2039. PMID 10843903.
  2. ^ Delgado-Bonal, Alfonso; Marshak, Alexander (June 2019). "Approximate Entropy and Sample Entropy: A Comprehensive Tutorial". Entropy. 21 (6): 541. Bibcode:2019Entrp..21..541D. doi:10.3390/e21060541. PMC 7515030. PMID 33267255.
  3. ^ Grassberger, Peter; Procaccia, Itamar (1983). "Estimation of the Kolmogorov entropy from a chaotic signal". Physical Review A. 28 (4): 2591(R). Bibcode:1983PhRvA..28.2591G. doi:10.1103/PhysRevA.28.2591.
  4. ^ Costa, Madalena; Goldberger, Ary; Peng, C.-K. (2005). "Multiscale entropy analysis of biological signals". Physical Review E. 71 (2) 021906. Bibcode:2005PhRvE..71b1906C. doi:10.1103/PhysRevE.71.021906. PMID 15783351.
  5. ^ Bล‚aลผkiewicz, Michalina; Kฤ™dziorek, Justyna; Hadamus, Anna (March 2021). "The Impact of Visual Input and Support Area Manipulation on Postural Control in Subjects after Osteoporotic Vertebral Fracture". Entropy. 23 (3): 375. Bibcode:2021Entrp..23..375B. doi:10.3390/e23030375. PMC 8004071. PMID 33804770.
  6. ^ Hadamus, Anna; Biaล‚oszewski, Dariusz; Bล‚aลผkiewicz, Michalina; Kowalska, Aleksandra J.; Urbaniak, Edyta; Wydra, Kamil T.; Wiaderna, Karolina; Boratyล„ski, Rafaล‚; Kobza, Agnieszka; Marczyล„ski, Wojciech (February 2021). "Assessment of the Effectiveness of Rehabilitation after Total Knee Replacement Surgery Using Sample Entropy and Classical Measures of Body Balance". Entropy. 23 (2): 164. Bibcode:2021Entrp..23..164H. doi:10.3390/e23020164. PMC 7911395. PMID 33573057.