Cross-Correlation

Cross-correlation is a measure of similarity between two series, typically a time series. It is sometimes called the sliding dot product or sliding inner product.

The cross-correlation implements the following conceptual steps:

  • Given two signals
    • is the reference signal with bounds .
    • is the subject signal with bounds .
  • Synchronization:
    • Find the global minimum .
    • Find the global maximum .
    • Construct a global time interval .
    • Choose a global time step, , to be the minimum of the reference time step and the subject time step.
  • Correlation:
    • Keep the reference signal stationary. Move the subject signal along the axis until the last data point of the subject signal is multiplied with the first data point of the reference signal.
    • Then, slide the subject signal to the right on the axis by , calculating the inner product of the two signals for each in .
    • Find the largest value of the foregoing inner products, and then for that step, move the subject curve to align with the reference curve. This will represent the highest correlation between the reference and the signal.

The figio implementation is based on the formulations contained in Terpsma et al.1, Sections E.2 and E.3 (pages 110 to 121).

Example

Consider the sawtooth examples shown below, recreated from the Anomaly webpage, section Normalized Cross-Correlation with Time Shift.2

Figure: Reproduction of the sawtooth series on the Anomaly website.2

Create the input file anomaly_recipe.yml

::::::::::::::
anomaly_recipe.yml
::::::::::::::
signal_a:
  type: xymodel
  # folder: ~/autotwin/figio/book/cross_correlation
  folder: ./  # the current directory
  file: signal_a.csv
  skip_rows: 1
  ycolumn: 1
  plot_kwargs:
    label: reference signal a
    color: red
    linewidth: 3
    linestyle: "--"
    marker: D
    alpha: 0.9
signal_b:
  type: xymodel
  # folder: ~/autotwin/figio/book/cross_correlation
  folder: ./  # the current directory
  file: signal_b.csv
  skip_rows: 1
  ycolumn: 1
  plot_kwargs:
    label: subject signal b
    color: darkcyan
    linewidth: 1
    linestyle: "-"
    marker: o
    alpha: 0.8
signal_b_correlated:
  type: xymodel
  # folder: ~/autotwin/figio/book/cross_correlation
  folder: ./  # the current directory
  file: signal_b.csv
  skip_rows: 1
  ycolumn: 1
  plot_kwargs:
    label: subject signal b
    color: darkcyan
    linewidth: 1
    linestyle: "-"
    marker: o
    alpha: 0.8
  signal_process:
    process1:
      correlation:
        reference:
          # folder: ~/autotwin/figio/book/cross_correlation
          folder: ./  # the current directory
          file: signal_a.csv
          skip_rows: 1
          ycolumn: 1
        verbose: true
        serialize: false
        # folder: ~/autotwin/figio/book/cross_correlation
        folder: ./  # the current directory
        file: out_signal_b_correlated.csv
figure_1:
  type: xyview
  model_keys: [ signal_a, signal_b ]
  # folder: ~/autotwin/figio/book/cross_correlation
  folder: ./  # the current directory
  file: out_anomaly_pre_corr.svg
  title: Anomaly site example, pre-correlation
  xlabel: time (s)
  ylabel: position (m)
  xlim: [ -1, 22 ]
  ylim: [ -1, 5 ]
  size: [ 8.0, 6.0 ]
  dpi: 100
  display: true
  details: false
  serialize: true
figure_2:
  type: xyview
  model_keys: [ signal_a, signal_b_correlated ]
  # folder: ~/autotwin/figio/book/cross_correlation
  folder: ./  # the current directory
  file: out_anomaly_post_corr.svg
  title: Anomaly site example, post-correlation
  xlabel: time (s)
  ylabel: position (m)
  xlim: [ -1, 22 ]
  ylim: [ -1, 5 ]
  size: [ 8.0, 6.0 ]
  dpi: 100
  display: true
  details: false
  serialize: true

which makes use of the two data series signal_a.csv and signal_b.csv.

::::::::::::::
signal_a.csv
::::::::::::::
time (s),signal_a (m)
1,0
2,1
3,2
4,3
5,4
6,0
7,1
8,2
9,3
10,4
11,0
12,1
13,2
14,3
15,4
16,0
17,1
18,2
19,3
20,4
::::::::::::::
signal_b.csv
::::::::::::::
time (s),signal_b (m)
1,1
2,2
3,3
4,3
5,0
6,1
7,2
8,3
9,4
10,0
11,1
12,1
13,4
14,4
15,0
16,1
17,2
18,3
19,4
20,0

Results

Run figio on the input file to produce the figures.

figio anomaly_receipe.yml

Processing file: anomaly_recipe.yml
====================================
Information
For (x, y) data and time series data:
  type: xymodel items associate with type: xyview items.
For histogram data:
  type: hmodel items associate with type: hview items.
====================================

This is xymodel.cross_correlation...
reference: [[ 1.  0.]
 [ 2.  1.]
 [ 3.  2.]
 [ 4.  3.]
 [ 5.  4.]
 [ 6.  0.]
 [ 7.  1.]
 [ 8.  2.]
 [ 9.  3.]
 [10.  4.]
 [11.  0.]
 [12.  1.]
 [13.  2.]
 [14.  3.]
 [15.  4.]
 [16.  0.]
 [17.  1.]
 [18.  2.]
 [19.  3.]
 [20.  4.]]
subject: [[ 1.  1.]
 [ 2.  2.]
 [ 3.  3.]
 [ 4.  3.]
 [ 5.  0.]
 [ 6.  1.]
 [ 7.  2.]
 [ 8.  3.]
 [ 9.  4.]
 [10.  0.]
 [11.  1.]
 [12.  1.]
 [13.  4.]
 [14.  4.]
 [15.  0.]
 [16.  1.]
 [17.  2.]
 [18.  3.]
 [19.  4.]
 [20.  0.]]

Synchronization...
  Reference [t_min, t_max] by dt (s): [1.0, 20.0] by 1.0
  Subject [t_min, t_max] by dt (s): [1.0, 20.0] by 1.0
  Globalized [t_min, t_max] by dt (s): [1.0, 20.0] by 1.0
  Globalized times: [ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
 19. 20.]
  Length of globalized times: 20

Correlation...
  Sliding dot product (cross-correlation): [  0.   0.   4.  11.  20.  30.  20.  19.  27.  41.  61.  41.  30.  42.
  61.  91.  61.  44.  55.  78. 117.  81.  51.  48.  58.  87.  61.  36.
  32.  37.  56.  40.  25.  17.  17.  26.  20.  11.   4.]
  Length of the sliding dot product: 39
  Max sliding dot product (cross-correlation): 117.0
  Sliding dot product of normalized signals (cross-correlation): [0.         0.         0.03375798 0.09283444 0.16878989 0.25318484
 0.16878989 0.1603504  0.22786636 0.34601928 0.51480918 0.34601928
 0.25318484 0.35445878 0.51480918 0.76799402 0.51480918 0.37133777
 0.46417221 0.65828059 0.98742088 0.68359907 0.43041423 0.40509575
 0.48949069 0.73423604 0.51480918 0.30382181 0.27006383 0.3122613
 0.4726117  0.33757979 0.21098737 0.14347141 0.14347141 0.21942686
 0.16878989 0.09283444 0.03375798]
  Correlated time_shift (from full left)=20.0
  Correlated index_shift (from full left)=20
  Correlated time step (s): 1.0
  Correlated t_min (s): 1.0
  Correlated t_max (s): 21.0
  Correlated times: [ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
 19. 20. 21.]
  Correlated reference f(t): [0. 1. 2. 3. 4. 0. 1. 2. 3. 4. 0. 1. 2. 3. 4. 0. 1. 2. 3. 4. 0.]
  Correlated subject f(t): [0. 1. 2. 3. 3. 0. 1. 2. 3. 4. 0. 1. 1. 4. 4. 0. 1. 2. 3. 4. 0.]
  Correlated error f(t): [ 0.  0.  0.  0. -1.  0.  0.  0.  0.  0.  0.  0. -1.  1.  0.  0.  0.  0.
  0.  0.  0.]
  reference_self_correlation: 120.0
  cross_correlation: 117.0
    >> cross_correlation_relative_error=0.025
    >> L2-norm error rate: 0.08247860988423225
  Signal process "correlation" completed.
Finished XYViewBase constructor.
Finished XYViewBase constructor.
Creating view with guid = "figure_1"
  Adding ['signal_a', 'signal_b'] model(s) to current view.
  Figure dpi set to 100
  Figure size set to [8.0, 6.0] inches.
  Serialized view to: out_anomaly_pre_corr.svg
Creating view with guid = "figure_2"
  Adding ['signal_a', 'signal_b_correlated'] model(s) to current view.
  Figure dpi set to 100
  Figure size set to [8.0, 6.0] inches.
  Serialized view to: out_anomaly_post_corr.svg
====================================
End of figio execution.

out_anomaly_pre_corr.svg out_anomaly_post_corr.svg

Error metrics:

  • cross-correlation relative error: 2.5 percent
  • L2-norm error rate: 8.3 percent

References

1

Terpsma RJ, Hovey CB. Blunt impact brain injury using cellular injury criterion. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); 2020 Oct 1. link

2

Understanding Cross-Correlation, Auto-Correlation, Normalization and Time Shift, March 8, 2016. Available from: https://anomaly.io/understand-auto-cross-correlation-normalized-shift/