Skip to content

API Reference

tmgtoolkit

plotting

plot_time_series(ax, y, t=None, **kwargs)

Generic function for plotting time series.

Plots the inputted time series signal(s) y on the inputted Matplotlib axis ax. This function modifies ax directly and does not return a new axis.

Parameters:

Name Type Description Default
ax Axes

Matplotlib axis object on which to plot

required
y ndarray

Numpy array holding the time series signal(s) to plot; y can be either one-dimensional (to plot a single time series) or two-dimensional (to plot multiple time series on the same axes). If y is two-dimensional, each time series should be a column in y.

required
t ndarray

1D Numpy array holding the time (or other independent variable) values on which y is defined. If provided, t must be a 1N Numpy array with the same number of points as the time series in y. If not provided, this function will use index values of y as the independent variable, i.e. t = [0, 1, 2, ..., y.shape[0]].

None
Keyword Arguments

xlabel : str Label for x axis ylabel : str Label for y axis title : str Axis title color : str or list Color for the lines of plotted time series signal(s). If a list, color should contain one exactly string color for every time series in y. linewidth : str or list Width for the lines of plotted time series signal(s); str/list behavior as for color. marker : str or list Marker for the lines of plotted time series signal(s); str/list behavior as for color. label : str or list Human-readable label for the plotted time series signal(s); str/list behavior as for color.

Source code in src/tmgtoolkit/plotting.py
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def plot_time_series(ax, y, t=None, **kwargs):
    """Generic function for plotting time series.

    Plots the inputted time series signal(s) `y` on the inputted Matplotlib
    axis `ax`. This function modifies `ax` directly and does not return a new
    axis.

    Parameters
    ----------
    ax : matplotlib.axes._axes.Axes
        Matplotlib axis object on which to plot
    y : ndarray
        Numpy array holding the time series signal(s) to plot; `y` can be
        either one-dimensional (to plot a single time series) or
        two-dimensional (to plot multiple time series on the same axes). If `y`
        is two-dimensional, each time series should be a column in `y`.
    t : ndarray, optional
        1D Numpy array holding the time (or other independent variable) values
        on which `y` is defined. If provided, `t` must be a 1N Numpy array with
        the same number of points as the time series in `y`. If not provided,
        this function will use index values of `y` as the independent variable,
        i.e. `t = [0, 1, 2, ..., y.shape[0]]`.

    Keyword Arguments
    -----------------
    xlabel : str
        Label for x axis
    ylabel : str
        Label for y axis
    title : str
        Axis title
    color : str or list
        Color for the lines of plotted time series signal(s). If a list,
        `color` should contain one exactly string color for every time series
        in `y`.
    linewidth : str or list
        Width for the lines of plotted time series signal(s); str/list behavior
        as for `color`.
    marker : str or list
        Marker for the lines of plotted time series signal(s); str/list behavior
        as for `color`.
    label : str or list
        Human-readable label for the plotted time series signal(s); str/list
        behavior as for `color`.

    """
    if t is None:
        t = np.arange(y)

    xlabel=kwargs.get('xlabel', PlottingConstants.TIME_SERIES_DEFAULTS['xlabel'])
    ylabel=kwargs.get('ylabel', PlottingConstants.TIME_SERIES_DEFAULTS['ylabel'])
    title=kwargs.get('title', PlottingConstants.TIME_SERIES_DEFAULTS['title'])
    ax.set_xlabel(xlabel)
    ax.set_ylabel(ylabel)
    ax.set_title(title)

    _remove_spines(ax)
    ax.plot(t, y, color=kwargs.get('color'), marker=kwargs.get('marker', PlottingConstants.TIME_SERIES_DEFAULTS['marker']),
            linewidth=kwargs.get('linewidth'), label=kwargs.get('label'))

plot_spm_t_statistic(ax, t_statistic, alpha, threshold, clusters, t=None, **kwargs)

Plots an SPM t-statistic and accompanying inference results.

Plots the inputted SPM t-statistic curve t_statistic and a summary of the accompanying inference results spm_ti on the inputted Matplotlib axes object ax. This function modifies ax directly and does not return a new axis.

Parameters:

Name Type Description Default
ax Axes

Matplotlib axis object on which to plot

required
t_statistic ndarray

1D Numpy array holding the SPM t-test statistic curve to be plotted

required
alpha float

Alpha value used for SPM inference

required
threshold float

Significance threshold value from SPM inference

required
clusters list
A list of SpmCluster dicts summarizing each supra-threshold
cluster, or an empty list if the inference did not produce
supra-threshold clusters, as in the `clusters` key returned by
`get_spm_t_inference`.
required
t ndarray

1D Numpy array holding the time (or other independent variable) values on which the SPM t-statistic curve spm_ts['t_statistic'] is defined. If provided, t must be a 1N Numpy array with the same number of points as the t-statistic curve. If not provided, this function will use index values of spm_ts['t_statistic'] as the independent variable.

None
Keyword Arguments

xlabel : str Label for x axis ylabel : str Label for y axis title : str Axis title color : str Color of the plotted t-statistic curve. color : str Color of the plotted t-statistic curve. marker : str Marker for the t-statistic curve. cluster_fillcolor : str Background color of suprathreshold significance clusters. threshold_color : str Color of the significance threshold line. threshold_linestyle : str Style of the significance threshold line.

Source code in src/tmgtoolkit/plotting.py
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
def plot_spm_t_statistic(ax, t_statistic, alpha, threshold, clusters, t=None, **kwargs):
    """Plots an SPM t-statistic and accompanying inference results.

    Plots the inputted SPM t-statistic curve `t_statistic` and a summary of the
    accompanying inference results `spm_ti` on the inputted Matplotlib axes
    object `ax`. This function modifies `ax` directly and does not return a new
    axis.

    Parameters
    ----------
    ax : matplotlib.axes._axes.Axes
        Matplotlib axis object on which to plot
    t_statistic : ndarray
        1D Numpy array holding the SPM t-test statistic curve to be plotted
    alpha : float
        Alpha value used for SPM inference
    threshold : float
        Significance threshold value from SPM inference
    clusters : list
            A list of SpmCluster dicts summarizing each supra-threshold
            cluster, or an empty list if the inference did not produce
            supra-threshold clusters, as in the `clusters` key returned by
            `get_spm_t_inference`.
    t : ndarray, optional
        1D Numpy array holding the time (or other independent variable) values
        on which the SPM t-statistic curve `spm_ts['t_statistic']` is
        defined. If provided, `t` must be a 1N Numpy array with the same number
        of points as the t-statistic curve. If not provided, this function will
        use index values of `spm_ts['t_statistic']` as the independent
        variable.

    Keyword Arguments
    -----------------
    xlabel : str
        Label for x axis
    ylabel : str
        Label for y axis
    title : str
        Axis title
    color : str
        Color of the plotted t-statistic curve.
    color : str
        Color of the plotted t-statistic curve.
    marker : str
        Marker for the t-statistic curve.
    cluster_fillcolor : str
        Background color of suprathreshold significance clusters.
    threshold_color : str
        Color of the significance threshold line.
    threshold_linestyle : str
        Style of the significance threshold line.

    """
    if t is None:
        t = np.arange(y)

    xlabel=kwargs.get('xlabel', PlottingConstants.SPM_STATISTIC_DEFAULTS['xlabel'])
    ylabel=kwargs.get('ylabel', PlottingConstants.SPM_STATISTIC_DEFAULTS['ylabel'])
    title=kwargs.get('title', PlottingConstants.SPM_STATISTIC_DEFAULTS['title'])
    ax.set_xlabel(xlabel)
    ax.set_ylabel(ylabel)
    ax.set_title(title)

    _remove_spines(ax)

    # Plot t-statistic
    ax.plot(t, t_statistic, color=kwargs.get('color'),
            marker=kwargs.get('marker', PlottingConstants.SPM_STATISTIC_DEFAULTS['marker']),
            linewidth=kwargs.get('linewidth'), label=kwargs.get('label'))

    # Plot dashed line at y = 0
    ax.axhline(y=0, color=PlottingConstants.SPM_STATISTIC_DEFAULTS['x_axis_color'], linestyle=PlottingConstants.SPM_STATISTIC_DEFAULTS['x_axis_linestyle'])

    # Plot dashed line at SPM significance threshold
    threshold_color = kwargs.get('threshold_color', PlottingConstants.SPM_STATISTIC_DEFAULTS['threshold_color'])
    threshold_linestyle = kwargs.get('threshold_linestyle', PlottingConstants.SPM_STATISTIC_DEFAULTS['threshold_linestyle'])
    ax.axhline(y=threshold, color=threshold_color,
               linestyle=threshold_linestyle)
    ax.axhline(y=-threshold, color=threshold_color,
               linestyle=threshold_linestyle)

    # Shade between curve and threshold
    fill_color = kwargs.get('cluster_fillcolor', PlottingConstants.SPM_STATISTIC_DEFAULTS['cluster_fillcolor'])
    ax.fill_between(t, t_statistic, threshold, where=t_statistic >= threshold,
            interpolate=True, color=fill_color)
    ax.fill_between(t, t_statistic, -threshold, where=t_statistic <= -threshold,
            interpolate=True, color=fill_color)

    # Text box showing SPM cluster parameters
    best_cluster = None if len(clusters) == 0 else clusters[_choose_cluster_to_display(clusters)]
    ax.text(PlottingConstants.SPM_STATISTIC_DEFAULTS['textbox_x'], PlottingConstants.SPM_STATISTIC_DEFAULTS['textbox_y'],
            _get_spm_axis_text(alpha, threshold, best_cluster),
            va='top', ha='left',
            transform=ax.transAxes,
            bbox=dict(facecolor=PlottingConstants.SPM_STATISTIC_DEFAULTS['textbox_facecolor'],
                      edgecolor=PlottingConstants.SPM_STATISTIC_DEFAULTS['textbox_edgecolor'],
                      boxstyle=PlottingConstants.SPM_STATISTIC_DEFAULTS['textbox_style']))

plot_spm_input_data_lazy(ax, group1, group2, t=None, **kwargs)

Plots the data used to compute an SPM t-statistic.

Wrapper around plot_spm_input_data that takes care of computing mean and standard deviation for the user.

Parameters:

Name Type Description Default
ax Axes

Matplotlib axis object on which to plot

required
group1 ndarray

2D Numpy array holding at least two time series, as documented in spm.get_spm_t_statistic()

required
group2 ndarray

2D Numpy array holding at least two time series, as documented in spm.get_spm_t_statistic()

required
t ndarray

1D Numpy array holding the time (or other independent variable) values on which the time series in group1 and group2 are defined.

None
Keyword Arguments

See plot_spm_input_data.

Source code in src/tmgtoolkit/plotting.py
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
def plot_spm_input_data_lazy(ax, group1, group2, t=None, **kwargs):
    """Plots the data used to compute an SPM t-statistic.

    Wrapper around `plot_spm_input_data` that takes care of computing mean and standard deviation for the user.

    Parameters
    ----------
    ax : matplotlib.axes._axes.Axes
        Matplotlib axis object on which to plot
    group1 : ndarray
        2D Numpy array holding at least two time series, as documented in
        `spm.get_spm_t_statistic()`
    group2 : ndarray
        2D Numpy array holding at least two time series, as documented in
        `spm.get_spm_t_statistic()`
    t : ndarray, optional
        1D Numpy array holding the time (or other independent variable) values
        on which the time series in `group1` and `group2` are defined.

    Keyword Arguments
    ----------
    See `plot_spm_input_data`.

    """
    mean1 = np.mean(group1, axis=1)
    mean2 = np.mean(group2, axis=1)
    std1 = np.std(group1, ddof=1, axis=1)
    std2 = np.std(group2, ddof=1, axis=1)
    plot_spm_input_data(ax, mean1, mean2, std1, std2, t=t, **kwargs)

plot_spm_input_data(ax, mean1, mean2, std1, std2, t=None, **kwargs)

Plots the data used to compute an SPM t-statistic.

Plots the mean value curve and standard deviation clouds of the data that would be used as input to a function like spm.get_spm_t_statistic() to compute an SPM t-statistic.

This function modifies ax directly and does not return a new axis.

Parameters:

Name Type Description Default
ax Axes

Matplotlib axis object on which to plot

required
mean1 ndarray

1D Numpy array mean value curve of group 1 SPM input data. See spm.get_spm_t_statistic()

required
mean2 ndarray

1D Numpy array mean value curve of group 2 SPM input data. See spm.get_spm_t_statistic()

required
std1 ndarray

1D Numpy array standard deviation curve of group 1 SPM input data. See spm.get_spm_t_statistic()

required
std2 ndarray

1D Numpy array standard deviation curve of group 2 SPM input data. See spm.get_spm_t_statistic()

required
t ndarray

1D Numpy array holding the time (or other independent variable) values on which the time series in group1 and group2 are defined.

None
Keyword Arguments

xlabel : str Label for x axis ylabel : str Label for y axis title : str Axis title color1 : str Color of the mean value line of the data in group1. color2 : str Color of the mean value line of the data in group2. linewidth1 : str Width of the mean value line of the data in group1. linewidth2 : str Width of the mean value line of the data in group2. fillcolor1 : str Fill color of the standard deviation cloud of the data in group1. fillcolor2 : str Fill color of the standard deviation cloud of the data in group2. alpha1 : str Alpha of the standard deviation cloud of the data in group1. alpha2 : str Alpha of the standard deviation cloud of the data in group2. label1 : str Human-readable label for the group1 data. label2 : str Human-readable label for the group2 data.

Source code in src/tmgtoolkit/plotting.py
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
def plot_spm_input_data(ax, mean1, mean2, std1, std2, t=None, **kwargs):
    """Plots the data used to compute an SPM t-statistic.

    Plots the mean value curve and standard deviation clouds of the data that
    would be used as input to a function like `spm.get_spm_t_statistic()` to
    compute an SPM t-statistic.

    This function modifies `ax` directly and does not return a new axis.

    Parameters
    ----------
    ax : matplotlib.axes._axes.Axes
        Matplotlib axis object on which to plot
    mean1 : ndarray
        1D Numpy array mean value curve of group 1 SPM input data.
        See `spm.get_spm_t_statistic()`
    mean2 : ndarray
        1D Numpy array mean value curve of group 2 SPM input data.
        See `spm.get_spm_t_statistic()`
    std1 : ndarray
        1D Numpy array standard deviation curve of group 1 SPM input data.
        See `spm.get_spm_t_statistic()`
    std2 : ndarray
        1D Numpy array standard deviation curve of group 2 SPM input data.
        See `spm.get_spm_t_statistic()`
    t : ndarray, optional
        1D Numpy array holding the time (or other independent variable) values
        on which the time series in `group1` and `group2` are defined.

    Keyword Arguments
    -----------------
    xlabel : str
        Label for x axis
    ylabel : str
        Label for y axis
    title : str
        Axis title
    color1 : str
        Color of the mean value line of the data in `group1`.
    color2 : str
        Color of the mean value line of the data in `group2`.
    linewidth1 : str
        Width of the mean value line of the data in `group1`.
    linewidth2 : str
        Width of the mean value line of the data in `group2`.
    fillcolor1 : str
        Fill color of the standard deviation cloud of the data in `group1`.
    fillcolor2 : str
        Fill color of the standard deviation cloud of the data in `group2`.
    alpha1 : str
        Alpha of the standard deviation cloud of the data in `group1`.
    alpha2 : str
        Alpha of the standard deviation cloud of the data in `group2`.
    label1 : str
        Human-readable label for the `group1` data.
    label2 : str
        Human-readable label for the `group2` data.

    """
    if t is None:
        t = np.arange(mean1)

    xlabel=kwargs.get('xlabel', PlottingConstants.SPM_STATISTIC_DEFAULTS['xlabel'])
    ylabel=kwargs.get('ylabel', PlottingConstants.SPM_STATISTIC_DEFAULTS['ylabel'])
    title=kwargs.get('title', PlottingConstants.SPM_STATISTIC_DEFAULTS['title'])
    ax.set_xlabel(xlabel)
    ax.set_ylabel(ylabel)
    ax.set_title(title)

    _remove_spines(ax)

    color1 = kwargs.get('color1', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['color1'])
    color2 = kwargs.get('color2', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['color2'])
    fillcolor1 = kwargs.get('fillcolor1', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['color1'])
    fillcolor2 = kwargs.get('fillcolor2', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['color2'])
    alpha1 = kwargs.get('alpha1', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['alpha1'])
    alpha2 = kwargs.get('alpha2', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['alpha2'])
    linewidth1 = kwargs.get('linewidth1', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['linewidth'])
    linewidth2 = kwargs.get('linewidth2', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['linewidth'])
    label1 = kwargs.get('label1', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['label1'])
    label2 = kwargs.get('label2', PlottingConstants.SPM_INPUT_DATA_DEFAULTS['label2'])

    # Mean value lines
    ax.plot(t, mean1, color=color1, linewidth=linewidth1, label=label1, zorder=PlottingConstants.SPM_INPUT_DATA_DEFAULTS['z_line1'])
    ax.plot(t, mean2, color=color2, linewidth=linewidth2, label=label2, zorder=PlottingConstants.SPM_INPUT_DATA_DEFAULTS['z_line2'])


    # Standard deviation clouds
    ax.fill_between(t, mean1 - std1, mean1 + std1, color=fillcolor1,
                    alpha=alpha1,
                    zorder=PlottingConstants.SPM_INPUT_DATA_DEFAULTS['z_fill1'])
    ax.fill_between(t, mean2 - std2, mean2 + std2, color=fillcolor2,
                    alpha=alpha2,
                    zorder=PlottingConstants.SPM_INPUT_DATA_DEFAULTS['z_fill2'])

    # Dashed line at y = 0
    ax.axhline(y=0, color=PlottingConstants.SPM_INPUT_DATA_DEFAULTS['x_axis_color'], linestyle=PlottingConstants.SPM_INPUT_DATA_DEFAULTS['x_axis_linestyle'])

    if label1 is not None or label2 is not None:
        ax.legend(framealpha=PlottingConstants.SPM_INPUT_DATA_DEFAULTS['legend_alpha'])

spm

get_spm_t_statistic(group1, group2, mitigate_iir_filter_artefact=True, swap_groups=False)

Computes SPM t-test statistic for the inputted data.

Returns a dict containing the 1D SPM t-test statistic resulting from a paired SPM t-test comparing the time series data in the inputted arrays group1 and group2. The returned SPM t-statistic is implicitly defined on the same time (or other independent variable) grid as the time series in group1 and group2, but it is up to the calling code to keep track of these time values.

Parameters:

Name Type Description Default
group1 ndarray

2D Numpy array holding at least two time series. Each of the time series in group1 should be the same length, and each series should be stored as a column in group1, so that group1 has shape (points_per_series, number_of_series).

required
group2 ndarray

2D Numpy array holding at least two time series, to be compared to the set of time series in group1. The number of time series in group2 may be different from the number of time series in group1, but the length of the time series in group1 and group2 should be equal (i.e. group1 and group2 should have the same number of rows).

required
mitigate_iir_filter_artefact bool

If True, passes data through _mitigate_iir_filter_artefact before computing the SPM t-statistic. See _mitigate_iir_filter_artefact for details.

True
swap_groups bool

Controls the order in which group1 and group2 are passed to spm1d's paired t-test function. This can be useful e.g. if client wants to manipulate which group is treated as "potentiated".

False
Pre-Conditions
  1. group1 and group2 must have the same shape—this is a requirement for further analysis by spm1d.
  2. No row in group1 or in group2 can have all equal values—this is to ensure there are no rows in the inputted data with zero variance, which would cause a divide by zero error when computing the SPM t-statistic. This is again a requirement for further analysis by spm1d.

Note on time values: although the time (or other independent variable) values on which group1 and group2 are defined are not needed to compute the SPM t-statistic, group1 and group2 should conceptually be defined on the same time grid; this becomes relevant when performing inference on the t-statistic returned by this function.

Returns:

Name Type Description
spm_ts dict

A dict dict holding the SPM t-test statistic curve produced by comparing the time series data in the inputted arrays group1 and group2. The dict has a single field: - t_statistic (ndarray): a 1D Numpy array holding the SPM t-statistic curve resulting from the comparison of the time series in group1 to the time series in group2. The length of t_statistic will equal the length of the time series in group1 and group2 (or the length of the shorter of the time series in group1 and group2 if the series in group1 and group2 have different lengths). - spm_t (spm1d._spm.SPM_T): wrapper object for the t-statistic used by the spm1d library. Meant for internal use only and subject to change in future versions.

Source code in src/tmgtoolkit/spm.py
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
def get_spm_t_statistic(group1, group2, mitigate_iir_filter_artefact=True, swap_groups=False):
    """Computes SPM t-test statistic for the inputted data.

    Returns a dict containing the 1D SPM t-test statistic resulting from a
    paired SPM t-test comparing the time series data in the inputted arrays
    `group1` and `group2`. The returned SPM t-statistic is implicitly defined
    on the same time (or other independent variable) grid as the time series in
    `group1` and `group2`, but it is up to the calling code to keep track of
    these time values.

    Parameters
    ----------
    group1 : ndarray
        2D Numpy array holding at least two time series. Each of the time
        series in `group1` should be the same length, and each series should be
        stored as a column in `group1`, so that `group1` has shape
        (points_per_series, number_of_series).
    group2 : ndarray
        2D Numpy array holding at least two time series, to be compared to the
        set of time series in `group1`. The number of time series in `group2`
        may be different from the number of time series in `group1`, but the
        length of the time series in `group1` and `group2` should be equal
        (i.e. `group1` and `group2` should have the same number of rows).
    mitigate_iir_filter_artefact : bool, optional
        If True, passes data through `_mitigate_iir_filter_artefact` before
        computing the SPM t-statistic. See `_mitigate_iir_filter_artefact` for
        details.
    swap_groups : bool, optional
        Controls the order in which `group1` and `group2` are passed to spm1d's
        paired t-test function. This can be useful e.g. if client wants to
        manipulate which group is treated as "potentiated".

    Pre-Conditions
    --------------
    1. `group1` and `group2` must have the same shape—this is a requirement for
       further analysis by spm1d.
    2. No row in `group1` or in `group2` can have all equal values—this is to
       ensure there are no rows in the inputted data with zero variance, which
       would cause a divide by zero error when computing the SPM t-statistic.
       This is again a requirement for further analysis by spm1d.

    Note on time values: although the time (or other independent variable)
    values on which `group1` and `group2` are defined are not needed to compute
    the SPM t-statistic, `group1` and `group2` should conceptually be defined
    on the same time grid; this becomes relevant when performing inference on
    the t-statistic returned by this function.

    Returns
    -------
    spm_ts : dict
        A dict dict holding the SPM t-test statistic curve produced by
        comparing the time series data in the inputted arrays `group1` and
        `group2`. The dict has a single field:
        - `t_statistic` (ndarray): a 1D Numpy array holding the SPM t-statistic
              curve resulting from the comparison of the time series in `group1` to
              the time series in `group2`. The length of `t_statistic` will equal
              the length of the time series in `group1` and `group2` (or the length of
              the shorter of the time series in `group1` and `group2` if the series in
              `group1` and `group2` have different lengths).
        - `spm_t` (spm1d._spm.SPM_T): wrapper object for the t-statistic used
            by the spm1d library. Meant for internal use only and subject to
            change in future versions.

    """
    if group1.shape != group2.shape:
        raise ValueError("Group 1 and Group 2 must have the same shape, but have shapes {} and {}.".format(group1.shape, group2.shape))

    if mitigate_iir_filter_artefact:
        group1, group2 = _mitigate_iir_filter_artefact(group1, group2)
    if swap_groups:
        spm_t = spm1d.stats.ttest_paired(group2.T, group1.T)
    else:
        spm_t = spm1d.stats.ttest_paired(group1.T, group2.T)
    return {
            't_statistic': spm_t.z,
            'spm_t': spm_t,
    }

get_spm_t_inference(spm_ts, t=None, alpha=0.05, two_tailed=True)

Performs SPM inference on the inputted SPM t-statistic data.

Returns a dict holding the results of performing inference on the inputted spm_ts dict at the Type I error level alpha.

Parameters:

Name Type Description Default
spm_ts dict

A dict of t-statistic results as returned by get_spm_t_statistic.

required
t ndarray

1D Numpy array holding the time (or other independent variable) values on which the SPM t-statistic curve spm_ts['t_statistic'] is defined. t is used to return inference-related time parameters in correct units. If provided, t must be a 1N Numpy array with the same number of points as spm_ts['t_statistic'] (this is the same time grid on which the time series used to compute spm_ts were defined). If not provided, this function will use index values of spm_ts['t_statistic'] as the independent variable, i.e. t = [0, 1, 2, ..., spm_ts['t_statistic'].shape[0]].

None
alpha float

Type I error rate (probability of rejecting the null hypothesis given that it is true) when performing inference.

0.05
two_tailed bool

Whether to perform two-sided or one-sided inference.

True

Returns:

Name Type Description
spm_t_inference dict

A dict produced by performing statistical inference on the SPM t-statistic results in spm_ts. The dict has the following fields: - alpha (float): alpha used for inference. - p (float): p value for entire inference. - threshold (float): SPM t-statistic significance threshold value. - clusters (list): a list of SpmCluster dicts summarizing each supra-threshold cluster, or an empty list if the inference did not produce supra-threshold clusters. Each SpmCluster dict has the following keys: - idx (int): the cluster's 0-based index within clusters - p (float): the cluster's p value. - start_time (float): time at which the cluster begins, in the same units as t. - end_time (float): time at which the cluster ends, in the same units as t. - centroid_time (float): time of the cluster's centroid, in the same units as t. - centroid (float): SPM t-statistic value of the cluster's centroid. - extremum_time (float): time of the cluster's extremum (which can in general be either a maximum or a minimum), in the same units as t. - extremum (float): SPM t-statistic value of the cluster's extremum. - area (float): the cluster's area, i.e. area of the region bounded by the SPM t-statistic curve and the horizontal line at the significance threshold spm_t_inference.threshold from the cluster's start_time to the cluster's end_time.

Source code in src/tmgtoolkit/spm.py
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
def get_spm_t_inference(spm_ts, t=None, alpha=0.05, two_tailed=True):
    """Performs SPM inference on the inputted SPM t-statistic data.

    Returns a dict holding the results of performing inference on the inputted
    `spm_ts` dict at the Type I error level `alpha`.

    Parameters
    ----------
    spm_ts : dict
        A dict of t-statistic results as returned by `get_spm_t_statistic`.
    t : ndarray, optional
        1D Numpy array holding the time (or other independent variable) values
        on which the SPM t-statistic curve `spm_ts['t_statistic']` is
        defined. `t` is used to return inference-related time parameters in
        correct units.
        If provided, `t` must be a 1N Numpy array with the same number of
        points as `spm_ts['t_statistic']` (this is the same time grid on
        which the time series used to compute `spm_ts` were defined).
        If not provided, this function will use index values of
        `spm_ts['t_statistic']` as the independent variable, i.e. 
        `t = [0, 1, 2, ..., spm_ts['t_statistic'].shape[0]]`.
    alpha : float
        Type I error rate (probability of rejecting the null hypothesis given
        that it is true) when performing inference.
    two_tailed : bool
        Whether to perform two-sided or one-sided inference.

    Returns
    -------
    spm_t_inference : dict
        A dict produced by performing statistical inference on the SPM
        t-statistic results in `spm_ts`.
        The dict has the following fields:
        - `alpha` (float): alpha used for inference.
        - `p` (float): p value for entire inference.
        - `threshold` (float): SPM t-statistic significance threshold value.
        - `clusters` (list): a list of SpmCluster dicts summarizing each
              supra-threshold cluster, or an empty list if the inference did
              not produce supra-threshold clusters. Each SpmCluster dict
              has the following keys:
              - `idx` (int): the cluster's 0-based index within `clusters`
              - `p` (float): the cluster's p value.
              - `start_time` (float): time at which the cluster begins, in the
                  same units as `t`.
              - `end_time` (float): time at which the cluster ends, in the same
                    units as `t`.
              - `centroid_time` (float): time of the cluster's centroid, in the
                    same units as `t`.
              - `centroid` (float): SPM t-statistic value of the cluster's
                centroid.
              - `extremum_time` (float): time of the cluster's extremum (which
                    can in general be either a maximum or a minimum), in the
                    same units as `t`.
              - `extremum` (float): SPM t-statistic value of the cluster's
                extremum.
              - `area` (float): the cluster's area, i.e. area of the region
                    bounded by the SPM t-statistic curve and the horizontal
                    line at the significance threshold
                    `spm_t_inference.threshold` from the cluster's `start_time`
                    to the cluster's `end_time`.
    """
    spm_ti = spm_ts['spm_t'].inference(alpha=alpha, two_tailed=two_tailed, interp=True)

    if t is None:
        t = np.arange(spm_ts['t_statistic'])

    return {
      'alpha': alpha,
      'p': spm_ti.p_set,
      'threshold': spm_ti.zstar,
      'clusters': _get_spm_clusters(spm_ti, t)
    }

time_series

get_tmg_parameters_of_time_series(y, t=None, ignore_maxima_with_idx_less_than=None, ignore_maxima_less_than=None, use_first_max_as_dm=False, interpolate_dm=False)

Returns TMG parameters for a time series.

Returns a dict holding the TMG parameters Dm, Td, Tc, Ts, and Tr for the inputted time series y.

Parameters:

Name Type Description Default
y ndarray

1D Numpy array holding the values of a time series signal. Typically this will be a TMG signal of muscle displacement measured with respect to time.

required
t ndarray

1D Numpy array holding the time (or other independent variable) values on which y is defined. This is used to return the values of the time parameters Td, Tc, Ts, and Tr in correct units.

If provided, t must be a 1N Numpy array with the same number of points as y. If not provided, this function will use index values of y as the independent variable, i.e. t = [0, 1, 2, ..., y.shape[0]].

Suggestion: If you are computing the TMG parameters of a standard TMG signal sampled at 1 kHz, you can leave t as None and rely on the default values t = [0, 1, 2, ..., y.shape[0]] (or, even better, be explicit and use e.g. t=np.arange[y.shape[0]]) and interpret the values of t as milliseconds; this works because in a standard TMG signal sampled at 1 kHz, the samples are uniformly spaced in time with spacing of 1 millisecond. The returned values of the parameters Td, Tc, Ts, and Tr will then be in milliseconds.

None
ignore_maxima_with_idx_less_than int

Ignore data points with index less than ignore_maxima_with_idx_less_than when computing Dm. Used in practice to avoid tiny maxima resulting from filtering artefacts in the first few milliseconds of a TMG signal. Will use a sane default value designed for TMG signals if no value is specified.

None
ignore_maxima_less_than float

Ignore data points with values less than ignore_maxima_less_than when computing Dm. Used in practice to avoid tiny maxima resulting from filtering artefacts in the first few milliseconds of a TMG signal. Will use a sane default value designed for TMG signals if no value is specified.

None
use_first_max_as_dm bool

If True, uses the first maximum meeting the criteria imposed by ignore_maxima_with_idx_less_than and ignore_maxima_less_than for Dm; if false, uses the global maximum for Dm. Used in practice to make Dm, and TMG parameters derived from it, correspond to the twitch from fast-twitch muscle fibers, which may have a distinct, earlier maximum than the global maximum caused by slower-twitch fibers.

False
interpolate_dm bool

If True, uses interpolation to fine-tune the value of Dm beyond the granularity of y's discrete samples. If False, uses the maximum sample in y as Dm. See _interpolate_extremum for more context on interpolation.

False

Returns:

Name Type Description
params dict

A dict holding the computed TMG parameter values. The dict has the following keys: - dm (float): value of Dm, in the same units as y. - tm (float): time of Dm, in the same units as t. - td (float): value of Td, in the same units as t. - tc (float): value of Tc, in the same units as t. - ts (float): value of Ts, in the same units as t. - tr (float): value of Tr, in the same units as t.

Source code in src/tmgtoolkit/time_series.py
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
def get_tmg_parameters_of_time_series(y, t=None,
                                      ignore_maxima_with_idx_less_than=None,
                                      ignore_maxima_less_than=None,
                                      use_first_max_as_dm=False,
                                      interpolate_dm=False):
    """Returns TMG parameters for a time series.

    Returns a dict holding the TMG parameters Dm, Td, Tc, Ts, and Tr for the
    inputted time series `y`.

    Parameters
    ----------
    y : ndarray
        1D Numpy array holding the values of a time series signal. Typically
        this will be a TMG signal of muscle displacement measured with respect
        to time.
    t : ndarray, optional
        1D Numpy array holding the time (or other independent variable) values
        on which `y` is defined. This is used to return the values of the time
        parameters Td, Tc, Ts, and Tr in correct units.

        If provided, `t` must be a 1N Numpy array with the same number of
        points as `y`. If not provided, this function will use index values of
        `y` as the independent variable, i.e. `t = [0, 1, 2, ..., y.shape[0]]`.

        Suggestion: If you are computing the TMG parameters of a standard TMG
        signal sampled at 1 kHz, you can leave `t` as `None` and rely on the
        default values `t = [0, 1, 2, ..., y.shape[0]]` (or, even better, be
        explicit and use e.g. `t=np.arange[y.shape[0]]`) and interpret the
        values of `t` as milliseconds; this works because in a standard TMG
        signal sampled at 1 kHz, the samples are uniformly spaced in time with
        spacing of 1 millisecond. The returned values of the parameters Td, Tc,
        Ts, and Tr will then be in milliseconds.
    ignore_maxima_with_idx_less_than : int, optional
        Ignore data points with index less than
        `ignore_maxima_with_idx_less_than` when computing Dm. Used in practice
        to avoid tiny maxima resulting from filtering artefacts in the first
        few milliseconds of a TMG signal. Will use a sane default value
        designed for TMG signals if no value is specified.
    ignore_maxima_less_than : float, optional
        Ignore data points with values less than `ignore_maxima_less_than` when
        computing Dm. Used in practice to avoid tiny maxima resulting from
        filtering artefacts in the first few milliseconds of a TMG signal. Will
        use a sane default value designed for TMG signals if no value is
        specified.
    use_first_max_as_dm : bool
        If True, uses the first maximum meeting the criteria imposed by
        `ignore_maxima_with_idx_less_than` and `ignore_maxima_less_than` for
        Dm; if false, uses the global maximum for Dm. Used in practice to make
        Dm, and TMG parameters derived from it, correspond to the twitch from
        fast-twitch muscle fibers, which may have a distinct, earlier maximum
        than the global maximum caused by slower-twitch fibers.
    interpolate_dm : bool
        If True, uses interpolation to fine-tune the value of Dm beyond the
        granularity of `y`'s discrete samples. If False, uses the maximum
        sample in `y` as Dm. See `_interpolate_extremum` for more context on
        interpolation.

    Returns
    -------
    params : dict
        A dict holding the computed TMG parameter values. The dict has the
        following keys:
        - `dm` (float): value of Dm, in the same units as `y`.
        - `tm` (float): time of Dm, in the same units as `t`.
        - `td` (float): value of Td, in the same units as `t`.
        - `tc` (float): value of Tc, in the same units as `t`.
        - `ts` (float): value of Ts, in the same units as `t`.
        - `tr` (float): value of Tr, in the same units as `t`.
    """
    if t is None:
        t = np.arange(y.shape[0])
    if ignore_maxima_with_idx_less_than is None:
        ignore_maxima_with_idx_less_than = TimeSeriesConstants.TMG_PARAMS['ignore_maxima_with_idx_less_than']
    if ignore_maxima_less_than is None:
        ignore_maxima_less_than = TimeSeriesConstants.TMG_PARAMS['ignore_maxima_less_than']

    dm_idx, dm, float_dm_idx = _get_dm_idx_and_value(y,
                                                     ignore_maxima_with_idx_less_than,
                                                     ignore_maxima_less_than,
                                                     use_first_max_as_dm,
                                                     interpolate_dm)

    t10_upcross_idx = _interpolate_idx_of_target_amplitude(y, 0.1*dm, True)
    t50_upcross_idx = _interpolate_idx_of_target_amplitude(y, 0.5*dm, True)
    t90_upcross_idx = _interpolate_idx_of_target_amplitude(y, 0.9*dm, True)
    t90_downcross_idx = _interpolate_idx_of_target_amplitude(y, 0.9*dm, False, start_search_at_idx=dm_idx)
    t50_downcross_idx = _interpolate_idx_of_target_amplitude(y, 0.5*dm, False, start_search_at_idx=dm_idx)

    # Convert indices to time
    tm = _idx_to_time(float_dm_idx, t)
    t10_upcross = _idx_to_time(t10_upcross_idx, t)
    t50_upcross = _idx_to_time(t50_upcross_idx, t)
    t90_upcross = _idx_to_time(t90_upcross_idx, t)
    t90_downcross = _idx_to_time(t90_downcross_idx, t)
    t50_downcross = _idx_to_time(t50_downcross_idx, t)

    # Compute standard TMG time parameters
    td = t10_upcross
    tc = t90_upcross - t10_upcross
    ts = t50_downcross - t50_upcross
    tr = t50_downcross - t90_downcross

    return {
      'dm': dm,
      'tm': tm,
      'td': td,
      'tc': tc,
      'ts': ts,
      'tr': tr,
    }

get_derivative_of_time_series(y, t=None)

Returns the derivative of a time series.

Returns the derivative with respect to time of the inputted time series.

Parameters:

Name Type Description Default
y ndarray

1D Numpy array holding the values of a time series signal, as for get_tmg_parameters_of_time_series.

required
t ndarray

1D Numpy array holding the time (or other independent variable) values on which y is defined.

If provided, t must be a 1N Numpy array with the same number of points as y. If not provided, this function will use index values of y as the independent variable, i.e. t = [0, 1, 2, ..., y.shape[0]].

None

Returns:

Name Type Description
dydt ndarray

1D Numpy array holding the derivative of the inputted time series y. The derivative dydt has the same dimensions as y, is defined on the same time grid t as y, and the units of dydt are the units of y divided by the units of t.

Source code in src/tmgtoolkit/time_series.py
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
def get_derivative_of_time_series(y, t=None):
    """Returns the derivative of a time series.

    Returns the derivative with respect to time of the inputted time series.

    Parameters
    ----------
    y : ndarray
        1D Numpy array holding the values of a time series signal, as for
        `get_tmg_parameters_of_time_series`.
    t : ndarray, optional
        1D Numpy array holding the time (or other independent variable) values
        on which `y` is defined.

        If provided, `t` must be a 1N Numpy array with the same number of
        points as `y`. If not provided, this function will use index values of
        `y` as the independent variable, i.e. `t = [0, 1, 2, ..., y.shape[0]]`.

    Returns
    -------
    dydt : ndarray
        1D Numpy array holding the derivative of the inputted time series `y`.
        The derivative `dydt` has the same dimensions as `y`, is defined on the
        same time grid `t` as `y`,  and the units of `dydt` are the units of
        `y` divided by the units of `t`.

    """
    if t is None:
        t = numpy.arange(y.shape[0])
    return np.gradient(y, t)

get_extremum_parameters_of_time_series(y, t=None)

Returns extremum parameters of a time series.

Returns a dict holding the maximum value, time of maximum value, minimum value, and time of minimum value of the inputted time series y.

Parameters:

Name Type Description Default
y ndarray

1D Numpy array holding the values of a time series signal, as for get_tmg_parameters_of_time_series. Typically y will be either a TMG signal or a TMG signal's time derivative, (e.g. as computed by get_derivative_of_time_series).

required
t ndarray

1D Numpy array holding the time (or other independent variable) values on which y is defined.

If provided, t must be a 1N Numpy array with the same number of points as y. If not provided, this function will use index values of y as the independent variable, i.e. t = [0, 1, 2, ..., y.shape[0]].

None

Returns:

Name Type Description
params dict

A dict holding the computed parameter values. The dict has the following keys: - max (float): maximum value of y, in the same units as y. - max_time (float): time at which max occurs, in the same units as t. If y has multiple equal maximum values, the time of the first maximum value is used. - min (float): minimum value of y, in the same units as y. - min_time (float): time at which min occurs, in the same units as t. If y has multiple equal minimum values, the time of the first minimum value is used.

Note: the extremum values and their times are computed by interpolating y and t, and in general the maximum and minimum values will fall between between discrete values in y, and the maximum and minimum times will fall between discrete values in t—this is expected because of interpolation. When t is not provided, the maximum and minimum times (which will in generally be non-integer, floating point values) should be interpretted as the index values where max and min would fall if y where defined on a continuous index domain.

Source code in src/tmgtoolkit/time_series.py
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
def get_extremum_parameters_of_time_series(y, t=None):
    """Returns extremum parameters of a time series.

    Returns a dict holding the maximum value, time of maximum value, minimum
    value, and time of minimum value of the inputted time series `y`.

    Parameters
    ----------
    y : ndarray
        1D Numpy array holding the values of a time series signal, as for
        `get_tmg_parameters_of_time_series`. Typically `y` will be either a TMG
        signal or a TMG signal's time derivative, (e.g. as computed by
        `get_derivative_of_time_series`).
    t : ndarray, optional
        1D Numpy array holding the time (or other independent variable) values
        on which `y` is defined.

        If provided, `t` must be a 1N Numpy array with the same number of
        points as `y`. If not provided, this function will use index values of
        `y` as the independent variable, i.e. `t = [0, 1, 2, ..., y.shape[0]]`.

    Returns
    -------
    params : dict
        A dict holding the computed parameter values. The dict has the
        following keys:
        - `max` (float): maximum value of `y`, in the same units as `y`.
        - `max_time` (float): time at which `max` occurs, in the same units as
              `t`. If `y` has multiple equal maximum values, the time of the
              first maximum value is used.
        - `min` (float): minimum value of `y`, in the same units as `y`.
        - `min_time` (float): time at which `min` occurs, in the same units as
              `t`. If `y` has multiple equal minimum values, the time of the
              first minimum value is used.

        Note: the extremum values and their times are computed by interpolating
        `y` and `t`, and in general the maximum and minimum values will fall
        between between discrete values in `y`, and the maximum and minimum
        times will fall between discrete values in `t`—this is expected because
        of interpolation.
        When `t` is not provided, the maximum and minimum times (which will in
        generally be non-integer, floating point values) should be interpretted
        as the index values where `max` and `min` would fall if `y` where
        defined on a continuous index domain.

    """
    if t is None:
        t = np.arange(y.shape[0])

    # Construct window around maximum (as index bounds allow)
    max_idx_estimate = np.argmax(y)
    idx_window = []
    y_window = []
    padding = TimeSeriesConstants.EXTREMUM_PARAMS['interpolation_window_padding']
    for i in range(max_idx_estimate - padding, max_idx_estimate + padding + 1):
        if i >= 0 and i < len(y):
            idx_window.append(i)
            y_window.append(y[i])
    max_idx, max = _interpolate_extremum(idx_window, y_window, True)

    # Construct window around minimum (as index bounds allow)
    min_idx_estimate = np.argmin(y)
    idx_window = []
    y_window = []
    for i in range(min_idx_estimate - padding, min_idx_estimate + padding + 1):
        if i >= 0 and i < len(y):
            idx_window.append(i)
            y_window.append(y[i])
    min_idx, min = _interpolate_extremum(idx_window, y_window, False)

    max_time = _idx_to_time(max_idx, t)
    min_time = _idx_to_time(min_idx, t)

    return {
            'max_time': max_time,
            'max': max,
            'min_time': min_time,
            'min': min
    }

tmgio

Functions for reading data from and writing data to measurement files, and for preprocessing of data read from measurement files in preparation for passing the data to analysis functions.

tmg_excel_to_ndarray(fname, skiprows=None, nrows=None, skipcols=None, ncols=None)

Extracts information in a TMG measurement Excel file.

Returns a TmgExcel dict holding the information in a standard-format TMG measurement Excel file, as produced by the official TMG measurement software distributed with the TMG S1 and S2 measurement systems.

Parameters:

Name Type Description Default
fname string

Path to a TMG measurement Excel file.

required

Returns:

Name Type Description
data ndarray

2D Numpy array holding the TMG signals in the inputted Excel file. Measurements are stored in columns, so that data has shape (rows, cols), where rows is the number of data points in each TMG measurement and cols is the number of measurements in the Excel file. Typically rows will be 1000, since a standard TMG signal is sampled for 1000 ms at 1 kHz.

Source code in src/tmgtoolkit/tmgio.py
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
def tmg_excel_to_ndarray(fname, skiprows=None, nrows=None, skipcols=None, ncols=None):
    """Extracts information in a TMG measurement Excel file.

    Returns a TmgExcel dict holding the information in a standard-format
    TMG measurement Excel file, as produced by the official TMG measurement
    software distributed with the TMG S1 and S2 measurement systems.

    Parameters
    ----------
    fname : string
        Path to a TMG measurement Excel file.

    Returns
    -------
    data : ndarray
        2D Numpy array holding the TMG signals in the
        inputted Excel file. Measurements are stored in columns, so that
        `data` has shape `(rows, cols)`, where `rows` is the number of
        data points in each TMG measurement and `cols` is the number of
        measurements in the Excel file. Typically `rows` will be 1000,
        since a standard TMG signal is sampled for 1000 ms at 1 kHz.
    """
    if skiprows is None:
        skiprows = IoConstants.TMG_EXCEL_MAGIC_VALUES['data_start_row_idx']
    if nrows is None:
        nrows = IoConstants.TMG_EXCEL_MAGIC_VALUES['data_nrows']
    if skipcols is None:
        skipcols = IoConstants.TMG_EXCEL_MAGIC_VALUES['data_start_col_idx']

    usecols = lambda col: col >= skipcols and ((col < (ncols + skipcols)) if ncols is not None else True)
    return pd.read_excel(fname, engine='openpyxl', header=None, skiprows=skiprows, nrows=nrows, usecols=usecols).values

split_data_for_spm(data, numsets, n1, n2, split_mode, skiprows=0, nrows=None, equalize_columns=True)

Splits structured input data into groups for analysis with SPM.

Splits the time series in the inputted 2D array data into groups that can then be compared to each other with SPM analysis. The function assumes data has a well-defined structure, namely that the time series in data are divided into numsets sets, where each set consists of n1 consecutive time series in Group 1 followed by n2 consecutive time series in Group 2.

For conventional split modes splits data into an array of tuples (group1, group2) tuples, with one tuple for each measurement set in data.

For "all"-type split modes, splits data into two 2D arrays stored in a single tuple (group1, group2).

For documentation of split modes see constants.IoConstants.SPM_SPLIT_MODES.

Parameters:

Name Type Description Default
data ndarray

2D Numpy array holding time series data. The time series should be stored in columns, so that data has shape (rows, cols), where rows is the number of data points in each time series measurement and cols is the number of time series.

required
numsets int

Number of sets in data.

required
n1 int

Number of Group 1 time series in each set.

required
n2 int

Number of Group 2 time series in each set.

required
skiprows int

Skips the first skiprows in data.

0
split_mode int

An symbolic constant from constants.IoConstants controlling how to split the measurements in data.

required
nrows int

If provided, return only the first nrows after skiprows in data. The default is to return all rows in data.

None
equalize_columns boolean

If True, all returned group1 and group2 arrays are guaranteed to have the same shape. If the inputted data does not split into an equal number of Group 1 and Group 2 time series under the given parameters, then the group with fewer time series is padded with additional time series until group1 and group2 have the same shape.

True

Returns:

Name Type Description
data_tuples list

Array holding a (group1, group2) tuple for each measurement set analyzed in data, where group1 and group2 are 2D Numpy arrays holding the Group 1 and Group 2 measurements, respectively, for each set. This return type is used for conventional split modes.

data_tuple tuple

Tuple (group1, group2) holding Group 1 and Group 2 measurements. Fields are 0 (group1) : ndarray 2D Numpy array holding Group 1 measurements. 1 (group2) : ndarray 2D Numpy array holding Group 2 measurements. This return type is used for "all"-type split modes.

Source code in src/tmgtoolkit/tmgio.py
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
def split_data_for_spm(data, numsets, n1, n2, split_mode, skiprows=0, nrows=None, equalize_columns=True):
    """Splits structured input data into groups for analysis with SPM.

    Splits the time series in the inputted 2D array `data` into groups that can
    then be compared to each other with SPM analysis. The function assumes
    `data` has a well-defined structure, namely that the time series in `data`
    are divided into `numsets` sets, where each set consists of `n1`
    consecutive time series in Group 1 followed by `n2` consecutive time series
    in Group 2.

    For conventional split modes splits `data` into an array of tuples
    `(group1, group2)` tuples, with one tuple for each measurement set in
    `data`.

    For "all"-type split modes, splits `data` into two 2D arrays stored in a
    single tuple `(group1, group2)`.

    For documentation of split modes see
    `constants.IoConstants.SPM_SPLIT_MODES`.

    Parameters
    ----------
    data : ndarray
        2D Numpy array holding time series data. The time series should be
        stored in columns, so that `data` has shape `(rows, cols)`, where
        `rows` is the number of data points in each time series measurement and
        `cols` is the number of time series.
    numsets : int
        Number of sets in `data`.
    n1 : int
        Number of Group 1 time series in each set.
    n2 : int
        Number of Group 2 time series in each set.
    skiprows : int, optional
        Skips the first `skiprows` in `data`.
    split_mode : int
        An symbolic constant from `constants.IoConstants` controlling how to
        split the measurements in `data`.
    nrows : int, optional
        If provided, return only the first `nrows` after `skiprows` in `data`.
        The default is to return all rows in `data`.
    equalize_columns : boolean, optional
        If True, all returned `group1` and `group2` arrays are guaranteed to
        have the same shape. If the inputted data does not split into an equal
        number of Group 1 and Group 2 time series under the given parameters,
        then the group with fewer time series is padded with additional time
        series until `group1` and `group2` have the same shape.

    Returns
    -------
    data_tuples : list
        Array holding a `(group1, group2)` tuple for each measurement set
        analyzed in `data`, where `group1` and `group2` are 2D Numpy arrays
        holding the Group 1 and Group 2 measurements, respectively, for each
        set. This return type is used for conventional split modes.
    data_tuple : tuple
        Tuple `(group1, group2)` holding Group 1 and Group 2 measurements.
        Fields are
        0 (group1) : ndarray
            2D Numpy array holding Group 1 measurements.
        1 (group2) : ndarray
            2D Numpy array holding Group 2 measurements.
        This return type is used for "all"-type split modes.

    """
    assert len(data.shape) == 2, "Inputted data must be two-dimensional array."
    assert skiprows >= 0, "The number of rows to skip must be non-negative."
    assert skiprows < data.shape[0], "The number of rows to skip ({}) exceeds the number of data rows ({}).".format(skiprows, data.shape[0])
    if nrows is not None:
        assert nrows > 0, "The requested number of rows to return must be greater than zero."
        assert nrows < data.shape[0] - skiprows, "The requested number of rows to return exceeds the number of data rows ({})".format(nrows, data.shape[0]) + ("and rows to skip ({}).".format(skiprows) if skiprows > 0 else ".")
    else:
        nrows = data.shape[0] - skiprows

    if split_mode == IoConstants.SPM_SPLIT_MODES['parallel']:
        return _split_data_parallel(data, numsets, n1, n2, skiprows, nrows, equalize_columns)
    elif split_mode == IoConstants.SPM_SPLIT_MODES['parallel_all']:
        return _split_data_parallel_all(data, numsets, n1, n2, skiprows, nrows, equalize_columns)
    elif split_mode == IoConstants.SPM_SPLIT_MODES['fixed_baseline']:
        return _split_data_fixed_baseline(data, numsets, n1, n2, skiprows, nrows, equalize_columns)
    elif split_mode == IoConstants.SPM_SPLIT_MODES['fixed_baseline_all']:
        return _split_data_fixed_baseline_all(data, numsets, n1, n2, skiprows, nrows, equalize_columns)
    elif split_mode == IoConstants.SPM_SPLIT_MODES['potentiation_creep']:
        return _split_data_potentiation_creep(data, numsets, n1, n2, skiprows, nrows, equalize_columns)
    elif split_mode == IoConstants.SPM_SPLIT_MODES['potentiation_creep_all']:
        return _split_data_potentiation_creep_all(data, numsets, n1, n2, skiprows, nrows, equalize_columns)
    else:
        raise ValueError("Unsupported split_mode ({}) passed to `split_data_for_spm`.".format(split_mode))