![]() | Spike Train Analysis Toolkit |
Information theoretic methods are now widely used for the analysis of spike train data. However, developing robust implementations of these methods can be tedious and time-consuming. In order to facilitate further adoption of these methods, we have developed the Spike Train Analysis Toolkit, a software package which implements several information-theoretic spike train analysis techniques. This implementation behaves like a typical Matlab toolbox, but the underlying computations are coded in C and optimized for efficiency.
In describing the capabilities of the toolkit, we further distinguish between what we call information methods and entropy methods.
How can I tell how much information is conveyed by the neural responses I have recorded?
To help you get started, we have created demonstrations of the various information methods, which you should use as a template for your own analyses. The demonstrations illustrate how to select and use these methods, and what to look for in the output. They also guide the selection and use of the associated entropy methods, options, and parameters.
The direct method makes the fewest assumptions about the nature of the neural code, but generally requires the most data (e.g., hundreds of repeats for some stimuli). The metric space method and the binless method require only about 10 repeats per stimulus, but make assumptions about the nature of the neural code (see the references below). For multineuronal data, only the direct and metric space methods are applicable. The direct methods require one to consider time in discrete bins (i.e., data are symbol sequences), whereas the metric space and binless methods work with continuous time (i.e., data are point processes).
This is enough to get started; further detail on the various methods contained in the STAToolkit is presented below.
Information methods are those methods which estimate the mutual information between an ensemble of spike trains and some other experimental variable. We distinguish between formal and attribute-specific information, as proposed by Reich et al. (2001). Formal information concerns all aspects of the response that depend on the stimulus. It is estimated from the difference between the entropy of responses to an ensemble of temporally rich stimuli and the entropy of responses to an ensemble of repeated stimuli. Attribute-specific information refers to the amount of information that responses convey about a particular experimental parameter. If the parameter describes one of several discrete categories, we refer to it as category-specific information.
The current version contains implementations of four information methods:
Entropy methods are those methods that estimate entropy from a discrete histogram, a computation common to many information-theoretic methods. For the general user, we recommend adapting the demos for one's own data; the demos select appropriate entropy methods. The advanced user may wish to substitute other entropy methods, or to use the entropy methods as standalone modules (e.g., entropy1d). Entropy methods are chosen by including the appropriate code in the entropy_estimation_method
option (see information options and parameters and implied entropy options and parameters). The included methods are:
plugin
)tpmc
)jack
)ma
)bub
)chaoshen
)ww
)nsb
)variance_estimation_method
(see information options). The jackknife (jack
) and bootstrap (boot
) methods can be applied to any entropy estimate. Additionally, the toolkit includes a variance estimate that is specific to the NSB entropy method (nsb_var
), and may include other specific variance estimates in the future. Each information method has a top level function which performs an analysis on an input data stucture. Each information method has been partitioned into modules corresponding to steps that provide useful intermediate results. They also include top-level functions for users that do not require flexibility. The table below depicts the five major information methods and the functions they call. Function directformal
performs a formal information analysis via the direct method; function directcat
performs a categorical information analysis via the direct method; function metric
performs a categorical information analysis via the metric space method; function binless
performs a categorical information analysis via the binless method; function ctwmcmc
performs a formal information analysis via the context-tree method.
directformal | directcat | metric | binless | ctwmcmc |
---|---|---|---|---|
directbin Bin spike trains | directbin Bin spike trains | metricopen Prepare input data structure | binlessopen Prepare input data structure | directbin Bin spike trains |
directcondformal Condition data on both category and time slice | metricdist Compute distances between sets of spike train pairs. | binlesswarp Warp spike times | ctwmcmctree Build the context tree(s) from data. | |
directcounttotal Count spike train words disregarding class | directcountcond Count spike train words in each class and disregarding class | metricclust Cluster spike trains based on distance matrix. | binlessembed Embed the spike trains | ctwmcmcsample Perform Markov chain Monte Carlo sampling on context tree(s). |
matrix2hist2d Converts a 2-D matrix of counts to a 2-D histogram | ||||
infocond Information and entropies from conditional and total histograms | infocond Information and entropies from conditional and total histograms | info2d Information and entropies from a 2-D histogram | binlessinfo Compute information components | ctwmcmcinfo Compute information from context tree entropies. |
All of the functions are documented in the function reference (note: opens in a new browser window).
Included demos give examples of how the top-level functions can be used.
We have developed a text-based input file format for the toolkit that is easy to generate. Users also have the option of bypassing the text-based file format and using another means to read the data into the Matlab input data structure.
Documentation fo the analysis options and parameters for information methods and entropy methods is available.
Estimated quantities are packaged in data structures with auxillary information such as variance estimates. See this page for more information.
This toolkit is one component of a larger endeavor in the field of computational neuroinformatics. We are in the process of integrating the toolkit with a Neurodatabase.org (a publicly-accessable neurophysiology database), developing a web-based analysis interface, and adapting the toolkit for a dedicated parallel cluster.
We are also working with members of the computational neuroscience community to incorporate their information theoretic techniques, as well as looking beyond information theory to other methodologies for analyzing neuroscience data. Please contact us if you would like to contribute.