Home > chronux_1_15 > spikesort > ss_aggregate.m

ss_aggregate

PURPOSE ^

SS_AGGREGATE ISI-restricted heirarchical cluster aggregation.

SYNOPSIS ^

function spikes = ss_aggregate(spikes, reintegrate_outliers)

DESCRIPTION ^

 SS_AGGREGATE  ISI-restricted heirarchical cluster aggregation.
     SPIKES = SS_AGGREGATE(SPIKES) takes and returns a spike-sorting object
     SPIKES after aggregating clusters (requires a previous overclustering
     and an interface energy matrix calculation).  The aggregation tree is
     stored in SPIKES.HIERARCHY.TREE and the new assignments are stored in
     SPIKES.HIERARCHY.ASSIGNS.

     The algorithm computes a similarity/connection matrix using the interface
     energy.  It then chooses the cluster pair with the highest connection
     strength, aggregates them, recalculates connection strengths, and then
     repeats the process.  Cluster aggregation is contingent on passing an
     interspike interval (ISI) test if SPIKES.SPIKETIMES is defined; if this
     test is not passed, the pair is not aggregated and aggregation continues.
     Aggregation stops when either (1) all remaining pairs fail the ISI test
     or (2) the connection strength drops below a (heuristic) cutoff of 0.01.

     The ISI test needs an idea of the expected refractory period to test
     for violations; as a default, it uses 1.5 times the width of the
     waveforms in the SPIKES structure.  To override this, define the
     value SPIKES.OPTIONS.REFRACTORY_PERIOD with units of seconds (e.g.,
     set to 0.0017 to indicate an expected refractory period of 1.7 msec).

     The SPIKES.HIERARCHY.TREE output is a matrix describing the aggregation.
     Each aggregation step entry produces a row, listed in the order they
     were performed.  The first two columns are the indices of the clusters
     that were aggregated; the index assigned to the aggregate for future
     entries is the lower of the two original indices.  The third column is
     the connection strength between the clusters before aggregation and the
     fourth column is the isi statistic for the aggregate (0 if isi statistics
     are not being used).
 
     After aggregation, outliers that were previously removed are typically 
     reinserted into the spikes list so that the list aligns with the original
     (pre-sorted) list.  The outlier waveforms and spike times are thus by
     default added back to the SPIKES.WAVEFORMS and SPIKES.SPIKETIMES fields
     respectively, after all other aggregation is complete.  These waveforms
     are assigned the label 0 in SPIKES.HIERARCHY.ASSIGNS and the other, 
     non-outlier, spikes are renumbered accordingly.  To prevent this from
     occuring, pass in 0 as the second argument to this function, i.e.,
     SPIKES = SS_AGGREGATE(SPIKES, 0);  

 References:
     Fee MS et al (1996).  J. Neurosci Methods (69): 175-88

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function spikes = ss_aggregate(spikes, reintegrate_outliers)
0002 % SS_AGGREGATE  ISI-restricted heirarchical cluster aggregation.
0003 %     SPIKES = SS_AGGREGATE(SPIKES) takes and returns a spike-sorting object
0004 %     SPIKES after aggregating clusters (requires a previous overclustering
0005 %     and an interface energy matrix calculation).  The aggregation tree is
0006 %     stored in SPIKES.HIERARCHY.TREE and the new assignments are stored in
0007 %     SPIKES.HIERARCHY.ASSIGNS.
0008 %
0009 %     The algorithm computes a similarity/connection matrix using the interface
0010 %     energy.  It then chooses the cluster pair with the highest connection
0011 %     strength, aggregates them, recalculates connection strengths, and then
0012 %     repeats the process.  Cluster aggregation is contingent on passing an
0013 %     interspike interval (ISI) test if SPIKES.SPIKETIMES is defined; if this
0014 %     test is not passed, the pair is not aggregated and aggregation continues.
0015 %     Aggregation stops when either (1) all remaining pairs fail the ISI test
0016 %     or (2) the connection strength drops below a (heuristic) cutoff of 0.01.
0017 %
0018 %     The ISI test needs an idea of the expected refractory period to test
0019 %     for violations; as a default, it uses 1.5 times the width of the
0020 %     waveforms in the SPIKES structure.  To override this, define the
0021 %     value SPIKES.OPTIONS.REFRACTORY_PERIOD with units of seconds (e.g.,
0022 %     set to 0.0017 to indicate an expected refractory period of 1.7 msec).
0023 %
0024 %     The SPIKES.HIERARCHY.TREE output is a matrix describing the aggregation.
0025 %     Each aggregation step entry produces a row, listed in the order they
0026 %     were performed.  The first two columns are the indices of the clusters
0027 %     that were aggregated; the index assigned to the aggregate for future
0028 %     entries is the lower of the two original indices.  The third column is
0029 %     the connection strength between the clusters before aggregation and the
0030 %     fourth column is the isi statistic for the aggregate (0 if isi statistics
0031 %     are not being used).
0032 %
0033 %     After aggregation, outliers that were previously removed are typically
0034 %     reinserted into the spikes list so that the list aligns with the original
0035 %     (pre-sorted) list.  The outlier waveforms and spike times are thus by
0036 %     default added back to the SPIKES.WAVEFORMS and SPIKES.SPIKETIMES fields
0037 %     respectively, after all other aggregation is complete.  These waveforms
0038 %     are assigned the label 0 in SPIKES.HIERARCHY.ASSIGNS and the other,
0039 %     non-outlier, spikes are renumbered accordingly.  To prevent this from
0040 %     occuring, pass in 0 as the second argument to this function, i.e.,
0041 %     SPIKES = SS_AGGREGATE(SPIKES, 0);
0042 %
0043 % References:
0044 %     Fee MS et al (1996).  J. Neurosci Methods (69): 175-88
0045 
0046 %   Last Modified By: sbm on Sun Oct  9 22:53:16 2005
0047 
0048 debug = 1;
0049 
0050 starttime = clock;
0051 
0052 cutoff = 0.01;   % arbitrarily stop aggregation when overlap density is < 1% of main cluster density
0053 
0054 %%%%% ARGUMENT CHECKING
0055 if (~isfield(spikes, 'hierarchy') | ~isfield(spikes.hierarchy, 'interface_energy'))
0056     error('SS:energy_not_computed', 'An energy matrix must be computed before aggregation.');
0057 elseif (~isfield(spikes, 'overcluster') | ~isfield(spikes.overcluster, 'assigns'))
0058     error('SS:overcluster_not_computed', 'The data must be overclustered before aggregation.');
0059 elseif (isfield(spikes, 'spiketimes') & ~isfield(spikes, 'Fs'))
0060     error('SS:bad_stop_condition', 'A sampling frequency Fs must be supplied for an ISI stop condition.');
0061 end
0062 if (isfield(spikes.hierarchy, 'assigns') && any(spikes.hierarchy.assigns == 0))
0063     error('SS:aggregate_after_outliers', 'Aggregation can not be performed after outliers are reintegrated into the data.');
0064 end
0065 if (nargin < 2),  reintegrate_outliers = 1;  end
0066 
0067 tmin = size(spikes.waveforms,2)./spikes.Fs;  % minimum possible time btw spikes
0068 if (isfield(spikes, 'options') && isfield(spikes.options, 'refractory_period'))
0069     tref = spikes.options.refractory_period;
0070 else
0071     tref = max(0.002, tmin*1.5);                 % crude guess as to refractory period
0072 end
0073 
0074 %%%%% INITIALIZE A FEW THINGS
0075 assignments = spikes.overcluster.assigns;
0076 interface_energy = spikes.hierarchy.interface_energy;
0077 numclusts = max(assignments);
0078 numpts = full(sparse(assignments, 1, 1, numclusts, 1));
0079 tree = [];
0080 untested = 3*ones(numclusts);    % they're all initially untested
0081 
0082 handle_fig = figure;  handle_img = imagesc(untested);  axis square;
0083 colormap([0 0 0; 0.9 0.4 0.4; 1 1 0; 0.4 0.8 0.2]);  % [bk, rd, yl, gn] => (0 combined, 1 not allowed, 2 testing, 3 untested)
0084 %%%%% AGGREGATE HIGHEST CONNECTION STRENGTHS UNTIL ALL TRIED
0085 while (any(any(triu(untested,1)==3)))   % only the upper triangular half is meaningful
0086     % compute connection strengths from interface energies
0087     %   first, normalize energy:
0088     normalize = ((numpts * numpts') - diag(numpts));    % Off diag: Na*Nb, On diag: Na^2-Na ...
0089     normalize = normalize - diag(0.5*diag(normalize));  % ... and divide diagonal by 2
0090     norm_energy = interface_energy ./ normalize;
0091     %   then, compute connection strength
0092     self = repmat(diag(norm_energy), [1,numclusts]);
0093     connect_strength = 2 .* norm_energy ./ (self + self');
0094     connect_strength = connect_strength .* (1-eye(numclusts));  % diag entries <- 0, so we won't agg clusters with themselves
0095 
0096     % Find best remaining pair
0097     remaining = ((untested == 3) .* connect_strength);
0098     best = max(remaining(:));           % highest untested connection strength
0099     
0100     if (best < cutoff)   % No point continuing if connection strengths have gotten really lousy
0101         break;
0102     end
0103     
0104     [clust1 clust2] = find(connect_strength == best);  % who're the lucky winners?
0105     first = min(clust1(1),clust2(1));   % if we get 2 best pairs, just take the 1st
0106     second = max(clust1(1),clust2(1)); 
0107     untested(first,second) = 2;    untested(first,second) = 2;     % mark that we're trying this one
0108     set(handle_img, 'CData', triu(untested)); title(['Trying ' num2str(first) ' and ' num2str(second)]); drawnow;
0109     
0110     % Is this aggregation allowed?
0111     if (isfield(spikes, 'spiketimes'))  % if we were given spike times, use them ...
0112         t1 = spikes.spiketimes(find(assignments == first));
0113         t2 = spikes.spiketimes(find(assignments == second));    
0114         if (debug)
0115             score = sum([diff(sort([t1; t2])) < tref]) ./ (length(t1) + length(t2));
0116             allow = (score < 0.05);
0117             scores = [0 0 score];
0118         else
0119             [allow, scores] = isiQuality(t1, t2, tmin, 0.010, tref, spikes.Fs);
0120         end        
0121     else  % ... otherwise, there are no restrictions on aggregation
0122         allow = 1;
0123         scores = [0 0 0];
0124     end
0125 
0126     if (allow)      % Bookkeeping ...
0127         % Aggregation subsumes the higher index cluster into the lower.  Start by adding
0128         % (denormalized) interaction energies for the second (higher index) cluster
0129         % to those of the first and zeroing the old entries of the second.  Because of the
0130         % triangular structure of the connection matrix, repeat for both rows and columns ...
0131         interface_energy(first,:) = interface_energy(first,:) + interface_energy(second,:);
0132         interface_energy(second,:) = 0;
0133         interface_energy(:,first) = interface_energy(:,first) + interface_energy(:,second);
0134         interface_energy(:,second) = 0;
0135         interface_energy(second,second) = 1;  % keep self-energy at 1 (we may divide by it later)
0136         % since we added rows & columns, some energy values will have spilled over into the
0137         % lower half of the energy matrix (which must be upper triangular).  The next 2 steps
0138         % recover those values.
0139         overflow = tril(interface_energy, -1);   % spillover below diagonal
0140         interface_energy = interface_energy + overflow' - overflow;  % reflect above diagonal
0141 
0142         % update counts vector
0143         numpts(first) = numpts(first) + numpts(second);
0144         numpts(second) = 2;   % leaving this as 2 prevents div by zero during normalization above
0145         
0146         % make a tree entry for the aggregation we just performed
0147         tree = cat(1, tree, [first, second, best, scores(3)]);
0148 
0149         % Now actually change the numbers
0150         assignments(find(assignments == second)) = first;
0151         
0152         % Finally, indicate that potential aggregations between the new cluster and
0153         % other (nonempty) clusters are untested while pairs involving clusters that
0154         % have already been emptied should not be tested.
0155         untested(first,:) = 3;               untested(:,first) = 3;
0156         untested(tree(:,2),:) = 0;           untested(:,tree(:,2)) = 0;
0157     else
0158         untested(first,second) = 1;          untested(second,first) = 1;
0159     end
0160 end
0161 close(handle_fig);
0162 
0163 %spikes.hierarchy.interface_energy_aggregated = interface_energy;
0164 spikes.hierarchy.tree = tree;
0165 spikes.hierarchy.assigns = assignments;
0166 
0167 if (reintegrate_outliers && isfield(spikes, 'outliers') && ~isempty(spikes.outliers.badinds))
0168     % First, we make room by putting all of the non-outliers back into their original places
0169     spikes.waveforms(spikes.outliers.goodinds,:) = spikes.waveforms;
0170     spikes.spiketimes(spikes.outliers.goodinds,:) = spikes.spiketimes;
0171     spikes.hierarchy.assigns(spikes.outliers.goodinds) = spikes.hierarchy.assigns;
0172     
0173     % Then we fill in the outliers ...
0174     spikes.waveforms(spikes.outliers.badinds,:) = spikes.outliers.waveforms;
0175     spikes.spiketimes(spikes.outliers.badinds,:) = spikes.outliers.spiketimes;
0176     spikes.hierarchy.assigns(spikes.outliers.badinds) = 0;  % ... and add the '0' label.
0177     
0178     % We'll also want to add the assignments to the 'overcluster' list (this is
0179     % important for post-clustering splitting).
0180     spikes.overcluster.assigns(spikes.outliers.goodinds) = spikes.overcluster.assigns;
0181     spikes.overcluster.assigns(spikes.outliers.badinds) = 0;
0182     
0183     spikes = rmfield(spikes, 'outliers');  % don't need this any more -- its redundant.
0184 end
0185 
0186 spikes.tictoc.aggregate = etime(clock, starttime);

Generated on Tue 15-Aug-2006 22:51:57 by m2html © 2003