


MEANDIST_ESTIM Estimate the average distance between vectors.
MEANDIST = MEANDIST_ESTIM(DATA) takes an (M x N) matrix DATA and returns
an estimate of the average Euclidean distance between the row vectors
of DATA. The estimate is made by averaging the pairwise distances for
500 randomly selected rows.
MEANDIST = MEANDIST_ESTIM(DATA, USE_DATA), when USE_DATA is a scalar, uses
min(M, USE_DATA) rows for the estimate. If USE_DATA is a vector, it is
taken as a set of indices into the rows of DATA and the distance estimate
is made with the specified rows.

0001 function meandist = meandist_estim(data, use_data) 0002 0003 % MEANDIST_ESTIM Estimate the average distance between vectors. 0004 % MEANDIST = MEANDIST_ESTIM(DATA) takes an (M x N) matrix DATA and returns 0005 % an estimate of the average Euclidean distance between the row vectors 0006 % of DATA. The estimate is made by averaging the pairwise distances for 0007 % 500 randomly selected rows. 0008 % 0009 % MEANDIST = MEANDIST_ESTIM(DATA, USE_DATA), when USE_DATA is a scalar, uses 0010 % min(M, USE_DATA) rows for the estimate. If USE_DATA is a vector, it is 0011 % taken as a set of indices into the rows of DATA and the distance estimate 0012 % is made with the specified rows. 0013 0014 % Last Modified By: sbm on Thu Oct 6 20:30:00 2005 0015 0016 %%%%%%%%%% CONSTANTS 0017 default_numrows = 800; 0018 M = size(data, 1); 0019 0020 %%%%%%%%%% DEFAULTS & ARGUMENT CHECKING 0021 if ((nargin == 2) & (numel(use_data) > 1)) 0022 try 0023 data(use_data,:); % lazy way of checking the indices 0024 catch 0025 error('The USE_DATA vector contains invalid indices.'); 0026 end 0027 else 0028 if (nargin == 1) 0029 use_data = min(M, default_numrows); 0030 elseif (use_data > M) 0031 error('USE_DATA can not contain a scalar higher than the number of rows in DATA.'); 0032 end 0033 inds = randperm(M); % randomly shuffle row indices ... 0034 use_data = inds(1:use_data); % ... and take the number of indices we're going to use 0035 end 0036 0037 %%%%%%%%%% MEAN DISTANCE ESTIMATION 0038 dists = pairdist(data(use_data,:)); 0039 meandist = mean(dists(triu(logical(dists ~= 0)))); 0040