


MEANDIST_ESTIM Estimate the average distance between vectors.
MEANDIST = MEANDIST_ESTIM(DATA) takes an (M x N) matrix DATA and returns
an estimate of the average Euclidean distance between the row vectors
of DATA. The estimate is made by averaging the pairwise distances for
500 randomly selected rows.
MEANDIST = MEANDIST_ESTIM(DATA, USE_DATA), when USE_DATA is a scalar, uses
min(M, USE_DATA) rows for the estimate. If USE_DATA is a vector, it is
taken as a set of indices into the rows of DATA and the distance estimate
is made with the specified rows.

0001 function meandist = meandist_estim(data, use_data) 0002 0003 % MEANDIST_ESTIM Estimate the average distance between vectors. 0004 % MEANDIST = MEANDIST_ESTIM(DATA) takes an (M x N) matrix DATA and returns 0005 % an estimate of the average Euclidean distance between the row vectors 0006 % of DATA. The estimate is made by averaging the pairwise distances for 0007 % 500 randomly selected rows. 0008 % 0009 % MEANDIST = MEANDIST_ESTIM(DATA, USE_DATA), when USE_DATA is a scalar, uses 0010 % min(M, USE_DATA) rows for the estimate. If USE_DATA is a vector, it is 0011 % taken as a set of indices into the rows of DATA and the distance estimate 0012 % is made with the specified rows. 0013 0014 %%%%%%%%%% CONSTANTS 0015 default_numrows = 800; 0016 M = size(data, 1); 0017 0018 %%%%%%%%%% DEFAULTS & ARGUMENT CHECKING 0019 if ((nargin == 2) && (numel(use_data) > 1)) 0020 try 0021 data(use_data,:); % lazy way of checking the indices 0022 catch 0023 error('The USE_DATA vector contains invalid indices.'); 0024 end 0025 else 0026 if (nargin == 1) 0027 use_data = min(M, default_numrows); 0028 elseif (use_data > M) 0029 error('USE_DATA can not contain a scalar higher than the number of rows in DATA.'); 0030 end 0031 inds = randperm(M); % randomly shuffle row indices ... 0032 use_data = inds(1:use_data); % ... and take the number of indices we're going to use 0033 end 0034 0035 %%%%%%%%%% MEAN DISTANCE ESTIMATION 0036 dists = pairdist(data(use_data,:)); 0037 meandist = mean(dists(triu(logical(dists ~= 0)))); 0038