Home > chronux_0.5 > locfit_wrap > locfitraw.m

locfitraw

PURPOSE ^

locfitraw locfit helper function to call from matlab

SYNOPSIS ^

function [x,y,e]=locfitraw(varargin)

DESCRIPTION ^

 locfitraw locfit helper function to call from matlab
  
  Usage: [x,y,e]=locfitraw( data )  {most basic usage, all defaults}

    Additional arguments are attached as name-value pairs, ie:
    [x,y,e]=locfitraw( x, 'alpha',[0.7,1.5] , 'family','rate' , 'ev','grid' , 'mg',100 ); 

====================================================================

  Argument types:

      The first set of arguments ('x', 'y', 'weights', 'cens', and
      'base') specify the regression variables and associated
      quantities.
 
      Another set ('scale', 'alpha', 'deg', 'kern', 'kt', 'acri' and
      'basis') control the amount of smoothing: bandwidth, smoothing
      weights and the local model.
 
      'deriv' and 'dc' relate to derivative (or local slope) estimation.
 
      'family' and 'link' specify the likelihood family.
 
      'xlim' and 'renorm' may be used in density estimation.
 
      'ev', 'flim', 'mg' and 'cut' control the set of evaluation points.
 
      'maxk',  'itype', 'mint', 'maxit' and 'debug' control the Locfit
      algorithms, and will be rarely used.
 
      'geth' and 'sty' are used by other functions calling 'locfit.raw',
      and should not be used directly.
 
=========================================================================

  Arguments in detail:
 
        x: Vector (or matrix) of the independent variable(s). 
      ******************************
  NOTE:       The first argument is placed in the first function slot without a name...
              All other arguments require 'name',value notation
      ******************************

        y: Response variable for regression models. For density
           families, 'y' can be omitted. 
 
  weights: Prior weights for observations (reciprocal of variance, or
           sample size). 
 
     cens: Censoring indicators for hazard rate or censored regression.
           The coding is '1' (or 'TRUE') for a censored observation, and
           '0' (or 'FALSE') for uncensored observations. 
 
     base: Baseline parameter estimate. If provided, the local
           regression model is fitted as Y_i = b_i + m(x_i) + epsilon_i,
           with Locfit estimating the m(x) term. For regression models,
           this effectively subtracts b_i from Y_i. The advantage of the
           'base' formulation is that it extends to likelihood
           regression models. 
 
    scale: A scale to apply to each variable. This is especially
           important for multivariate fitting, where variables may be
           measured in non-comparable units. It is also used to specify
           the frequency for 'ang' terms. If 'scale=F' (the default) no
           scaling is performed. If 'scale=T', marginal standard
           deviations are used. Alternatively, a numeric vector can
           provide scales for the individual variables. 
 
    alpha: Smoothing parameter. A single number (e.g. 'alpha=0.7') is
           interpreted as a nearest neighbor fraction. With two
           componentes (e.g. 'alpha=c(0.7,1.2)'), the first component is
           a nearest neighbor fraction, and the second component is a
           fixed component. A third component is the penalty term in
           locally adaptive smoothing. 
 
      deg: Degree of local polynomial. Default: 2 (local quadratic).
           Degrees 0 to 3 are supported by almost all parts of the
           Locfit code. Higher degrees may work in some cases. 
 
     kern: Weight function, default = '"tcub"'. Other choices are
           '"rect"', '"trwt"', '"tria"', '"epan"', '"bisq"' and
           '"gauss"'. Choices may be restricted when derivatives are
           required; e.g. for confidence bands and some bandwidth
           selectors. 
 
       kt: Kernel type, '"sph"' (default); '"prod"'. In multivariate
           problems, '"prod"' uses a simplified product model which
           speeds up computations. 
 
       acri: Criterion for adaptive bandwidth selection.
 
    basis: User-specified basis functions. See 'lfbas' for more details
           on this argument.
 
    deriv: Derivative estimation. If 'deriv=1', the returned fit will be
           estimating the derivative (or more correctly, an estimate of
           the local slope). If 'deriv=c(1,1)' the second order
           derivative is estimated. 'deriv=2' is for the partial
           derivative, with respect to the second variable, in
           multivariate settings.  
 
       dc: Derivative adjustment.  
 
   family: Local likelihood family; '"gaussian"'; '"binomial"';
           '"poisson"'; '"gamma"' and '"geom"'. Density and rate
           estimation families are '"dens"', '"rate"' and '"hazard"'
           (hazard rate). If the family is preceded by a ''q'' (for
           example, 'family="qbinomial"'), quasi-likelihood variance
           estimates are used. Otherwise, the residual variance ('rv')
           is fixed at 1. The default family is '"qgauss"' if a response
           'y' is provided; '"density"' if no response is provided. 
 
     link: Link function for local likelihood fitting. Depending on the
           family, choices may be '"ident"', '"log"', '"logit"',
           '"inverse"', '"sqrt"' and '"arcsin"'. 
 
     xlim: For density estimation, Locfit allows the density to be
           supported on a bounded interval (or rectangle, in more than
           one dimension). The format should be 'c(ll,ul)' where 'll' is
           a vector of the lower bounds and 'ur' the upper bounds.
           Bounds such as [0,infty) are not supported, but can be
           effectively implemented by specifying a very large upper
           bound. 
 
   renorm: Local likelihood density estimates may not integrate exactly
           to 1. If 'renorm=T', the integral will be estimated
           numerically and the estimate rescaled. Presently this is
           implemented only in one dimension. 
 
       ev: Evaluation Structure, default = '"tree"'. Also available are
           '"phull"', '"data"', '"grid"', '"kdtree"', '"kdcenter"' and
           '"crossval"'. 'ev="none"' gives no evaluation points,
           effectively producing the global parametric fit. A vector or
           matrix of evaluation points can also be provided. 
 
     flim: A vector of lower and upper bounds for the evaluation
           structure, specified as 'c(ll,ur)'. This should not be
           confused with 'xlim'. It defaults to the data range. 
         
       mg: For the '"grid"' evaluation structure, 'mg' specifies the
           number of points on each margin. Default 10. Can be either a
           single number or vector. 
 
      cut: Refinement parameter for adaptive partitions. Default 0.8;
           smaller values result in more refined partitions. 
 
     maxk: Controls space assignment for evaluation structures. For the
           adaptive evaluation structures, it is impossible to be sure
           in advance how many vertices will be generated. If you get
           warnings about `Insufficient vertex space', Locfit's default
           assigment can be increased by increasing 'maxk'. The default
           is 'maxk=100'. 
 
    itype: Integration type for density estimation. Available methods
           include '"prod"', '"mult"' and '"mlin"'; and '"haz"' for
           hazard rate estimation problems. The available integration
           methods depend on model specification (e.g. dimension, degree
           of fit). By default, the best available method is used. 
 
     mint: Points for numerical integration rules. Default 20. 
 
    maxit: Maximum iterations for local likelihood estimation. Default
           20. 
 
    debug: If > 0; prints out some debugging information.
 
     geth: Don't use!  
 
      sty: Style for special terms ('left', 'ang' e.t.c.). Do not try to
           set this directly; call 'locfit' instead. 
    
==========================================================================

 Requires windows since R-(D)COM is windows-specific
  I am working on a platform-independent replacement

 Requires that Matlab-R link Matlab package be installed from
 http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=5051&objectType=file   
 file MATLAB_RLINK.zip

 Requires that R be installed see http://r-project.org first
 file rw1091.exe

 Requires that R locfit package be installed first
 From within R in menu do "Packages" then "Install from CRAN"

 Requires that R-(D)COM be installed first from  
 http://lib.stat.cmu.edu/R/CRAN/contrib/extra/dcom/
 (get latest EXE file approx 3 MB)
 file RSrv135.exe

 The above packages should come bundled with this software for convenience
 with the exception of locfit which is easiest to install from within R

 In values:

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [x,y,e]=locfitraw(varargin)
0002 % locfitraw locfit helper function to call from matlab
0003 %
0004 %  Usage: [x,y,e]=locfitraw( data )  {most basic usage, all defaults}
0005 %
0006 %    Additional arguments are attached as name-value pairs, ie:
0007 %    [x,y,e]=locfitraw( x, 'alpha',[0.7,1.5] , 'family','rate' , 'ev','grid' , 'mg',100 );
0008 %
0009 %====================================================================
0010 %
0011 %  Argument types:
0012 %
0013 %      The first set of arguments ('x', 'y', 'weights', 'cens', and
0014 %      'base') specify the regression variables and associated
0015 %      quantities.
0016 %
0017 %      Another set ('scale', 'alpha', 'deg', 'kern', 'kt', 'acri' and
0018 %      'basis') control the amount of smoothing: bandwidth, smoothing
0019 %      weights and the local model.
0020 %
0021 %      'deriv' and 'dc' relate to derivative (or local slope) estimation.
0022 %
0023 %      'family' and 'link' specify the likelihood family.
0024 %
0025 %      'xlim' and 'renorm' may be used in density estimation.
0026 %
0027 %      'ev', 'flim', 'mg' and 'cut' control the set of evaluation points.
0028 %
0029 %      'maxk',  'itype', 'mint', 'maxit' and 'debug' control the Locfit
0030 %      algorithms, and will be rarely used.
0031 %
0032 %      'geth' and 'sty' are used by other functions calling 'locfit.raw',
0033 %      and should not be used directly.
0034 %
0035 %=========================================================================
0036 %
0037 %  Arguments in detail:
0038 %
0039 %        x: Vector (or matrix) of the independent variable(s).
0040 %      ******************************
0041 %  NOTE:       The first argument is placed in the first function slot without a name...
0042 %              All other arguments require 'name',value notation
0043 %      ******************************
0044 %
0045 %        y: Response variable for regression models. For density
0046 %           families, 'y' can be omitted.
0047 %
0048 %  weights: Prior weights for observations (reciprocal of variance, or
0049 %           sample size).
0050 %
0051 %     cens: Censoring indicators for hazard rate or censored regression.
0052 %           The coding is '1' (or 'TRUE') for a censored observation, and
0053 %           '0' (or 'FALSE') for uncensored observations.
0054 %
0055 %     base: Baseline parameter estimate. If provided, the local
0056 %           regression model is fitted as Y_i = b_i + m(x_i) + epsilon_i,
0057 %           with Locfit estimating the m(x) term. For regression models,
0058 %           this effectively subtracts b_i from Y_i. The advantage of the
0059 %           'base' formulation is that it extends to likelihood
0060 %           regression models.
0061 %
0062 %    scale: A scale to apply to each variable. This is especially
0063 %           important for multivariate fitting, where variables may be
0064 %           measured in non-comparable units. It is also used to specify
0065 %           the frequency for 'ang' terms. If 'scale=F' (the default) no
0066 %           scaling is performed. If 'scale=T', marginal standard
0067 %           deviations are used. Alternatively, a numeric vector can
0068 %           provide scales for the individual variables.
0069 %
0070 %    alpha: Smoothing parameter. A single number (e.g. 'alpha=0.7') is
0071 %           interpreted as a nearest neighbor fraction. With two
0072 %           componentes (e.g. 'alpha=c(0.7,1.2)'), the first component is
0073 %           a nearest neighbor fraction, and the second component is a
0074 %           fixed component. A third component is the penalty term in
0075 %           locally adaptive smoothing.
0076 %
0077 %      deg: Degree of local polynomial. Default: 2 (local quadratic).
0078 %           Degrees 0 to 3 are supported by almost all parts of the
0079 %           Locfit code. Higher degrees may work in some cases.
0080 %
0081 %     kern: Weight function, default = '"tcub"'. Other choices are
0082 %           '"rect"', '"trwt"', '"tria"', '"epan"', '"bisq"' and
0083 %           '"gauss"'. Choices may be restricted when derivatives are
0084 %           required; e.g. for confidence bands and some bandwidth
0085 %           selectors.
0086 %
0087 %       kt: Kernel type, '"sph"' (default); '"prod"'. In multivariate
0088 %           problems, '"prod"' uses a simplified product model which
0089 %           speeds up computations.
0090 %
0091 %       acri: Criterion for adaptive bandwidth selection.
0092 %
0093 %    basis: User-specified basis functions. See 'lfbas' for more details
0094 %           on this argument.
0095 %
0096 %    deriv: Derivative estimation. If 'deriv=1', the returned fit will be
0097 %           estimating the derivative (or more correctly, an estimate of
0098 %           the local slope). If 'deriv=c(1,1)' the second order
0099 %           derivative is estimated. 'deriv=2' is for the partial
0100 %           derivative, with respect to the second variable, in
0101 %           multivariate settings.
0102 %
0103 %       dc: Derivative adjustment.
0104 %
0105 %   family: Local likelihood family; '"gaussian"'; '"binomial"';
0106 %           '"poisson"'; '"gamma"' and '"geom"'. Density and rate
0107 %           estimation families are '"dens"', '"rate"' and '"hazard"'
0108 %           (hazard rate). If the family is preceded by a ''q'' (for
0109 %           example, 'family="qbinomial"'), quasi-likelihood variance
0110 %           estimates are used. Otherwise, the residual variance ('rv')
0111 %           is fixed at 1. The default family is '"qgauss"' if a response
0112 %           'y' is provided; '"density"' if no response is provided.
0113 %
0114 %     link: Link function for local likelihood fitting. Depending on the
0115 %           family, choices may be '"ident"', '"log"', '"logit"',
0116 %           '"inverse"', '"sqrt"' and '"arcsin"'.
0117 %
0118 %     xlim: For density estimation, Locfit allows the density to be
0119 %           supported on a bounded interval (or rectangle, in more than
0120 %           one dimension). The format should be 'c(ll,ul)' where 'll' is
0121 %           a vector of the lower bounds and 'ur' the upper bounds.
0122 %           Bounds such as [0,infty) are not supported, but can be
0123 %           effectively implemented by specifying a very large upper
0124 %           bound.
0125 %
0126 %   renorm: Local likelihood density estimates may not integrate exactly
0127 %           to 1. If 'renorm=T', the integral will be estimated
0128 %           numerically and the estimate rescaled. Presently this is
0129 %           implemented only in one dimension.
0130 %
0131 %       ev: Evaluation Structure, default = '"tree"'. Also available are
0132 %           '"phull"', '"data"', '"grid"', '"kdtree"', '"kdcenter"' and
0133 %           '"crossval"'. 'ev="none"' gives no evaluation points,
0134 %           effectively producing the global parametric fit. A vector or
0135 %           matrix of evaluation points can also be provided.
0136 %
0137 %     flim: A vector of lower and upper bounds for the evaluation
0138 %           structure, specified as 'c(ll,ur)'. This should not be
0139 %           confused with 'xlim'. It defaults to the data range.
0140 %
0141 %       mg: For the '"grid"' evaluation structure, 'mg' specifies the
0142 %           number of points on each margin. Default 10. Can be either a
0143 %           single number or vector.
0144 %
0145 %      cut: Refinement parameter for adaptive partitions. Default 0.8;
0146 %           smaller values result in more refined partitions.
0147 %
0148 %     maxk: Controls space assignment for evaluation structures. For the
0149 %           adaptive evaluation structures, it is impossible to be sure
0150 %           in advance how many vertices will be generated. If you get
0151 %           warnings about `Insufficient vertex space', Locfit's default
0152 %           assigment can be increased by increasing 'maxk'. The default
0153 %           is 'maxk=100'.
0154 %
0155 %    itype: Integration type for density estimation. Available methods
0156 %           include '"prod"', '"mult"' and '"mlin"'; and '"haz"' for
0157 %           hazard rate estimation problems. The available integration
0158 %           methods depend on model specification (e.g. dimension, degree
0159 %           of fit). By default, the best available method is used.
0160 %
0161 %     mint: Points for numerical integration rules. Default 20.
0162 %
0163 %    maxit: Maximum iterations for local likelihood estimation. Default
0164 %           20.
0165 %
0166 %    debug: If > 0; prints out some debugging information.
0167 %
0168 %     geth: Don't use!
0169 %
0170 %      sty: Style for special terms ('left', 'ang' e.t.c.). Do not try to
0171 %           set this directly; call 'locfit' instead.
0172 %
0173 %==========================================================================
0174 %
0175 % Requires windows since R-(D)COM is windows-specific
0176 %  I am working on a platform-independent replacement
0177 %
0178 % Requires that Matlab-R link Matlab package be installed from
0179 % http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=5051&objectType=file
0180 % file MATLAB_RLINK.zip
0181 %
0182 % Requires that R be installed see http://r-project.org first
0183 % file rw1091.exe
0184 %
0185 % Requires that R locfit package be installed first
0186 % From within R in menu do "Packages" then "Install from CRAN"
0187 %
0188 % Requires that R-(D)COM be installed first from
0189 % http://lib.stat.cmu.edu/R/CRAN/contrib/extra/dcom/
0190 % (get latest EXE file approx 3 MB)
0191 % file RSrv135.exe
0192 %
0193 % The above packages should come bundled with this software for convenience
0194 % with the exception of locfit which is easiest to install from within R
0195 %
0196 % In values:
0197 %
0198 %
0199 
0200 % Check for toolboxes
0201 if not(exist('putRdata'));
0202     fprintf('You need to install Matlab-R Link first (do: "help locfitraw" for info)\nThen Install R-(D)COM\nThen install R\nThen install locfit from within R\nOnly works on Windoze\n');
0203     return
0204 end
0205 
0206 %
0207 % Connect to R only if not done so already, never disconnect
0208 global RCONNECTED;
0209 if isempty( RCONNECTED )
0210   % Try the open command
0211   [status,msg] = openR;
0212   if status ~= 1
0213     disp(['Problem connecting to R: ' msg]);
0214     return
0215   end
0216   evalR('library("locfit")') % attach locfit library
0217   RCONNECTED = 1;
0218 end
0219 
0220 
0221 % Minimal input validation
0222 if nargin < 1
0223    error( 'At least one input argument required' );
0224 end
0225 if mod(nargin,2)==0
0226    error( 'Argument count must be odd' );
0227 end
0228 
0229 putRdata( 'xdata', varargin{1}(:) );
0230 args = '';
0231 
0232 n = 2;
0233 while n < length(varargin)
0234     if isa(varargin{n+1},'char')
0235       args = sprintf( '%s,%s="%s"',args, varargin{n}, varargin{n+1} );
0236     else
0237       putRdata( sprintf('%sval',varargin{n}), varargin{n+1} );
0238       args = sprintf( '%s,%s=%sval',args, varargin{n}, varargin{n} );
0239     end
0240     n=n+2;
0241 end
0242 
0243 command=sprintf( 'fit<-locfit.raw( xdata %s )', args );
0244 evalR( command );
0245 evalR( 'out<-knots(fit,what=c("x","coef","nlx"))' );
0246 %evalR( 'plot(fit)' );
0247 
0248 out = getRdata( 'out' );
0249 [x,ind]=sort(out(:,1),1);
0250 y=out(ind,2);
0251 e=out(ind,3);
0252 %aic=getRdata('-2*$fit$dp$lk+2*$fit$dp$df1');
0253 
0254 
0255 return;

Generated on Tue 16-Aug-2005 21:33:45 by m2html © 2003