Descision Tree augmented by Artificial Neural Network

This models divides the input space by fitting a tree followed by artificial neural network to each of leaf. Decision tree model is built using rpart package and neural network using deepdive.Feature of stacking predictions from other models is also made available.

deeptree(
  x,
  y,
  hiddenLayerUnits,
  activation = c("sigmoid", "sigmoid"),
  reluLeak = 0,
  modelType = "regress",
  iterations = 500,
  eta = 10^-2,
  seed = 2,
  gradientClip = 0.8,
  regularisePar = 0,
  optimiser = "adam",
  parMomentum = 0.9,
  inputSizeImpact = 1,
  parRmsPropZeroAdjust = 10^-8,
  parRmsProp = 0.9999,
  treeLeaves = NA,
  treeMinSplitPercent = 0.3,
  treeMinSplitCount = 30,
  treeCp = 0.01,
  stackPred = NA,
  printItrSize = 100,
  showProgress = T,
  stopError = 0.01,
  miniBatchSize = NA,
  useBatchProgress = T,
  ignoreNAerror = F
)

Arguments

x	a data frame with input variables
y	a data frame with ouptut variable
hiddenLayerUnits	a numeric vector, length of vector indicates number of hidden layers and each element in vector indicates corresponding hidden units Eg: c(6,4) for two layers, one with 6 hiiden units and other with 4 hidden units. Note: Output layer is automatically created.
activation	one of "sigmoid","relu","sin","cos","none". The default is "sigmoid". Choose a activation per hidden layer
reluLeak	numeric. Applicable when activation is "relu". Specify value between 0 any number close to zero below 1. Eg: 0.01,0.001 etc
modelType	one of "regress","binary","multiClass". "regress" for regression will create a linear single unit output layer. "binary" will create a single unit sigmoid activated layer. "multiClass" will create layer with units corresponding to number of output classes with softmax activation.
iterations	integer. This indicates number of iteratios or epochs in backpropagtion .The default value is 500.
eta	numeric.Hyperparameter,sets the Learning rate for backpropagation. Eta determines the convergence ability and speed of convergence.
seed	numeric. Set seed with this parameter. Incase of sin activation sometimes changing seed can yeild better results. Default is 2
gradientClip	numeric. Hyperparameter numeric value which limits gradient size for weight update operation in backpropagation. Default is 0.8 . It can take any postive value.
regularisePar	numeric. L2 Regularisation Parameter .
optimiser	one of "gradientDescent","momentum","rmsProp","adam". Default value "adam"
parMomentum	numeric. Applicable for optimiser "mometum" and "adam"
inputSizeImpact	numeric. Adjusts the gradient size by factor of percentage of rows in input. For very small data set setting this to 0 could yeild faster result. Default is 1.
parRmsPropZeroAdjust	numeric. Applicable for optimiser "rmsProp" and "adam"
parRmsProp	numeric.Applicable for optimiser "rmsProp" and "adam"
treeLeaves	vector.Optional , leaves numbers from externally trained tree model can be supplied here. If supplied then model will not build a explicit tree and just fit a neural network to mentioned leaves.
treeMinSplitPercent	numeric. This parameter controls depth of tree setting min split count for leaf subdivision as percentage of observations. Final minimum split will be chosen as max of count calculted with treeMinSplitPercent and treeMinSplitCount. Default 0.3. Range 0 to 1.
treeMinSplitCount	numeric. This parameter controls depth of tree setting min split count.Final minimum split will be chosen as max of count calculted with treeMinSplitPercent and treeMinSplitCount. Default 30
treeCp	complexity parameter. `rpart.control`
stackPred	vector.Predictions from buildnet or other models can be supplied here. If for certain leaf stackPrep accuracy is better then stackpred predictions will be chosen.
printItrSize	numeric. Number of iterations after which progress message should be shown. Default value 100 and for iterations below 100 atleast 5 messages will be seen
showProgress	logical. True will show progress and F will not show progress
stopError	Numeric. Rmse at which iterations can be stopped. Default is 0.01, can be set as NA in case all iterations needs to run.
miniBatchSize	integer. Set the mini batch size for mini batch gradient
useBatchProgress	logical. Applicable for miniBatch , setting T will use show rmse in Batch and F will show error on full dataset. For large dataset set T
ignoreNAerror	logical. Set T if iteration needs to be stopped when predictions become NA

Value

returns model object which can be passed into predict.deeptree

Examples


require(deepdive)

x <- data.frame(a = runif(1000)*100,
b = runif(1000)*200,
c = runif(1000)*100)
y<- data.frame(y=20*x$a +30* x$b+10*x$c +10)

deepTreeMod<-deeptree(x,
y,
hiddenLayerUnits=c(4,4),
activation = c('relu',"sin"),
reluLeak=0.01,
modelType ='regress',
iterations = 1000,
eta = 0.4,
seed=2,
gradientClip=0.8,
regularisePar=0,
optimiser="adam",
parMomentum=0.9,
inputSizeImpact=1,
parRmsPropZeroAdjust=10^-8,
parRmsProp=0.9999,
treeLeaves=NA,
treeMinSplitPercent=0.4,
treeMinSplitCount=100,
stackPred =NA,
stopError=4,
miniBatchSize=64,
useBatchProgress=T,
ignoreNAerror=F)
#> iteration 99: 3311.57772982348
#> iteration 199: 676.229136371436
#> iteration 299: 681.451736490737
#> iteration 399: 697.954182188333
#> iteration 499: 679.767521373061
#> iteration 599: 695.765674408348
#> iteration 699: 681.716800450833
#> iteration 799: 692.514114036877
#> iteration 899: 565.078296520105
#> iteration 1000: 616.383102030072
#> iteration 99: 325.047688794225
#> iteration 199: 324.560776204087
#> iteration 299: 271.144426390469
#> iteration 399: 346.765869445742
#> iteration 499: 439.995765378202
#> iteration 599: 393.555666167262
#> iteration 699: 326.655713622822
#> iteration 799: 334.13855373762
#> iteration 899: 377.889746444716
#> iteration 1000: 363.989876175523
#> iteration 99: 50.4331408481273
#> iteration 199: 65.9104203241474
#> iteration 299: 76.9476682386133
#> iteration 399: 85.6127089398539
#> iteration 499: 58.3006874413882
#> iteration 599: 23.1033140870675
#> iteration 699: 27.2316510495828
#> iteration 799: 24.2852994870429
#> iteration 899: 23.032737170571
#> iteration 1000: 23.1030235121181
#> iteration 99: 222.113044761898
#> iteration 199: 149.97293450678
#> iteration 299: 232.042283716635
#> iteration 399: 737.872387718822
#> iteration 499: 643.218348462475
#> iteration 599: 632.155467581962
#> iteration 699: 568.323077084781
#> iteration 799: 548.479709658453
#> iteration 899: 579.003428191187
#> iteration 1000: 596.568323681389

Descision Tree augmented by Artificial Neural Network

Arguments

Value

Examples

Contents