rpart/0000755000176200001440000000000014173604075011411 5ustar liggesusersrpart/NAMESPACE0000644000176200001440000000132113306236017012617 0ustar liggesusersuseDynLib(rpart, .registration = TRUE, .fixes = "C_") export(meanvar, na.rpart, path.rpart, plotcp, post, printcp, prune, prune.rpart, rpart, rpart.control, rsq.rpart, snip.rpart, xpred.rpart) export(rpart.exp) # needed for one of the tests importFrom(grDevices, col2rgb, dev.cur, dev.off, postscript) importFrom(graphics, abline, axis, box, identify, legend, lines, mtext, par, plot, polygon, segments, text, title) import(stats) S3method(labels, rpart) S3method(meanvar, rpart) S3method(model.frame, rpart) S3method(plot, rpart) S3method(post, rpart) S3method(predict, rpart) S3method(print, rpart) S3method(prune, rpart) S3method(residuals, rpart) S3method(summary, rpart) S3method(text, rpart) rpart/ChangeLog0000644000176200001440000004014714170373107013165 0ustar liggesuserszzzz zzz zz 4.1-17 plot.rpart() gained three new arguments (branch.col, branch.lty, branch.lwd) for controlling the color, line type, and width of the branches. 2019 May 21 4.1-16 Updated rpart.matrix to use lapply instead of a loop 2019 Apr 11 4.1-15 Update saved test/example results because of changes to R random number generators 2018 Jul 31 4.1-14 Change post.rpart.Rd example so as to not write outside tempdir() Changed name of solder to solder.balance and added in solder from the survival package (larger than solder.balance). Now data matches between the packages. Modified vignettes slightly 2018 Jan 08 4.1-12 Merge ChangeLog files 2017 Mar 12 4.1-11 Include directly-needed headers, update o/p for R 3.4.0. 2015 Jun 29 4.1-10 Tweak imports. 2015 Feb 11 4.1-9 Update Korean translations. Remove some unused assignments. 2014 Mar 28 4.1-8 Update Polish translations. 2014 Mar 24 4.1-7 Update French and German translations. Fix array-overrun in gini.c (detected by valgrind checks for adabag). 2014 Mar 07 4.1-6 model.frame() method could fail when the recorded call was rpart::rpart(). 2014 Jan 25 4.1-5 Avoid abbreviation in tests/treble.R More comprehensive Description: field. 2013 Dec 10 4.1-4 Change pre-defined structure sizes to mitigate false positives from Undefined Behaviour Sanitizer. 2013 Sep 01 4.1-3 Document TMT change to predict() output. 2013 Aug 15 4.1-2 Replace calls to as.name(). Remove unused and un-exported rpartpl(). Correct plot.rpart.Rd as to where things are stored. Increase version dependence to >= 2.15.0, remove conditional paste0 from this package. 2013 Mar 20 4.1-1 Add ko translations, update reference output for 3.0.0. 2012 Nov 29 4.1-0 Remove rpconvert() (converted rpart2 trees). Clean up a lot of the R code. C code reformatted with GNU indent and -i4 -nut -ncdb -d1 -br -ce -il0 -npcs -brs then use whitespace-cleanup in Emacs. Call R_CheckUserInterrupts at each phase of cross-validation. Ensure that surrogates really are better than the default split (adj > 1e-10), and that at least 2 cases of non-zero weight go each way. Removed unused file src/s_xpred.c. 2012 Nov 18 4.0-3 Add 'importance' to rpart object and to summary(): from TMT. 'minlength' arg for tree() method. Improved handling of weights with zero fit, including bugfix for ordered factors. free_tree was commented out in rpart.c, and could crash as the structure was not zeroed. As a precaution, pointers which are freed are NULLed. s_xpred.c did not free the first instance of a tree. make_cp_list used calloc but did not free. Spell-check help pages and vignettes. Several examples needed par(xpd = TRUE) at default plotting sizes. 2012 Nov 11 4.0-2 Add car90 dataset from TMT. 'agree' in choose_surg.c needs to be double for fractional weights. Not use paste0() so R 2.14.x works. 2012 Oct 26 4.0-1 Merge in updates from TMT: see inst/NEWS.Rd and below Use .Call for all C code. Use an environment in the package for 'parms' from plot.rpart(). Force byte-compilation for consistency. 2012 Oct 03 3.1-55 Force use of registered symbols in R >= 3.0.0 Update Polish translations. Work on message formats. text(fancy = TRUE) gains a 'bg' argument. 2012 Jun 27 3.1-54 Add Polish tranlations. 2012 Jun 01 3.1-53 rpart, rpart.matrix: allow backticks in formulae. tests/backtick.R: regession test 2012 Mar 04 3.1-52 src/xval.c: ensure unused code is not compiled in. 2012 Jan 11 3.1-51 Change description of 'margin' in ?plot.rpart as suggested by Bill Venables. 2011 Apr 09 3.1-50 Change licence to GPL-2 | GPL-3 Remove set-but-unused variable in src/xval.c 2011 Mar 06 3.1-49 Update testall.Rout.save for R 2.13.0 2010 Dec 08 3.1-48 Avoid partial match to args, unnecessary as.vector. Update reference output for survival change. Correction to plot.rpart(compress = TRUE) from Stephen Milborrow. 2010 Nov 03 3.1-47 Update rpart-Ex.Rout.save for 2.12.x 2010 Jan 03 3.1-46 Update rpart-Ex.Rout.save 2009 Jul 28 3.1-45 Add rpart-Ex.Rout.save file 2009 May 18 3.1-44 Add German translations 2009 Mar 09 3.1-43 Spelling in man/snip.rpart.Rd. Spacing issue in tests/testall.Rout.save Remove environments from fit$functions if basic plotcp() allows 'ylim' to be passed in 2008 Oct 21 3.1-42 Make use of 1L etc, update plot.rpart to use dev.new 2008 Apr 10 3.1-41 Add Russian translations 2008 Mar 28 3.1-40 cosmetics on .Rd files, Date: field in DESCRIPTION 2008 Feb 18 3.1-39 summary.rpart was missing a drop=FALSE. 2007 Oct 03 3.1-38 Remove obsolete \non_function{} notation. 2007 Jul 26 3.1-37 Correct spelling errors in man pages DESCRIPTION: GPL-2 only point to www.r-project.org for GPL-2. 2007 Jun 12 3.1-36 Qualify nchar() where needed Update tests/testall.Rout.save for 2.6.x Add reference to usersplits.R in ?rpart. 2007 Feb 23 3.1-35 Correct 'label' in text.rpart C-level formatg is replaced by sprintf. 2006 Dec 24 3.1-34 Spelling corrections 2006 Nov 29 3.1-33 Use control=NULL in deparsed calls 2006 Sep 26 3.1-32 Missing 'drop=FALSE' in rpart, add tests/surv_test.R 2006 Sep 04 3.1-31 Add depends on standard packages. 2006 Jul 05 3.1-30 Update output for R 2.4.0's naprint Expand the LICENCE, and install it Update cu.summary.rda 2006 Apr 13 3.1-29 Update tests output for changes in all.equal 2005 Dec 30 3.1-28 Use registered symbols in .C/.Call 2005 Dec 09 3.1-27 Add French and en@quot translations. 2005 Nov 15 3.1-26 Add back entry-point registration. 2005 Nov 09 3.1-25 Drop obselete test for existence of .checkMFClasses 2005 Oct 17 3.1-24 Add missing drop=FALSE in na.rpart. Clarify predict.rpart.Rd and rpart.object.Rd. Add na.action arg to predict.rpart (instead of using the na.action used during fitting). 2005 Apr 15 3.1-23 Use xpd=NA in example(rpart) 2005 Feb 01 3.1.22 Improve error messages for possible translation. 2004 Nov 17 3.1-21 Change logic for setting params on a device in plot.rpart. text.rpart.Rd: Mention use of xpd=TRUE. 2004 Aug 25 3.1-20 Stop attempts to plot a degenerate tree 2004 Aug 03 3.1-18 Conversion for R 2.0.0 & LazyData 2004 Jun 22 3.1-17 Fix possible use of uninitialized `split' in bsplit.c Add drop=FALSE for probs prediction for a single case. 2004 Jun 06 3.1-16 Replace long* by int* in rpartexp2.c 2003 Dec 08 3.1-15 Update NAMESPACE for R 1.9.0 Capitalization issues in DESCRIPTION file 2003 Nov 18 3.1-14 Test newdata types in predict.rpart Correct documentation for `y' in rpart.Rd 2003 Jul 20 3.1-13 Remove unused vars 2003 Mar 15 3.1-12 Update NAMESPACE file Use post not post.rpart 2003 Mar 03 3.1-11 Reinstate rpart.matrix etc, as ipred used it (even though they were documented as for use in rpart). 2003 Mar 01 3.1-10 Use namespace, REprintf. 2002 Dec 10 3.1-9 Apparent typo in rpartcallback.s spotted by Torsten Hothorn. formatg uses e+/-0n not 00n under Windows 2002 Jun 20 3.1-8 Remove use of registration fiasco 2002 Jun 05 3.1-7 T -> TRUE in tests 2002 Mar 26 3.1-6 based on rpart3 'release'. Bug fix from TMT for empty classes in training set. Bug fix for prediction from root-only tree. Register .C/.Call entry points. Add PACKAGE= to .C/.Call calls. Replace is.Surv by its definition Don't need FUN1 in text.rpart any more 2002 Jan 14 3.1-5 path.rpart needs descendants(), node.match(). 2002 Jan 04 3.1-4 Allow ylim to be passed to plotcp. Add NAOK=TRUE to formatg. Workaround for multiple symbols for MacOS X. 2001 Nov 11 3.1-3 Change to zero-split case in rpart. Fixes from TMT re pruning single-node trees. 2001 Sep 25 3.1-2 Fixes to predict.rpart, residuals.rpart. 2001 Aug 23 3.1-1 Further fixes from TMT, xpred.rpart was not intepreting fit$parms correctly. 2001 Aug 08 3.1-0 Further updates from TMT. More documentation updates and corrections. 2001 Jul 25 3.0-2 Correct documentation for predict.rpart. Remove left-over frame$yprob in residuals.rpart. Use >= vs < in the labels for continuous splits. 2001 Jul 03 3.0-1 Restore use of FUN1 in text.rpart, as NAs are handled differently in R. 2001 May 25 3.0-0 New sources from TMT with user-specified splits. Use format(nsmall=) and naresid/naprint from R 1.3.0. Change na.rpart to make use of passing down the terms attribute in R 1.3.0. Explicitly get/set .rpart.parms* in user workspace. 2001 Mar 31 2.0-3 Add priority: recommended Re-licence under GPL2 2000 Dec 05 2.0-2 Update for R 1.2.0: more careful use of malloc 2000 Aug 12 2.0-1 Update for 2000/02/25 release of rpart, which added case weights 2000 Feb 07 1.1-2 Header file changes for 0.99.0 (especially re error) Escape # in post.part.Rd and text.rpart.Rd 1999 Dec 15 1.1-1 New version which uses weights. Change all occurrences of longs. 1999 May 11 1.0-7 Fix bug in graycode.c, from TMT. All the examples now run correctly. 1999 Apr 1.0-6 Add index of (test) datasets. 1999 Feb 24 1.0-5 Correct rpart.branch.s to get(parms, inherits=TRUE). Add tolerance to tree.depth, needed for the Windows version. 1999 Jan 06 1.0-4 Remove model.frame.rp, which was no longer needed now model.frame.default uses xlevels. Modified rpart.matrix to allow - in model formulae. Examples are now all executable (or commented out). 1998 Jul 24 1.0-3 Added identify.rpart to work around R's limited identify. text is now generic in R, so removed from zzz.R. levels is now generic in R, so removed from zzz.R. Manual pages re-converted with Sd2Rd version 0.3-1. predict now uses xlevels to force agreement of levels of factors in newdata. 1998 Jun 22 1.0-2 Manual pages converted with Details section snip.rpart implemented. 1998 Jun 16 1.0-1 Original port ------------------- Former file PORTING ------------------- src/*.{h,c} long -> int man/*.Rd convert *.d by Sd2Rd Don't implement code using naresid, which R does not have. F -> FALSE, T -> TRUE R/labels.rpart.s replace call to prlabel by S code. R/model.frame.rpart.s deparse calls R/na.rpart.s as x has no attributes in R, redesign R/plot.rpart.s change frame=0 to .GlobalEnv R/post.rpart.s change `title' trickery R/print.rpart.s remove nsmall in call to format attr(x, 'ylevels') not 'ylevel' R/rpart.branch.s get with inherits not frame=0 R/rpart.matrix.s adjust for different terms structure R/rpart.s single -> double sys.parent() -> sys.frame(sys.parent()) R/rpartco.s remove frame=0 several times R/snip.rpart.mouse.s drop frame=0 R/summary.rpart.s remove justify="left" in format remove comma in paste(... ,,collapse ...) attr(x, 'ylevels') not 'ylevel' R/text.rpart.s text.default prints "NA", so remove these remove density=0 in calls to polygon R/xpred.rpart.s single -> double sys.parent() -> sys.frame(sys.parent()) R/zzz.R a few missing functions. ------------------- Former file ChangeLog.TMT ------------------- The changes documented since 3/2002, which is the date stated in Brian's version of the Description file. In Sept 2012 the current R version and the Mayo version were merged. 14March02: When y was a factor, with no instances of one of the levels in the "middle" of the levels list, the program would generate NA due to division by 0. Fairly simple fixes to gini.c and rpart.s. The bug, and a very nicely documented test case, was supplied by Matthew Wiener. 8Aug02: Small bug pointed out by Kai Yu in rundown.c - a missing pair of {}, which only apply if usesurrogate < 2 and there are lots of missings. Appears that the se of the xval error would be too small. 29Oct02: Error in xpred.rpart.s, for a user split with a response vector >1, the "yback" vector was too short. (Diff the lines creating eframe with those of rpart.s, to see the obvious oversight!) Leads to a core dump. 29Oct02: Fix error in branch.c, which was not watching out for missing values in the surrogate variable. Found due to close reading of the C code by Kai Yu. (I'm impressed!) 12Nov02: Add the "return.all" argument to xpred.rpart, to fit the needs of Dan Schaid. Add the test case xpred.s to test it. 28Nov02: Major changes to how indexing is done. In the older version, at a lower branch on the tree one would find the following code again and again: for (i=0; i [![CRAN_STATUS_BADGE](http://www.r-pkg.org/badges/version/rpart)](https://CRAN.R-project.org/package=rpart) [![Downloads](http://cranlogs.r-pkg.org/badges/rpart)](https://CRAN.R-project.org/package=rpart) [![Travis-CI Build Status](https://travis-ci.org/bethatkinson/rpart.svg?branch=master)](https://travis-ci.org/bethatkinson/rpart) This is the source code for the `rpart` package, which is a recommended package in R. It gets posted to the comprehensive R archive (CRAN) as needed after undergoing a thorough testing. ## Overview The `rpart` code builds classification or regression models of a very general structure using a two stage procedure; the resulting models can be represented as binary trees. The package implements many of the ideas found in the CART (Classification and Regression Trees) book and programs of Breiman, Friedman, Olshen and Stone. Because CART is the trademarked name of a particular software implementation of these ideas and `tree` was used for the Splus routines of Clark and Pregibon, a different acronym - Recursive PARTitioning or rpart - was chosen. rpart/data/0000755000176200001440000000000013330606500012307 5ustar liggesusersrpart/data/kyphosis.tab.gz0000644000176200001440000000074613306236017015303 0ustar liggesusersM"Ikyphosis.tabV9n1 >Aԭ2u4y ,9`;E~^i͑ua%EG}~:u矏rty8煈n? lW܏ gai8O$xf7_Pn@\q xU~Z~8Gȟ'SS9F&UgD}U_eȏX׆)/z$'ԗ1{]aGN/ 1j:!wq@Ue>0or? G~?'?_ /RAF_X_#[Uo?l??R߄Zg ϺOҿ;ϧO?O<urR&ׂC} (Oէ닼G: rpart/data/cu.summary.rda0000644000176200001440000000434413306236017015114 0ustar liggesusersXo!E%Ѷ,쭹ءM!mEddZh?]JTuE{(kS4XcFIM-^8}73\[450z>7o~f赅HbX< ßX, 4]߲b I_f~~׾qA3} =Կև}W,SNב^6痸4.қu{?@g o?)Az}_GWEzy>靳8y<?Ap_y^Hx=vkZ{>Zt W8}o_*[/#}n_޵G ڿ>9Q!}\Dߏ}<e<һr `IVFu!'*v6ɈTdtOUTb%S##a׫OZה?"VVH6i.|et]\Ϣ]5]%Fl 2e.E3/fEQ"鎡;+l 39Mt".1=c]$B&.enJNJ.af)t _Ts<{" 8WRN-֔P"qb\K)4&DdtEh;bj@!y w?< y_ж 4T(#UvQƹ\! ' e4Ľ5||33uH琶Ś䂟y6*~r4~YDEbmQva~3a~4g&;q:O5E <[?`?~&"yvø;|u#myݐFF?^Mwx;4lΦrɛ O+o U"xtLb=x4]m_XT^=\e04A./Bʢ2뎹nlm Gc^ 9x7h RO'үOtVMq EZ^~D{-'ZgeGn(bHLǎ`$u"v5(! cb2m2lHb8/SΨT8ͯD0 _ łSZހ'@'B4J&9FƲR FٯfUt ΗҢ,ժYrĕUxT WWЃv()+[xʐ&}F*N[d.WBy_1lD0E}ёq#38VجmC yh$58KNyR1{ڢYq 1o6ik2V%%Zp>O:l>NWIAG ռqF|˽P1t;Sʚ`r2WKmxظ,`y؉P<3l1fCժ|p,WVDGmi>p4ɎIξg\ORSqGг4F 3FE$|9IuOQ h;1ǯnKR`|QU )uO6+.@649#xDZW p;&>N3F2#pR,eQק/NqvgvIn3+7C7<7̎rpart/data/solder.rda0000644000176200001440000015070713330423637014311 0ustar liggesusersRDX2 X  solder.balance  levels L M S class factor  Thick Thin factor  A1.5 A3 B3 B6 factor                                                   D4 D6 D7 L4 L6 L7 L8 L9 W4 W9 factor                "                      .0-',-')*", $+#   ! names Opening Solder Mask PadType Panel skips data.frame row.names 0 solder  L M S factor  Thick Thin factor  A1.5 A3 A6 B3 B6 factor                                                               D4 D6 D7 L4 L6 L7 L8 L9 W4 W9 factor  1 2 3 factorpening Solder Mask PadType Panel skipsdata.framerpart/data/car90.rda0000644000176200001440000001651613306236017013733 0ustar liggesusers]{p\WyV,Ɏ,;h; $,[~]K~GrdIN0WҵjWt)S`hvy;Z$$< 4P<-0)л{~ov䤴sݻwϹ޼eo8N}{C6QW$!m]Ӱ=J>z*9\6߽tѝow\C'>iCӊN.Col蟿Z wo~\gOg6+ [N?-?<v|㱟3T\m\nDk'Pq.)?xgj_G*v>}?/ $sޒr^Nϵ)a>MqtxYJv?Y)@ #7ڦ#mF+`aS֔uKA.@oAOf'R᭏WoWÏxE:7]P먪1׸t?)Wq$=n2OY'*6.=ߤNԸ n򏣮Zovw+&M3ԝ|'[?''c;rו=qс'kLm_I)qWKl ~:NԿquӧM[-wN;b'3|'m}~,}ٶ;QRoq@n@zϧN?8Sv,{76,vEoE㏟ky]ɛ"Po/mSrکb%yZeצ:rq48 wr9Ͼ?_t=?:DZ%/ʾ=^;'uݺyD?fkO78<6\-ιE_xq?.^L*7& qkiqU>צ y~rE¡8*-cykQ _^`a U NRĿʵtl-h?I=\NE#&q.eϾ:OC=ז(~;Վwm{=_ȋ[_SYWv= ءrk{Me~8VGv woս??#Jݦ~rI~}w}PWs _ W%x&FZr}n8COnwO߈]cI}ӏE!i\yϖ6]Q~..iqh L\n\A/OM>ܦ?pUf3Ҋ:~ qPcO&3?f>+MqmOxm{ߵb%.]<5w:-Jim7hߧXg)w=~oo(Bt%D٨r@Qv=PQzڮK v;>\?bk{͵ql>NgK_ʼݗ.kJ=1xY?4yh۠ဒw Nxז=N߮o~[w)3M*C /U]_Wmxם2Kܢ<]o[NK9uAeMɼU}UO}P1z} x\ |ȧufӚ~xnʣq~ O{87*\\ԩpUy>9~`OU+q //机y^ct=7,_ZWiUT[Dcvh, _9fsX_FjSR{|rq@%Fx66r;8ߛLDrks(r?i`Md6KMxɵP9;9'o"[m*e~0AS^PQ\DŶL&|jquycq9>h{ɳF-~>8cW35;澾O ڈs]ÿ+Cߍ><ý ~.C{_};<`uwWZ׆ ˯'fO|67]~[4X_˴x:hz4o4^q=j||}7?<|kz?,`> C'p2큼p`Y6f=7̓{p[usއq.W0$ rlOjB&r֦Uۓrez({4nS1&^^-qϱeαy5^.>[ێqOzֶ'bn-~3 _ܲ ^/&Ptewv>ƯZ>n"@?GOryYMOAi_v)?YM9x.iMTO+}F~Y_BǷxmϼzce8Kwme*^e]JZc$@_dU4(zuy=li$UzčIDVChO"1ikV^W:+1hP-e0?1< 6JcYKwRxO`t!ZȾPOl0d{ 2XCzߠl'|ի9Xo [ljgDO(*5claoͩ&>gxuw۵=*oן76}ue{Llq//y4/an|%o?q{GihoթwAV|䱷,ho} @]EW+آ ̏-o/- !zؽ`oe}?Ϛ- 2ϩdϚUy0O8/49pҎrb_7[@zFRC=G3pJ}V[ ;A|b߉ mfkGxo2>;ϏE:ԏ?b>?2_[Y亩 &7-sj袯:z.&|5ЅuS Bm~XM ʙ2yOVlKx67:~{'u=o2VaV"W]ۨj/o¾U."4V.PMRe/zGjߣN݉zG=㮄tU lg.a:e v_ץK}Իu3'<[_c]9µP~e57h0WcּVu^o{_ovaӁE}iN+Ne-͛a{zش@\þ׆:/P5CB#`=? [bqK1ʃm1_b!GD|^./}:9NWw֦ ](zzV81kZҭt{VP1:uY[o[H=z,nD+W,zK[^Y_emb [u.];R]rq_.댉I&RƧֻ1UWS)TSvK1T;u[GbNxr\8jӾ4rx?7 3d\w9 +.'>? "FQd ''ӨE ?ɣEqGDiSGEٙD/8O3Txi%i*/y_h2,~ri(̧1 ǽR5C*S9|6@4Yމtm1axzz˕6 ݮt㯪z0İ𮀟9SKˍOiw2G_oǰԥvޛh>/"OQׇ?>}PSDn>ǡg14u>US㠞;]+jͧLiԷn>͚OG>̯ v|6gl_)!<ӴۛwڌO_55ғ/*:Sh~|@߫n>=KqqU~g1UUz>.`|CMCNpl~PWq8W#^{C?7ϱ)UVm<~qyc9 C9.*eoƟsŕ<957K{.*a el \v.k:$:GӴi{A簗K5$P}u9^QoSSӼ`o3Si䧦* ~d`vO3APɸ 2;2S_E7J*SRHPS|?DOzl4RnyjQh#RhZL~sܫ,ܓO2%CiYi"x?@i*Ҽ)?JEe^~hܓ\ݓ=k_җOMܐ3LB$_j~rXUS{6SSx/}A6?5T)6ɱ*Ŗd6&ٚv*7催q'q*C7QA/_Um VVgϤ2XMCc4ۻcf6" 2g^Cm; P d99]33ν6#ͦMPTZf#l0ٹ'-sـvmeV.:r.,F7'T.Bʳ\a`9[ʲ ;[F|FYF2LAi7۶\+3,ϙGgm۳IY=+aZ?<&xdDDl}M'eJl?RŞJn9wR ?JҔGix3Tԑle -ac sD>F4[kH>/*ͦp"❑hpDPs0[Ͷ`}삁x<\?mzrpart/data/stagec.rda0000644000176200001440000000314713306236017014257 0ustar liggesusersXMlTU~)RLia 3an( 6`BX1n1!1ntBMt ѝろ13хF1 w󸽾:KΜ{wιozf/K]LY»gD.aVRz^l\HܵI% }yDJt'RGw8~|'ى;yx<Nva yyy(= þ\y._;~]~/!p}4dw+Q_8DLzy9Oo>CgC78ϣܿwyyӆ/{x3^>/]gqFޘqw@8yOr;?́&A~|}emFzpߍ9~?se4\vo蹯<(|9ERS}/YF!7m̵6IZbRPO:WO==6cyNIs%lJ I%H %ab i*qI5IN ^ :(bA65'#_(0MBbf>UĘ9`)kb} ~3AOjuqkX kkǁ8hZ3=*Ӡ)m]EmF_:|~D+aІwk?<|M}>9G?Yy}$cRCQ\+OCs)&;9sqNbimAKCS|L{*(Nu݃}F|eC_1pW+&'KJuC'\3 {/Љ9h~97Gp.8xoE#/|H?!!dwr}4wSߔY[E@jta.7}|C>:  "/~V?G/_\f5d|\"7,Q8x6M1#F z'=fc1<)P1'ay!ǵE?.V&5|q-~k&_cOL0CםNjAw3{Ι7xQGLbz:LAi= ujwHXruT\uI^n\'xK:s\/7Wil_ky.6^k<ޭ7&v{Fcpq^`R^_Yv#)/k+7J Cj LMԂݴKW֤ Ud=,|rpart/data/car.test.frame.csv.gz0000644000176200001440000000272013306236017016265 0ustar liggesusersM"Icar.test.frame.csvmVn8}W vy)MzAko}dl!"K%%q~g(R>0l̙3gfVwZckg]kq5'[[}WQ}fͩn8]*׺տU%*ΫudAT`$X-8kWCmd.[ Aʨ(|ij2s$ ,Vܚ2b{(!$ni2pnxtݓ?4} 73# c@l#Jt愢'}l U@ ݹfak3~r9|km."1,a ?yDq-x(z k{2kQeTC&5g?~]H(/-f^ц\©XV &i#τ*LXdG7a$ hyA 9ɬ7\{ålMRE# 7gJ)`)X5? E1e4BCѶfCpk_ 6"hF赞O4r>WkPW|A$6,[Ғ[nt?Մtysӌ-\#=^צ`pfvbߠwaGvRӱ9kZAۮshK :ke&lgb ) -Ep+8k!YT'.tc v} `TIfGQèr>`(x|9B޼&*{i).QH^Y=2\k噖&f%J!&2?8 [ MFQD%iޠ j9x!*ϧĊ2>).~u~ rpart/man/0000755000176200001440000000000014170373107012160 5ustar liggesusersrpart/man/rsq.rpart.Rd0000644000176200001440000000151313306236017014401 0ustar liggesusers\name{rsq.rpart} \alias{rsq.rpart} \title{ Plots the Approximate R-Square for the Different Splits } \description{ Produces 2 plots. The first plots the r-square (apparent and apparent - from cross-validation) versus the number of splits. The second plots the Relative Error(cross-validation) +/- 1-SE from cross-validation versus the number of splits. } \usage{ rsq.rpart(x) } \arguments{ \item{x}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. }} \section{Side Effects}{ Two plots are produced. } \note{ The labels are only appropriate for the \code{"anova"} method. } \examples{ z.auto <- rpart(Mileage ~ Weight, car.test.frame) rsq.rpart(z.auto) } \keyword{tree} rpart/man/rpart-internal.Rd0000644000176200001440000000222013306236017015403 0ustar liggesusers\name{rpart-internal} \alias{pred.rpart} \alias{rpart.matrix} \title{ Internal Functions } \description{ Internal functions, only used by packages \pkg{rpart} and \pkg{ipred}. } \usage{ pred.rpart(fit, x) rpart.anova(y, offset, parms, wt) rpart.class(y, offset, parms, wt) rpart.matrix(frame) rpart.poisson(y, offset, parms, wt) rpartco(tree, parms) } \arguments{ \item{fit}{a tree fitted by \code{rpart}.} \item{x}{a matrix of predictors.} \item{y}{the responses.} \item{offset}{an offset, or \code{NULL}.} \item{parms}{a list of parameters, usually empty.} \item{wt}{case weights.} \item{frame}{model frame (from call to \code{rpart})} \item{tree}{a tree fitted by \code{rpart}.} } \value{ For \code{rpartco} the x,y plotting coordinates of the nodes. \code{rpart.anova}, \code{rpart.class} and \code{rpart.poisson} return a list with components \item{y}{(adjusting for \code{offset} if necessary),} \item{parms}{as input,} \item{numresp}{the number of responses,} \item{summary}{a function to be invoked by \code{\link{summary.rpart}},} \item{text}{a function to be invoked by \code{\link{text.rpart}}.} } \keyword{internal} rpart/man/predict.rpart.Rd0000644000176200001440000000731713306236017015236 0ustar liggesusers\name{predict.rpart} \alias{predict.rpart} \title{ Predictions from a Fitted Rpart Object } \description{ Returns a vector of predicted responses from a fitted \code{rpart} object. } \usage{ \method{predict}{rpart}(object, newdata, type = c("vector", "prob", "class", "matrix"), na.action = na.pass, \dots) } \arguments{ \item{object}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{newdata}{ data frame containing the values at which predictions are required. The predictors referred to in the right side of \code{formula(object)} must be present by name in \code{newdata}. If missing, the fitted values are returned. } \item{type}{ character string denoting the type of predicted value returned. If the \code{rpart} object is a classification tree, then the default is to return \code{prob} predictions, a matrix whose columns are the probability of the first, second, etc. class. (This agrees with the default behavior of \code{\link[tree]{tree}}). Otherwise, a vector result is returned. } \item{na.action}{a function to determine what should be done with missing values in \code{newdata}. The default is to pass them down the tree using surrogates in the way selected when the model was built. Other possibilities are \code{\link{na.omit}} and \code{\link{na.fail}}. } \item{\dots}{ further arguments passed to or from other methods. } } \value{ A new object is obtained by dropping \code{newdata} down the object. For factor predictors, if an observation contains a level not used to grow the tree, it is left at the deepest possible node and \code{frame$yval} at the node is the prediction. If \code{type = "vector"}:\cr vector of predicted responses. For regression trees this is the mean response at the node, for Poisson trees it is the estimated response rate, and for classification trees it is the predicted class (as a number). If \code{type = "prob"}:\cr (for a classification tree) a matrix of class probabilities. If \code{type = "matrix"}:\cr a matrix of the full responses (\code{frame$yval2} if this exists, otherwise \code{frame$yval}). For regression trees, this is the mean response, for Poisson trees it is the response rate and the number of events at that node in the fitted tree, and for classification trees it is the concatenation of at least the predicted class, the class counts at that node in the fitted tree, and the class probabilities (some versions of \pkg{rpart} may contain further columns). If \code{type = "class"}:\cr (for a classification tree) a factor of classifications based on the responses. } \details{ This function is a method for the generic function predict for class \code{"rpart"}. It can be invoked by calling \code{predict} for an object of the appropriate class, or directly by calling \code{predict.rpart} regardless of the class of the object. } \seealso{ \code{\link{predict}}, \code{\link{rpart.object}} } \examples{ z.auto <- rpart(Mileage ~ Weight, car.test.frame) predict(z.auto) fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis) predict(fit, type = "prob") # class probabilities (default) predict(fit, type = "vector") # level numbers predict(fit, type = "class") # factor predict(fit, type = "matrix") # level number, class frequencies, probabilities sub <- c(sample(1:50, 25), sample(51:100, 25), sample(101:150, 25)) fit <- rpart(Species ~ ., data = iris, subset = sub) fit table(predict(fit, iris[-sub,], type = "class"), iris[-sub, "Species"]) } \keyword{tree} rpart/man/labels.rpart.Rd0000644000176200001440000000427013306236017015041 0ustar liggesusers\name{labels.rpart} \alias{labels.rpart} \title{ Create Split Labels For an Rpart Object } \description{ This function provides labels for the branches of an \code{rpart} tree. } \usage{ \method{labels}{rpart}(object, digits = 4, minlength = 1L, pretty, collapse = TRUE, ...) } \arguments{ \item{object}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{digits}{ the number of digits to be used for numeric values. All of the \code{rpart} functions that call labels explicitly set this value, with \code{options("digits")} as the default. } \item{minlength}{ the minimum length for abbreviation of character or factor variables. If \code{0} no abbreviation is done; if \code{1} single English letters are used, first lower case than upper case (with a maximum of 52 levels). If the value is greater than \code{}, the \code{\link{abbreviate}} function is used, passed the \code{minlength} argument. } \item{pretty}{ an argument included for compatibility with the \pkg{tree} package: \code{pretty = 0} implies \code{minlength = 0L}, \code{pretty = NULL} implies \code{minlength = 1L}, and \code{pretty = TRUE} implies \code{minlength = 4L}. } \item{collapse}{ logical. The returned set of labels is always of the same length as the number of nodes in the tree. If \code{collapse = TRUE} (default), the returned value is a vector of labels for the branch leading into each node, with \code{"root"} as the label for the top node. If \code{FALSE}, the returned value is a two column matrix of labels for the left and right branches leading out from each node, with \code{"leaf"} as the branch labels for terminal nodes. } \item{\dots}{optional arguments to \code{abbreviate}.} } \value{ Vector of split labels (\code{collapse = TRUE}) or matrix of left and right splits (\code{collapse = FALSE}) for the supplied \code{rpart} object. This function is called by printing methods for \code{rpart} and is not intended to be called directly by the users. } \seealso{ \code{\link{abbreviate}} } \keyword{tree} rpart/man/car.test.frame.Rd0000644000176200001440000000305013306236017015257 0ustar liggesusers\name{car.test.frame} \alias{car.test.frame} \title{Automobile Data from 'Consumer Reports' 1990} \description{ The \code{car.test.frame} data frame has 60 rows and 8 columns, giving data on makes of cars taken from the April, 1990 issue of \emph{Consumer Reports}. This is part of a larger dataset, some columns of which are given in \code{\link{cu.summary}}. } \usage{ car.test.frame } \format{ This data frame contains the following columns: \describe{ \item{\code{Price}}{ a numeric vector giving the list price in US dollars of a standard model } \item{\code{Country}}{ of origin, a factor with levels \samp{France}, \samp{Germany}, \samp{Japan} , \samp{Japan/USA}, \samp{Korea}, \samp{Mexico}, \samp{Sweden} and \samp{USA} } \item{\code{Reliability}}{ a numeric vector coded \code{1} to \code{5}. } \item{\code{Mileage}}{ fuel consumption miles per US gallon, as tested. } \item{\code{Type}}{ a factor with levels \code{Compact} \code{Large} \code{Medium} \code{Small} \code{Sporty} \code{Van} } \item{\code{Weight}}{ kerb weight in pounds. } \item{\code{Disp.}}{ the engine capacity (displacement) in litres. } \item{\code{HP}}{ the net horsepower of the vehicle. } }} \source{ \emph{Consumer Reports}, April, 1990, pp. 235--288 quoted in John M. Chambers and Trevor J. Hastie eds. (1992) \emph{Statistical Models in S}, Wadsworth and Brooks/Cole, Pacific Grove, CA, pp. 46--47. } \seealso{ \code{\link{car90}}, \code{\link{cu.summary}} } \examples{ z.auto <- rpart(Mileage ~ Weight, car.test.frame) summary(z.auto) } \keyword{datasets} rpart/man/plotcp.Rd0000644000176200001440000000273113306236017013751 0ustar liggesusers\name{plotcp} \alias{plotcp} \title{ Plot a Complexity Parameter Table for an Rpart Fit } \description{ Gives a visual representation of the cross-validation results in an \code{rpart} object. } \usage{ plotcp(x, minline = TRUE, lty = 3, col = 1, upper = c("size", "splits", "none"), \dots) } \arguments{ \item{x}{ an object of class \code{"rpart"} } \item{minline}{ whether a horizontal line is drawn 1SE above the minimum of the curve. } \item{lty}{ line type for this line } \item{col}{ colour for this line } \item{upper}{ what is plotted on the top axis: the size of the tree (the number of leaves), the number of splits or nothing. } \item{\dots}{ additional plotting parameters } } \value{ None. } \section{Side Effects}{ A plot is produced on the current graphical device. } \details{ The set of possible cost-complexity prunings of a tree from a nested set. For the geometric means of the intervals of values of \code{cp} for which a pruning is optimal, a cross-validation has (usually) been done in the initial construction by \code{\link{rpart}}. The \code{cptable} in the fit contains the mean and standard deviation of the errors in the cross-validated prediction against each of the geometric means, and these are plotted by this function. A good choice of \code{cp} for pruning is often the leftmost value for which the mean lies below the horizontal line. } \seealso{ \code{\link{rpart}}, \code{\link{printcp}}, \code{\link{rpart.object}} } \keyword{tree} rpart/man/rpart.exp.Rd0000644000176200001440000000222513306236017014371 0ustar liggesusers\name{rpart.exp} \alias{rpart.exp} \title{Initialization function for exponential fitting} \description{ This function does the initialization step for rpart, when the response is a survival object. It rescales the data so as to have an exponential baseline hazard and then uses Poisson methods. This function would rarely if ever be called directly by a user. } \usage{ rpart.exp(y, offset, parms, wt)} \arguments{ \item{y}{the response, which will be of class \code{Surv}} \item{offset}{optional offset} \item{parms}{parameters controlling the fit. This is a list with components \code{shrink} and \code{method}. The first is the prior for the coefficient of variation of the predictions. The second is either \code{"deviance"} or \code{"sqrt"} and is the measure used for cross-validation. If values are missing the defaults are used, which are \code{"deviance"} for the method, and a shrinkage of 1.0 for the deviance method and 0 for the square root.} \item{wt}{case weights, if present} } \value{a list with the necessary initialization components} \author{Terry Therneau} \seealso{\code{\link{rpart}}} \keyword{ tree } rpart/man/rpart.object.Rd0000644000176200001440000001232613306236017015046 0ustar liggesusers\name{rpart.object} \alias{rpart.object} \title{ Recursive Partitioning and Regression Trees Object } \description{ These are objects representing fitted \code{rpart} trees. } \section{Structure}{ The following components must be included in a legitimate \code{rpart} object. } \value{ \item{frame}{ data frame with one row for each node in the tree. The \code{row.names} of \code{frame} contain the (unique) node numbers that follow a binary ordering indexed by node depth. Columns of \code{frame} include \code{var}, a factor giving the names of the variables used in the split at each node (leaf nodes are denoted by the level \code{""}), \code{n}, the number of observations reaching the node, \code{wt}, the sum of case weights for observations reaching the node, \code{dev}, the deviance of the node, \code{yval}, the fitted value of the response at the node, and \code{splits}, a two column matrix of left and right split labels for each node. Also included in the frame are \code{complexity}, the complexity parameter at which this split will collapse, \code{ncompete}, the number of competitor splits recorded, and \code{nsurrogate}, the number of surrogate splits recorded. Extra response information which may be present is in \code{yval2}, which contains the number of events at the node (poisson tree), or a matrix containing the fitted class, the class counts for each node, the class probabilities and the \sQuote{node probability} (classification trees). } \item{where}{ an integer vector of the same length as the number of observations in the root node, containing the row number of \code{frame} corresponding to the leaf node that each observation falls into. } \item{call}{ an image of the call that produced the object, but with the arguments all named and with the actual formula included as the formula argument. To re-evaluate the call, say \code{update(tree)}. } \item{terms}{ an object of class \code{c("terms", "formula")} (see \code{\link{terms.object}}) summarizing the formula. Used by various methods, but typically not of direct relevance to users. } \item{splits}{ a numeric matrix describing the splits: only present if there are any. The row label is the name of the split variable, and columns are \code{count}, the number of observations (which are not missing and are of positive weight) sent left or right by the split (for competitor splits this is the number that would have been sent left or right had this split been used, for surrogate splits it is the number missing the primary split variable which were decided using this surrogate), \code{ncat}, the number of categories or levels for the variable (\code{+/-1} for a continuous variable), \code{improve}, which is the improvement in deviance given by this split, or, for surrogates, the concordance of the surrogate with the primary, and \code{index}, the numeric split point. The last column \code{adj} gives the adjusted concordance for surrogate splits. For a factor, the \code{index} column contains the row number of the csplit matrix. For a continuous variable, the sign of \code{ncat} determines whether the subset \code{x < cutpoint} or \code{x > cutpoint} is sent to the left. } \item{csplit}{ an integer matrix. (Only present only if at least one of the split variables is a factor or ordered factor.) There is a row for each such split, and the number of columns is the largest number of levels in the factors. Which row is given by the \code{index} column of the \code{splits} matrix. The columns record \code{1} if that level of the factor goes to the left, \code{3} if it goes to the right, and \code{2} if that level is not present at this node of the tree (or not defined for the factor). } \item{method}{ character string: the method used to grow the tree. One of \code{"class"}, \code{"exp"}, \code{"poisson"}, \code{"anova"} or \code{"user"} (if splitting functions were supplied). } \item{cptable}{ a matrix of information on the optimal prunings based on a complexity parameter. } \item{variable.importance}{ a named numeric vector giving the importance of each variable. (Only present if there are any splits.) When printed by \code{\link{summary.rpart}} these are rescaled to add to 100. } \item{numresp}{ integer number of responses; the number of levels for a factor response. } \item{parms, control}{ a record of the arguments supplied, which defaults filled in. } \item{functions}{ the \code{summary}, \code{print} and \code{text} functions for method used. } \item{ordered}{ a named logical vector recording for each variable if it was an ordered factor. } \item{na.action}{ (where relevant) information returned by \code{\link{model.frame}} on the special handling of \code{NA}s derived from the \code{na.action} argument. } There may be \link{attributes} \code{"xlevels"} and \code{"levels"} recording the levels of any factor splitting variables and of a factor response respectively. Optional components include the model frame (\code{model}), the matrix of predictors (\code{x}) and the response variable (\code{y}) used to construct the \code{rpart} object. } \seealso{ \code{\link{rpart}}. } \keyword{tree} \keyword{methods} rpart/man/meanvar.rpart.Rd0000644000176200001440000000236313306236017015231 0ustar liggesusers\name{meanvar.rpart} \alias{meanvar} \alias{meanvar.rpart} \title{ Mean-Variance Plot for an Rpart Object } \description{ Creates a plot on the current graphics device of the deviance of the node divided by the number of observations at the node. Also returns the node number. } \usage{ meanvar(tree, \dots) \method{meanvar}{rpart}(tree, xlab = "ave(y)", ylab = "ave(deviance)", \dots) } \arguments{ \item{tree}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{xlab}{ x-axis label for the plot. } \item{ylab}{ y-axis label for the plot. } \item{\dots}{ additional graphical parameters may be supplied as arguments to this function. } } \value{ an invisible list containing the following vectors is returned. \item{x}{ fitted value at terminal nodes (\code{yval}). } \item{y}{ deviance of node divided by number of observations at node. } \item{label}{ node number. } } \section{Side Effects}{ a plot is put on the current graphics device. } \seealso{ \code{\link{plot.rpart}}. } \examples{ z.auto <- rpart(Mileage ~ Weight, car.test.frame) meanvar(z.auto, log = 'xy') } \keyword{tree} rpart/man/solder.balance.Rd0000644000176200001440000000322513330424646015327 0ustar liggesusers\name{solder.balance} \alias{solder.balance} \alias{solder} \title{Soldering of Components on Printed-Circuit Boards} \description{ The \code{solder.balance} data frame has 720 rows and 6 columns, representing a balanced subset of a designed experiment varying 5 factors on the soldering of components on printed-circuit boards. The \code{solder} data frame is the full version of the data with 900 rows. It is located in both the rpart and the survival packages. } \usage{ solder } \format{ This data frame contains the following columns: \describe{ \item{\code{Opening}}{ a factor with levels \samp{L}, \samp{M} and \samp{S} indicating the amount of clearance around the mounting pad. } \item{\code{Solder}}{ a factor with levels \samp{Thick} and \samp{Thin} giving the thickness of the solder used. } \item{\code{Mask}}{ a factor with levels \samp{A1.5}, \samp{A3}, \samp{B3} and \samp{B6} indicating the type and thickness of mask used. } \item{\code{PadType}}{ a factor with levels \samp{D4}, \samp{D6}, \samp{D7}, \samp{L4}, \samp{L6}, \samp{L7}, \samp{L8}, \samp{L9}, \samp{W4} and \samp{W9} giving the size and geometry of the mounting pad. } \item{\code{Panel}}{ \code{1:3} indicating the panel on a board being tested. } \item{\code{skips}}{ a numeric vector giving the number of visible solder skips. } }} \source{ John M. Chambers and Trevor J. Hastie eds. (1992) \emph{Statistical Models in S}, Wadsworth and Brooks/Cole, Pacific Grove, CA. } \examples{ fit <- rpart(skips ~ Opening + Solder + Mask + PadType + Panel, data = solder.balance, method = "anova") summary(residuals(fit)) plot(predict(fit), residuals(fit)) } \keyword{datasets} rpart/man/stagec.Rd0000644000176200001440000000244213306236017013715 0ustar liggesusers\name{stagec} \alias{stagec} \docType{data} \title{Stage C Prostate Cancer} \description{A set of 146 patients with stage C prostate cancer, from a study exploring the prognostic value of flow cytometry.} \usage{data(stagec)} \format{ A data frame with 146 observations on the following 8 variables. \describe{ \item{\code{pgtime}}{Time to progression or last follow-up (years)} \item{\code{pgstat}}{1 = progression observed, 0 = censored} \item{\code{age}}{age in years} \item{\code{eet}}{early endocrine therapy, 1 = no, 2 = yes} \item{\code{g2}}{percent of cells in G2 phase, as found by flow cytometry} \item{\code{grade}}{grade of the tumor, Farrow system} \item{\code{gleason}}{grade of the tumor, Gleason system} \item{\code{ploidy}}{the ploidy status of the tumor, from flow cytometry. Values are \samp{diploid}, \samp{tetraploid}, and \samp{aneuploid}} } } \details{ A tumor is called diploid (normal complement of dividing cells) if the fraction of cells in G2 phase was determined to be 13\% or less. Aneuploid cells have a measurable fraction with a chromosome count that is neither 24 nor 48, for these the G2 percent is difficult or impossible to measure. } \examples{ require(survival) rpart(Surv(pgtime, pgstat) ~ ., stagec) } \keyword{datasets} rpart/man/prune.rpart.Rd0000644000176200001440000000177313306236017014735 0ustar liggesusers\name{prune.rpart} \alias{prune.rpart} \alias{prune} \title{ Cost-complexity Pruning of an Rpart Object } \description{ Determines a nested sequence of subtrees of the supplied \code{rpart} object by recursively \code{snipping} off the least important splits, based on the complexity parameter (\code{cp}). } \usage{ prune(tree, \dots) \method{prune}{rpart}(tree, cp, \dots) } \arguments{ \item{tree}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{cp}{ Complexity parameter to which the \code{rpart} object will be trimmed. } \item{\dots}{further arguments passed to or from other methods.} } \value{ A new \code{rpart} object that is trimmed to the value \code{cp}. } \seealso{ \code{\link{rpart}} } \examples{ z.auto <- rpart(Mileage ~ Weight, car.test.frame) zp <- prune(z.auto, cp = 0.1) plot(zp) #plot smaller rpart object } \keyword{tree} rpart/man/post.rpart.Rd0000644000176200001440000000537013306236017014566 0ustar liggesusers\name{post.rpart} \alias{post.rpart} \alias{post} \title{ PostScript Presentation Plot of an Rpart Object } \description{ Generates a PostScript presentation plot of an \code{rpart} object. } \usage{ post(tree, \dots) \method{post}{rpart}(tree, title., filename = paste(deparse(substitute(tree)), ".ps", sep = ""), digits = getOption("digits") - 2, pretty = TRUE, use.n = TRUE, horizontal = TRUE, \dots) } \arguments{ \item{tree}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{title.}{ a title which appears at the top of the plot. By default, the name of the \code{rpart} endpoint is printed out. } \item{filename}{ ASCII file to contain the output. By default, the name of the file is the name of the object given by \code{rpart} (with the suffix \code{.ps} added). If \code{filename = ""}, the plot appears on the current graphical device. } \item{digits}{ number of significant digits to include in numerical data. } \item{pretty}{ an integer denoting the extent to which factor levels will be abbreviated in the character strings defining the splits; (0) signifies no abbreviation of levels. A \code{NULL} signifies using elements of letters to represent the different factor levels. The default (\code{TRUE}) indicates the maximum possible abbreviation. } \item{use.n}{ Logical. If \code{TRUE} (default), adds to label (\#events level1/ \#events level2/etc. for method \code{class}, \code{n} for method \code{anova}, and \#events/n for methods \code{poisson} and \code{exp}). } \item{horizontal}{ Logical. If \code{TRUE} (default), plot is horizontal. If \code{FALSE}, plot appears as landscape. } \item{\dots}{ other arguments to the \code{postscript} function. } } \section{Side Effects}{ a plot of \code{rpart} is created using the \code{postscript} driver, or the current device if \code{filename = ""}. } \details{ The plot created uses the functions \code{plot.rpart} and \code{text.rpart} (with the \code{fancy} option). The settings were chosen because they looked good to us, but other options may be better, depending on the \code{rpart} object. Users are encouraged to write their own function containing favorite options. } \seealso{ \code{\link{plot.rpart}}, \code{\link{rpart}}, \code{\link{text.rpart}}, \code{\link{abbreviate}} } \examples{ \dontrun{ z.auto <- rpart(Mileage ~ Weight, car.test.frame) post(z.auto, file = "") # display tree on active device # now construct postscript version on file "pretty.ps" # with no title post(z.auto, file = "pretty.ps", title = " ") z.hp <- rpart(Mileage ~ Weight + HP, car.test.frame) post(z.hp)} } \keyword{tree} rpart/man/residuals.rpart.Rd0000644000176200001440000000315413312771561015577 0ustar liggesusers\name{residuals.rpart} \alias{residuals.rpart} \title{ Residuals From a Fitted Rpart Object } \usage{ \method{residuals}{rpart}(object, type = c("usual", "pearson", "deviance"), ...) } \description{ Method for \code{residuals} for an \code{rpart} object. } \arguments{ \item{object}{ fitted model object of class \code{"rpart"}. } \item{type}{ Indicates the type of residual desired. For regression or \code{anova} trees all three residual definitions reduce to \code{y - fitted}. This is the residual returned for \code{user} method trees as well. For classification trees the \code{usual} residuals are the misclassification losses L(actual, predicted) where L is the loss matrix. With default losses this residual is 0/1 for correct/incorrect classification. The \code{pearson} residual is (1-fitted)/sqrt(fitted(1-fitted)) and the \code{deviance} residual is sqrt(minus twice logarithm of fitted). For \code{poisson} and \code{exp} (or survival) trees, the \code{usual} residual is the observed - expected number of events. The \code{pearson} and \code{deviance} residuals are as defined in McCullagh and Nelder. } \item{\dots}{further arguments passed to or from other methods.} } \value{ Vector of residuals of type \code{type} from a fitted \code{rpart} object. } \references{ McCullagh P. and Nelder, J. A. (1989) \emph{Generalized Linear Models}. London: Chapman and Hall. } \examples{ fit <- rpart(skips ~ Opening + Solder + Mask + PadType + Panel, data = solder.balance, method = "anova") summary(residuals(fit)) plot(predict(fit),residuals(fit)) } \keyword{tree} rpart/man/rpart.Rd0000644000176200001440000001127614170373107013606 0ustar liggesusers\name{rpart} \alias{rpart} %\alias{rpartcallback} \title{ Recursive Partitioning and Regression Trees } \description{ Fit a \code{rpart} model } \usage{ rpart(formula, data, weights, subset, na.action = na.rpart, method, model = FALSE, x = FALSE, y = TRUE, parms, control, cost, \dots) } \arguments{ \item{formula}{a \link{formula}, with a response but no interaction terms. If this is a data frame, it is taken as the model frame (see \code{\link{model.frame}).} } \item{data}{an optional data frame in which to interpret the variables named in the formula.} \item{weights}{optional case weights.} \item{subset}{optional expression saying that only a subset of the rows of the data should be used in the fit.} \item{na.action}{the default action deletes all observations for which \code{y} is missing, but keeps those in which one or more predictors are missing.} \item{method}{one of \code{"anova"}, \code{"poisson"}, \code{"class"} or \code{"exp"}. If \code{method} is missing then the routine tries to make an intelligent guess. If \code{y} is a survival object, then \code{method = "exp"} is assumed, if \code{y} has 2 columns then \code{method = "poisson"} is assumed, if \code{y} is a factor then \code{method = "class"} is assumed, otherwise \code{method = "anova"} is assumed. It is wisest to specify the method directly, especially as more criteria may added to the function in future. Alternatively, \code{method} can be a list of functions named \code{init}, \code{split} and \code{eval}. Examples are given in the file \file{tests/usersplits.R} in the sources, and in the vignettes \sQuote{User Written Split Functions}.} \item{model}{if logical: keep a copy of the model frame in the result? If the input value for \code{model} is a model frame (likely from an earlier call to the \code{rpart} function), then this frame is used rather than constructing new data.} \item{x}{keep a copy of the \code{x} matrix in the result.} \item{y}{keep a copy of the dependent variable in the result. If missing and \code{model} is supplied this defaults to \code{FALSE}.} \item{parms}{optional parameters for the splitting function.\cr Anova splitting has no parameters.\cr Poisson splitting has a single parameter, the coefficient of variation of the prior distribution on the rates. The default value is 1.\cr Exponential splitting has the same parameter as Poisson.\cr For classification splitting, the list can contain any of: the vector of prior probabilities (component \code{prior}), the loss matrix (component \code{loss}) or the splitting index (component \code{split}). The priors must be positive and sum to 1. The loss matrix must have zeros on the diagonal and positive off-diagonal elements. The splitting index can be \code{gini} or \code{information}. The default priors are proportional to the data counts, the losses default to 1, and the split defaults to \code{gini}.} \item{control}{a list of options that control details of the \code{rpart} algorithm. See \code{\link{rpart.control}}.} \item{cost}{a vector of non-negative costs, one for each variable in the model. Defaults to one for all variables. These are scalings to be applied when considering splits, so the improvement on splitting on a variable is divided by its cost in deciding which split to choose.} \item{\dots}{arguments to \code{\link{rpart.control}} may also be specified in the call to \code{rpart}. They are checked against the list of valid arguments.} } \details{ This differs from the \code{tree} function in S mainly in its handling of surrogate variables. In most details it follows Breiman \emph{et. al} (1984) quite closely. \R package \pkg{tree} provides a re-implementation of \code{tree}. } \value{ An object of class \code{rpart}. See \code{\link{rpart.object}}. } \references{ Breiman L., Friedman J. H., Olshen R. A., and Stone, C. J. (1984) \emph{Classification and Regression Trees.} Wadsworth. } \seealso{ \code{\link{rpart.control}}, \code{\link{rpart.object}}, \code{\link{summary.rpart}}, \code{\link{print.rpart}} } \examples{ fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis) fit2 <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis, parms = list(prior = c(.65,.35), split = "information")) fit3 <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis, control = rpart.control(cp = 0.05)) par(mfrow = c(1,2), xpd = NA) # otherwise on some devices the text is clipped plot(fit) text(fit, use.n = TRUE) plot(fit2) text(fit2, use.n = TRUE) } \keyword{tree} rpart/man/print.rpart.Rd0000644000176200001440000000433114170373107014733 0ustar liggesusers\name{print.rpart} \alias{print.rpart} \title{ Print an Rpart Object } \description{ This function prints an \code{rpart} object. It is a method for the generic function \code{print} of class \code{"rpart"}. } \usage{ \method{print}{rpart}(x, minlength = 0, spaces = 2, cp, digits = getOption("digits"), nsmall = min(20, digits), \dots) } \arguments{ \item{x}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{minlength}{ Controls the abbreviation of labels: see \code{\link{labels.rpart}}. } \item{spaces}{ the number of spaces to indent nodes of increasing depth. } \item{digits}{ the number of digits of numbers to print. } \item{nsmall}{ the number of digits to the right of the decimal. See \code{\link{format}}. } \item{cp}{ prune all nodes with a complexity less than \code{cp} from the printout. Ignored if unspecified. } \item{\dots}{ arguments to be passed to or from other methods. }} \section{Side Effects}{ A semi-graphical layout of the contents of \code{x$frame} is printed. Indentation is used to convey the tree topology. Information for each node includes the node number, split, size, deviance, and fitted value. For the \code{"class"} method, the class probabilities are also printed. } \details{ This function is a method for the generic function \code{print} for class \code{"rpart"}. It can be invoked by calling print for an object of the appropriate class, or directly by calling \code{print.rpart} regardless of the class of the object. } \seealso{ \code{\link{print}}, \code{\link{rpart.object}}, \code{\link{summary.rpart}}, \code{\link{printcp}} } \examples{ z.auto <- rpart(Mileage ~ Weight, car.test.frame) z.auto \dontrun{node), split, n, deviance, yval * denotes terminal node 1) root 60 1354.58300 24.58333 2) Weight>=2567.5 45 361.20000 22.46667 4) Weight>=3087.5 22 61.31818 20.40909 * 5) Weight<3087.5 23 117.65220 24.43478 10) Weight>=2747.5 15 60.40000 23.80000 * 11) Weight<2747.5 8 39.87500 25.62500 * 3) Weight<2567.5 15 186.93330 30.93333 * }} \keyword{tree} rpart/man/plot.rpart.Rd0000644000176200001440000000574314170373107014565 0ustar liggesusers\name{plot.rpart} \alias{plot.rpart} \title{ Plot an Rpart Object } \description{ Plots an rpart object on the current graphics device. } \usage{ \method{plot}{rpart}(x, uniform = FALSE, branch = 1, compress = FALSE, nspace, margin = 0, minbranch = 0.3, branch.col = 1, branch.lty = 1, branch.lwd = 1, \dots) } \arguments{ \item{x}{ a fitted object of class \code{"rpart"}, containing a classification, regression, or rate tree. } \item{uniform}{ if \code{TRUE}, uniform vertical spacing of the nodes is used; this may be less cluttered when fitting a large plot onto a page. The default is to use a non-uniform spacing proportional to the error in the fit. } \item{branch}{ controls the shape of the branches from parent to child node. Any number from 0 to 1 is allowed. A value of 1 gives square shouldered branches, a value of 0 give V shaped branches, with other values being intermediate. } \item{compress}{ if \code{FALSE}, the leaf nodes will be at the horizontal plot coordinates of \code{1:nleaves}. If \code{TRUE}, the routine attempts a more compact arrangement of the tree. The compaction algorithm assumes \code{uniform=TRUE}; surprisingly, the result is usually an improvement even when that is not the case. } \item{nspace}{ the amount of extra space between a node with children and a leaf, as compared to the minimal space between leaves. Applies to compressed trees only. The default is the value of \code{branch}. } \item{margin}{ an extra fraction of white space to leave around the borders of the tree. (Long labels sometimes get cut off by the default computation). } \item{minbranch}{ set the minimum length for a branch to \code{minbranch} times the average branch length. This parameter is ignored if \code{uniform=TRUE}. Sometimes a split will give very little improvement, or even (in the classification case) no improvement at all. A tree with branch lengths strictly proportional to improvement leaves no room to squeeze in node labels. } \item{branch.col}{ set the color of the branches. } \item{branch.lty}{ set the line type of the branches. } \item{branch.lwd}{ set the line width of the branches. } \item{\dots}{ arguments to be passed to or from other methods. }} \value{ The coordinates of the nodes are returned as a list, with components \code{x} and \code{y}. } \section{Side Effects}{ An unlabeled plot is produced on the current graphics device: one being opened if needed. In order to build up a plot in the usual S style, e.g., a separate \code{text} command for adding labels, some extra information about the plot needs be retained. This is kept in an environment in the package. } \details{ This function is a method for the generic function \code{plot}, for objects of class \code{rpart}. The y-coordinate of the top node of the tree will always be 1. } \seealso{ \code{\link{rpart}}, \code{\link{text.rpart}} } \examples{ fit <- rpart(Price ~ Mileage + Type + Country, cu.summary) par(xpd = TRUE) plot(fit, compress = TRUE) text(fit, use.n = TRUE) } \keyword{tree} rpart/man/car90.Rd0000644000176200001440000000755013306236017013372 0ustar liggesusers\name{car90} \alias{car90} \docType{data} \title{Automobile Data from 'Consumer Reports' 1990} \description{ Data on 111 cars, taken from pages 235--255, 281--285 and 287--288 of the April 1990 \emph{Consumer Reports} Magazine. } \usage{data(car90)} \format{ The data frame contains the following columns \describe{ \item{Country}{a factor giving the country in which the car was manufactured} \item{Disp}{engine displacement in cubic inches} \item{Disp2}{engine displacement in liters} \item{Eng.Rev}{engine revolutions per mile, or engine speed at 60 mph} \item{Front.Hd}{distance between the car's head-liner and the head of a 5 ft. 9 in. front seat passenger, in inches, as measured by CU} \item{Frt.Leg.Room}{maximum front leg room, in inches, as measured by CU} \item{Frt.Shld}{front shoulder room, in inches, as measured by CU} \item{Gear.Ratio}{the overall gear ratio, high gear, for manual transmission} \item{Gear2}{the overall gear ratio, high gear, for automatic transmission} \item{HP}{net horsepower} \item{HP.revs}{the red line---the maximum safe engine speed in rpm} \item{Height}{height of car, in inches, as supplied by manufacturer} \item{Length}{overall length, in inches, as supplied by manufacturer} \item{Luggage}{luggage space} \item{Mileage}{a numeric vector of gas mileage in miles/gallon as tested by CU; contains NAs.} \item{Model2}{alternate name, if the car was sold under two labels} \item{Price}{list price with standard equipment, in dollars} \item{Rear.Hd}{distance between the car's head-liner and the head of a 5 ft 9 in. rear seat passenger, in inches, as measured by CU} \item{Rear.Seating}{rear fore-and-aft seating room, in inches, as measured by CU} \item{RearShld}{rear shoulder room, in inches, as measured by CU} \item{Reliability}{an ordered factor with levels \samp{Much worse} < \samp{worse} < \samp{average} < \samp{better} < \samp{Much better}: contains \code{NA}s.} \item{Rim}{factor giving the rim size} \item{Sratio.m}{Number of turns of the steering wheel required for a turn of 30 foot radius, manual steering} \item{Sratio.p}{Number of turns of the steering wheel required for a turn of 30 foot radius, power steering} \item{Steering}{steering type offered: manual, power, or both} \item{Tank}{fuel refill capacity in gallons} \item{Tires}{factor giving tire size} \item{Trans1}{manual transmission, a factor with levels \samp{}, \samp{man.4}, \samp{man.5} and \samp{man.6}} \item{Trans2}{automatic transmission, a factor with levels \samp{}, \samp{auto.3}, \samp{auto.4}, and \samp{auto.CVT}. No car is missing both the manual and automatic transmission variables, but several had both as options} \item{Turning}{the radius of the turning circle in feet} \item{Type}{a factor giving the general type of car. The levels are: \samp{Small}, \samp{Sporty}, \samp{Compact}, \samp{Medium}, \samp{Large}, \samp{Van}} \item{Weight}{an order statistic giving the relative weights of the cars; 1 is the lightest and 111 is the heaviest} \item{Wheel.base}{length of wheelbase, in inches, as supplied by manufacturer} \item{Width}{width of car, in inches, as supplied by manufacturer} }} \source{ This is derived (with permission) from the data set \code{car.all} in S-PLUS, but with some further clean up of variable names and definitions. } \seealso{ \code{\link{car.test.frame}}, \code{\link{cu.summary}} for extracts from other versions of the dataset. } \examples{ data(car90) plot(car90$Price/1000, car90$Weight, xlab = "Price (thousands)", ylab = "Weight (lbs)") mlowess <- function(x, y, ...) { keep <- !(is.na(x) | is.na(y)) lowess(x[keep], y[keep], ...) } with(car90, lines(mlowess(Price/1000, Weight, f = 0.5))) } \keyword{datasets} rpart/man/printcp.Rd0000644000176200001440000000221113306236017014120 0ustar liggesusers\name{printcp} \alias{printcp} \title{ Displays CP table for Fitted Rpart Object } \description{ Displays the \code{cp} table for fitted \code{rpart} object. } \usage{ printcp(x, digits = getOption("digits") - 2) } \arguments{ \item{x}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{digits}{ the number of digits of numbers to print. }} \details{ Prints a table of optimal prunings based on a complexity parameter. } \seealso{ \code{\link{summary.rpart}}, \code{\link{rpart.object}} } \examples{ z.auto <- rpart(Mileage ~ Weight, car.test.frame) printcp(z.auto) \dontrun{ Regression tree: rpart(formula = Mileage ~ Weight, data = car.test.frame) Variables actually used in tree construction: [1] Weight Root node error: 1354.6/60 = 22.576 CP nsplit rel error xerror xstd 1 0.595349 0 1.00000 1.03436 0.178526 2 0.134528 1 0.40465 0.60508 0.105217 3 0.012828 2 0.27012 0.45153 0.083330 4 0.010000 3 0.25729 0.44826 0.076998 }} \keyword{tree} rpart/man/kyphosis.Rd0000644000176200001440000000234513306236017014322 0ustar liggesusers\name{kyphosis} \alias{kyphosis} \title{Data on Children who have had Corrective Spinal Surgery} \description{ The \code{kyphosis} data frame has 81 rows and 4 columns. representing data on children who have had corrective spinal surgery } \usage{ kyphosis } \format{ This data frame contains the following columns: \describe{ \item{\code{Kyphosis}}{ a factor with levels \code{absent} \code{present} indicating if a kyphosis (a type of deformation) was present after the operation. } \item{\code{Age}}{ in months } \item{\code{Number}}{ the number of vertebrae involved } \item{\code{Start}}{ the number of the first (topmost) vertebra operated on. } }} \source{ John M. Chambers and Trevor J. Hastie eds. (1992) \emph{Statistical Models in S}, Wadsworth and Brooks/Cole, Pacific Grove, CA. } \examples{ fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis) fit2 <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis, parms = list(prior = c(0.65, 0.35), split = "information")) fit3 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis, control = rpart.control(cp = 0.05)) par(mfrow = c(1,2), xpd = TRUE) plot(fit) text(fit, use.n = TRUE) plot(fit2) text(fit2, use.n = TRUE) } \keyword{datasets} rpart/man/na.rpart.Rd0000644000176200001440000000067213306236017014177 0ustar liggesusers\name{na.rpart} \alias{na.rpart} \title{ Handles Missing Values in an Rpart Object } \usage{ na.rpart(x) } \description{ Handles missing values in an \code{"rpart"} object. } \arguments{ \item{x}{ a model frame. }} \details{ Default function that handles missing values when calling the function \code{rpart}. It omits cases where part of the response is missing or all the explanatory variables are missing. } \keyword{tree} rpart/man/figures/0000755000176200001440000000000014170373107013624 5ustar liggesusersrpart/man/figures/rpart.png0000644000176200001440000006150614170373107015472 0ustar liggesusersPNG  IHDR%n!x pHYs.#.#x?v IDATxO\}'S(C!= CYe e-B`位C-xSEI䕗!bE@j{z$x Z<[ ?%~O+?]U}[9&إh|y3rv#1pp4>RR/]O)4ye%e8_ȁ#~ݕҵk/oP G3))S;#\Mw'Jx荤Nmp.(B {#[j-}~Pȣz#[IB ɽ#xWb䊾 ۄ{#M>ލNMVP荜_Vy}[VVp4iZZ ޅ*nn`Y)X1t+WM:p-`EZW7[bA(Xrn]ex}Jz,R:UZ%Jz(n[zF(Z.PB$FvJtNM'T.F>Vf^5s P/R7r5~6r87PPު̓M*"Td8a[C(@\وeWMM:N(0rkoQPQrJ ]91B @ Gy}k/_ Pt#z#}oDokyR @B @y{#SiKo`\ўl`rP%`74i:N4Mه]lj-nyVMO6{oF!>h^I$VM:p- %j]Oo䈭Z 2&5ӻ}D8,Pp4ܛHs|EG.45ڕ&@zl^9Uki{F٤Ӻ*ɵc@ %@FRNԊVN{Io@:7r)Z荔M&w0Bx=J G 9T9Nij&`z/JtNMZ%@USi[F+&>SnvZK(:-F+cCj$IFopotPtRloժZ$z#u\}NMJHݽ4i[B|bKu}k3roR#~öev%Foor;ootP{#Fn4i6<ͱ\ .=蛬-]` %@1F?p{*f^Lor3["h|6F*{0}WC1Bxz7}k/_+F(p4>Hz#Hy`r/}X5Y+B(Eުu?#jPbz+&/! Xh|7vnnֵMnI]& %ȥG#~qcbKu}`roR#~7ĈCx5BktPI(>f8Sz9n/#w{7\ B ƧsAoxB}=ju7ya %@{?#jھNtpb00Vp4>oȡvVjë77$ #~oy*77Šk&qok9\H(Gި7v|\MfFo)]&"X7Ujhc S3ׇZ'&sB p4>uFtxRovl(=p%:'&_B TD9]+Óz#K1:xk۹or,P={#FʚvVl.JQAd=}zGnz#x^Lor3}royuP(h -]ףsoO(}{#FNW}cЊ[#Ex}GB mZM istM/oÉ GN 4CfKd8,Q|?O:T뽎-ZOZs9ikc3zݪ , Xh|:FT}7ҜhA%ڃ{7U\ P j]Oo䈭ZĄfz/}'d8_Mo9n/$LkWNzZ. %Gl^9Ul ٤Ӻ*5=v)Q}o$5DX!o[};9'KF%0V-VFF.h&;}!\ TC(9 G 9T9Neɽ4r%:'&3B p4>{#_1wxRoL60Bx3OفkNJ1roj>b9?C'=!É N(,F.U?7z#͉71:xkp}yD߄7a\ݷI4muZ+]{7Y}\ #荜D`LS3}O}tVp4>{#j~i샧:p5M&|J(a G97r]! 8 &w7U\ j]Oo䈭Z+&1[o¬T9N2`'&wpI]&PBo7<{#;p5i&wi]䚏 }$;y{#i'jVN{In-]>PBoȥMnpoB5{#{0}WMۃŖ&B(roR#~7ҜhAbt&sd$Y7a\Rfãi0X1iw7pсk#IrJ|#~z#5f^&mk:e8Sz9n/@M&|-NƧsAoxB@?Edo__M^2B.J(*V-VF[VD7ÖNby#FSbU7ۇ}-oo¾Jwy{#OFFV^M&}!7;p-}Gި7v$IFopo¾JXުUYs4MFx~j&7YD߄JXh|.Oժ728Óz#a(=p%:'&,PRHU;<7Mn" 'P7alwiiv-XCd=}F(a!荜Ȯ724{i0߇NM3ٳ~^9T-ZOyhy3<ik˱E\g^jžxbroR#~7ҜhA=8x'}ҵցkBB s[7rפh6:p-7K$F%e8_Mo9n/2M>M.!N %p4>WGN|Ƕ7M)]1rͧ'Jxb9?P#7r87YqB ˽ZU DoU۹b[W&k9蛬(dE Gsyu`O¦m> -]MVPbro$_`؆FW777Y!BɊȽئuWܤh6:p-=qb؃z'&+@(z#{=QoK>Mn澉-]=f#~TCآ5;t%GNlAÚYl9vg^jSzh8ν3U4'C=#uALo&kH(鑼Ujz#Gl ]n_&N6:p-,Proz#q#~`i׮\7PRE'=!ÉIG %{#[A$ԫK>MnI]&#tp4>WG [F/or=r뺾I%{#D[f'Foor;tPRXD9W+ilx4Mc`8|}7,h|9tF`LS3}O}t!0ȩ_GFD#~`M&|ײR}4O0҃vGi0ۗIuZVP߫U xX1I]=}^>dɆy#FSvforZԥo$Bɒ7}\Ko% {#MB%&j0O^&k9,P\.LI`ɦm0Wso7BHվGO.t_L'JUZ{m`'o2ͼjrR%dro$.c7r,͚&oI~c׼w׾y}&D >=蛬2ɜ9ߪE@7wn=ׯNLعfz/ &7É)P|Ցآ5;tEvg?<7N~=ء9ex$ 4 f|bk7<1zOhi{#O?l>Oskڮwݞp4| ܭU+z#|ۚ4MX7.`aLݚ&x=7:pMc!ҭI|89G P8GG3yF%h|+/Uai9zidmIs_zHomy֙\Eo9~;#;h>Hɝ4WKFx(KgߪŒ洝e 跭i7ܩypM~&/}Kfj$99HMtTLuuU&+G۵*<{#O+loGe|>JY[T>kH;o#gfYF\\ICI^#U5GthjI S3WstpIn׭bD x24Gk[g} 7F۵}2d70m8ߤɥ|b/&ھ{#WsXnA(.͢o2mjwj}ooҋP@o|+#ymZJ^L7M3f'MKXo%GlÓ #۾IL-];p-{RJp4>j=7z#͉6_As$ &wQ•o/-]1kƗQ]([dX&͆GӴ9beح7Mjwk:s85XUOyiV$ 'Uq@l?͆k?-]Wk!\p4>;o7[#~:L_\md4G`r' fVJ{#GlhwvŤܩfKW$MNb]ESorӡ,0<7s{orU %M;QKo(7h>Hd#;Jb,V>LLfrS}΄I11@ogMpL"tAPǗ~٪Khk0/JFkP@,öeR}"$ m2Is(5{i0}} %a.ߧ;Ўn/eBI, 5'tYv͑YsD lr:/%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%Epa>W}&§5ɣz䍈(wiw@g %0$g t3@ؾ%E %@QB PP%E ;oҋ_q %`H. [@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB P8ҙg?gO<g?.ܾ]O]~U^o}?]Bxp4R{5O昷GAҋ_Y nQx"R|o~oE %@QB PP%E %@Q)a)6￟6~>Nӫ~x_66[,JX$_4om .zZg!,[@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PX_ҥGﻯ=wwG?ײ;;V<Ѫߣ&$B ̩Ϟx/ޏɍxs~h%\tŇAqmU=+'߂'Z{tL_yWBپŮ?z7 n /z@+"燅7NW|!+%/W羘N}f퍟=aKtg ٲZ.z:f'sJؓx(i B nپ%E %@QB P;4_L8edﬔE %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %@QB PPuuN'~*y6~7/̳k_m)ࡄ"}ӟ>H?}띴۰Bδ~;g %x?H>{L_yGIgO/|9}߫39\z+7~n4>&ާo}?_C'm}?„Z_|nG7bа̇#ȥK'KWLsf3=d!tg]8ng` %˝i(]o /|}+@ω޷'o{ Ug`(xo1}:ySKH?7Ci^Xъo_o!O_,H/> W}RķFlӪOenکpGK/ uQl voRBd|*'B ؿ؆#I|@-Il,hW[춿,,G#;׺{JؓS "޻8$bϛ&H+:$^!~smw[v6z锰k{SL?Γ?oBwݙ|ޟ?{xЖ/}#};?0&v=餽>S'~Ryg?nb5+oP®tH8{޳D59ο]}^bQ|Sw1]m78rbo{ylbW]˞tvڛo}{>nW*_߲F^ޛSWqnȼa"ƪוWq_n[2v[ooOnu~]_5;}Zvƨ؊vUXdv _E ]v+֯)Pߎom+$^{["{Mlv<2'۷K|˽o,v{O<^pHߖϷ%8}no5۫+<ω-b󈕕 gۆDj_94/J㾩!ă0R(XQG0ymU駹SjJXm?@blO*a'b$s;]>ƶNP"WR<(H30{ %٢0X(?Ւ(n+s@lbx %m> 󎮍mzy$.JgOB eѵNPP+ 4w}g޴Ϋ—Guů65*Mlx\{VyW7ȫH=U{?]OS^:?>_?7uIH<(~$#?]S[^ߏ^^3MK(>f;,kTklيy{$߮/hޑɱu/z=;4ey:,XQ񠿓4d+ro4+CdԾׇha5$ޫy{;)?7n BHxgDJN&M([q|;_obĶ݉8Fh>.|E9~;;b%'v؍`@9oL/AVHiE'hblQsff+@%v5m-ow2m ;J`kׂI\WCfײĵl_[ &^/WlZl!d`1]gnPo}@۵"$z:%ăktbS? z\l#th?P<(1^x0۵]Ӊwݎ3…_r39س/>'&{[ځJ]q:$^ %Wng\z9c{{?ӷы閽ۿFůa%bnx7No}y}o@Gܜ#<J.FJ]3;%<6x˟S۟$BȕWhq`.Oڶy6\{M7X؞_(m P,DtG(ssN ge+?H]J=@b[B '4a صI[{!v7H=J]373ؕ7޶J,P_,PJ,P-Nou Xi%/~ #sR,P-NrX`((J(g,Pš %@QB PP%E %@QB PP%E %@QB PP%E %@QB PP%E %,/~剄yG?.c %mwwK.~3Pӕmg\N|;?Ho=Y!R2]Ji7~ֿx3W`}Y)Lߥis<͚#X"+ad0_d6Id35۾%.#:%yg $D=I(ӕW45{iῶ[[o/?b=#~" y6~tGlnSRh{yvb. IDATﵵ7jLs_;[}M`ro3tm갶7w-[]i8MYsWS&o> GrwrcF~om(p!xkw ՠ$tSfiZ:!H:Մm4Hp7. kFUz@yud~E87`UEazs* %[LAMCiڜ7`LS3}/^a)R:]+>L7i0Hi65Ur9#wޯ^}8G(R:S(F7M &wCadJ~Ia8Tͯ3z&1K굽;}6SJ'|_}NPqg_̎ŊIL2BlFW5n.N^{ycT>>_5i6<ͱ_+#iTG~z#vgr8AD; 0F>ȕk/|8טh|.M58ÓF!qߝFRJUa枝;#\+'U5GtxL>FbuF6syboqvp4UCm^ŋȽ4ý2OoqJҿM.IisD :2WoqJɍG}EkkjB^moqJIS96jR3uj䵗t :ԟɱ\ ӴbVTn=>w^Z?WN}I#z$VDEo$\K{N!<8؆#j6m'j7c~=w^-@򶮪͚i:/MbKי\DD5}-Z[#~?v#y}~{_Zz}y凜O)IK4I&i\< +^j>T_}]yJ=±l.gb4kV -iTF\Edxa`Ŋy#DفkW+(MV?Bxp 'F%z#`zJJZ_Fk y\Zlm]F}A$IE_$j^ʇmormD{#@fهF#+yr$#qd[ tGp Q'xN 0XEMy%؀b8zq$y?4>!M@6z#3 #ޔ9fwq%K}Y:I| &Pv>~.7rJH&NM雄LvVBod Hоɩ j‰ȍm 1#MF91M.^}D$Է#FVPb797?!&{F9%G L4Y!BɚhN77?2q3-C^=Bɚ7E&Et>v^k 7& hDC۟Ļ.l7pr\X@oDhY?BIFBk87 aKHoDތ0;e{PP7NM7i޸-aBߎ0k 0[۟ħ`"D{#< PR훜R]7 .D2J41PRP7GlMf*7`Ѥ7rx[7{LohdӺE}#7%F.J }s©o"[< _J97R&BIo"'˟6\'ᄾ #9]hL"D#;}-.}+:pU;/+k 7R8BIhDV?w]ئoWC%L^kod`%=f_ÉL@F%LN5\f,X!BIR@&< +z#89u^Oljs^hCZWY׺^V_ګ_3x D'GMv2xOִdU1p|M'O͉]7 Cވ7RDoDEςLJM795?!\ R8aB`Wt^H!e_#tB2.&T&.uUk_ PҾ59mL$39Y z#x}s:!Lmg%FfFQ%xP7GlMf| `2oK-Zd7G#Yo2Ήi]uѿoX )Kވ:7 Eo"G~MJdu~+gѺ5T< @&愾 `i7?K11,x)B *͑sɖ~ `{#%պ";, }yk2zׅmN,F⼄k # !`itBxdoR%k]<BRod^TeςJtnN {NxXLZ89u^c80>}7o?88s&_9TuvL&offlU-#X +79˄vȁiɪV!c&~JMN͉]7 7"w0rg%X+훜)0! }"W>sn/օP Z7Msɱ.9( $7rZ =F(A6orn~B8M-JL4A%ȎMOo0}0kS]ʹ729! [ntt"}(:/s_oC(Aִo"E_ʻ.l_7#Cވ:7J`B} '&2!w@Ƥ7r7kéz#9pbo2p]إo9IyzCb:kNFVZH}o䶄ZEoJ`Nd,/7%L^kd,J`N쥵.+S{#MO5\f,JPP7C 'M='EiQވrT % (:!|iv171( EҾt|> ([6Bz#LDM&NMN雄LSo;+72E,}S  )ȍm 1}vNLE"ADIK};Bo@(Ahp ֿ !F7MOo0}=$9ٍi}̟1P us{go\MDt>v^k 7"7"](\|( ֣ZF{/Bk87 aK@Qz#&~ZbY,Jousxd} )F qr<묽:9O;}s^i{\*9sZ1KXՒiN~YO(𨜾N5-YՒ$]Q ?9}tBxTFD71@}#EFdQ"gG((P&'52ӢV)exNs_E*׺Jΐu~zKXՒiN~YL𨜾N`dMKV 3 Pd&77H$fB(oވ "Ҿɩ j €UU_Xf~C(2Xiv10! "AD.uUk'B `MɁ˧`"@䈖og%F&F#hN7"LB `P#]&~3]HXz~Jo2Ήi]uѿo\|%F!&B `\}lȄua_N5 #usxdޜ7Dz#sѸG( 9~d+9o,}o䶄ZZd7PH&dhyׅmNؕz#q^﵆I`%@tBxTFd/uxtRod^Te` %@j8)o9 qr<묽:9amO'ι/9Y2ʡrLk]{|qV7/ގ='Ű:'kZUHo䘉__%@OiTߜ%} aLz#3\d,VPMNOWNFHoUO|3 [6_]@$H )ྑK};2Yhp -9U@odaPornBxC'8ϩnLKc$wyǺlMf| v;/k 7%})mSua ̐z#rDJ!Roȍm 1@(27i4x1WM7D$Է#Fip orDnO߃Cz#VP`mornBxC'FL'8gCk9~dEKߤW>X kC(J/]L.>  \Q-z#֎P +n5ȄߥoRȼߩ/3xH%@É@&< %F qr<묽:9@n8 Kݧ7o?8ŋʵk] S7"%^?£r&;< "kZ%oI֣Z~p9#0C'GeMȋoވ,j]d,C愺jp5p1fB8 1-j_&~XB(`N7'u~ŰÄT7rSb9iDޚ$> (X 9Y P79?!ƒ P2J4`@QB7 Odb:%ǵ pLo@I%}vNL®} Cz#JC(P,훜N}Yl#Zw/?j7H usex}AzsB7?K}#cB(nsM\M):pU;ٞlJM M"F⼄k]՚d,f_ÉIJu]od^Te+E(knU@d/s1-jxY{urԣzq߼✓'&_9Tut)\:?(9xg%jŇM (7YKִdU1pP_ uMM77 7"w0r@6%795?!\ R89!,W Ŏe旉_'B <@'.:b13!,AD.uUkjJorn~Byeoe,&G|;+720Bo~PO}s:!QߤkS)720BoPHL/o9vgXJo2Ήi]uѿZyD.>  Cz#<xP7eo"~w%}9u7kéz#XP7NMR&%J`B9l7'FnK8uEvz# X[׻.lw!F⼄k # B(% Q}O퍘jYHXP7C 'M='Ӣ!1Ng/G`%"z~RIDAT:bDŽ7%F I&{YӒUBz#LjJ`Mo"doKx32EC(5Ӿɩ aÎzJ 7iqoGJ #797?!z# B dH&'33_P us?!LeH@%9H~odrDPFpBǦF0&́&$dzګ }0*ԍ;&sZFB £M5L2x3J:! 1} then it represents the number of character widths (for current graphical device) to use. } \item{fheight}{ Relates to option \code{fancy} and the height of the ellipses and rectangles. If \code{fheight <1} then it is a scaling factor (default = 0.8). If \code{fheight > 1} then it represents the number of character heights (for current graphical device) to use. } \item{bg}{ The color used to paint the background to annotations if \code{fancy = TRUE}. } \item{\dots}{ Graphical parameters may also be supplied as arguments to this function (see \code{par}). As labels often extend outside the plot region it can be helpful to specify \code{xpd = TRUE}. } } \section{Side Effects}{ the current plot of a tree dendrogram is labeled. } \seealso{ \code{\link{text}}, \code{\link{plot.rpart}}, \code{\link{rpart}}, \code{\link{labels.rpart}}, \code{\link{abbreviate}} } \examples{ freen.tr <- rpart(y ~ ., freeny) par(xpd = TRUE) plot(freen.tr) text(freen.tr, use.n = TRUE, all = TRUE) } \keyword{tree} rpart/man/xpred.rpart.Rd0000644000176200001440000000377113306236017014726 0ustar liggesusers\name{xpred.rpart} \alias{xpred.rpart} \title{ Return Cross-Validated Predictions } \description{ Gives the predicted values for an \code{rpart} fit, under cross validation, for a set of complexity parameter values. } \usage{ xpred.rpart(fit, xval = 10, cp, return.all = FALSE) } \arguments{ \item{fit}{ a object of class \code{"rpart"}. } \item{xval}{ number of cross-validation groups. This may also be an explicit list of integers that define the cross-validation groups. } \item{cp}{ the desired list of complexity values. By default it is taken from the \code{cptable} component of the fit. } \item{return.all}{ if FALSE return only the first element of the prediction} } \value{ A matrix with one row for each observation and one column for each complexity value. If \code{return.all} is TRUE and the prediction for each node is a vector, then the result will be an array containing all of the predictions. When the response is categorical, for instance, the result contains the predicted class followed by the class probabilities of the selected terminal node; \code{result[1,,]} will be the matrix of predicted classes, \code{result[2,,]} the matrix of class 1 probabilities, etc. } \details{ Complexity penalties are actually ranges, not values. If the \code{cp} values found in the table were \eqn{.36}, \eqn{.28}, and \eqn{.13}, for instance, this means that the first row of the table holds for all complexity penalties in the range \eqn{[.36, 1]}, the second row for \code{cp} in the range \eqn{[.28, .36)} and the third row for \eqn{[.13,.28)}. By default, the geometric mean of each interval is used for cross validation. } \seealso{ \code{\link{rpart}} } \examples{ fit <- rpart(Mileage ~ Weight, car.test.frame) xmat <- xpred.rpart(fit) xerr <- (xmat - car.test.frame$Mileage)^2 apply(xerr, 2, sum) # cross-validated error estimate # approx same result as rel. error from printcp(fit) apply(xerr, 2, sum)/var(car.test.frame$Mileage) printcp(fit) } \keyword{tree} rpart/man/snip.rpart.Rd0000644000176200001440000000420513306236017014546 0ustar liggesusers\name{snip.rpart} \alias{snip.rpart} \title{ Snip Subtrees of an Rpart Object } \description{ Creates a "snipped" rpart object, containing the nodes that remain after selected subtrees have been snipped off. The user can snip nodes using the toss argument, or interactively by clicking the mouse button on specified nodes within the graphics window. } \usage{ snip.rpart(x, toss) } \arguments{ \item{x}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{toss}{ an integer vector containing indices (node numbers) of all subtrees to be snipped off. If missing, user selects branches to snip off as described below. }} \value{ A \code{rpart} object containing the nodes that remain after specified or selected subtrees have been snipped off. } \details{ A dendrogram of \code{rpart} is expected to be visible on the graphics device, and a graphics input device (e.g., a mouse) is required. Clicking (the selection button) on a node displays the node number, sample size, response y-value, and Error (dev). Clicking a second time on the same node snips that subtree off and visually erases the subtree. This process may be repeated an number of times. Warnings result from selecting the root or leaf nodes. Clicking the exit button will stop the snipping process and return the resulting \code{rpart} object. See the documentation for the specific graphics device for details on graphical input techniques. } \section{Warning}{ Visually erasing the plot is done by over-plotting with the background colour. This will do nothing if the background is transparent (often true for screen devices). } \seealso{ \code{\link{plot.rpart}} } \examples{ ## dataset not in R \dontrun{ z.survey <- rpart(market.survey) # grow the rpart object plot(z.survey) # plot the tree z.survey2 <- snip.rpart(z.survey, toss = 2) # trim subtree at node 2 plot(z.survey2) # plot new tree # can also interactively select the node using the mouse in the # graphics window }} \keyword{tree} rpart/man/summary.rpart.Rd0000644000176200001440000000312513453654057015305 0ustar liggesusers\name{summary.rpart} \alias{summary.rpart} \title{ Summarize a Fitted Rpart Object } \description{ Returns a detailed listing of a fitted \code{rpart} object. } \usage{ \method{summary}{rpart}(object, cp = 0, digits = getOption("digits"), file, \dots) } \arguments{ \item{object}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{digits}{ Number of significant digits to be used in the result. } \item{cp}{ trim nodes with a complexity of less than \code{cp} from the listing. } \item{file}{ write the output to a given file name. (Full listings of a tree are often quite long). } \item{\dots}{ arguments to be passed to or from other methods. }} \details{ This function is a method for the generic function summary for class \code{"rpart"}. It can be invoked by calling \code{summary} for an object of the appropriate class, or directly by calling \code{summary.rpart} regardless of the class of the object. It prints the call, the table shown by \code{\link{printcp}}, the variable importance (summing to 100) and details for each node (the details depending on the type of tree). } \seealso{ \code{\link{summary}}, \code{\link{rpart.object}}, \code{\link{printcp}}. } \examples{ ## a regression tree z.auto <- rpart(Mileage ~ Weight, car.test.frame) summary(z.auto) ## a classification tree with multiple variables and surrogate splits. summary(rpart(Kyphosis ~ Age + Number + Start, data = kyphosis)) } \keyword{tree} rpart/man/rpart.control.Rd0000644000176200001440000000641213306236017015257 0ustar liggesusers\name{rpart.control} \alias{rpart.control} \title{ Control for Rpart Fits } \description{ Various parameters that control aspects of the \code{rpart} fit. } \usage{ rpart.control(minsplit = 20, minbucket = round(minsplit/3), cp = 0.01, maxcompete = 4, maxsurrogate = 5, usesurrogate = 2, xval = 10, surrogatestyle = 0, maxdepth = 30, \dots) } \arguments{ \item{minsplit}{ the minimum number of observations that must exist in a node in order for a split to be attempted. } \item{minbucket}{ the minimum number of observations in any terminal \code{} node. If only one of \code{minbucket} or \code{minsplit} is specified, the code either sets \code{minsplit} to \code{minbucket*3} or \code{minbucket} to \code{minsplit/3}, as appropriate. } \item{cp}{ complexity parameter. Any split that does not decrease the overall lack of fit by a factor of \code{cp} is not attempted. For instance, with \code{anova} splitting, this means that the overall R-squared must increase by \code{cp} at each step. The main role of this parameter is to save computing time by pruning off splits that are obviously not worthwhile. Essentially,the user informs the program that any split which does not improve the fit by \code{cp} will likely be pruned off by cross-validation, and that hence the program need not pursue it. } \item{maxcompete}{ the number of competitor splits retained in the output. It is useful to know not just which split was chosen, but which variable came in second, third, etc. } \item{maxsurrogate}{ the number of surrogate splits retained in the output. If this is set to zero the compute time will be reduced, since approximately half of the computational time (other than setup) is used in the search for surrogate splits. } \item{usesurrogate}{ how to use surrogates in the splitting process. \code{0} means display only; an observation with a missing value for the primary split rule is not sent further down the tree. \code{1} means use surrogates, in order, to split subjects missing the primary variable; if all surrogates are missing the observation is not split. For value \code{2} ,if all surrogates are missing, then send the observation in the majority direction. A value of \code{0} corresponds to the action of \code{tree}, and \code{2} to the recommendations of Breiman \emph{et.al} (1984). } \item{xval}{ number of cross-validations. } \item{surrogatestyle}{ controls the selection of a best surrogate. If set to \code{0} (default) the program uses the total number of correct classification for a potential surrogate variable, if set to \code{1} it uses the percent correct, calculated over the non-missing values of the surrogate. The first option more severely penalizes covariates with a large number of missing values. } \item{maxdepth}{ Set the maximum depth of any node of the final tree, with the root node counted as depth 0. Values greater than 30 \code{rpart} will give nonsense results on 32-bit machines. } \item{\dots}{ mop up other arguments. } } \value{ A list containing the options. } \seealso{ \code{\link{rpart}} } \keyword{tree} rpart/man/cu.summary.Rd0000644000176200001440000000313313306236017014550 0ustar liggesusers\name{cu.summary} \alias{cu.summary} \title{Automobile Data from 'Consumer Reports' 1990} \description{ The \code{cu.summary} data frame has 117 rows and 5 columns, giving data on makes of cars taken from the April, 1990 issue of \emph{Consumer Reports}. } \usage{ cu.summary } \format{ This data frame contains the following columns: \describe{ \item{\code{Price}}{ a numeric vector giving the list price in US dollars of a standard model } \item{\code{Country}}{ of origin, a factor with levels \samp{Brazil}, \samp{England}, \samp{France}, \samp{Germany}, \samp{Japan}, \samp{Japan/USA}, \samp{Korea}, \samp{Mexico}, \samp{Sweden} and \samp{USA} } \item{\code{Reliability}}{ an ordered factor with levels \samp{Much worse} < \samp{worse} < \samp{average} < \samp{better} < \samp{Much better} } \item{\code{Mileage}}{ fuel consumption miles per US gallon, as tested. } \item{\code{Type}}{ a factor with levels \code{Compact} \code{Large} \code{Medium} \code{Small} \code{Sporty} \code{Van} } } } \source{ \emph{Consumer Reports}, April, 1990, pp. 235--288 quoted in John M. Chambers and Trevor J. Hastie eds. (1992) \emph{Statistical Models in S}, Wadsworth and Brooks/Cole, Pacific Grove, CA, pp. 46--47. } \seealso{ \code{\link{car.test.frame}}, \code{\link{car90}} } \examples{ fit <- rpart(Price ~ Mileage + Type + Country, cu.summary) par(xpd = TRUE) plot(fit, compress = TRUE) text(fit, use.n = TRUE) } \keyword{datasets} rpart/man/path.rpart.Rd0000644000176200001440000000454714170373107014544 0ustar liggesusers\name{path.rpart} \alias{path.rpart} \title{ Follow Paths to Selected Nodes of an Rpart Object } \description{ Returns a names list where each element contains the splits on the path from the root to the selected nodes. } \usage{ path.rpart(tree, nodes, pretty = 0, print.it = TRUE) } \arguments{ \item{tree}{ fitted model object of class \code{"rpart"}. This is assumed to be the result of some function that produces an object with the same named components as that returned by the \code{rpart} function. } \item{nodes}{ an integer vector containing indices (node numbers) of all nodes for which paths are desired. If missing, user selects nodes as described below. } \item{pretty}{ an integer denoting the extent to which factor levels in split labels will be abbreviated. A value of (0) signifies no abbreviation. A \code{NULL}, the default, signifies using elements of letters to represent the different factor levels. } \item{print.it}{ Logical. Denotes whether paths will be printed out as nodes are interactively selected. Irrelevant if \code{nodes} argument is supplied. }} \value{ A named (by node) list, each element of which contains all the splits on the path from the root to the specified or selected nodes. } \section{Graphical Interaction}{ A dendrogram of the \code{rpart} object is expected to be visible on the graphics device, and a graphics input device (e.g. a mouse) is required. Clicking (the selection button) on a node selects that node. This process may be repeated any number of times. Clicking the exit button will stop the selection process and return the list of paths. } \details{ The function has a required argument as an \code{rpart} object and a list of nodes as optional arguments. Omitting a list of nodes will cause the function to wait for the user to select nodes from the dendrogram. It will return a list, with one component for each node specified or selected. The component contains the sequence of splits leading to that node. In the graphical interaction, the individual paths are printed out as nodes are selected. } \references{ This function was modified from \code{path.tree} in S. } \seealso{ \code{\link{rpart}} } \examples{ fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis) print(fit) path.rpart(fit, nodes = c(11, 22)) } \keyword{tree} rpart/DESCRIPTION0000644000176200001440000000240614173604075013121 0ustar liggesusersPackage: rpart Priority: recommended Version: 4.1.16 Date: 2022-01-24 Authors@R: c(person("Terry", "Therneau", role = "aut", email = "therneau@mayo.edu"), person("Beth", "Atkinson", role = c("aut", "cre"), email = "atkinson@mayo.edu"), person("Brian", "Ripley", role = "trl", email = "ripley@stats.ox.ac.uk", comment = "producer of the initial R port, maintainer 1999-2017")) Description: Recursive partitioning for classification, regression and survival trees. An implementation of most of the functionality of the 1984 book by Breiman, Friedman, Olshen and Stone. Title: Recursive Partitioning and Regression Trees Depends: R (>= 2.15.0), graphics, stats, grDevices Suggests: survival License: GPL-2 | GPL-3 LazyData: yes ByteCompile: yes NeedsCompilation: yes Author: Terry Therneau [aut], Beth Atkinson [aut, cre], Brian Ripley [trl] (producer of the initial R port, maintainer 1999-2017) Maintainer: Beth Atkinson Repository: CRAN URL: https://github.com/bethatkinson/rpart, https://cran.r-project.org/package=rpart BugReports: https://github.com/bethatkinson/rpart/issues Packaged: 2022-01-24 19:17:40 UTC; atkinson Date/Publication: 2022-01-24 20:12:45 UTC rpart/build/0000755000176200001440000000000014173575524012516 5ustar liggesusersrpart/build/vignette.rds0000644000176200001440000000036714173575524015063 0ustar liggesusersuNM0m x__!">EIIor/Vݝ 3ǐ⑀g8ឤZ*-[$Z̋V4f̀\HFET h?}$\0nUT*hֈM=1M/dA^;b.•[2e*`' 2gϵ,E&-ŵq7hdtI솣{$ekȐ3`ɠnPrpart/tests/0000755000176200001440000000000014170373107012547 5ustar liggesusersrpart/tests/backticks.Rout.save0000644000176200001440000000263513453452640016326 0ustar liggesusers R Under development (unstable) (2019-04-05 r76323) -- "Unsuffered Consequences" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > ## allow backticks in rpart.matrix: see > ## https://stat.ethz.ch/pipermail/r-help/2012-May/314081.html > > set.seed(10) > library(rpart) > Iris <- iris > names(Iris) <- sub(".", " ", names(iris), fixed=TRUE) > rpart(Species ~ `Sepal Length`, data = Iris) n= 150 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 150 100 setosa (0.33333333 0.33333333 0.33333333) 2) Sepal Length< 5.45 52 7 setosa (0.86538462 0.11538462 0.01923077) * 3) Sepal Length>=5.45 98 49 virginica (0.05102041 0.44897959 0.50000000) 6) Sepal Length< 6.15 43 15 versicolor (0.11627907 0.65116279 0.23255814) * 7) Sepal Length>=6.15 55 16 virginica (0.00000000 0.29090909 0.70909091) * > > proc.time() user system elapsed 0.105 0.017 0.116 rpart/tests/treble2.R0000644000176200001440000000501313453436374014241 0ustar liggesusers# # Test weights in a regression problem # library(rpart) set.seed(10) mystate <- data.frame(state.x77, region=factor(state.region)) names(mystate) <- c("population","income" , "illiteracy","life" , "murder", "hs.grad", "frost", "area", "region") xgrp <- rep(1:10,5) fit4 <- rpart(income ~ population + region + illiteracy +life + murder + hs.grad + frost , mystate, control=rpart.control(minsplit=10, xval=xgrp)) wts <- rep(3, nrow(mystate)) fit4b <- rpart(income ~ population + region + illiteracy +life + murder + hs.grad + frost , mystate, control=rpart.control(minsplit=10, xval=xgrp), weights=wts) fit4b$frame$wt <- fit4b$frame$wt/3 fit4b$frame$dev <- fit4b$frame$dev/3 fit4b$cptable[,5] <- fit4b$cptable[,5] * sqrt(3) temp <- c('frame', 'where', 'splits', 'csplit', 'cptable') all.equal(fit4[temp], fit4b[temp]) # Next is a very simple case, but worth keeping dummy <- data.frame(y=1:10, x1=c(10:4, 1:3), x2=c(1,3,5,7,9,2,4,6,8,0)) xx1 <- rpart(y ~ x1 + x2, dummy, minsplit=4, xval=0) xx2 <- rpart(y ~ x1 + x2, dummy, weights=rep(2,10), minsplit=4, xval=0) all.equal(xx1$frame$dev, c(82.5, 10, 2, .5, 10, .5, 2)) all.equal(xx2$frame$dev, c(82.5, 10, 2, .5, 10, .5, 2)*2) # Now for a set of non-equal weights # We need to set maxcompete=3 because there just happens to be, in one # of the lower nodes, an exact tie between variables "life" and "murder". # Round off error causes fit5 to choose one and fit5b the other. # Later -- cut it back to maxdepth=3 for the same reason (a tie). # nn <- nrow(mystate) wts <- rep(1:5, length=nn) temp <- rep(1:nn, wts) #row replicates xgrp <- rep(1:10, length=nn) xgrp2<- rep(xgrp, wts) tempc <- rpart.control(minsplit=2, xval=xgrp2, maxsurrogate=0, maxcompete=3, maxdepth=3) # Direct: replicate rows in the data set, and use unweighted fit5 <- rpart(income ~ population + region + illiteracy +life + murder + hs.grad + frost , data=mystate[temp,], control=tempc) # Weighted tempc <- rpart.control(minsplit=2, xval=xgrp, maxsurrogate=0, maxcompete=3, maxdepth=3) fit5b <- rpart(income ~ population + region + illiteracy +life + murder + hs.grad + frost , data=mystate, control=tempc, weights=wts) all.equal(fit5$frame[-2], fit5b$frame[-2]) # the "n" component won't match all.equal(fit5$cptable, fit5b$cptable) all.equal(fit5$splits[,-1],fit5b$splits[,-1]) all.equal(fit5$csplit, fit5b$csplit) rpart/tests/Examples/0000755000176200001440000000000013453662120014324 5ustar liggesusersrpart/tests/Examples/rpart-Ex.Rout.save0000644000176200001440000010232714173565136017654 0ustar liggesusers R Under development (unstable) (2019-04-05 r76323) -- "Unsuffered Consequences" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > pkgname <- "rpart" > source(file.path(R.home("share"), "R", "examples-header.R")) > options(warn = 1) > library('rpart') > > base::assign(".oldSearch", base::search(), pos = 'CheckExEnv') > base::assign(".old_wd", base::getwd(), pos = 'CheckExEnv') > cleanEx() > nameEx("car.test.frame") > ### * car.test.frame > > flush(stderr()); flush(stdout()) > > ### Name: car.test.frame > ### Title: Automobile Data from 'Consumer Reports' 1990 > ### Aliases: car.test.frame > ### Keywords: datasets > > ### ** Examples > > z.auto <- rpart(Mileage ~ Weight, car.test.frame) > summary(z.auto) Call: rpart(formula = Mileage ~ Weight, data = car.test.frame) n= 60 CP nsplit rel error xerror xstd 1 0.59534912 0 1.0000000 1.0337818 0.18046532 2 0.13452819 1 0.4046509 0.5836606 0.10900973 3 0.01282843 2 0.2701227 0.4409221 0.08652804 4 0.01000000 3 0.2572943 0.4415805 0.08663003 Variable importance Weight 100 Node number 1: 60 observations, complexity param=0.5953491 mean=24.58333, MSE=22.57639 left son=2 (45 obs) right son=3 (15 obs) Primary splits: Weight < 2567.5 to the right, improve=0.5953491, (0 missing) Node number 2: 45 observations, complexity param=0.1345282 mean=22.46667, MSE=8.026667 left son=4 (22 obs) right son=5 (23 obs) Primary splits: Weight < 3087.5 to the right, improve=0.5045118, (0 missing) Node number 3: 15 observations mean=30.93333, MSE=12.46222 Node number 4: 22 observations mean=20.40909, MSE=2.78719 Node number 5: 23 observations, complexity param=0.01282843 mean=24.43478, MSE=5.115312 left son=10 (15 obs) right son=11 (8 obs) Primary splits: Weight < 2747.5 to the right, improve=0.1476996, (0 missing) Node number 10: 15 observations mean=23.8, MSE=4.026667 Node number 11: 8 observations mean=25.625, MSE=4.984375 > > > > cleanEx() > nameEx("car90") > ### * car90 > > flush(stderr()); flush(stdout()) > > ### Name: car90 > ### Title: Automobile Data from 'Consumer Reports' 1990 > ### Aliases: car90 > ### Keywords: datasets > > ### ** Examples > > data(car90) > plot(car90$Price/1000, car90$Weight, + xlab = "Price (thousands)", ylab = "Weight (lbs)") > mlowess <- function(x, y, ...) { + keep <- !(is.na(x) | is.na(y)) + lowess(x[keep], y[keep], ...) + } > with(car90, lines(mlowess(Price/1000, Weight, f = 0.5))) > > > > cleanEx() > nameEx("cu.summary") > ### * cu.summary > > flush(stderr()); flush(stdout()) > > ### Name: cu.summary > ### Title: Automobile Data from 'Consumer Reports' 1990 > ### Aliases: cu.summary > ### Keywords: datasets > > ### ** Examples > > fit <- rpart(Price ~ Mileage + Type + Country, cu.summary) > par(xpd = TRUE) > plot(fit, compress = TRUE) > text(fit, use.n = TRUE) > > > > graphics::par(get("par.postscript", pos = 'CheckExEnv')) > cleanEx() > nameEx("kyphosis") > ### * kyphosis > > flush(stderr()); flush(stdout()) > > ### Name: kyphosis > ### Title: Data on Children who have had Corrective Spinal Surgery > ### Aliases: kyphosis > ### Keywords: datasets > > ### ** Examples > > fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis) > fit2 <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis, + parms = list(prior = c(0.65, 0.35), split = "information")) > fit3 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis, + control = rpart.control(cp = 0.05)) > par(mfrow = c(1,2), xpd = TRUE) > plot(fit) > text(fit, use.n = TRUE) > plot(fit2) > text(fit2, use.n = TRUE) > > > > graphics::par(get("par.postscript", pos = 'CheckExEnv')) > cleanEx() > nameEx("meanvar.rpart") > ### * meanvar.rpart > > flush(stderr()); flush(stdout()) > > ### Name: meanvar.rpart > ### Title: Mean-Variance Plot for an Rpart Object > ### Aliases: meanvar meanvar.rpart > ### Keywords: tree > > ### ** Examples > > z.auto <- rpart(Mileage ~ Weight, car.test.frame) > meanvar(z.auto, log = 'xy') > > > > cleanEx() > nameEx("path.rpart") > ### * path.rpart > > flush(stderr()); flush(stdout()) > > ### Name: path.rpart > ### Title: Follow Paths to Selected Nodes of an Rpart Object > ### Aliases: path.rpart > ### Keywords: tree > > ### ** Examples > > fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis) > print(fit) n= 81 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 81 17 absent (0.79012346 0.20987654) 2) Start>=8.5 62 6 absent (0.90322581 0.09677419) 4) Start>=14.5 29 0 absent (1.00000000 0.00000000) * 5) Start< 14.5 33 6 absent (0.81818182 0.18181818) 10) Age< 55 12 0 absent (1.00000000 0.00000000) * 11) Age>=55 21 6 absent (0.71428571 0.28571429) 22) Age>=111 14 2 absent (0.85714286 0.14285714) * 23) Age< 111 7 3 present (0.42857143 0.57142857) * 3) Start< 8.5 19 8 present (0.42105263 0.57894737) * > path.rpart(fit, nodes = c(11, 22)) node number: 11 root Start>=8.5 Start< 14.5 Age>=55 node number: 22 root Start>=8.5 Start< 14.5 Age>=55 Age>=111 > > > > cleanEx() > nameEx("plot.rpart") > ### * plot.rpart > > flush(stderr()); flush(stdout()) > > ### Name: plot.rpart > ### Title: Plot an Rpart Object > ### Aliases: plot.rpart > ### Keywords: tree > > ### ** Examples > > fit <- rpart(Price ~ Mileage + Type + Country, cu.summary) > par(xpd = TRUE) > plot(fit, compress = TRUE) > text(fit, use.n = TRUE) > > > > graphics::par(get("par.postscript", pos = 'CheckExEnv')) > cleanEx() > nameEx("post.rpart") > ### * post.rpart > > flush(stderr()); flush(stdout()) > > ### Name: post.rpart > ### Title: PostScript Presentation Plot of an Rpart Object > ### Aliases: post.rpart post > ### Keywords: tree > > ### ** Examples > > ## Not run: > ##D z.auto <- rpart(Mileage ~ Weight, car.test.frame) > ##D post(z.auto, file = "") # display tree on active device > ##D # now construct postscript version on file "pretty.ps" > ##D # with no title > ##D post(z.auto, file = "pretty.ps", title = " ") > ##D z.hp <- rpart(Mileage ~ Weight + HP, car.test.frame) > ##D post(z.hp) > ## End(Not run) > > > > cleanEx() > nameEx("predict.rpart") > ### * predict.rpart > > flush(stderr()); flush(stdout()) > > ### Name: predict.rpart > ### Title: Predictions from a Fitted Rpart Object > ### Aliases: predict.rpart > ### Keywords: tree > > ### ** Examples > > z.auto <- rpart(Mileage ~ Weight, car.test.frame) > predict(z.auto) Eagle Summit 4 Ford Escort 4 30.93333 30.93333 Ford Festiva 4 Honda Civic 4 30.93333 30.93333 Mazda Protege 4 Mercury Tracer 4 30.93333 30.93333 Nissan Sentra 4 Pontiac LeMans 4 30.93333 30.93333 Subaru Loyale 4 Subaru Justy 3 30.93333 30.93333 Toyota Corolla 4 Toyota Tercel 4 30.93333 30.93333 Volkswagen Jetta 4 Chevrolet Camaro V8 30.93333 20.40909 Dodge Daytona Ford Mustang V8 23.80000 20.40909 Ford Probe Honda Civic CRX Si 4 25.62500 30.93333 Honda Prelude Si 4WS 4 Nissan 240SX 4 25.62500 23.80000 Plymouth Laser Subaru XT 4 23.80000 30.93333 Audi 80 4 Buick Skylark 4 25.62500 25.62500 Chevrolet Beretta 4 Chrysler Le Baron V6 25.62500 23.80000 Ford Tempo 4 Honda Accord 4 23.80000 23.80000 Mazda 626 4 Mitsubishi Galant 4 23.80000 25.62500 Mitsubishi Sigma V6 Nissan Stanza 4 20.40909 23.80000 Oldsmobile Calais 4 Peugeot 405 4 25.62500 25.62500 Subaru Legacy 4 Toyota Camry 4 23.80000 23.80000 Volvo 240 4 Acura Legend V6 23.80000 20.40909 Buick Century 4 Chrysler Le Baron Coupe 23.80000 23.80000 Chrysler New Yorker V6 Eagle Premier V6 20.40909 20.40909 Ford Taurus V6 Ford Thunderbird V6 20.40909 20.40909 Hyundai Sonata 4 Mazda 929 V6 23.80000 20.40909 Nissan Maxima V6 Oldsmobile Cutlass Ciera 4 20.40909 23.80000 Oldsmobile Cutlass Supreme V6 Toyota Cressida 6 20.40909 20.40909 Buick Le Sabre V6 Chevrolet Caprice V8 20.40909 20.40909 Ford LTD Crown Victoria V8 Chevrolet Lumina APV V6 20.40909 20.40909 Dodge Grand Caravan V6 Ford Aerostar V6 20.40909 20.40909 Mazda MPV V6 Mitsubishi Wagon 4 20.40909 20.40909 Nissan Axxess 4 Nissan Van 4 20.40909 20.40909 > > fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis) > predict(fit, type = "prob") # class probabilities (default) absent present 1 0.4210526 0.5789474 2 0.8571429 0.1428571 3 0.4210526 0.5789474 4 0.4210526 0.5789474 5 1.0000000 0.0000000 6 1.0000000 0.0000000 7 1.0000000 0.0000000 8 1.0000000 0.0000000 9 1.0000000 0.0000000 10 0.4285714 0.5714286 11 0.4285714 0.5714286 12 1.0000000 0.0000000 13 0.4210526 0.5789474 14 1.0000000 0.0000000 15 1.0000000 0.0000000 16 1.0000000 0.0000000 17 1.0000000 0.0000000 18 0.8571429 0.1428571 19 1.0000000 0.0000000 20 1.0000000 0.0000000 21 1.0000000 0.0000000 22 0.4210526 0.5789474 23 0.4285714 0.5714286 24 0.4210526 0.5789474 25 0.4210526 0.5789474 26 1.0000000 0.0000000 27 0.4210526 0.5789474 28 0.4285714 0.5714286 29 1.0000000 0.0000000 30 1.0000000 0.0000000 31 1.0000000 0.0000000 32 0.8571429 0.1428571 33 0.8571429 0.1428571 34 1.0000000 0.0000000 35 0.8571429 0.1428571 36 1.0000000 0.0000000 37 1.0000000 0.0000000 38 0.4210526 0.5789474 39 1.0000000 0.0000000 40 0.4285714 0.5714286 41 0.4210526 0.5789474 42 1.0000000 0.0000000 43 0.4210526 0.5789474 44 0.4210526 0.5789474 45 1.0000000 0.0000000 46 0.8571429 0.1428571 47 1.0000000 0.0000000 48 0.8571429 0.1428571 49 0.4210526 0.5789474 50 0.8571429 0.1428571 51 0.4285714 0.5714286 52 1.0000000 0.0000000 53 0.4210526 0.5789474 54 1.0000000 0.0000000 55 1.0000000 0.0000000 56 1.0000000 0.0000000 57 1.0000000 0.0000000 58 0.4210526 0.5789474 59 1.0000000 0.0000000 60 0.4285714 0.5714286 61 0.4210526 0.5789474 62 0.4210526 0.5789474 63 0.4210526 0.5789474 64 1.0000000 0.0000000 65 1.0000000 0.0000000 66 1.0000000 0.0000000 67 1.0000000 0.0000000 68 0.8571429 0.1428571 69 1.0000000 0.0000000 70 1.0000000 0.0000000 71 0.8571429 0.1428571 72 0.8571429 0.1428571 73 1.0000000 0.0000000 74 0.8571429 0.1428571 75 1.0000000 0.0000000 76 1.0000000 0.0000000 77 0.8571429 0.1428571 78 1.0000000 0.0000000 79 0.8571429 0.1428571 80 0.4210526 0.5789474 81 1.0000000 0.0000000 > predict(fit, type = "vector") # level numbers 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 2 1 2 2 1 1 1 1 1 2 2 1 2 1 1 1 1 1 1 1 1 2 2 2 2 1 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 2 2 1 1 1 1 1 1 1 1 1 2 1 2 2 1 2 2 1 1 1 1 2 1 2 1 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 2 1 1 1 1 2 1 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 79 80 81 1 2 1 > predict(fit, type = "class") # factor 1 2 3 4 5 6 7 8 9 10 present absent present present absent absent absent absent absent present 11 12 13 14 15 16 17 18 19 20 present absent present absent absent absent absent absent absent absent 21 22 23 24 25 26 27 28 29 30 absent present present present present absent present present absent absent 31 32 33 34 35 36 37 38 39 40 absent absent absent absent absent absent absent present absent present 41 42 43 44 45 46 47 48 49 50 present absent present present absent absent absent absent present absent 51 52 53 54 55 56 57 58 59 60 present absent present absent absent absent absent present absent present 61 62 63 64 65 66 67 68 69 70 present present present absent absent absent absent absent absent absent 71 72 73 74 75 76 77 78 79 80 absent absent absent absent absent absent absent absent absent present 81 absent Levels: absent present > predict(fit, type = "matrix") # level number, class frequencies, probabilities [,1] [,2] [,3] [,4] [,5] [,6] 1 2 8 11 0.4210526 0.5789474 0.23456790 2 1 12 2 0.8571429 0.1428571 0.17283951 3 2 8 11 0.4210526 0.5789474 0.23456790 4 2 8 11 0.4210526 0.5789474 0.23456790 5 1 29 0 1.0000000 0.0000000 0.35802469 6 1 29 0 1.0000000 0.0000000 0.35802469 7 1 29 0 1.0000000 0.0000000 0.35802469 8 1 29 0 1.0000000 0.0000000 0.35802469 9 1 29 0 1.0000000 0.0000000 0.35802469 10 2 3 4 0.4285714 0.5714286 0.08641975 11 2 3 4 0.4285714 0.5714286 0.08641975 12 1 29 0 1.0000000 0.0000000 0.35802469 13 2 8 11 0.4210526 0.5789474 0.23456790 14 1 12 0 1.0000000 0.0000000 0.14814815 15 1 29 0 1.0000000 0.0000000 0.35802469 16 1 29 0 1.0000000 0.0000000 0.35802469 17 1 29 0 1.0000000 0.0000000 0.35802469 18 1 12 2 0.8571429 0.1428571 0.17283951 19 1 29 0 1.0000000 0.0000000 0.35802469 20 1 12 0 1.0000000 0.0000000 0.14814815 21 1 29 0 1.0000000 0.0000000 0.35802469 22 2 8 11 0.4210526 0.5789474 0.23456790 23 2 3 4 0.4285714 0.5714286 0.08641975 24 2 8 11 0.4210526 0.5789474 0.23456790 25 2 8 11 0.4210526 0.5789474 0.23456790 26 1 12 0 1.0000000 0.0000000 0.14814815 27 2 8 11 0.4210526 0.5789474 0.23456790 28 2 3 4 0.4285714 0.5714286 0.08641975 29 1 29 0 1.0000000 0.0000000 0.35802469 30 1 29 0 1.0000000 0.0000000 0.35802469 31 1 29 0 1.0000000 0.0000000 0.35802469 32 1 12 2 0.8571429 0.1428571 0.17283951 33 1 12 2 0.8571429 0.1428571 0.17283951 34 1 29 0 1.0000000 0.0000000 0.35802469 35 1 12 2 0.8571429 0.1428571 0.17283951 36 1 29 0 1.0000000 0.0000000 0.35802469 37 1 12 0 1.0000000 0.0000000 0.14814815 38 2 8 11 0.4210526 0.5789474 0.23456790 39 1 12 0 1.0000000 0.0000000 0.14814815 40 2 3 4 0.4285714 0.5714286 0.08641975 41 2 8 11 0.4210526 0.5789474 0.23456790 42 1 12 0 1.0000000 0.0000000 0.14814815 43 2 8 11 0.4210526 0.5789474 0.23456790 44 2 8 11 0.4210526 0.5789474 0.23456790 45 1 29 0 1.0000000 0.0000000 0.35802469 46 1 12 2 0.8571429 0.1428571 0.17283951 47 1 29 0 1.0000000 0.0000000 0.35802469 48 1 12 2 0.8571429 0.1428571 0.17283951 49 2 8 11 0.4210526 0.5789474 0.23456790 50 1 12 2 0.8571429 0.1428571 0.17283951 51 2 3 4 0.4285714 0.5714286 0.08641975 52 1 29 0 1.0000000 0.0000000 0.35802469 53 2 8 11 0.4210526 0.5789474 0.23456790 54 1 29 0 1.0000000 0.0000000 0.35802469 55 1 29 0 1.0000000 0.0000000 0.35802469 56 1 29 0 1.0000000 0.0000000 0.35802469 57 1 12 0 1.0000000 0.0000000 0.14814815 58 2 8 11 0.4210526 0.5789474 0.23456790 59 1 12 0 1.0000000 0.0000000 0.14814815 60 2 3 4 0.4285714 0.5714286 0.08641975 61 2 8 11 0.4210526 0.5789474 0.23456790 62 2 8 11 0.4210526 0.5789474 0.23456790 63 2 8 11 0.4210526 0.5789474 0.23456790 64 1 29 0 1.0000000 0.0000000 0.35802469 65 1 29 0 1.0000000 0.0000000 0.35802469 66 1 12 0 1.0000000 0.0000000 0.14814815 67 1 29 0 1.0000000 0.0000000 0.35802469 68 1 12 2 0.8571429 0.1428571 0.17283951 69 1 12 0 1.0000000 0.0000000 0.14814815 70 1 29 0 1.0000000 0.0000000 0.35802469 71 1 12 2 0.8571429 0.1428571 0.17283951 72 1 12 2 0.8571429 0.1428571 0.17283951 73 1 29 0 1.0000000 0.0000000 0.35802469 74 1 12 2 0.8571429 0.1428571 0.17283951 75 1 29 0 1.0000000 0.0000000 0.35802469 76 1 29 0 1.0000000 0.0000000 0.35802469 77 1 12 2 0.8571429 0.1428571 0.17283951 78 1 12 0 1.0000000 0.0000000 0.14814815 79 1 12 2 0.8571429 0.1428571 0.17283951 80 2 8 11 0.4210526 0.5789474 0.23456790 81 1 12 0 1.0000000 0.0000000 0.14814815 > > sub <- c(sample(1:50, 25), sample(51:100, 25), sample(101:150, 25)) > fit <- rpart(Species ~ ., data = iris, subset = sub) > fit n= 75 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 75 50 setosa (0.33333333 0.33333333 0.33333333) 2) Petal.Length< 2.5 25 0 setosa (1.00000000 0.00000000 0.00000000) * 3) Petal.Length>=2.5 50 25 versicolor (0.00000000 0.50000000 0.50000000) 6) Petal.Length< 4.85 26 2 versicolor (0.00000000 0.92307692 0.07692308) * 7) Petal.Length>=4.85 24 1 virginica (0.00000000 0.04166667 0.95833333) * > table(predict(fit, iris[-sub,], type = "class"), iris[-sub, "Species"]) setosa versicolor virginica setosa 25 0 0 versicolor 0 22 1 virginica 0 3 24 > > > > cleanEx() > nameEx("print.rpart") > ### * print.rpart > > flush(stderr()); flush(stdout()) > > ### Name: print.rpart > ### Title: Print an Rpart Object > ### Aliases: print.rpart > ### Keywords: tree > > ### ** Examples > > z.auto <- rpart(Mileage ~ Weight, car.test.frame) > z.auto n= 60 node), split, n, deviance, yval * denotes terminal node 1) root 60 1354.58300 24.58333 2) Weight>=2567.5 45 361.20000 22.46667 4) Weight>=3087.5 22 61.31818 20.40909 * 5) Weight< 3087.5 23 117.65220 24.43478 10) Weight>=2747.5 15 60.40000 23.80000 * 11) Weight< 2747.5 8 39.87500 25.62500 * 3) Weight< 2567.5 15 186.93330 30.93333 * > ## Not run: > ##D node), split, n, deviance, yval > ##D * denotes terminal node > ##D > ##D 1) root 60 1354.58300 24.58333 > ##D 2) Weight>=2567.5 45 361.20000 22.46667 > ##D 4) Weight>=3087.5 22 61.31818 20.40909 * > ##D 5) Weight<3087.5 23 117.65220 24.43478 > ##D 10) Weight>=2747.5 15 60.40000 23.80000 * > ##D 11) Weight<2747.5 8 39.87500 25.62500 * > ##D 3) Weight<2567.5 15 186.93330 30.93333 * > ## End(Not run) > > > cleanEx() > nameEx("printcp") > ### * printcp > > flush(stderr()); flush(stdout()) > > ### Name: printcp > ### Title: Displays CP table for Fitted Rpart Object > ### Aliases: printcp > ### Keywords: tree > > ### ** Examples > > z.auto <- rpart(Mileage ~ Weight, car.test.frame) > printcp(z.auto) Regression tree: rpart(formula = Mileage ~ Weight, data = car.test.frame) Variables actually used in tree construction: [1] Weight Root node error: 1354.6/60 = 22.576 n= 60 CP nsplit rel error xerror xstd 1 0.595349 0 1.00000 1.03378 0.180465 2 0.134528 1 0.40465 0.58366 0.109010 3 0.012828 2 0.27012 0.44092 0.086528 4 0.010000 3 0.25729 0.44158 0.086630 > ## Not run: > ##D Regression tree: > ##D rpart(formula = Mileage ~ Weight, data = car.test.frame) > ##D > ##D Variables actually used in tree construction: > ##D [1] Weight > ##D > ##D Root node error: 1354.6/60 = 22.576 > ##D > ##D CP nsplit rel error xerror xstd > ##D 1 0.595349 0 1.00000 1.03436 0.178526 > ##D 2 0.134528 1 0.40465 0.60508 0.105217 > ##D 3 0.012828 2 0.27012 0.45153 0.083330 > ##D 4 0.010000 3 0.25729 0.44826 0.076998 > ## End(Not run) > > > cleanEx() > nameEx("prune.rpart") > ### * prune.rpart > > flush(stderr()); flush(stdout()) > > ### Name: prune.rpart > ### Title: Cost-complexity Pruning of an Rpart Object > ### Aliases: prune.rpart prune > ### Keywords: tree > > ### ** Examples > > z.auto <- rpart(Mileage ~ Weight, car.test.frame) > zp <- prune(z.auto, cp = 0.1) > plot(zp) #plot smaller rpart object > > > > cleanEx() > nameEx("residuals.rpart") > ### * residuals.rpart > > flush(stderr()); flush(stdout()) > > ### Name: residuals.rpart > ### Title: Residuals From a Fitted Rpart Object > ### Aliases: residuals.rpart > ### Keywords: tree > > ### ** Examples > > fit <- rpart(skips ~ Opening + Solder + Mask + PadType + Panel, + data = solder.balance, method = "anova") > summary(residuals(fit)) Min. 1st Qu. Median Mean 3rd Qu. Max. -13.8000 -1.0361 -0.6833 0.0000 0.9639 16.2000 > plot(predict(fit),residuals(fit)) > > > > cleanEx() > nameEx("rpart") > ### * rpart > > flush(stderr()); flush(stdout()) > > ### Name: rpart > ### Title: Recursive Partitioning and Regression Trees > ### Aliases: rpart > ### Keywords: tree > > ### ** Examples > > fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis) > fit2 <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis, + parms = list(prior = c(.65,.35), split = "information")) > fit3 <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis, + control = rpart.control(cp = 0.05)) > par(mfrow = c(1,2), xpd = NA) # otherwise on some devices the text is clipped > plot(fit) > text(fit, use.n = TRUE) > plot(fit2) > text(fit2, use.n = TRUE) > > > > graphics::par(get("par.postscript", pos = 'CheckExEnv')) > cleanEx() > nameEx("rsq.rpart") > ### * rsq.rpart > > flush(stderr()); flush(stdout()) > > ### Name: rsq.rpart > ### Title: Plots the Approximate R-Square for the Different Splits > ### Aliases: rsq.rpart > ### Keywords: tree > > ### ** Examples > > z.auto <- rpart(Mileage ~ Weight, car.test.frame) > rsq.rpart(z.auto) Regression tree: rpart(formula = Mileage ~ Weight, data = car.test.frame) Variables actually used in tree construction: [1] Weight Root node error: 1354.6/60 = 22.576 n= 60 CP nsplit rel error xerror xstd 1 0.595349 0 1.00000 1.03378 0.180465 2 0.134528 1 0.40465 0.58366 0.109010 3 0.012828 2 0.27012 0.44092 0.086528 4 0.010000 3 0.25729 0.44158 0.086630 > > > > cleanEx() > nameEx("snip.rpart") > ### * snip.rpart > > flush(stderr()); flush(stdout()) > > ### Name: snip.rpart > ### Title: Snip Subtrees of an Rpart Object > ### Aliases: snip.rpart > ### Keywords: tree > > ### ** Examples > > ## dataset not in R > ## Not run: > ##D z.survey <- rpart(market.survey) # grow the rpart object > ##D plot(z.survey) # plot the tree > ##D z.survey2 <- snip.rpart(z.survey, toss = 2) # trim subtree at node 2 > ##D plot(z.survey2) # plot new tree > ##D > ##D # can also interactively select the node using the mouse in the > ##D # graphics window > ## End(Not run) > > > cleanEx() > nameEx("solder.balance") > ### * solder.balance > > flush(stderr()); flush(stdout()) > > ### Name: solder.balance > ### Title: Soldering of Components on Printed-Circuit Boards > ### Aliases: solder.balance solder > ### Keywords: datasets > > ### ** Examples > > fit <- rpart(skips ~ Opening + Solder + Mask + PadType + Panel, + data = solder.balance, method = "anova") > summary(residuals(fit)) Min. 1st Qu. Median Mean 3rd Qu. Max. -13.8000 -1.0361 -0.6833 0.0000 0.9639 16.2000 > plot(predict(fit), residuals(fit)) > > > > cleanEx() > nameEx("stagec") > ### * stagec > > flush(stderr()); flush(stdout()) > > ### Name: stagec > ### Title: Stage C Prostate Cancer > ### Aliases: stagec > ### Keywords: datasets > > ### ** Examples > > require(survival) Loading required package: survival > rpart(Surv(pgtime, pgstat) ~ ., stagec) n= 146 node), split, n, deviance, yval * denotes terminal node 1) root 146 192.111100 1.0000000 2) grade< 2.5 61 44.799010 0.3634439 4) g2< 11.36 33 9.117405 0.1229835 * 5) g2>=11.36 28 27.602190 0.7345610 10) gleason< 5.5 20 14.297110 0.5304115 * 11) gleason>=5.5 8 11.094650 1.3069940 * 3) grade>=2.5 85 122.441500 1.6148600 6) age>=56.5 75 103.062900 1.4255040 12) gleason< 7.5 50 66.119800 1.1407320 24) g2< 13.475 24 27.197170 0.8007306 * 25) g2>=13.475 26 36.790960 1.4570210 50) g2>=17.915 15 20.332740 0.9789825 * 51) g2< 17.915 11 13.459010 2.1714480 * 13) gleason>=7.5 25 33.487250 2.0307290 26) g2>=15.29 10 11.588480 1.2156230 * 27) g2< 15.29 15 18.939150 2.7053610 * 7) age< 56.5 10 13.769010 3.1822320 * > > > > cleanEx() detaching ‘package:survival’ > nameEx("summary.rpart") > ### * summary.rpart > > flush(stderr()); flush(stdout()) > > ### Name: summary.rpart > ### Title: Summarize a Fitted Rpart Object > ### Aliases: summary.rpart > ### Keywords: tree > > ### ** Examples > > ## a regression tree > z.auto <- rpart(Mileage ~ Weight, car.test.frame) > summary(z.auto) Call: rpart(formula = Mileage ~ Weight, data = car.test.frame) n= 60 CP nsplit rel error xerror xstd 1 0.59534912 0 1.0000000 1.0337818 0.18046532 2 0.13452819 1 0.4046509 0.5836606 0.10900973 3 0.01282843 2 0.2701227 0.4409221 0.08652804 4 0.01000000 3 0.2572943 0.4415805 0.08663003 Variable importance Weight 100 Node number 1: 60 observations, complexity param=0.5953491 mean=24.58333, MSE=22.57639 left son=2 (45 obs) right son=3 (15 obs) Primary splits: Weight < 2567.5 to the right, improve=0.5953491, (0 missing) Node number 2: 45 observations, complexity param=0.1345282 mean=22.46667, MSE=8.026667 left son=4 (22 obs) right son=5 (23 obs) Primary splits: Weight < 3087.5 to the right, improve=0.5045118, (0 missing) Node number 3: 15 observations mean=30.93333, MSE=12.46222 Node number 4: 22 observations mean=20.40909, MSE=2.78719 Node number 5: 23 observations, complexity param=0.01282843 mean=24.43478, MSE=5.115312 left son=10 (15 obs) right son=11 (8 obs) Primary splits: Weight < 2747.5 to the right, improve=0.1476996, (0 missing) Node number 10: 15 observations mean=23.8, MSE=4.026667 Node number 11: 8 observations mean=25.625, MSE=4.984375 > > ## a classification tree with multiple variables and surrogate splits. > summary(rpart(Kyphosis ~ Age + Number + Start, data = kyphosis)) Call: rpart(formula = Kyphosis ~ Age + Number + Start, data = kyphosis) n= 81 CP nsplit rel error xerror xstd 1 0.17647059 0 1.0000000 1.000000 0.2155872 2 0.01960784 1 0.8235294 1.058824 0.2200975 3 0.01000000 4 0.7647059 1.058824 0.2200975 Variable importance Start Age Number 64 24 12 Node number 1: 81 observations, complexity param=0.1764706 predicted class=absent expected loss=0.2098765 P(node) =1 class counts: 64 17 probabilities: 0.790 0.210 left son=2 (62 obs) right son=3 (19 obs) Primary splits: Start < 8.5 to the right, improve=6.762330, (0 missing) Number < 5.5 to the left, improve=2.866795, (0 missing) Age < 39.5 to the left, improve=2.250212, (0 missing) Surrogate splits: Number < 6.5 to the left, agree=0.802, adj=0.158, (0 split) Node number 2: 62 observations, complexity param=0.01960784 predicted class=absent expected loss=0.09677419 P(node) =0.7654321 class counts: 56 6 probabilities: 0.903 0.097 left son=4 (29 obs) right son=5 (33 obs) Primary splits: Start < 14.5 to the right, improve=1.0205280, (0 missing) Age < 55 to the left, improve=0.6848635, (0 missing) Number < 4.5 to the left, improve=0.2975332, (0 missing) Surrogate splits: Number < 3.5 to the left, agree=0.645, adj=0.241, (0 split) Age < 16 to the left, agree=0.597, adj=0.138, (0 split) Node number 3: 19 observations predicted class=present expected loss=0.4210526 P(node) =0.2345679 class counts: 8 11 probabilities: 0.421 0.579 Node number 4: 29 observations predicted class=absent expected loss=0 P(node) =0.3580247 class counts: 29 0 probabilities: 1.000 0.000 Node number 5: 33 observations, complexity param=0.01960784 predicted class=absent expected loss=0.1818182 P(node) =0.4074074 class counts: 27 6 probabilities: 0.818 0.182 left son=10 (12 obs) right son=11 (21 obs) Primary splits: Age < 55 to the left, improve=1.2467530, (0 missing) Start < 12.5 to the right, improve=0.2887701, (0 missing) Number < 3.5 to the right, improve=0.1753247, (0 missing) Surrogate splits: Start < 9.5 to the left, agree=0.758, adj=0.333, (0 split) Number < 5.5 to the right, agree=0.697, adj=0.167, (0 split) Node number 10: 12 observations predicted class=absent expected loss=0 P(node) =0.1481481 class counts: 12 0 probabilities: 1.000 0.000 Node number 11: 21 observations, complexity param=0.01960784 predicted class=absent expected loss=0.2857143 P(node) =0.2592593 class counts: 15 6 probabilities: 0.714 0.286 left son=22 (14 obs) right son=23 (7 obs) Primary splits: Age < 111 to the right, improve=1.71428600, (0 missing) Start < 12.5 to the right, improve=0.79365080, (0 missing) Number < 3.5 to the right, improve=0.07142857, (0 missing) Node number 22: 14 observations predicted class=absent expected loss=0.1428571 P(node) =0.1728395 class counts: 12 2 probabilities: 0.857 0.143 Node number 23: 7 observations predicted class=present expected loss=0.4285714 P(node) =0.08641975 class counts: 3 4 probabilities: 0.429 0.571 > > > > cleanEx() > nameEx("text.rpart") > ### * text.rpart > > flush(stderr()); flush(stdout()) > > ### Name: text.rpart > ### Title: Place Text on a Dendrogram Plot > ### Aliases: text.rpart > ### Keywords: tree > > ### ** Examples > > freen.tr <- rpart(y ~ ., freeny) > par(xpd = TRUE) > plot(freen.tr) > text(freen.tr, use.n = TRUE, all = TRUE) > > > > graphics::par(get("par.postscript", pos = 'CheckExEnv')) > cleanEx() > nameEx("xpred.rpart") > ### * xpred.rpart > > flush(stderr()); flush(stdout()) > > ### Name: xpred.rpart > ### Title: Return Cross-Validated Predictions > ### Aliases: xpred.rpart > ### Keywords: tree > > ### ** Examples > > fit <- rpart(Mileage ~ Weight, car.test.frame) > xmat <- xpred.rpart(fit) > xerr <- (xmat - car.test.frame$Mileage)^2 > apply(xerr, 2, sum) # cross-validated error estimate 0.79767456 0.28300396 0.04154257 0.01132626 1396.6687 773.1546 577.8990 594.1341 > > # approx same result as rel. error from printcp(fit) > apply(xerr, 2, sum)/var(car.test.frame$Mileage) 0.79767456 0.28300396 0.04154257 0.01132626 60.83306 33.67539 25.17087 25.87800 > printcp(fit) Regression tree: rpart(formula = Mileage ~ Weight, data = car.test.frame) Variables actually used in tree construction: [1] Weight Root node error: 1354.6/60 = 22.576 n= 60 CP nsplit rel error xerror xstd 1 0.595349 0 1.00000 1.03378 0.180465 2 0.134528 1 0.40465 0.58366 0.109010 3 0.012828 2 0.27012 0.44092 0.086528 4 0.010000 3 0.25729 0.44158 0.086630 > > > > ### *