segmented/0000755000175100001440000000000012404065367012252 5ustar hornikuserssegmented/inst/0000755000175100001440000000000012404051010013203 5ustar hornikuserssegmented/inst/CITATION0000644000175100001440000000236112404051010014342 0ustar hornikuserscitHeader("To cite segmented in publications use:") citEntry(entry="Article", title = "Estimating regression models with unknown break-points.", author = personList(as.person("Vito M.R. Muggeo")), journal = "Statistics in Medicine", year = "2003", volume = "22", pages = "3055--3071", textVersion = paste("Vito M. R. Muggeo (2003).", "Estimating regression models with unknown break-points.", "Statistics in Medicine, 22, 3055-3071.") ) citEntry(entry="Article", title = "segmented: an R Package to Fit Regression Models with Broken-Line Relationships.", author = personList(as.person("Vito M.R. Muggeo")), journal = "R News", year = "2008", volume = "8", number = "1", pages = "20--25", url = "http://cran.r-project.org/doc/Rnews/", textVersion = paste("Vito M. R. Muggeo (2008).", "segmented: an R Package to Fit Regression Models with Broken-Line Relationships.", "R News, 8/1, 20-25.", "URL http://cran.r-project.org/doc/Rnews/.") ) segmented/NAMESPACE0000644000175100001440000000133512404051010013447 0ustar hornikusersexport(segmented, segmented.default, segmented.lm, segmented.glm, broken.line ,confint.segmented,davies.test,draw.history, intercept,lines.segmented,plot.segmented,print.segmented, seg.control,seg.lm.fit,seg.glm.fit,seg.lm.fit.boot,seg.glm.fit.boot, seg.def.fit,seg.def.fit.boot, slope, summary.segmented,print.summary.segmented,vcov.segmented, predict.segmented, points.segmented) S3method(segmented,default) S3method(segmented,lm) S3method(segmented,glm) S3method(plot,segmented) S3method(print,segmented) S3method(summary,segmented) S3method(print, summary.segmented) S3method(lines,segmented) S3method(confint,segmented) S3method(vcov,segmented) S3method(predict,segmented) S3method(points,segmented) segmented/NEWS0000644000175100001440000002467312404051010012741 0ustar hornikusers************************************* * * * Changes in segmented * * * ************************************* =============== version 0.5-0.0 =============== * segmented.default() introduced. Now it is possible to estimate segmented relationships in arbitrary regression models (besides lm and glm) where specific methods do not exist (e.g. cox or quantile regression models). =============== version 0.4-0.1 (not on CRAN) =============== * segmented.lm() and segmented.glm() did not work if the starting model included additional "variables", such as 'threshold' in 'subset=age0. * The breakpoint starting values when automatic selection is performed are now specified as equally spaced values (optionally as quantiles). see argument 'quant' in seg.control() * added 'Authors@R' entry in the DESCRIPTION file =============== version 0.2-9.1 =============== * Some bugs fixed: segmented.lm() and segmented.glm() did not finish correctly when no breakpoint was found; now segmented.lm() and segmented.glm() take care of flat relationships; plot.segmented() did not compute correctly the partial residuals for segmented glm fits. =============== version 0.2-9.0 =============== * Bootstrap restarting implemented to deal with problems coming from flat segmented relationships. segmented now is less sensitive to starting values supplied for 'psi'. * At the convergence segmented now constrains the gap coefficients to be exactly zero. This is the default and it can be altered by the 'gap' argument in seg.control(). * plot.segmented() has been re-written. It gains argument `res' for plotting partial residuals along with the fitted piecewise lines, and now it produces nicer (and typically smaller) plots. * Some bugs fixed: davies.test() did not work correctly for deterministic data (thanks to Glenn Roberts for finding the error). davies.test() also returns the `process', i.e. the different values of the evaluation points and corresponding test statistic. =============== version 0.2-8.4 =============== * Some bugs fixed: segmented.glm() fitted a simple "lm" (and not "glm") (the error was introduced incidentally from 0.2-8.3, thanks to Vronique Storme for finding the error); broken.line() was not working for models without intercept and a null left slope; intercept() was not working correctly with multiple segmented variables. =============== version 0.2-8.3 =============== * Some minor bugs fixed: segmented.lm() and segmented.glm() did not find the offset variable in the dataframe where the initial (g)lm was called for; segmented.lm() and segmented.glm() sometimes returned an error when the automated algorithm was used (thanks to Paul Cohen for finding the error). =============== version 0.2-8.2 =============== * Some minor bugs fixed (segmented.lm() and segmented.glm() *alway* included the left slope in the estimation process, although the number of parameters was correct in the returned final fit. confint.segmented() did not order the estimated breakpoints for the variable having rev.sgn=TRUE; intercept() missed the (currently meaningless) argument var.diff (thanks to Eric Fuchs for pointing out that). ) =============== version 0.2-8.1 =============== * Some minor bugs fixed (segmented.lm() and segmented.glm() were not working correctly with dataframe subset or when the starting linear model included several intercepts (e.g., see the example about data("plant"); thanks to Nicola Ferrari for finding the error). davies.test() did not work when the variable name of its argument `seg.Z' included reserved words, e.g. `seg.Z~dist'; thanks to Thom White for finding the error). =============== version 0.2-8 =============== * intercept() added. It computes the intercepts of the regression lines for each segment of the fitted segmented relationship. * plot.segmented() now accepts a vector `col' argument to draw the fitted piecewise linear relationships with different colors. * Some minor bugs fixed (summary.segmented were not working correctly). =============== version 0.2-7.3 =============== * argument APC added to the slope() function to compute the `annual percent change'. * Some minor bugs fixed (confint and slope were not working correctly when the estimated breakpoints were returned in non-increasing order; offset was ignored in segmented.lm and segmented.glm; broken.line() was not working correctly (and its argument gap was unimplemented), thanks to M. Rennie for pointing out that; summary.segmented() was not working for models with no linear term, i.e. fitted via segmented(lm(y~0),..)). =============== version 0.2-7.2 =============== * segmented.lm and segmented.glm now accept objects with formulas y~., Thanks to G. Ferrara for finding the error. * Some bugs fixed (slope and confint were using the normal (rather than the t-distribution) to compute the CIs in gaussian models). =============== version 0.2-7.1 =============== * segmented.lm and segmented.glm now accept objects without 'explicit' formulas, namely returned by lm(my_fo,..) (and glm(my_fo,..)) where my_fo was defined earlier. Thanks to Y. Iwasaki for finding the error. =============== version 0.2-7 =============== * A sort of automatic procedure for breakpoint estimation is implemented. See argument stop.if.error in seg.control(). * davies.test() now accepts a one-sided formula (~x) rather than character ("x") to mean the segmented variable to be tested. davies.test also gains the arguments `beta0' and `dispersion'. * Some bugs fixed. =============== version 0.2-6 =============== * vcov.segmented() added. * option var.diff for robust covariance matrix has been added in summary.segmented(), print.summary.segmented(), slope(), and confint(). * Some bugs fixed. segmented/data/0000755000175100001440000000000012404051010013137 5ustar hornikuserssegmented/data/down.rda0000644000175100001440000000062112404051010014575 0ustar hornikusersuAKAѵBCDtUWFAxJ0 WA"бtt" ,@; 373;謨cN&9)Mr0s[!W̶5o>#%ON8 0`\0 cPQAƑ r9")r9Dkw(k}>JsנAf?Wi||ɗs9Nu:y##nnɛcE^ yD4Ws{{!A!O@0MDŽ0$}0]wmu/uJfBꬦ[]K+嚗/g7:NK+[tsZUnTf_!+κsegmented/data/plant.rda0000644000175100001440000000123012404051010014741 0ustar hornikusersV=hA=,(X"Zhނ(j+L+\mP.wMB.DrIN![ iI!Q"/^.w!cBr{oCV{:0͛ޛye77c B5MDtMDE2I+ z#bu^8ok3J1X_ya] xgiKqŸeO99ՠܫPɅi\ėP3Ԕ[;l)Q/@t2Hc7IWڇwjA|^o-P>ϣ:H_9*jֿ:9-7d0nIVfjN &P~ޒl|9Ώ#~|x{q*īR~\+ȏ M)6k\O>OiQ^>1 Ǵ*7Bw.fe)=fCep}MW0Eĩ8ϘOFuy /Α"3ȟ Q }"؟_G}^Hnq8H$Rzꍽ`0NZYbְvnFʹ)Dۣ f8LȽIOvsy"n/!Qo,Ո_ T segmented/data/stagnant.rda0000644000175100001440000000054012404051010015445 0ustar hornikusers r0b```b`fdd`b2Y# '(.ILK+a``c`b/g,D_`8oGVpXc4a'h7ۂ׹la40W4V`djC`߯[aV@3O7g@7@O+ 掷P@@-p%H{W k mK o?,/Cl쁊j9H|CAoּb CA M9gQ~L /HKtssaRKҊ 2segmented/R/0000755000175100001440000000000012404051010012427 5ustar hornikuserssegmented/R/predict.segmented.r0000644000175100001440000000724612404051010016227 0ustar hornikusers predict.segmented<-function(object, newdata, ...){ #rev: 30/10/2013: it seems to work correctly, even with the minus variable (null right slope..) #rev: 14/4/2014 now it works like predict.lm/glm #BUT problems if type="terms" (in realt funziona, il problema che # restituisce una colonna per "x", "U.x", "psi.x".. (Eventualmente si dovrebbero sommare..) #if(!is.null(object$orig.call$offset)) stop("predict.segmented can not handle argument 'offset'. Include it in formula!") dummy.matrix<-function(x.values, x.name, obj.seg, psi.est=TRUE){ #given the segmented fit 'obj.seg' and a segmented variable x.name with corresponding values x.values, #this function simply returns a matrix with columns (x, (x-psi)_+, -b*I(x>psi)) #or ((x-psi)_+, -b*I(x>psi)) if obj.seg does not include the coef for the linear "x" f.U<-function(nomiU, term=NULL){ #trasforma i nomi dei coeff U (o V) nei nomi delle variabili corrispondenti #and if 'term' is provided (i.e. it differs from NULL) the index of nomiU matching term are returned k<-length(nomiU) nomiUsenzaU<-strsplit(nomiU, "\\.") nomiU.ok<-vector(length=k) for(i in 1:k){ nomi.i<-nomiUsenzaU[[i]][-1] if(length(nomi.i)>1) nomi.i<-paste(nomi.i,collapse=".") nomiU.ok[i]<-nomi.i } if(!is.null(term)) nomiU.ok<-(1:k)[nomiU.ok%in%term] return(nomiU.ok) } n<-length(x.values) #le seguenti righe selezionavano (ERRONEAMENTE) sia "U1.x" sia "U1.neg.x" (se "x" e "neg.x" erano segmented covariates) #nameU<- grep(paste("\\.",x.name,"$", sep=""), obj.seg$nameUV$U, value = TRUE) #nameV<- grep(paste("\\.",x.name,"$", sep=""), obj.seg$nameUV$V, value = TRUE) nameU<-obj.seg$nameUV$U[f.U(obj.seg$nameUV$U,x.name)] nameV<-obj.seg$nameUV$V[f.U(obj.seg$nameUV$V,x.name)] diffSlope<-coef(obj.seg)[nameU] est.psi<-obj.seg$psi[nameV,2] k<-length(est.psi) PSI <- matrix(rep(est.psi, rep(n, k)), ncol = k) newZ<-matrix(x.values, nrow=n,ncol=k, byrow = FALSE) dummy1<-pmax(newZ-PSI,0) if(psi.est){ V<-ifelse(newZ>PSI,-1,0) dummy2<- if(k==1) V*diffSlope else V%*%diag(diffSlope) #t(diffSlope*t(-I(newZ>PSI))) newd<-cbind(x.values,dummy1,dummy2) colnames(newd)<-c(x.name,nameU, nameV) } else { newd<-cbind(x.values,dummy1) colnames(newd)<-c(x.name,nameU) } if(!x.name%in%names(coef(obj.seg))) newd<-newd[,-1,drop=FALSE] return(newd) } #-------------------------------------------------------------- if(missing(newdata)){ newd.ok<-model.frame(object) } else { #devi trasformare la variabili segmented attraverso dummy.matrix() nameU<-object$nameUV$U nameV<-object$nameUV$V nameZ<-object$nameUV$Z n<-nrow(newdata) r<-NULL for(i in 1:length(nameZ)){ x.values<-newdata[[nameZ[i]]] DM<-dummy.matrix(x.values, nameZ[i], object) r[[i]]<-DM } newd.ok<-data.frame(matrix(unlist(r), nrow=n, byrow = FALSE)) names(newd.ok)<- unlist(sapply(r, colnames)) idZ<-match(nameZ, names( newdata)) newdata<-cbind(newdata[,-idZ, drop=FALSE], newd.ok) # newdata<-subset(newdata, select=-idZ) newdata<-cbind(newdata, newd.ok) } class(object)<-class(object)[-1] f<-predict(object, newdata=newdata, ...) #f<-if(inherits(object, what = "glm", which = FALSE)) predict.glm(object, newdata=newd.ok, ...) else predict.lm(object, newdata=newd.ok, ...) return(f) #sommare se "terms"? } segmented/R/seg.lm.fit.boot.r0000644000175100001440000001322012404051010015520 0ustar hornikusersseg.lm.fit.boot<-function(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE){ #random se TRUE prende valori random quando errore: comunque devi modificare qualcosa (magari con it.max) # per fare restituire la dev in corrispondenza del punto psi-random #nonParm. se TRUE implemneta il case resampling. Quello semiparam dipende dal non-errore di #---------------------------------- # sum.of.squares<-function(obj.seg){ # #computes the "correct" SumOfSquares from a segmented" fit # b<-obj.seg$obj$coef # X<-qr.X(obj.seg$obj$qr) #X<-model.matrix(obj.seg) # X<-X[,!is.na(b)] # b<-b[!is.na(b)] # rev.b<-rev(b) # rev.b[1:length(obj.seg$psi)]<-0 # b<-rev(rev.b) # new.fitted<-drop(X%*%b) # new.res<- obj.seg$obj$residuals + obj.seg$obj$fitted - new.fitted # ss<-sum(new.res^2) # ss # } #-------- extract.psi<-function(lista){ #serve per estrarre il miglior psi.. dev.values<-lista[[1]] psi.values<-lista[[2]] dev.ok<-min(dev.values) id.dev.ok<-which.min(dev.values) if(is.list(psi.values)) psi.values<-matrix(unlist(psi.values), nrow=length(dev.values), byrow=TRUE) if(!is.matrix(psi.values)) psi.values<-matrix(psi.values) psi.ok<-psi.values[id.dev.ok,] r<-list(SumSquares.no.gap=dev.ok, psi=psi.ok) r } #------------- visualBoot<-opz$visualBoot opz.boot<-opz opz.boot$pow=c(1.1,1.2) opz1<-opz opz1$it.max <-1 n<-length(y) o0<-try(seg.lm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol=FALSE), silent=TRUE) rangeZ <- apply(Z, 2, range) #serve sempre if(!is.list(o0)) { o0<- seg.lm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol=TRUE) o0<-extract.psi(o0) if(!nonParam) {warning("using nonparametric boot");nonParam<-TRUE} } if(is.list(o0)){ est.psi00<-est.psi0<-o0$psi ss00<-o0$SumSquares.no.gap if(!nonParam) fitted.ok<-fitted(o0) } else { if(!nonParam) stop("the first fit failed and I cannot extract fitted values for the semipar boot") if(random) { est.psi00<-est.psi0<-apply(rangeZ,2,function(r)runif(1,r[1],r[2])) PSI1 <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) o0<-try(seg.lm.fit(y, XREG, Z, PSI1, w, offs, opz1), silent=TRUE) ss00<-o0$SumSquares.no.gap } else { est.psi00<-est.psi0<-apply(PSI,2,mean) ss00<-opz$dev0 } } all.est.psi.boot<-all.selected.psi<-all.est.psi<-matrix(, nrow=n.boot, ncol=length(est.psi0)) all.ss<-all.selected.ss<-rep(NA, n.boot) if(is.null(size.boot)) size.boot<-n # na<- ,,apply(...,2,function(x)mean(is.na(x))) Z.orig<-Z if(visualBoot) cat(0, " ", formatC(opz$dev0, 3, format = "f"),"", "(No breakpoint(s))", "\n") count.random<-0 for(k in seq(n.boot)){ PSI <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) if(jt) Z<-apply(Z.orig,2,jitter) if(nonParam){ id<-sample(n, size=size.boot, replace=TRUE) o.boot<-try(seg.lm.fit(y[id], XREG[id,,drop=FALSE], Z[id,,drop=FALSE], PSI[id,,drop=FALSE], w[id], offs[id], opz.boot), silent=TRUE) } else { yy<-fitted.ok+sample(residuals(o0),size=n, replace=TRUE) o.boot<-try(seg.lm.fit(yy, XREG, Z.orig, PSI, weights, offs, opz.boot), silent=TRUE) } if(is.list(o.boot)){ all.est.psi.boot[k,]<-est.psi.boot<-o.boot$psi } else { est.psi.boot<-apply(rangeZ,2,function(r)runif(1,r[1],r[2])) } PSI <- matrix(rep(est.psi.boot, rep(nrow(Z), length(est.psi.boot))), ncol = length(est.psi.boot)) opz$h<-max(opz$h*.9, .2) opz$it.max<-opz$it.max+1 o<-try(seg.lm.fit(y, XREG, Z.orig, PSI, w, offs, opz, return.all.sol=TRUE), silent=TRUE) if(!is.list(o) && random){ est.psi0<-apply(rangeZ,2,function(r)runif(1,r[1],r[2])) PSI1 <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) o<-try(seg.lm.fit(y, XREG, Z, PSI1, w, offs, opz1), silent=TRUE) count.random<-count.random+1 } if(is.list(o)){ if(!"coefficients"%in%names(o$obj)) o<-extract.psi(o) all.est.psi[k,]<-o$psi all.ss[k]<-o$SumSquares.no.gap if(o$SumSquares.no.gap<=ifelse(is.list(o0), o0$SumSquares.no.gap, 10^12)) o0<-o est.psi0<-o0$psi all.selected.psi[k,] <- est.psi0 all.selected.ss[k]<-o0$SumSquares.no.gap #min(c(o$SumSquares.no.gap, o0$SumSquares.no.gap)) } if(visualBoot) { flush.console() spp <- if (k < 10) "" else NULL cat(k, spp, "", formatC(o0$SumSquares.no.gap, 3, format = "f"), "\n") } } #end n.boot all.selected.psi<-rbind(est.psi00,all.selected.psi) all.selected.ss<-c(ss00, all.selected.ss) ris<-list(all.selected.psi=drop(all.selected.psi),all.selected.ss=all.selected.ss, all.psi=all.est.psi, all.ss=all.ss) if(is.null(o0$obj)){ PSI1 <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) o0<-try(seg.lm.fit(y, XREG, Z, PSI1, w, offs, opz1), silent=TRUE) } if(!is.list(o0)) return(0) o0$boot.restart<-ris return(o0) }segmented/R/summary.segmented.R0000644000175100001440000001174512404051010016231 0ustar hornikusers`summary.segmented` <- function(object, short=FALSE, var.diff=FALSE, ...){ if(is.null(object$psi)) object<-object[[length(object)]] #i seguenti per calcolare aa,bb,cc funzionano per lm e glm, da verificare con arima.... # nome<-rownames(object$psi) # nome<-as.character(parse("",text=nome)) # aa<-grep("U",names(coef(object)[!is.na(coef(object))])) # bb<-unlist(sapply(nome,function(x){grep(x,names(coef(object)[!is.na(coef(object))]))},simplify=FALSE,USE.NAMES=FALSE)) # cc<-intersect(aa,bb) #indices of diff-slope parameters # iV<- -grep("psi.",names(coef(object)[!is.na(coef(object))]))#indices of all but the Vs if(var.diff && length(object$nameUV$Z)>1) { var.diff<-FALSE warning("var.diff set to FALSE with multiple segmented variables", call.=FALSE) } nomiU<-object$nameUV[[1]] nomiV<-object$nameUV[[2]] idU<-match(nomiU,names(coef(object)[!is.na(coef(object))])) idV<-match(nomiV,names(coef(object)[!is.na(coef(object))])) beta.c<- coef(object)[nomiU] #per metodo default.. if( !inherits(object, "segmented")){ summ <- c(summary(object, ...), object["psi"]) summ[c("it","epsilon")]<-object[c("it","epsilon")] coeff<-coef(object) v<-try(vcov(object), silent=TRUE) if(class(v)!="try-error"){ v<-sqrt(diag(v)) summ$gap<-cbind(coeff[idV]*beta.c,abs(v[idV]*beta.c),coeff[idV]/v[idV]) colnames(summ$gap)<-c("Est.","SE","t value") rownames(summ$gap)<-nomiU } else { summ$gap<-cbind(coeff[idV]*beta.c,NA,NA) colnames(summ$gap)<-c("Est.","SE","t value") rownames(summ$gap)<-nomiU } return(summ) } if("lm"%in%class(object) && !"glm"%in%class(object)){ #if(!inherits(object, "glm")){ summ <- c(summary.lm(object, ...), object["psi"]) summ$Ttable<-summ$coefficients if(var.diff){ sigma2.new<-tapply(object$residuals, object$id.group, function(xx){sum(xx^2)}) summ$df.new<-tapply(object$residuals, object$id.group, function(xx){(length(xx)-length(object$coef))}) summ$sigma.new<-sqrt(sigma2.new/summ$df.new) #modifica gli SE Qr <- object$qr p <- object$rank p1 <- 1L:p inv.XtX <- chol2inv(Qr$qr[p1, p1, drop = FALSE]) X <- qr.X(Qr,FALSE) attr(X, "assign") <- NULL sigma.i<-rowSums(model.matrix(~0+factor(object$id.group))%*%diag(summ$sigma.new)) var.b<-inv.XtX%*%crossprod(X*sigma.i)%*%inv.XtX dimnames(var.b)<-dimnames(summ$cov.unscaled) summ$cov.var.diff<-var.b summ$Ttable[,2]<-sqrt(diag(var.b)) summ$Ttable[,3]<-summ$Ttable[,1]/summ$Ttable[,2] summ$Ttable[,4]<- 2 * pnorm(abs(summ$Ttable[,3]), lower.tail = FALSE) dimnames(summ$Ttable) <- list(names(object$coefficients)[Qr$pivot[p1]], c("Estimate", "Std. Error", "z value", "Pr(>|z|)")) } coeff<-summ$Ttable[,1]#summ$coefficients[,1] v<-summ$Ttable[,2] #summ$coefficients[,2] summ$gap<-cbind(coeff[idV]*beta.c,abs(v[idV]*beta.c),coeff[idV]/v[idV]) summ$Ttable[idU,4]<-NA summ$Ttable<-summ$Ttable[-idV,] #dimnames(summ$gap)<-list(rep("",nrow(object$psi)),c("Est.","SE","t value")) colnames(summ$gap)<-c("Est.","SE","t value") rownames(summ$gap)<-nomiU summ[c("it","epsilon")]<-object[c("it","epsilon")] summ$var.diff<-var.diff summ$short<-short class(summ) <- c("summary.segmented", "summary.lm") return(summ) } #if("glm"%in%class(object)){ if(inherits(object, "glm")){ summ <- c(summary.glm(object, ...), object["psi"]) summ$Ttable<-summ$coefficients[-idV,] summ$Ttable[idU,4]<-NA coeff<-summ$coefficients[,1] v<-summ$coefficients[,2] summ$gap<-cbind(coeff[idV]*beta.c,abs(v[idV]*beta.c),coeff[idV]/v[idV]) #dimnames(summ$gap)<-list(rep("",nrow(object$psi)),c("Est.","SE","t value")) colnames(summ$gap)<-c("Est.","SE","t value") rownames(summ$gap)<-nomiU summ[c("it","epsilon")]<-object[c("it","epsilon")] summ$short<-short class(summ) <- c("summary.segmented", "summary.glm") return(summ)} if("Arima"%in%class(object)){ #da controllare coeff<-object$coef v<-sqrt(diag(object$var.coef)) Ttable<-cbind(coeff[-idV],v[-idV],coeff[-idV]/v[-idV]) object$gap<-cbind(coeff[idV]*beta.c,v[idV]*beta.c,coeff[idV]/v[idV]) #dimnames(object$gap)<-list(rep("",nrow(object$psi)),c("Est.","SE","t value")) colnames(summ$gap)<-c("Est.","SE","t value") rownames(summ$gap)<-nomiU colnames(Ttable)<-c("Estimate","Std. Error","t value") object$Ttable<-Ttable object$short<-short summ<-object class(summ) <- "summary.segmented" return(summ)} } segmented/R/intercept.r0000644000175100001440000000676712404051010014627 0ustar hornikusersintercept<-function (ogg, parm, gap=TRUE, rev.sgn = FALSE, var.diff = FALSE, digits = max(3, getOption("digits") - 3)){ #corregge in caso di no model intercept -- CHE VOLEVO DIRE?? #forse che adesso funziona se nel modello non c' l'interc. #-- f.U<-function(nomiU, term=NULL){ #trasforma i nomi dei coeff U (o V) nei nomi delle variabili corrispondenti #and if 'term' is provided (i.e. it differs from NULL) the index of nomiU matching term are returned k<-length(nomiU) nomiUsenzaU<-strsplit(nomiU, "\\.") nomiU.ok<-vector(length=k) for(i in 1:k){ nomi.i<-nomiUsenzaU[[i]][-1] if(length(nomi.i)>1) nomi.i<-paste(nomi.i,collapse=".") nomiU.ok[i]<-nomi.i } if(!is.null(term)) nomiU.ok<-(1:k)[nomiU.ok%in%term] return(nomiU.ok) } #-- #if (!"segmented" %in% class(ogg)) stop("A segmented model is needed") if (var.diff && length(ogg$nameUV$Z) > 1) { var.diff <- FALSE warning("var.diff set to FALSE with multiple segmented variables", call. = FALSE) } nomepsi <- rownames(ogg$psi) nomeU <- ogg$nameUV[[1]] nomeZ <- ogg$nameUV[[3]] if (missing(parm)) { nomeZ <- ogg$nameUV[[3]] if (length(rev.sgn) == 1) rev.sgn <- rep(rev.sgn, length(nomeZ)) } else { if (!all(parm %in% ogg$nameUV[[3]])) { stop("invalid parm") } else { nomeZ <- parm } } if (length(rev.sgn) != length(nomeZ)) rev.sgn <- rep(rev.sgn, length.out = length(nomeZ)) nomi <- names(coef(ogg)) nomi <- nomi[-match(nomepsi, nomi)] Allpsi <- index <- vector(mode = "list", length = length(nomeZ)) gapCoef<-summary.segmented(ogg)$gap Ris <- list() rev.sgn <- rep(rev.sgn, length.out = length(nomeZ)) if("(Intercept)"%in%names(coef(ogg))){ alpha0 <- alpha00 <- coef(ogg)["(Intercept)"]} else {alpha0 <- alpha00 <-0} #per ogni variabile segmented... for (i in 1:length(nomeZ)) { # id.cof.U <- grep(paste("\\.", nomeZ[i], "$", sep = ""), nomi, value = FALSE) # psii <- ogg$psi[grep(paste("\\.", nomeZ[i], "$", sep = ""), rownames(ogg$psi), value = FALSE), 2] id.cof.U <- f.U(ogg$nameUV$U, nomeZ[i]) + (match(ogg$nameUV$U[1], nomi)-1) psii<- ogg$psi[f.U(ogg$nameUV$V, nomeZ[i]) , "Est."] Allpsi[[i]] <- sort(psii, decreasing = FALSE) id.cof.U <- id.cof.U[order(psii)] index[[i]] <- id.cof.U alpha0<-if("(Intercept)"%in%names(coef(ogg))) coef(ogg)["(Intercept)"] else 0 ind <- as.numeric(na.omit(unlist(index[[i]]))) cof <- coef(ogg)[ind] alpha <- vector(length = length(ind)) #gapCoef.i<-gapCoef[grep(paste("\\.",nomeZ[i],"$",sep=""), rownames(gapCoef), value = FALSE),"Est."] gapCoef.i<-gapCoef[f.U(rownames(gapCoef), nomeZ[i]) ,"Est."] for (j in 1:length(cof)) { alpha[j] <- alpha0 - Allpsi[[i]][j] * cof[j] if(gap) alpha[j] <- alpha[j] - gapCoef.i[j] alpha0 <- alpha[j] } #if(gap) alpha<-alpha -gapCoef[grep(paste("\\.",nomeZ[i],"$",sep=""), rownames(gapCoef), value = FALSE),"Est."] cof.out <- c(alpha00, alpha) ris <- matrix(cof.out) dimnames(ris) <- list(paste("intercept", 1:nrow(ris), sep = ""), "Est.") Ris[[nomeZ[i]]] <- signif(ris, digits) } Ris } segmented/R/seg.control.R0000644000175100001440000000103612404051010015007 0ustar hornikusers`seg.control` <- function(toll=.0001, it.max=10, display=FALSE, stop.if.error=TRUE, K=10, quant=FALSE, last=TRUE, maxit.glm=25, h=1, n.boot=20, size.boot=NULL, gap=FALSE, jt=FALSE, nonParam=TRUE, random=TRUE, powers=c(1,1), seed=NULL, fn.obj=NULL){ list(toll=toll,it.max=it.max,visual=display,stop.if.error=stop.if.error, K=K,last=last,maxit.glm=maxit.glm,h=h,n.boot=n.boot, size.boot=size.boot, gap=gap, jt=jt, nonParam=nonParam, random=random, pow=powers, seed=seed, quant=quant, fn.obj=fn.obj)} segmented/R/seg.def.fit.boot.r0000644000175100001440000001237112404051010015654 0ustar hornikusersseg.def.fit.boot<-function(obj, Z, PSI, mfExt, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE){ #random se TRUE prende valori random quando errore: comunque devi modificare qualcosa (magari con it.max) # per fare restituire la dev in corrispondenza del punto psi-random #nonParm. se TRUE implemneta il case resampling. Quello semiparam dipende dal non-errore di extract.psi<-function(lista){ #serve per estrarre il miglior psi.. dev.values<-lista[[1]] psi.values<-lista[[2]] dev.ok<-min(dev.values) id.dev.ok<-which.min(dev.values) if(is.list(psi.values)) psi.values<-matrix(unlist(psi.values), nrow=length(dev.values), byrow=TRUE) if(!is.matrix(psi.values)) psi.values<-matrix(psi.values) psi.ok<-psi.values[id.dev.ok,] r<-list(SumSquares.no.gap=dev.ok, psi=psi.ok) r } #------------- visualBoot<-opz$visualBoot opz.boot<-opz opz.boot$pow=c(1.1,1.2) opz1<-opz opz1$it.max <-1 n<-nrow(mfExt) o0<-try(seg.def.fit(obj, Z, PSI, mfExt, opz), silent=TRUE) rangeZ <- apply(Z, 2, range) #serve sempre if(!is.list(o0)) { o0<- seg.def.fit(obj, Z, PSI, mfExt, opz, return.all.sol=TRUE) o0<-extract.psi(o0) if(!nonParam) {warning("using nonparametric boot");nonParam<-TRUE} } if(is.list(o0)){ est.psi00<-est.psi0<-o0$psi ss00<-o0$SumSquares.no.gap if(!nonParam) fitted.ok<-fitted(o0) } else { if(!nonParam) stop("the first fit failed and I cannot extract fitted values for the semipar boot") if(random) { est.psi00<-est.psi0<-apply(rangeZ,2,function(r)runif(1,r[1],r[2])) PSI1 <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) o0<-try(seg.def.fit(obj, Z, PSI1, mfExt, opz1), silent=TRUE) ss00<-o0$SumSquares.no.gap } else { est.psi00<-est.psi0<-apply(PSI,2,mean) ss00<-opz$dev0 } } all.est.psi.boot<-all.selected.psi<-all.est.psi<-matrix(, nrow=n.boot, ncol=length(est.psi0)) all.ss<-all.selected.ss<-rep(NA, n.boot) if(is.null(size.boot)) size.boot<-n # na<- ,,apply(...,2,function(x)mean(is.na(x))) Z.orig<-Z if(visualBoot) cat(0, " ", formatC(opz$dev0, 3, format = "f"),"", "(No breakpoint(s))", "\n") count.random<-0 for(k in seq(n.boot)){ PSI <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) if(jt) Z<-apply(Z.orig,2,jitter) if(nonParam){ id<-sample(n, size=size.boot, replace=TRUE) o.boot<-try(seg.def.fit(obj, Z[id,,drop=FALSE], PSI[id,,drop=FALSE], mfExt[id,,drop=FALSE], opz.boot), silent=TRUE) } else { yy<-fitted.ok+sample(residuals(o0),size=n, replace=TRUE) ##----> o.boot<-try(seg.lm.fit(yy, XREG, Z.orig, PSI, weights, offs, opz.boot), silent=TRUE) #in realt la risposta dovrebbe essere "yy" da cambiare in mfExt o.boot<- try(seg.def.fit(obj, Z.orig, PSI, mfExt, opz.boot), silent=TRUE) } if(is.list(o.boot)){ all.est.psi.boot[k,]<-est.psi.boot<-o.boot$psi } else { est.psi.boot<-apply(rangeZ,2,function(r)runif(1,r[1],r[2])) } PSI <- matrix(rep(est.psi.boot, rep(nrow(Z), length(est.psi.boot))), ncol = length(est.psi.boot)) opz$h<-max(opz$h*.9, .2) opz$it.max<-opz$it.max+1 o <- try(seg.def.fit(obj, Z.orig, PSI, mfExt, opz, return.all.sol=TRUE), silent=TRUE) if(!is.list(o) && random){ est.psi0<-apply(rangeZ,2,function(r)runif(1,r[1],r[2])) PSI1 <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) o <- try(seg.def.fit(obj, Z, PSI1, mfExt, opz1), silent=TRUE) count.random<-count.random+1 } if(is.list(o)){ if(!"coefficients"%in%names(o$obj)) o<-extract.psi(o) all.est.psi[k,]<-o$psi all.ss[k]<-o$SumSquares.no.gap if(o$SumSquares.no.gap<=ifelse(is.list(o0), o0$SumSquares.no.gap, 10^12)) o0<-o est.psi0<-o0$psi all.selected.psi[k,] <- est.psi0 all.selected.ss[k]<-o0$SumSquares.no.gap #min(c(o$SumSquares.no.gap, o0$SumSquares.no.gap)) } if(visualBoot) { flush.console() spp <- if (k < 10) "" else NULL cat(k, spp, "", formatC(o0$SumSquares.no.gap, 3, format = "f"), "\n") } } #end n.boot all.selected.psi<-rbind(est.psi00,all.selected.psi) all.selected.ss<-c(ss00, all.selected.ss) ris<-list(all.selected.psi=drop(all.selected.psi),all.selected.ss=all.selected.ss, all.psi=all.est.psi, all.ss=all.ss) if(is.null(o0$obj)){ PSI1 <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) o0 <- try(seg.def.fit(obj, Z, PSI1, mfExt, opz1), silent=TRUE) } if(!is.list(o0)) return(0) o0$boot.restart<-ris return(o0) }segmented/R/plot.segmented.R0000644000175100001440000002034612404051010015507 0ustar hornikusersplot.segmented<-function (x, term, add = FALSE, res = FALSE, conf.level = 0, interc=TRUE, link = TRUE, res.col = 1, rev.sgn = FALSE, const = 0, shade=FALSE, rug=TRUE, show.gap=FALSE, ...){ #funzione plot.segmented che consente di disegnare anche i pointwise CI f.U<-function(nomiU, term=NULL){ #trasforma i nomi dei coeff U (o V) nei nomi delle variabili corrispondenti #and if 'term' is provided (i.e. it differs from NULL) the index of nomiU matching term are returned k<-length(nomiU) nomiUsenzaU<-strsplit(nomiU, "\\.") nomiU.ok<-vector(length=k) for(i in 1:k){ nomi.i<-nomiUsenzaU[[i]][-1] if(length(nomi.i)>1) nomi.i<-paste(nomi.i,collapse=".") nomiU.ok[i]<-nomi.i } if(!is.null(term)) nomiU.ok<-(1:k)[nomiU.ok%in%term] return(nomiU.ok) } #-------------- linkinv <- !link if (inherits(x, what = "glm", which = FALSE) && linkinv && !is.null(x$offset) && res) stop("residuals with offset on the response scale?") if(conf.level< 0 || conf.level>.9999) stop("meaningless 'conf.level'") show.gap<-FALSE if (missing(term)) { if (length(x$nameUV$Z) > 1) { stop("please, specify `term'") } else { term <- x$nameUV$Z } } else { if (!term %in% x$nameUV$Z) stop("invalid `term'") } opz <- list(...) cols <- opz$col if (length(cols) <= 0) cols <- 1 lwds <- opz$lwd if (length(lwds) <= 0) lwds <- 1 ltys <- opz$lty if (length(ltys) <= 0) ltys <- 1 cexs <- opz$cex if (length(cexs) <= 0) cexs <- 1 pchs <- opz$pch if (length(pchs) <= 0) pchs <- 1 xlabs <- opz$xlab if (length(xlabs) <= 0) xlabs <- term ylabs <- opz$ylab if (length(ylabs) <= 0) ylabs <- paste("Effect of ", term, sep = " ") a <- intercept(x, term, gap = show.gap)[[1]][, "Est."] #Poich intercept() restituisce quantit che includono sempre l'intercetta del modello, questa va eliminata se interc=FALSE if(!interc && ("(Intercept)" %in% names(coef(x)))) a<- a-coef(x)["(Intercept)"] b <- slope(x, term)[[1]][, "Est."] #id <- grep(paste("\\.", term, "$", sep = ""), rownames(x$psi), value = FALSE) #confondeva "psi1.x","psi1.neg.x" id <- f.U(rownames(x$psi), term) est.psi <- x$psi[id, "Est."] K <- length(est.psi) val <- sort(c(est.psi, x$rangeZ[, term])) #---------aggiunta per gli IC rangeCI<-NULL fit0<-NULL n<-length(x$fitted.values) tipo<- if(inherits(x, what = "glm", which = FALSE) && link) "link" else "response" vall<-sort(c(seq(min(val), max(val), l=120), est.psi)) #ciValues<-predict.segmented(x, newdata=vall, se.fit=TRUE, type=tipo, level=conf.level) vall.list<-list(vall) names(vall.list)<-term ciValues<-try(broken.line(x, vall.list, link=link, interc=interc, se.fit=TRUE), silent=TRUE) if(class(ciValues)=="try-error") ciValues<-broken.line(x, vall.list, link=link, interc=interc, se.fit=FALSE) if(conf.level>0) { k.alpha<- abs(qnorm((1-conf.level)/2)) if(identical(class(x),c("segmented","lm"))) k.alpha<-abs(qt((1-conf.level)/2, x$df.residual)) # k.alpha<-if(inherits(x, what = "glm", which = FALSE)) abs(qnorm((1-conf.level)/2)) else abs(qt((1-conf.level)/2, x$df.residual)) ciValues<-cbind(ciValues$fit, ciValues$fit- k.alpha*ciValues$se.fit, ciValues$fit + k.alpha*ciValues$se.fit) rangeCI<-range(ciValues) #ciValues una matrice di length(val)x3. Le 3 colonne: stime, inf, sup #polygon(c(vall, rev(vall)), c(ciValues[,2],rev(ciValues[,3])), col = "gray", border=NA) } #--------- a.ok <- c(a[1], a) b.ok <- c(b[1], b) y.val <- a.ok + b.ok * val + const a.ok1 <- c(a, a[length(a)]) b.ok1 <- c(b, b[length(b)]) y.val <- y.val1 <- a.ok1 + b.ok1 * val + const s <- 1:(length(val) - 1) xvalues <- x$model[, term] if (rev.sgn) { val <- -val xvalues <- -xvalues } m <- cbind(val[s], y.val1[s], val[s + 1], y.val[s + 1]) #values where to compute predictions (useful only if res=TRUE) if(res){ new.d<-data.frame(ifelse(rep(rev.sgn, length(xvalues)),-xvalues, xvalues)) names(new.d)<-term fit0 <- broken.line(x, new.d, link = link, interc=interc, se.fit=FALSE)$fit } #------------------------------------------------------------------------------- if (inherits(x, what = "glm", which = FALSE) && linkinv) { #se GLM con linkinv fit <- if (res) #predict.segmented(x, ifelse(rep(rev.sgn, length(xvalues)),-xvalues,xvalues), type=tipo) + resid(x, "response") + const #broken.line(x, term, gap = show.gap, link = link) + resid(x, "response") + const fit0 + resid(x, "response") + const else x$family$linkinv(c(y.val, y.val1)) xout <- sort(c(seq(val[1], val[length(val)], l = 120), val[-c(1, length(val))])) l <- approx(as.vector(m[, c(1, 3)]), as.vector(m[, c(2, 4)]), xout = xout) id.group <- cut(l$x, val, FALSE, TRUE) yhat <- l$y xhat <- l$x m[, c(2, 4)] <- x$family$linkinv(m[, c(2, 4)]) if (!add) { plot(as.vector(m[, c(1, 3)]), as.vector(m[, c(2, 4)]), type = "n", xlab = xlabs, ylab = ylabs, main = opz$main, sub = opz$sub, xlim = opz$xlim, ylim = if(is.null(opz$ylim)) range(fit, fit0, rangeCI) else opz$ylim ) if(rug) {segments(xvalues, rep(par()$usr[3],length(xvalues)), xvalues, rep(par()$usr[3],length(xvalues))+ abs(diff(par()$usr[3:4]))/40)} } if(conf.level>0){ if(rev.sgn) vall<- -vall if(shade) polygon(c(vall, rev(vall)), c(ciValues[,2],rev(ciValues[,3])), col = "gray", border=NA) else matlines(vall, ciValues[,-1], type="l", lty=2, col=cols) } if (res) points(xvalues, fit, cex = cexs, pch = pchs, col = res.col) yhat <- x$family$linkinv(yhat) if (length(cols) == 1) cols <- rep(cols, max(id.group)) if (length(lwds) == 1) lwds <- rep(lwds, max(id.group)) if (length(ltys) == 1) ltys <- rep(ltys, max(id.group)) for (i in 1:max(id.group)) { lines(xhat[id.group == i], yhat[id.group == i], col = cols[i], lwd = lwds[i], lty = ltys[i]) } #------------------------------------------------------------------------------- } else { #se LM o "GLM con link=TRUE (ovvero linkinv=FALSE)" r <- cbind(val, y.val) r1 <- cbind(val, y.val1) rr <- rbind(r, r1) fit <- c(y.val, y.val1) if (res) { ress <- if (inherits(x, what = "glm", which = FALSE)) residuals(x, "working") * sqrt(x$weights) else resid(x) #if(!is.null(x$offset)) ress<- ress - x$offset #fit <- broken.line(x, term, gap = show.gap, link = link, interc = TRUE) + ress + const #fit <- predict.segmented(x, ifelse(rep(rev.sgn, length(xvalues)),-xvalues,xvalues), type=tipo) + ress + const fit <- fit0 + ress + const } if (!add) plot(rr, type = "n", xlab = xlabs, ylab = ylabs, main = opz$main, sub = opz$sub, xlim = opz$xlim, ylim = if(is.null(opz$ylim)) range(fit, fit0, rangeCI) else opz$ylim) if(rug) {segments(xvalues, rep(par()$usr[3],length(xvalues)), xvalues, rep(par()$usr[3],length(xvalues))+ abs(diff(par()$usr[3:4]))/40)} if(conf.level>0) { if(rev.sgn) vall<- -vall if(shade) polygon(c(vall, rev(vall)), c(ciValues[,2],rev(ciValues[,3])), col = "gray", border=NA) else matlines(vall, ciValues[,-1], type="l", lty=2, col=cols) } if (res) points(xvalues, fit, cex = cexs, pch = pchs, col = res.col) segments(m[, 1], m[, 2], m[, 3], m[, 4], col = cols, lwd = lwds, lty = ltys) } invisible(NULL) } segmented/R/confint.segmented.R0000644000175100001440000000700412404051010016165 0ustar hornikusers`confint.segmented` <- function(object, parm, level=0.95, rev.sgn=FALSE, var.diff=FALSE, digits=max(3, getOption("digits") - 3), ...){ #-- f.U<-function(nomiU, term=NULL){ #trasforma i nomi dei coeff U (o V) nei nomi delle variabili corrispondenti #and if 'term' is provided (i.e. it differs from NULL) the index of nomiU matching term are returned k<-length(nomiU) nomiUsenzaU<-strsplit(nomiU, "\\.") nomiU.ok<-vector(length=k) for(i in 1:k){ nomi.i<-nomiUsenzaU[[i]][-1] if(length(nomi.i)>1) nomi.i<-paste(nomi.i,collapse=".") nomiU.ok[i]<-nomi.i } if(!is.null(term)) nomiU.ok<-(1:k)[nomiU.ok%in%term] return(nomiU.ok) } #-- # if(!"segmented"%in%class(object)) stop("A segmented model is needed") if(var.diff && length(object$nameUV$Z)>1) { var.diff<-FALSE warning("var.diff set to FALSE with multiple segmented variables", call.=FALSE) } #nomi delle variabili segmented: if(missing(parm)) { nomeZ<- object$nameUV[[3]] if(length(rev.sgn)==1) rev.sgn<-rep(rev.sgn,length(nomeZ)) } else { if(! parm %in% object$nameUV[[3]]) {stop("invalid parm")} else {nomeZ<-parm} } if(length(rev.sgn)!=length(nomeZ)) rev.sgn<-rep(rev.sgn, length.out=length(nomeZ)) rr<-list() z<-if("lm"%in%class(object)) abs(qt((1-level)/2,df=object$df.residual)) else abs(qnorm((1-level)/2)) for(i in 1:length(nomeZ)){ #per ogni variabile segmented `parm' (tutte o selezionata).. #nomi.U<-grep(paste("\\.",nomeZ[i],"$",sep=""),object$nameUV$U,value=TRUE) #nomi.V<-grep(paste("\\.",nomeZ[i],"$",sep=""),object$nameUV$V,value=TRUE) nomi.U<- object$nameUV$U[f.U(object$nameUV$U, nomeZ[i])] nomi.V<- object$nameUV$V[f.U(object$nameUV$V, nomeZ[i])] m<-matrix(,length(nomi.U),3) rownames(m)<-nomi.V colnames(m)<-c("Est.",paste("CI","(",level*100,"%",")",c(".l",".u"),sep="")) for(j in 1:length(nomi.U)){ #per ogni psi della stessa variabile segmented.. sel<-c(nomi.V[j],nomi.U[j]) V<-vcov(object,var.diff=var.diff)[sel,sel] #questa vcov di (psi,U) b<-coef(object)[sel[2]] #diff-Slope th<-c(b,1) orig.coef<-drop(diag(th)%*%coef(object)[sel]) #sono i (gamma,beta) th*coef(ogg)[sel] gammma<-orig.coef[1] est.psi<-object$psi[sel[1],2] V<-diag(th)%*%V%*%diag(th) #2x2 vcov() di gamma e beta se.psi<-sqrt((V[1,1]+V[2,2]*(gammma/b)^2-2*V[1,2]*(gammma/b))/b^2) r<-c(est.psi, est.psi-z*se.psi, est.psi+z*se.psi) if(rev.sgn[i]) r<-c(-r[1],rev(-r[2:3])) m[j,]<-r } #end loop j (ogni psi della stessa variabile segmented) #CONTROLLA QUESTO:..sarebbe pi bello if(nrow(m)==1) rownames(m)<-"" else m<-m[order(m[,1]),] if(rev.sgn[i]) { #m<-m[nrow(m):1,] rownames(m)<-rev(rownames(m)) } rr[[length(rr)+1]]<-signif(m,digits) } #end loop i (ogni variabile segmented) names(rr)<-nomeZ return(rr) } #end_function segmented/R/seg.glm.fit.r0000644000175100001440000001427212404051010014735 0ustar hornikusersseg.glm.fit<-function(y,XREG,Z,PSI,w,offs,opz,return.all.sol=FALSE){ #------------------------- dpmax<-function(x,y,pow=1){ #deriv pmax if(pow==1) ifelse(x>y, -1, 0) else -pow*pmax(x-y,0)^(pow-1) } #-------------------- c1 <- apply((Z <= PSI), 2, all) #prima era solo < c2 <- apply((Z >= PSI), 2, all) #prima era solo > if(sum(c1 + c2) != 0 || is.na(sum(c1 + c2))) stop("psi out of the range") pow<-opz$pow eta0<-opz$eta0 fam<-opz$fam maxit.glm<-opz$maxit.glm #-------------- nomiOK<-opz$nomiOK toll<-opz$toll h<-opz$h stop.if.error<-opz$stop.if.error dev.new<-opz$dev0 visual<-opz$visual id.psi.group<-opz$id.psi.group it.max<-old.it.max<-opz$it.max id.psi.group<-opz$id.psi.group gap<-opz$gap rangeZ <- apply(Z, 2, range) #k<-ncol(Z) psi<-PSI[1,] names(psi)<-id.psi.group it <- 1 epsilon <- 10 dev.values<-psi.values <- NULL id.psi.ok<-rep(TRUE, length(psi)) sel.col.XREG<-unique(sapply(colnames(XREG), function(x)match(x,colnames(XREG)))) if(is.numeric(sel.col.XREG))XREG<-XREG[,sel.col.XREG,drop=FALSE] #elimina le ripetizioni, ad es. le due intercette.. while (abs(epsilon) > toll) { k<-ncol(Z) U <- pmax((Z - PSI), 0)^pow[1] V <- dpmax(Z,PSI,pow=pow[2])# ifelse((Z > PSI), -1, 0) X <- cbind(XREG, U, V) rownames(X) <- NULL if (ncol(V) == 1) colnames(X)[(ncol(XREG) + 1):ncol(X)] <- c("U", "V") else colnames(X)[(ncol(XREG) + 1):ncol(X)] <- c(paste("U", 1:ncol(U), sep = ""), paste("V", 1:ncol(V), sep = "")) #obj <- lm.wfit(x = X, y = y, w = w, offset = o) #controlla****************** obj <- suppressWarnings(glm.fit(x = X, y = y, offset = offs, weights = w, family = fam, control = glm.control(maxit = maxit.glm), etastart = eta0)) eta0 <- obj$linear.predictors dev.old<-dev.new dev.new <- dev.new1<- obj$dev if(return.all.sol) dev.new1 <- glm.fit(x=cbind(XREG, U),y=y, family=fam, weights=w, offset=offs, etastart=eta0)$dev dev.values[[length(dev.values) + 1]] <- dev.new1 if (visual) { flush.console() if (it == 1) cat(0, " ", formatC(dev.old, 3, format = "f"), "", "(No breakpoint(s))", "\n") spp <- if (it < 10) "" else NULL cat(it, spp, "", formatC(dev.new1, 3, format = "f"), "",length(psi),"\n") #cat(paste("iter = ", it, spp," dev = ",formatC(dev.new,digits=3,format="f"), " n.psi = ",formatC(length(psi),digits=0,format="f"), sep=""), "\n") } epsilon <- (dev.new - dev.old)/(dev.old+.001) # epsilon <- (dev.new1 - dev.old)/dev.old #se vuoi usare la *vera* (e non la working che tiene conto dei gap) deviance obj$epsilon <- epsilon it <- it + 1 obj$it <- it #class(obj) <- c("segmented", class(obj)) #list.obj[[length(list.obj) + ifelse(last == TRUE, 0, 1)]] <- obj if (k == 1) { beta.c <- coef(obj)["U"] gamma.c <- coef(obj)["V"] } else { beta.c <- coef(obj)[paste("U", 1:ncol(U), sep = "")] gamma.c <- coef(obj)[paste("V", 1:ncol(V), sep = "")] } if (it > it.max) break psi.values[[length(psi.values) + 1]] <- psi.old <- psi #if(it>=old.it.max && h<1) H<-h psi <- psi.old + h*gamma.c/beta.c PSI <- matrix(rep(psi, rep(nrow(Z), ncol(Z))), ncol = ncol(Z)) #check if psi is admissible.. a <- apply((Z <= PSI), 2, all) #prima era solo < b <- apply((Z >= PSI), 2, all) #prima era solo > if(stop.if.error) { isErr <- sum(a + b) != 0 || is.na(sum(a + b)) if(isErr) { if(return.all.sol) return(list(dev.values, psi.values)) else stop("(Some) estimated psi out of its range") } } else { id.psi.ok<-!is.na((a+b)<=0)&(a+b)<=0 Z <- Z[,id.psi.ok,drop=FALSE] psi <- psi[id.psi.ok] PSI <- PSI[,id.psi.ok,drop=FALSE] nomiOK<-nomiOK[id.psi.ok] #salva i nomi delle U per i psi ammissibili id.psi.group<-id.psi.group[id.psi.ok] names(psi)<-id.psi.group if(ncol(PSI)<=0) return(0) } #end else #obj$psi <- psi } #end while #queste due righe aggiunte nella versione 0.2.9-3 (adesso i breakpoints sono sempre ordinati) psi<-unlist(tapply(psi, id.psi.group, sort)) names(psi)<-id.psi.group PSI <- matrix(rep(psi, rep(nrow(Z), length(psi))), ncol = length(psi)) #aggiunto da qua.. U <- pmax((Z - PSI), 0) V <- ifelse((Z > PSI), -1, 0) X <- cbind(XREG, U, V) rownames(X) <- NULL if (ncol(V) == 1) colnames(X)[(ncol(XREG) + 1):ncol(X)] <- c("U", "V") else colnames(X)[(ncol(XREG) + 1):ncol(X)] <- c(paste("U", 1:ncol(U), sep = ""), paste("V", 1:ncol(V), sep = "")) #tolto il suppressWarnings( obj <- glm.fit(x = X, y = y, offset = offs, weights = w, family = fam, control = glm.control(maxit = maxit.glm), etastart = eta0) obj$epsilon <- epsilon obj$it <- it obj.new <- glm.fit(x = cbind(XREG, U), y = y, offset = offs, weights = w, family = fam, control = glm.control(maxit = maxit.glm), etastart = eta0) SS.new<- obj.new$dev if(!gap){ names.coef<-names(obj$coefficients) obj$coefficients<-c(obj.new$coefficients, rep(0,ncol(V))) names(obj$coefficients)<-names.coef obj$residuals<-obj.new$residuals obj$fitted.values<-obj.new$fitted.values obj$linear.predictors<-obj.new$linear.predictors obj$deviance<-obj.new$deviance obj$weights<-obj.new$weights obj$aic<-obj.new$aic #+ 2*ncol(V) #ho fatto la modifica in segmented.glm(): "objF$aic<-obj$aic + 2*k" } #fino a qua.. obj<-list(obj=obj,it=it,psi=psi,psi.values=psi.values,U=U,V=V,rangeZ=rangeZ, epsilon=epsilon,nomiOK=nomiOK, dev.no.gap=SS.new, id.psi.group=id.psi.group) return(obj) } segmented/R/segmented.lm.R0000644000175100001440000004346012404051010015143 0ustar hornikusers`segmented.lm` <- function(obj, seg.Z, psi=stop("provide psi"), control = seg.control(), model = TRUE, ...) { n.Seg<-1 if(length(all.vars(seg.Z))>1 & !is.list(psi)) stop("`psi' should be a list with more than one covariate in `seg.Z'") if(is.list(psi)){ if(length(all.vars(seg.Z))!=length(psi)) stop("A wrong number of terms in `seg.Z' or `psi'") if(any(is.na(match(all.vars(seg.Z),names(psi), nomatch = NA)))) stop("Variables in `seg.Z' and `psi' do not match") n.Seg <- length(psi) } if(length(all.vars(seg.Z))!=n.Seg) stop("A wrong number of terms in `seg.Z' or `psi'") it.max <- old.it.max<- control$it.max toll <- control$toll visual <- control$visual stop.if.error<-control$stop.if.error n.boot<-control$n.boot size.boot<-control$size.boot gap<-control$gap random<-control$random pow<-control$pow visualBoot<-FALSE if(n.boot>0){ if(!is.null(control$seed)) { set.seed(control$seed) employed.Random.seed<-control$seed } else { employed.Random.seed<-eval(parse(text=paste(sample(0:9, size=6), collapse=""))) set.seed(employed.Random.seed) } if(visual) {visual<-FALSE; visualBoot<-TRUE}# warning("`display' set to FALSE with bootstrap restart", call.=FALSE)} if(!stop.if.error) stop("Bootstrap restart only with a fixed number of breakpoints") } last <- control$last K<-control$K h<-min(abs(control$h),1) if(h<1) it.max<-it.max+round(it.max/2) # if(!stop.if.error) objInitial<-obj #------------------------------- # #una migliore soluzione......... # objframe <- update(obj, model = TRUE, x = TRUE, y = TRUE) # y <- objframe$y # a <- model.matrix(seg.Z, data = eval(obj$call$data)) # a <- subset(a, select = colnames(a)[-1]) orig.call<-Call<-mf<-obj$call orig.call$formula<- mf$formula<-formula(obj) #per consentire lm(y~.) m <- match(c("formula", "data", "subset", "weights", "na.action","offset"), names(mf), 0L) mf <- mf[c(1, m)] mf$drop.unused.levels <- TRUE mf[[1L]] <- as.name("model.frame") if(class(mf$formula)=="name" && !"~"%in%paste(mf$formula)) mf$formula<-eval(mf$formula) #orig.call$formula<-update.formula(orig.call$formula, paste("~.-",all.vars(seg.Z))) # #genn 2013. dalla versione 0.2.9-4 ho tolto if(length(.. Tra l'altro non capisco perch lo avevo fatto # if(length(all.vars(formula(obj)))>1){ # mf$formula<-update.formula(mf$formula,paste(paste(seg.Z,collapse=".+"),"+",paste(all.vars(formula(obj))[-1],collapse="+"))) # } else { # mf$formula<-update.formula(mf$formula,paste(seg.Z,collapse=".+")) # } #nov 2013 dalla versione 0.3-0.0 (che dovrebbe essere successiva alla 0.2-9.5) viene creato anche il modelframe esteso che comprende # termini "originali", prima che fossero trasformati (Ad es., x prima che ns(x) costruisca le basi). Questo permette di avere termini # ns(), poly(), bs() nel modello di partenza mfExt<- mf mf$formula<-update.formula(mf$formula,paste(seg.Z,collapse=".+")) #mfExt$formula<- update.formula(mfExt$formula,paste(paste(seg.Z,collapse=".+"),"+",paste(all.vars(formula(obj)),collapse="+"))) # mfExt$formula<- if(!is.null(obj$call$data)) # update.formula(mf$formula,paste(".~",paste(all.vars(obj$call), collapse="+"),"-",obj$call$data,sep="")) # else update.formula(mf$formula,paste(".~",paste(all.vars(obj$call), collapse="+"),sep="")) #----------- # browser() if(!is.null(obj$call$offset) || !is.null(obj$call$weights) || !is.null(obj$call$subset)){ mfExt$formula<- update.formula(mf$formula,paste(".~.+", paste( paste(all.vars(obj$call$offset), collapse="+"), paste(all.vars(obj$call$weights), collapse="+"), paste(all.vars(obj$call$subset), collapse="+"), sep="+" ) ,sep="")) } mf <- eval(mf, parent.frame()) n<-nrow(mf) #questo serve per inserire in mfExt le eventuali variabili contenute nella formula con offset(..) nomiOff<-setdiff(all.vars(formula(obj)), names(mf)) if(length(nomiOff)>=1) mfExt$formula<-update.formula(mfExt$formula,paste(".~.+", paste( nomiOff, collapse="+"), sep="")) #---------------------------------------------------- # browser() #ago 2014 c' la questione di variabili aggiuntive... nomiTUTTI<-all.vars(mfExt$formula) #comprende anche altri nomi (ad es., threshold) "variabili" nomiNO<-NULL #dovrebbe contenere for(i in nomiTUTTI){ r<-try(eval(parse(text=i), parent.frame()), silent=TRUE) if(class(r)!="try-error" && length(r)==1) nomiNO[[length(nomiNO)+1]]<-i } #nomiNO dovrebbe contenere i nomi delle "altre variabili" (come th in subset=x0 if(any(id.duplic)) { #new.mf<-mf[,id.duplic,drop=FALSE] new.mf<-mf[,all.vars(formula(obj))[id.duplic],drop=FALSE] new.XREGseg<-data.matrix(new.mf) XREG<-cbind(XREG,new.XREGseg) } n.psi<- length(unlist(psi)) id.n.Seg<-(ncol(XREG)-n.Seg+1):ncol(XREG) XREGseg<-XREG[,id.n.Seg,drop=FALSE] #XREG<-XREG[,-id.n.Seg,drop=FALSE] #XREG<-model.matrix(obj0) non va bene perch non elimina gli eventuali mancanti in seg.Z.. #Due soluzioni #XREG<-XREG[,colnames(model.matrix(obj)),drop=FALSE] #XREG<-XREG[,match(c("(Intercept)",all.vars(formula(obj))[-1]),colnames(XREG),nomatch =0),drop=FALSE] XREG <- XREG[, match(c("(Intercept)", namesXREG0),colnames(XREG), nomatch = 0), drop = FALSE] XREG<-XREG[,unique(colnames(XREG)), drop=FALSE] n <- nrow(XREG) #Z <- list(); for (i in colnames(XREGseg)) Z[[length(Z) + 1]] <- XREGseg[, i] Z<-lapply(apply(XREGseg,2,list),unlist) #prende anche i nomi! name.Z <- names(Z) <- colnames(XREGseg) if(length(Z)==1 && is.vector(psi) && (is.numeric(psi)||is.na(psi))){ psi <- list(as.numeric(psi)) names(psi)<-name.Z } if (!is.list(Z) || !is.list(psi) || is.null(names(Z)) || is.null(names(psi))) stop("Z and psi have to be *named* list") id.nomiZpsi <- match(names(Z), names(psi)) if ((length(Z)!=length(psi)) || any(is.na(id.nomiZpsi))) stop("Length or names of Z and psi do not match") #dd <- match(names(Z), names(psi)) nome <- names(psi)[id.nomiZpsi] psi <- psi[nome] initial.psi<-psi for(i in 1:length(psi)) { if(any(is.na(psi[[i]]))) psi[[i]]<-if(control$quant) {quantile(Z[[i]], prob= seq(0,1,l=K+2)[-c(1,K+2)], names=FALSE)} else {(min(Z[[i]])+ diff(range(Z[[i]]))*(1:K)/(K+1))} } a <- sapply(psi, length) #per evitare che durante il processo iterativo i psi non siano ordinati id.psi.group <- rep(1:length(a), times = a) #identificativo di apparteneza alla variabile # #Znew <- list() #for (i in 1:length(psi)) Znew[[length(Znew) + 1]] <- rep(Z[i], a[i]) #Z <- matrix(unlist(Znew), nrow = n) Z<-matrix(unlist(mapply(function(x,y)rep(x,y),Z,a,SIMPLIFY = TRUE)),nrow=n) psi <- unlist(psi) #se psi numerico, la seguente linea restituisce i valori ordinati all'interno della variabile.. psi<-unlist(tapply(psi,id.psi.group,sort)) k <- ncol(Z) PSI <- matrix(rep(psi, rep(n, k)), ncol = k) #controllo se psi ammissibile.. c1 <- apply((Z <= PSI), 2, all) #dovrebbero essere tutti FALSE (prima era solo <) c2 <- apply((Z >= PSI), 2, all) #dovrebbero essere tutti FALSE (prima era solo >) if(sum(c1 + c2) != 0 || is.na(sum(c1 + c2)) ) stop("starting psi out of the admissible range") #questo dovrebbe eliminare i psi non-ammissib. ma non sono sicuro cosa succede se ci sono pi variabili # if(sum(c1 + c2) != 0){ # id.val.psi<-!((c1+c1)>0) #individua i psi ammissibili (i.e. interni) # psi<-psi[id.val.psi] # Z<-Z[,id.val.psi] # PSI<-PSI[,id.val.psi] # } # if(is.na(sum(c1 + c2))) stop("psi out of the range") colnames(Z) <- nomiZ <- rep(nome, times = a) ripetizioni <- as.numeric(unlist(sapply(table(nomiZ)[order(unique(nomiZ))], function(xxx) {1:xxx}))) nomiU <- paste("U", ripetizioni, sep = "") nomiU <- paste(nomiU, nomiZ, sep = ".") nomiV <- paste("V", ripetizioni, sep = "") nomiV <- paste(nomiV, nomiZ, sep = ".") #forse non serve crearsi l'ambiente KK, usa mf.. #obj <- update(obj, formula = Fo, data = mf) #if (model.frame) obj$model <- mf #controlla che model.frame() funzioni sull'oggetto restituito # KK <- new.env() # for (i in 1:ncol(objframe$model)) assign(names(objframe$model[i]), objframe$model[[i]], envir = KK) if (it.max == 0) { #mf<-cbind(mf, mfExt) U <- pmax((Z - PSI), 0) colnames(U) <- paste(ripetizioni, nomiZ, sep = ".") nomiU <- paste("U", colnames(U), sep = "") #for (i in 1:ncol(U)) assign(nomiU[i], U[, i], envir = KK) # necessario il for? puoi usare colnames(U)<-nomiU;mf[nomiU]<-U for(i in 1:ncol(U)) mfExt[nomiU[i]]<-mf[nomiU[i]]<-U[,i] Fo <- update.formula(formula(obj), as.formula(paste(".~.+", paste(nomiU, collapse = "+")))) obj <- update(obj, formula = Fo, evaluate=FALSE) #data = mf, #if(!is.null(obj[["subset"]])) obj[["subset"]]<-NULL obj<-eval(obj, envir=mfExt) if (model) obj$model <-mf #obj$model <- data.frame(as.list(KK)) names(psi)<-paste(paste("psi", ripetizioni, sep = ""), nomiZ, sep=".") obj$psi <- psi return(obj) } #XREG <- model.matrix(obj) creata sopra #o <- model.offset(objframe) #w <- model.weights(objframe) if (is.null(weights)) weights <- rep(1, n) if (is.null(offs)) offs <- rep(0, n) initial <- psi obj0 <- obj dev0<-sum(obj$residuals^2) list.obj <- list(obj) # psi.values <- NULL nomiOK<-nomiU opz<-list(toll=toll,h=h,stop.if.error=stop.if.error,dev0=dev0,visual=visual,it.max=it.max, nomiOK=nomiOK,id.psi.group=id.psi.group,gap=gap,visualBoot=visualBoot,pow=pow) if(n.boot<=0){ obj<-seg.lm.fit(y,XREG,Z,PSI,weights,offs,opz) } else { obj<-seg.lm.fit.boot(y, XREG, Z, PSI, weights, offs, opz, n.boot=n.boot, size.boot=size.boot, random=random) #jt, nonParam } if(!is.list(obj)){ warning("No breakpoint estimated", call. = FALSE) return(obj0) } if(obj$obj$df.residual==0) warning("no residual degrees of freedom (other warnings expected)", call.=FALSE) id.psi.group<-obj$id.psi.group nomiOK<-obj$nomiOK it<-obj$it psi<-obj$psi psi.values<-if(n.boot<=0) obj$psi.values else obj$boot.restart U<-obj$U V<-obj$V # return(obj) #if(any(table(rowSums(V))<=1)) stop("only 1 datum in an interval: breakpoint(s) at the boundary or too close") for(jj in colnames(V)) { VV<-V[, which(colnames(V)==jj), drop=FALSE] sumV<-abs(rowSums(VV)) if( (any(diff(sumV)>=2) #se ci sono due breakpoints uguali || any(table(sumV)<=1)) && stop.if.error) stop("only 1 datum in an interval: breakpoint(s) at the boundary or too close each other") } rangeZ<-obj$rangeZ obj<-obj$obj k<-length(psi) beta.c<-if(k == 1) coef(obj)["U"] else coef(obj)[paste("U", 1:ncol(U), sep = "")] psi.values[[length(psi.values) + 1]] <- psi id.warn <- FALSE if (n.boot<=0 && it > it.max) { #it >= (it.max+1) warning("max number of iterations attained", call. = FALSE) id.warn <- TRUE } Vxb <- V %*% diag(beta.c, ncol = length(beta.c)) #se usi una procedura automatica devi cambiare ripetizioni, nomiU e nomiV, e quindi: length.psi<-tapply(as.numeric(as.character(names(psi))), as.numeric(as.character(names(psi))), length) forma.nomiU<-function(xx,yy)paste("U",1:xx, ".", yy, sep="") forma.nomiVxb<-function(xx,yy)paste("psi",1:xx, ".", yy, sep="") nomiU <- unlist(mapply(forma.nomiU, length.psi, name.Z)) #invece di un ciclo #paste("U",1:length.psi[i], ".", name.Z[i]) nomiVxb <- unlist(mapply(forma.nomiVxb, length.psi, name.Z)) # nomiVxb <-paste("psi",ripetizioni, ".",nomiZ ,sep="") #Dalla 0.2.9-5 eliminati i seguenti. La linea sopra sembra sufficiente # colnames(U) <- colnames(Vxb) <-sapply(strsplit(nomiOK,"U"),function(x)x[2]) # #colnames(U) <- paste(ripetizioni, nomiZ, sep = ".") # #colnames(Vxb) <- paste(ripetizioni, nomiZ, sep = ".") # nomiU <- paste("U", colnames(U), sep = "") # nomiVxb <- paste("psi", colnames(Vxb), sep = "") # for (i in 1:ncol(U)) { # assign(nomiU[i], U[, i], envir = KK) # assign(nomiVxb[i], Vxb[, i], envir = KK) # } #mf<-cbind(mf, mfExt) #questo creava ripetizioni.. for(i in 1:ncol(U)) { mfExt[nomiU[i]]<-mf[nomiU[i]]<-U[,i] mfExt[nomiVxb[i]]<-mf[nomiVxb[i]]<-Vxb[,i] } nnomi <- c(nomiU, nomiVxb) # browser() Fo <- update.formula(formula(obj0), as.formula(paste(".~.+", paste(nnomi, collapse = "+")))) #objF <- update(obj0, formula = Fo, data = KK) objF <- update(obj0, formula = Fo, evaluate=FALSE, data = mfExt) #if(!is.null(objF[["subset"]])) objF[["subset"]]<-NULL objF<-eval(objF, envir=mfExt) #Pu capitare che psi sia ai margini e ci sono 1 o 2 osservazioni in qualche intervallo. Oppure ce ne #sono di pi ma hanno gli stessi valori di x #objF$coef pu avere mancanti.. names(which(is.na(coef(objF)))) if(any(is.na(objF$coefficients)) && stop.if.error){ stop("at least one coef estimate is NA: breakpoint(s) at the boundary? (possibly with many x-values replicated)", call. = FALSE) } objF$offset<- obj0$offset if(!gap){ names.coef<-names(objF$coefficients) #questi codici funzionano e si basano sull'assunzioni che le U e le V siano ordinate.. if(k==1) {names(obj$coefficients)[match(c("U","V"), names(coef(obj)))]<- nnomi } else { names(obj$coefficients)[match(c(paste("U",1:k, sep=""), paste("V",1:k, sep="")), names(coef(obj)))]<- nnomi } objF$coefficients[names.coef]<-obj$coefficients[names.coef] #objF$coefficients<-obj$coefficients #names(objF$coefficients)<-names.coef objF$fitted.values<-obj$fitted.values objF$residuals<-obj$residuals } if(any(is.na(objF$coefficients))){ #Se gap==FALSE qui non ci possono essere NA (sono sostituiti dagli 0) stop("some estimate is NA: premature stopping with a large number of breakpoints?", call. = FALSE) } Cov <- vcov(objF) id <- match(nomiVxb, names(coef(objF))) vv <- if (length(id) == 1) Cov[id, id] else diag(Cov[id, id]) #if(length(initial)!=length(psi)) initial<-rep(NA,length(psi)) a<-tapply(id.psi.group, id.psi.group, length) #ho sovrascritto "a" di sopra, ma non dovrebbe servire.. initial<-unlist(mapply(function(x,y){if(is.na(x)[1])rep(x,y) else x }, initial.psi, a)) psi <- cbind(initial, psi, sqrt(vv)) rownames(psi) <- colnames(Cov)[id] colnames(psi) <- c("Initial", "Est.", "St.Err") objF$rangeZ <- rangeZ objF$psi.history <- psi.values objF$psi <- psi objF$it <- (it - 1) objF$epsilon <- obj$epsilon objF$call <- match.call() objF$nameUV <- list(U = nomiU, V = rownames(psi), Z = name.Z) objF$id.group <- if(length(name.Z)<=1) -rowSums(as.matrix(V)) objF$id.psi.group <- id.psi.group objF$id.warn <- id.warn objF$orig.call<-orig.call if (model) objF$model <- mf #objF$mframe <- data.frame(as.list(KK)) if(n.boot>0) objF$seed<-employed.Random.seed class(objF) <- c("segmented", class(obj0)) list.obj[[length(list.obj) + 1]] <- objF class(list.obj) <- "segmented" if (last) list.obj <- list.obj[[length(list.obj)]] return(list.obj) } segmented/R/lines.segmented.R0000644000175100001440000000172212404051010015640 0ustar hornikuserslines.segmented<-function(x, term, bottom=TRUE, shift=TRUE, conf.level=0.95, k=50, pch=18, rev.sgn=FALSE,...){ if(missing(term)){ if(length(x$nameUV$Z)>1 ) {stop("please, specify `term'")} else {term<-x$nameUV$Z} } ss<-list(...) colore<- if(is.null(ss$col)) 1 else ss$col usr <- par("usr") h<-(usr[4]-usr[3])/abs(k) y<- if(bottom) usr[3]+h else usr[4]-h r<- confint.segmented(object=x,parm=term,level=conf.level,rev.sgn=rev.sgn,digits=15) m<-r[[term]] #FORSE non necessaria #if(rev.sgn) m<- -m #ma invece serve il seguente (se length(psi)=1 e rev.sgn=T): m<-matrix(m,ncol=3) if(nrow(m)>1) m<-m[order(m[,1]),] est.psi<-m[,1] lower.psi<-m[,2] upper.psi<-m[,3] if(length(est.psi)>1) { if(shift) y<-y+seq(-h/2,h/2,length=length(est.psi)) else rep(y,length(est.psi)) } segments(lower.psi, y, upper.psi, y, ...) points(est.psi,y,type="p",pch=pch,col=colore) } segmented/R/seg.lm.fit.r0000644000175100001440000001323012404051010014557 0ustar hornikusersseg.lm.fit<-function(y,XREG,Z,PSI,w,offs,opz,return.all.sol=FALSE){ #aggiunge la SS.ok (che esclude i gap) #argomento return.all.sol #opz$pow (passata da seg.control) che necessita di dpmax() #step halving more straightforward (deleted H) #----------------- dpmax<-function(x,y,pow=1){ #deriv pmax if(pow==1) ifelse(x>y, -1, 0) else -pow*pmax(x-y,0)^(pow-1) } #----------- mylm<-function(x,y,w,offs=rep(0,length(y))){ x1<-x*sqrt(w) y<-y-offs y1<-y*sqrt(w) b<-drop(solve(crossprod(x1),crossprod(x1,y1))) fit<-drop(tcrossprod(x,t(b))) r<-y-fit o<-list(coefficients=b,fitted.values=fit,residuals=r) o } #----------- c1 <- apply((Z <= PSI), 2, all) c2 <- apply((Z >= PSI), 2, all) if(sum(c1 + c2) != 0 || is.na(sum(c1 + c2))) stop("psi out of the range") # pow<-opz$pow nomiOK<-opz$nomiOK toll<-opz$toll h<-opz$h gap<-opz$gap stop.if.error<-opz$stop.if.error dev.new<-opz$dev0 visual<-opz$visual id.psi.group<-opz$id.psi.group it.max<-old.it.max<-opz$it.max rangeZ <- apply(Z, 2, range) psi<-PSI[1,] names(psi)<-id.psi.group #H<-1 it <- 1 epsilon <- 10 dev.values<-psi.values <- NULL id.psi.ok<-rep(TRUE, length(psi)) sel.col.XREG<-unique(sapply(colnames(XREG), function(x)match(x,colnames(XREG)))) if(is.numeric(sel.col.XREG))XREG<-XREG[,sel.col.XREG,drop=FALSE] #elimina le ripetizioni, ad es. le due intercette.. while (abs(epsilon) > toll) { k<-ncol(Z) U <- pmax((Z - PSI), 0)^pow[1]#U <- pmax((Z - PSI), 0) V <- dpmax(Z,PSI,pow=pow[2])# ifelse((Z > PSI), -1, 0) X <- cbind(XREG, U, V) rownames(X) <- NULL if (ncol(V) == 1) colnames(X)[(ncol(XREG) + 1):ncol(X)] <- c("U", "V") else colnames(X)[(ncol(XREG) + 1):ncol(X)] <- c(paste("U", 1:ncol(U), sep = ""), paste("V", 1:ncol(V), sep = "")) obj <- lm.wfit(x = X, y = y, w = w, offset = offs) dev.old<-dev.new dev.new <- dev.new1 <-sum(obj$residuals^2) if(return.all.sol) dev.new1 <- sum(mylm(x = cbind(XREG, U), y = y, w = w, offs = offs)$residuals^2) dev.values[[length(dev.values) + 1]] <- dev.new1 if (visual) { flush.console() if (it == 1) cat(0, " ", formatC(dev.old, 3, format = "f"), "", "(No breakpoint(s))", "\n") spp <- if (it < 10) "" else NULL cat(it, spp, "", formatC(dev.new, 3, format = "f"), "",length(psi),"\n") #cat(paste("iter = ", it, spp," dev = ",formatC(dev.new,digits=3,format="f"), " n.psi = ",formatC(length(psi),digits=0,format="f"), sep=""), "\n") } epsilon <- (dev.new - dev.old)/(dev.old + .001) obj$epsilon <- epsilon it <- it + 1 obj$it <- it #class(obj) <- c("segmented", class(obj)) #list.obj[[length(list.obj) + ifelse(last == TRUE, 0, 1)]] <- obj if (k == 1) { beta.c <- coef(obj)["U"] gamma.c <- coef(obj)["V"] } else { beta.c <- coef(obj)[paste("U", 1:ncol(U), sep = "")] gamma.c <- coef(obj)[paste("V", 1:ncol(V), sep = "")] } if (it > it.max) break psi.values[[length(psi.values) + 1]] <- psi.old <- psi # if(it>=old.it.max && h<1) H<-h psi <- psi.old + h*gamma.c/beta.c PSI <- matrix(rep(psi, rep(nrow(Z), length(psi))), ncol = length(psi)) #check if psi is admissible.. a <- apply((Z <= PSI), 2, all) #prima era solo < b <- apply((Z >= PSI), 2, all) #prima era solo > if(stop.if.error) { isErr<- (sum(a + b) != 0 || is.na(sum(a + b))) if(isErr) { if(return.all.sol) return(list(dev.values, psi.values)) else stop("(Some) estimated psi out of its range") } } else { id.psi.ok<-!is.na((a+b)<=0)&(a+b)<=0 Z <- Z[,id.psi.ok,drop=FALSE] psi <- psi[id.psi.ok] PSI <- PSI[,id.psi.ok,drop=FALSE] nomiOK<-nomiOK[id.psi.ok] #salva i nomi delle U per i psi ammissibili id.psi.group<-id.psi.group[id.psi.ok] names(psi)<-id.psi.group if(ncol(PSI)<=0) return(0) } #end else #obj$psi <- psi } #end while #queste due righe aggiunte nella versione 0.2.9-3 (adesso i breakpoints sono sempre ordinati) psi<-unlist(tapply(psi, id.psi.group, sort)) names(psi)<-id.psi.group PSI <- matrix(rep(psi, rep(nrow(Z), length(psi))), ncol = length(psi)) #aggiunto da qua.. U <- pmax((Z - PSI), 0) V <- ifelse((Z > PSI), -1, 0) X <- cbind(XREG, U, V) rownames(X) <- NULL if (ncol(V) == 1) colnames(X)[(ncol(XREG) + 1):ncol(X)] <- c("U", "V") else colnames(X)[(ncol(XREG) + 1):ncol(X)] <- c(paste("U", 1:ncol(U), sep = ""), paste("V", 1:ncol(V), sep = "")) obj <- lm.wfit(x = X, y = y, w = w, offset = offs) obj$epsilon <- epsilon obj$it <- it obj.new <- lm.wfit(x = cbind(XREG, U), y = y, w = w, offset = offs) SS.new<-sum(obj.new$residuals^2) if(!gap){ names.coef<-names(obj$coefficients) obj$coefficients<-c(obj.new$coefficients, rep(0,ncol(V))) names(obj$coefficients)<-names.coef obj$residuals<-obj.new$residuals obj$fitted.values<-obj.new$fitted.values } #fino a qua.. obj<-list(obj=obj,it=it,psi=psi,psi.values=psi.values,U=U,V=V,rangeZ=rangeZ, epsilon=epsilon,nomiOK=nomiOK, SumSquares.no.gap=SS.new, id.psi.group=id.psi.group) #inserire id.psi.ok? return(obj) } segmented/R/print.summary.segmented.R0000644000175100001440000000637612404051010017370 0ustar hornikusers`print.summary.segmented` <- function(x, short = x$short, var.diff = x$var.diff, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"),...){ cat("\n\t***Regression Model with Segmented Relationship(s)***\n\n") cat( "Call: \n" ) print( x$call ) cat("\nEstimated Break-Point(s):\n ") print(signif(x$psi[,-1],4)) cat("\nt value for the gap-variable(s) V: ",x$gap[,3],"\n") if(any(abs(x$gap[,3])>1.96)) cat(" Warning:", sum(abs(x$gap[,3])>1.96),"gap coefficient(s) significant at 0.05 level\n") if(short){ cat("\nDifference-in-slopes parameter(s):\n") #print(x$Ttable[(nrow(x$Ttable)-nrow(x$psi)+1):nrow(x$Ttable),])} nome<-rownames(x$psi) #nome<-as.character(parse("",text=nome)) #aa<-grep("U",rownames(x$Ttable)) #bb<-unlist(sapply(nome,function(xx){grep(xx,rownames(x$Ttable))},simplify=FALSE,USE.NAMES=FALSE)) #cc<-intersect(aa,bb) #indices of diff-slope parameters nomiU<-rownames(x$gap) #idU<-match(nomiU,rownames(x$Ttable)) print(x$Ttable[nomiU,]) } else {cat("\nMeaningful coefficients of the linear terms:\n") if(is.null(dim(x$Ttable))){ print(x$Ttable) #printCoefmat(matrix(x$Ttable,nrow=1,ncol=4,dimnames=list(" ",names(x$Ttable))),has.Pvalue=FALSE) } else { printCoefmat(x$Ttable, digits = digits, signif.stars = signif.stars,na.print = "NA", ...) } } if("summary.lm"%in%class(x)){ #for lm if(var.diff){ for(i in 1:length(x$sigma.new)){ cat("\nResidual standard error ",i,":", format(signif(x$sigma.new[i], digits)), "on", x$df.new[i], "degrees of freedom")} cat("\n") } else { cat("\nResidual standard error:", format(signif(x$sigma, digits)), "on", x$df[2], "degrees of freedom\n")} if (!is.null(x$fstatistic)) { cat("Multiple R-Squared:", formatC(x$r.squared, digits = digits)) cat(", Adjusted R-squared:", formatC(x$adj.r.squared, digits = digits), "\n")} } if("summary.glm"%in%class(x)){ #for glm cat("(Dispersion parameter for ", x$family$family, " family taken to be ", format(x$dispersion), ")\n\n", apply(cbind(paste(format.default(c("Null", "Residual"), width = 8, flag = ""), "deviance:"), format(unlist(x[c("null.deviance", "deviance")]), digits = max(5, digits + 1)), " on", format(unlist(x[c("df.null", "df.residual")])), " degrees of freedom\n"), 1, paste, collapse = " "), "AIC: ", format(x$aic, digits = max(4, digits + 1)), "\n", sep = "") } if(!"summary.lm"%in%class(x) && !"summary.glm"%in%class(x)){#for Arima cm <- x$call$method if (is.null(cm) || cm != "CSS") cat("sigma^2 estimated as ", format(x$sigma2, digits = digits), ", log likelihood = ", format(round(x$loglik, 2)), ", aic = ", format(round(x$aic, 2)), "\n", sep = "") else cat("\nsigma^2 estimated as ", format(x$sigma2, digits = digits), ", part log likelihood = ", format(round(x$loglik, 2)), "\n", sep = "") } invisible(x) cat("\nConvergence attained in",x$it,"iterations with relative change",x$epsilon,"\n") } segmented/R/slope.R0000644000175100001440000001206512404051010013700 0ustar hornikusers`slope` <- function(ogg, parm, conf.level=0.95, rev.sgn=FALSE, var.diff=FALSE, APC=FALSE, digits = max(3, getOption("digits") - 3)){ #-- f.U<-function(nomiU, term=NULL){ #trasforma i nomi dei coeff U (o V) nei nomi delle variabili corrispondenti #and if 'term' is provided (i.e. it differs from NULL) the index of nomiU matching term are returned k<-length(nomiU) nomiUsenzaU<-strsplit(nomiU, "\\.") nomiU.ok<-vector(length=k) for(i in 1:k){ nomi.i<-nomiUsenzaU[[i]][-1] if(length(nomi.i)>1) nomi.i<-paste(nomi.i,collapse=".") nomiU.ok[i]<-nomi.i } if(!is.null(term)) nomiU.ok<-(1:k)[nomiU.ok%in%term] return(nomiU.ok) } #-- # if(!"segmented"%in%class(ogg)) stop("A segmented model is needed") if(var.diff && length(ogg$nameUV$Z)>1) { var.diff<-FALSE warning("var.diff set to FALSE with multiple segmented variables", call.=FALSE) } nomepsi<-rownames(ogg$psi) #OK nomeU<-ogg$nameUV$U nomeZ<-ogg$nameUV$Z if(missing(parm)) { nomeZ<- ogg$nameUV[[3]] if(length(rev.sgn)==1) rev.sgn<-rep(rev.sgn,length(nomeZ)) } else { if(! all(parm %in% ogg$nameUV$Z)) {stop("invalid parm")} else {nomeZ<-parm} } if(length(rev.sgn)!=length(nomeZ)) rev.sgn<-rep(rev.sgn, length.out=length(nomeZ)) nomi<-names(coef(ogg)) nomi<-nomi[-match(nomepsi,nomi)] #escludi i coef delle V index<-vector(mode = "list", length = length(nomeZ)) for(i in 1:length(nomeZ)) { #id.cof.U<-grep(paste("\\.",nomeZ[i],"$",sep=""), nomi, value=FALSE) #psii<-ogg$psi[grep(paste("\\.",nomeZ[i],"$",sep=""), rownames(ogg$psi), value=FALSE),2] #id.cof.U<- match(grep(nomeZ[i], ogg$nameUV$U, value=TRUE), nomi) #psii<-ogg$psi[grep(nomeZ[i], ogg$nameUV$V, value=TRUE),2] #il paste con "$" (paste("\\.",nomeZ[i],"$",sep="")) utile perch serve a distinguere variabili con nomi simili (ad es., "x" e "xx") #Comunque nella versione dopo la 0.3-1.0 ho (FINALMENTE) risolto mettendo f.U id.cof.U<- f.U(ogg$nameUV$U, nomeZ[i]) #id.cof.U la posizione nel vettore ogg$nameUV$U; la seguente corregge per eventuali variabili che ci sono prima (ad es., interc) id.cof.U<- id.cof.U + (match(ogg$nameUV$U[1], nomi)-1) psii<- ogg$psi[f.U(ogg$nameUV$V, nomeZ[i]) , "Est."] id.cof.U <- id.cof.U[order(psii)] index[[i]]<-c(match(nomeZ[i],nomi), id.cof.U) } Ris<-list() digits <- max(3, getOption("digits") - 3) rev.sgn<-rep(rev.sgn, length.out=length(nomeZ)) # transf=c("x","1") # if( (length(transf)!=2) || !(length(transf)==1 && transf=="APC")) stop("'error in transf'") # if(transf=="APC") transf<-c("100*(exp(x)-1)", "100*exp(x)") # my.f<-function(x)eval(parse(text=transf[1])) # my.f.deriv<-function(x)eval(parse(text=transf[2])) for(i in 1:length(index)){ ind<-as.numeric(na.omit(unlist(index[[i]]))) M<-matrix(1,length(ind),length(ind)) M[row(M)1) nomi.i<-paste(nomi.i,collapse=".") nomiU.ok[i]<-nomi.i } if(!is.null(term)) nomiU.ok<-(1:k)[nomiU.ok%in%term] return(nomiU.ok) } #------------- if(missing(term)){ if(length(x$nameUV$Z)>1 ) {stop("please, specify `term'")} else {term<-x$nameUV$Z} } opz<-list(...) nameV<- x$nameUV$V[f.U(x$nameUV$V, term)] psii<- x$psi[nameV, "Est."] d<-data.frame(a=psii) names(d)<-term opz$y<-broken.line(x,d, se.fit=FALSE, interc=interc, link=link)[[1]] opz$x<-psii if(is.null(opz$cex)) opz$cex<-1.5 if(is.null(opz$lwd)) opz$lwd<-2 do.call(points, opz) invisible(NULL) } segmented/R/segmented.default.r0000644000175100001440000003357512404051010016225 0ustar hornikusers#`segmented.default` <- #o1<-segmented(out.lm, seg.Z=~x +z,psi=list(x=c(30,60),z=.3), control=seg.control(display=FALSE, n.boot=20, seed=1515)) #o2<-ss(out.lm, seg.Z=~x +z,psi=list(x=c(30,60),z=.3), control=seg.control(display=FALSE, n.boot=20, seed=1515)) #o2<-ss(out.lm, seg.Z=~x +z,psi=list(x=c(30,60),z=.3), control=seg.control(display=FALSE, n.boot=0)) #o2<-ss(o, seg.Z=~age, psi=41, control=seg.control(display=FALSE, n.boot=0)) segmented.default<- function(obj, seg.Z, psi=stop("provide psi"), control = seg.control(), model = TRUE, ...) { #Richiede control$f.obj that should be a string like "sum(x$residuals^2)" or "x$dev" #----------------- dpmax<-function(x,y,pow=1){ #deriv pmax if(pow==1) ifelse(x>y, -1, 0) else -pow*pmax(x-y,0)^(pow-1) } #----------- # control$fn.obj<-"sum(x$residuals^2)" # control$fn.obj<-"x$dev" # control$fn.obj<-"-x$loglik[2]" # control$fn.obj<-"x$rho" # if(is.null(control$fn.obj)) stop("'segmented.default' needs 'fn.obj' specified in seg.control") else fn.obj<-control$fn.obj if(is.null(control$fn.obj)) fn.obj<-"-as.numeric(logLik(x))" else fn.obj<-control$fn.obj #----------- n.Seg<-1 if(length(all.vars(seg.Z))>1 & !is.list(psi)) stop("`psi' should be a list with more than one covariate in `seg.Z'") if(is.list(psi)){ if(length(all.vars(seg.Z))!=length(psi)) stop("A wrong number of terms in `seg.Z' or `psi'") if(any(is.na(match(all.vars(seg.Z),names(psi), nomatch = NA)))) stop("Variables in `seg.Z' and `psi' do not match") n.Seg <- length(psi) } if(length(all.vars(seg.Z))!=n.Seg) stop("A wrong number of terms in `seg.Z' or `psi'") it.max <- old.it.max<- control$it.max toll <- control$toll visual <- control$visual stop.if.error<-control$stop.if.error n.boot<-control$n.boot # n.boot<-0 size.boot<-control$size.boot gap<-control$gap random<-control$random pow<-control$pow visualBoot<-FALSE if(n.boot>0){ if(!is.null(control$seed)) { set.seed(control$seed) employed.Random.seed<-control$seed } else { employed.Random.seed<-eval(parse(text=paste(sample(0:9, size=6), collapse=""))) set.seed(employed.Random.seed) } if(visual) {visual<-FALSE; visualBoot<-TRUE}# warning("`display' set to FALSE with bootstrap restart", call.=FALSE)} if(!stop.if.error) stop("Bootstrap restart only with a fixed number of breakpoints") } last <- control$last K<-control$K h<-min(abs(control$h),1) if(h<1) it.max<-it.max+round(it.max/2) orig.call<-Call<-mf<-obj$call orig.call$formula<- mf$formula<-formula(obj) #per consentire lm(y~.) m <- match(c("formula", "data", "subset", "weights", "na.action","offset"), names(mf), 0L) mf <- mf[c(1, m)] mf$drop.unused.levels <- TRUE mf[[1L]] <- as.name("model.frame") if(class(mf$formula)=="name" && !"~"%in%paste(mf$formula)) mf$formula<-eval(mf$formula) mf$formula<-update.formula(mf$formula,paste(seg.Z,collapse=".+")) mfExt<- mf if(!is.null(obj$call$offset) || !is.null(obj$call$weights) || !is.null(obj$call$subset)){ mfExt$formula<- update.formula(mf$formula,paste(".~.+", paste( paste(all.vars(obj$call$offset), collapse="+"), paste(all.vars(obj$call$weights), collapse="+"), paste(all.vars(obj$call$subset), collapse="+"), sep="+" ) ,sep="")) } mf <- eval(mf, parent.frame()) n<-nrow(mf) #questo serve per inserire in mfExt le eventuali variabili contenute nella formula con offset(..) nomiOff<-setdiff(all.vars(formula(obj)), names(mf)) if(length(nomiOff)>=1) mfExt$formula<-update.formula(mfExt$formula,paste(".~.+", paste( nomiOff, collapse="+"), sep="")) nomiTUTTI<-all.vars(mfExt$formula) #comprende anche altri nomi (ad es., threshold) "variabili" nomiNO<-NULL #dovrebbe contenere for(i in nomiTUTTI){ r<-try(eval(parse(text=i), parent.frame()), silent=TRUE) if(class(r)!="try-error" && length(r)==1) nomiNO[[length(nomiNO)+1]]<-i } if(!is.null(nomiNO)) mfExt$formula<-update.formula(mfExt$formula,paste(".~.-", paste( nomiNO, collapse="-"), sep="")) mfExt<-eval(mfExt, parent.frame()) #apply(mfExt,2,function(x) {if(is.Surv(x)) x[,1:ncol(x)] else x}) if(inherits(obj, "coxph")){ is.Surv<-NA rm(is.Surv) for(i in 1:ncol(mfExt)){ if(is.Surv(mfExt[,i])) aa<-mfExt[,i][,1:ncol(mfExt[,i])] } mfExt<-cbind(aa,mfExt) } id.seg<-match(all.vars(seg.Z), names(mfExt)) name.Z<-names(mfExt)[id.seg] Z<-mfExt[,id.seg,drop=FALSE] # name.Z <- names(Z) if(ncol(Z)==1 && is.vector(psi) && (is.numeric(psi)||is.na(psi))){ psi <- list(as.numeric(psi)) names(psi)<-name.Z } if (!is.list(psi) || is.null(names(psi))) stop("psi should be a *named* list") id.nomiZpsi <- match(colnames(Z), names(psi)) if ((ncol(Z)!=length(psi)) || any(is.na(id.nomiZpsi))) stop("Length or names of Z and psi do not match") nome <- names(psi)[id.nomiZpsi] psi <- psi[nome] initial.psi<-psi for(i in 1:length(psi)) { if(any(is.na(psi[[i]]))) psi[[i]]<-if(control$quant) {quantile(Z[,i], prob= seq(0,1,l=K+2)[-c(1,K+2)], names=FALSE)} else {(min(Z[,i])+ diff(range(Z[,i]))*(1:K)/(K+1))} } a <- sapply(psi, length) #per evitare che durante il processo iterativo i psi non siano ordinati id.psi.group <- rep(1:length(a), times = a) #identificativo di apparteneza alla variabile Z<-matrix(unlist(mapply(function(x,y)rep(x,y),Z,a,SIMPLIFY = TRUE)),nrow=n) colnames(Z) <- nomiZ.vett <- rep(nome, times = a) #SERVE??? s perch Z senza colnames psi <- unlist(psi) #se psi numerico, la seguente linea restituisce i valori ordinati all'interno della variabile.. psi<-unlist(tapply(psi,id.psi.group,sort)) k <- ncol(Z) PSI <- matrix(rep(psi, rep(n, k)), ncol = k) #controllo se psi ammissibile.. c1 <- apply((Z <= PSI), 2, all) #dovrebbero essere tutti FALSE (prima era solo <) c2 <- apply((Z >= PSI), 2, all) #dovrebbero essere tutti FALSE (prima era solo >) if(sum(c1 + c2) != 0 || is.na(sum(c1 + c2)) ) stop("starting psi out of the admissible range") U <- pmax((Z - PSI), 0)^pow[1]#U <- pmax((Z - PSI), 0) #V <- dpmax(Z,PSI,pow=pow[2])# ifelse((Z > PSI), -1, 0) V<-ifelse((Z > PSI), -1, 0) #ripetizioni <- as.numeric(unlist(sapply(table(nomiZ)[order(unique(nomiZ))], function(xxx) {1:xxx}))) ripetizioni <- as.vector(unlist(tapply(id.psi.group, id.psi.group, function(x) 1:length(x) ))) nomiU <- paste("U", ripetizioni, sep = "") nomiU <- paste(nomiU, nomiZ.vett, sep = ".") nomiV <- paste("V", ripetizioni, sep = "") nomiV <- paste(nomiV, nomiZ.vett, sep = ".") nnomi <- c(nomiU, nomiV) for(i in 1:k) { mfExt[nomiU[i]] <- U[,i] mfExt[nomiV[i]] <- V[,i] } Fo <- update.formula(formula(obj), as.formula(paste(".~.+", paste(nnomi, collapse = "+")))) Fo.noV <- update.formula(formula(obj), as.formula(paste(".~.+", paste(nomiU, collapse = "+")))) call.ok <- update(obj, formula = Fo, evaluate=FALSE, data = mfExt) #objF <- update(obj0, formula = Fo, data = KK) call.noV <- update(obj, formula = Fo.noV, evaluate=FALSE, data = mfExt) #objF <- update(obj0, formula = Fo, data = KK) if (it.max == 0) { obj1 <- eval(call.noV, envir=mfExt) return(obj1) } #obj1 <- eval(call.ok, envir=mfExt) initial <- psi obj0 <- obj #browser() dev0<- eval(parse(text=fn.obj), list(x=obj)) if(is.na(dev0)) dev0<-10 list.obj <- list(obj) nomiOK<-nomiU opz<-list(toll=toll,h=h,stop.if.error=stop.if.error,dev0=dev0,visual=visual,it.max=it.max, nomiOK=nomiOK, id.psi.group=id.psi.group, gap=gap, visualBoot=visualBoot, pow=pow) opz$call.ok<-call.ok opz$call.noV<-call.noV opz$nomiU<-nomiU opz$nomiV<-nomiV opz$fn.obj <- fn.obj if(n.boot<=0){ obj<-seg.def.fit(obj, Z, PSI, mfExt, opz) } else { obj<-seg.def.fit.boot(obj, Z, PSI, mfExt, opz, n.boot=n.boot, size.boot=size.boot, random=random) #jt, nonParam } if(!is.list(obj)){ warning("No breakpoint estimated", call. = FALSE) return(obj0) } if(!is.null(obj$obj$df.residual)){ if(obj$obj$df.residual==0) warning("no residual degrees of freedom (other warnings expected)", call.=FALSE) } id.psi.group<-obj$id.psi.group nomiOK<-obj$nomiOK #sarebbe nomiU it<-obj$it psi<-obj$psi psi.values<-if(n.boot<=0) obj$psi.values else obj$boot.restart U<-obj$U V<-obj$V # return(obj) #if(any(table(rowSums(V))<=1)) stop("only 1 datum in an interval: breakpoint(s) at the boundary or too close") for(jj in colnames(V)) { VV<-V[, which(colnames(V)==jj), drop=FALSE] sumV<-abs(rowSums(VV)) if( (any(diff(sumV)>=2) #se ci sono due breakpoints uguali || any(table(sumV)<=1)) && stop.if.error) stop("only 1 datum in an interval: breakpoint(s) at the boundary or too close each other") } rangeZ<-obj$rangeZ obj<-obj$obj k<-length(psi) # beta.c<-if(k == 1) coef(obj)["U"] else coef(obj)[paste("U", 1:ncol(U), sep = "")] beta.c<- coef(obj)[nomiU] psi.values[[length(psi.values) + 1]] <- psi id.warn <- FALSE if (n.boot<=0 && it > it.max) { #it >= (it.max+1) warning("max number of iterations attained", call. = FALSE) id.warn <- TRUE } Vxb <- V %*% diag(beta.c, ncol = length(beta.c)) #se usi una procedura automatica devi cambiare ripetizioni, nomiU e nomiV, e quindi: length.psi<-tapply(as.numeric(as.character(names(psi))), as.numeric(as.character(names(psi))), length) forma.nomiU<-function(xx,yy)paste("U",1:xx, ".", yy, sep="") forma.nomiVxb<-function(xx,yy)paste("psi",1:xx, ".", yy, sep="") nomiU <- unlist(mapply(forma.nomiU, length.psi, name.Z)) #in realt non serve, c'era gi! nomiVxb <- unlist(mapply(forma.nomiVxb, length.psi, name.Z)) for(i in 1:ncol(U)) { mfExt[nomiU[i]]<-mf[nomiU[i]]<-U[,i] mfExt[nomiVxb[i]]<-mf[nomiVxb[i]]<-Vxb[,i] } nnomi <- c(nomiU, nomiVxb) # browser() Fo <- update.formula(formula(obj0), as.formula(paste(".~.+", paste(nnomi, collapse = "+")))) objF <- update(obj0, formula = Fo, evaluate=FALSE, data = mfExt) #if(!is.null(objF[["subset"]])) objF[["subset"]]<-NULL objF<- eval(objF, envir=mfExt) #Pu capitare che psi sia ai margini e ci sono 1 o 2 osservazioni in qualche intervallo. Oppure ce ne #sono di pi ma hanno gli stessi valori di x #objF$coef pu avere mancanti.. names(which(is.na(coef(objF)))) if(any(is.na(objF$coefficients)) && stop.if.error){ stop("at least one coef estimate is NA: breakpoint(s) at the boundary? (possibly with many x-values replicated)", call. = FALSE) } # CONTROLLARE!!!! # objF$offset<- obj0$offset #sostituire i valori: objF include le U e V, obj solo le U if(!gap){ #names.coef <- names(objF$coefficients) #names(obj$coefficients)[match(nomiV, names(coef(obj)))]<-nomiVxb #objF$coefficients[names.coef]<-obj$coefficients[names.coef] names.coef <- names(obj$coefficients) objF$coefficients[names.coef]<-obj$coefficients[names.coef] objF$coefficients[nomiVxb]<-rep(0, k) if(!is.null(objF$fitted.values)) objF$fitted.values<-obj$fitted.values if(!is.null(objF$residuals)) objF$residuals<-obj$residuals if(!is.null(objF$linear.predictors)) objF$linear.predictors<-obj$linear.predictors if(!is.null(objF$deviance)) objF$deviance<-obj$deviance if(!is.null(objF$weights)) objF$weights<-obj$weights if(!is.null(objF$aic)) objF$aic<-obj$aic + 2*k if(!is.null(objF$loglik)) objF$loglik<-obj$loglik #per coxph if(!is.null(objF$rho)) objF$rho<-obj$rho #per rq if(!is.null(objF$dual)) objF$dual<-obj$dual #per rq } if(any(is.na(objF$coefficients))){ #Se gap==FALSE qui non ci possono essere NA (sono sostituiti dagli 0) stop("some estimate is NA: premature stopping with a large number of breakpoints?", call. = FALSE) } a<-tapply(id.psi.group, id.psi.group, length) #ho sovrascritto "a" di sopra, ma non dovrebbe servire.. initial<-unlist(mapply(function(x,y){if(is.na(x)[1])rep(x,y) else x }, initial.psi, a)) id <- match(nomiVxb, names(coef(objF))) Cov <- try(vcov(objF), silent=TRUE) if(class(Cov)!="try-error") { vv <- if (length(id) == 1) Cov[id, id] else diag(Cov[id, id]) #if(length(initial)!=length(psi)) initial<-rep(NA,length(psi)) psi <- cbind(initial, psi, sqrt(vv)) rownames(psi) <- colnames(Cov)[id] colnames(psi) <- c("Initial", "Est.", "St.Err") } else { psi <- cbind(initial, psi) rownames(psi) <- nomiVxb colnames(psi) <- c("Initial", "Est.") } objF$rangeZ <- rangeZ objF$psi.history <- psi.values objF$psi <- psi objF$it <- (it - 1) objF$epsilon <- obj$epsilon objF$call <- match.call() objF$nameUV <- list(U = nomiU, V = rownames(psi), Z = name.Z) objF$id.group <- if(length(name.Z)<=1) -rowSums(as.matrix(V)) objF$id.psi.group <- id.psi.group objF$id.warn <- id.warn objF$orig.call<-orig.call if (model) objF$model <- mf #objF$mframe <- data.frame(as.list(KK)) if(n.boot>0) objF$seed<-employed.Random.seed # class(objF) <- c("segmented", class(obj0)) list.obj[[length(list.obj) + 1]] <- objF class(list.obj) <- "segmented" if (last) list.obj <- list.obj[[length(list.obj)]] return(list.obj) } #end function segmented/R/segmented.glm.R0000644000175100001440000004266012404051010015313 0ustar hornikusers`segmented.glm` <- #objF$id.group??? function(obj, seg.Z, psi=stop("provide psi"), control = seg.control(), model = TRUE, ...) { n.Seg<-1 if(length(all.vars(seg.Z))>1 & !is.list(psi)) stop("`psi' should be a list with more than one covariate in `seg.Z'") if(is.list(psi)){ if(length(all.vars(seg.Z))!=length(psi)) stop("A wrong number of terms in `seg.Z' or `psi'") if(any(is.na(match(all.vars(seg.Z),names(psi), nomatch = NA)))) stop("Variables in `seg.Z' and `psi' do not match") n.Seg <- length(psi) } if(length(all.vars(seg.Z))!=n.Seg) stop("A wrong number of terms in `seg.Z' or `psi'") maxit.glm <- control$maxit.glm it.max <- old.it.max<- control$it.max toll <- control$toll visual <- control$visual stop.if.error<-control$stop.if.error n.boot<-control$n.boot size.boot<-control$size.boot gap<-control$gap random<-control$random pow<-control$pow visualBoot<-FALSE if(n.boot>0){ if(!is.null(control$seed)) { set.seed(control$seed) employed.Random.seed<-control$seed } else { employed.Random.seed<-eval(parse(text=paste(sample(0:9, size=6), collapse=""))) set.seed(employed.Random.seed) } if(visual) {visual<-FALSE; visualBoot<-TRUE}#warning("`display' set to FALSE with bootstrap restart", call.=FALSE)} if(!stop.if.error) stop("Bootstrap restart only with a fixed number of breakpoints") } last <- control$last K<-control$K h<-min(abs(control$h),1) if(h<1) it.max<-it.max+round(it.max/2) # if(!stop.if.error) objInitial<-obj #------------------------------- # #una migliore soluzione......... # objframe <- update(obj, model = TRUE, x = TRUE, y = TRUE) # y <- objframe$y # a <- model.matrix(seg.Z, data = eval(obj$call$data)) # a <- subset(a, select = colnames(a)[-1]) orig.call<-Call<-mf<-obj$call orig.call$formula<-mf$formula<-formula(obj) #per consentire lm(y~.) m <- match(c("formula", "data", "subset", "weights", "na.action","etastart","mustart","offset"), names(mf), 0L) mf <- mf[c(1, m)] mf$drop.unused.levels <- TRUE mf[[1L]] <- as.name("model.frame") #non so a che serva la seguente linea.. if(class(mf$formula)=="name" && !"~"%in%paste(mf$formula)) mf$formula<-eval(mf$formula) #orig.call$formula<-update.formula(orig.call$formula, paste("~.-",all.vars(seg.Z))) #utile per plotting # nomeRispo<-strsplit(paste(formula(obj))[2],"/")[[1]] #eventuali doppi nomi separati da "/" (tipo "y/n" per GLM binom) #la linea sotto aggiunge nel mf anche la variabile offs.. # if(length(all.vars(formula(obj)))>1){ # id.rispo<-1 # if(length(nomeRispo)>=2) id.rispo<-1:2 # #questo serve quando formula(obj) ha solo l'intercept # agg<-if(length(all.vars(formula(obj))[-id.rispo])==0) "" else "+" # mf$formula<-update.formula(mf$formula,paste(paste(seg.Z,collapse=".+"),agg,paste(all.vars(formula(obj))[-id.rispo],collapse="+"))) # } else { # mf$formula<-update.formula(mf$formula,paste(seg.Z,collapse=".+")) # } mfExt<- mf mf$formula<-update.formula(mf$formula,paste(seg.Z,collapse=".+")) # mfExt$formula<- update.formula(mfExt$formula,paste(paste(seg.Z,collapse=".+"),"+",paste(all.vars(formula(obj)),collapse="+"))) # mfExt$formula<- if(!is.null(obj$call$data)) # update.formula(mf$formula,paste(".~",paste(all.vars(obj$call), collapse="+"),"-",obj$call$data,sep="")) # else update.formula(mf$formula,paste(".~",paste(all.vars(obj$call), collapse="+"),sep="")) #----------- if(!is.null(obj$call$offset) || !is.null(obj$call$weights) || !is.null(obj$call$subset)){ mfExt$formula<- update.formula(mf$formula,paste(".~.+", paste( paste(all.vars(obj$call$offset), collapse="+"), paste(all.vars(obj$call$weights), collapse="+"), paste(all.vars(obj$call$subset), collapse="+"), sep="+" ) ,sep="")) } mf <- eval(mf, parent.frame()) n<-nrow(mf) #La linea sotto serve per inserire in mfExt le eventuali variabili contenute nella formula con offset(..) # o anche variabili che rientrano in espressioni (ad es., y/n o I(y*n)) nomiOff<-setdiff(all.vars(formula(obj)), names(mf)) if(length(nomiOff)>=1) mfExt$formula<-update.formula(mfExt$formula,paste(".~.+", paste( nomiOff, collapse="+"), sep="")) #ago 2014 c' la questione di variabili aggiuntive... nomiTUTTI<-all.vars(mfExt$formula) #comprende anche altri nomi (ad es., threshold) "variabili" nomiNO<-NULL #dovrebbe contenere for(i in nomiTUTTI){ r<-try(eval(parse(text=i), parent.frame()), silent=TRUE) if(class(r)!="try-error" && length(r)==1) nomiNO[[length(nomiNO)+1]]<-i } #nomiNO dovrebbe contenere i nomi delle "altre variabili" (come th in subset=x=2) mf[nomeRispo[1]]<-weights*y id.duplic<-match(all.vars(formula(obj)),all.vars(seg.Z),nomatch=0)>0 if(any(id.duplic)) { #new.mf<-mf[,id.duplic,drop=FALSE] new.mf<-mf[,all.vars(formula(obj))[id.duplic],drop=FALSE] new.XREGseg<-data.matrix(new.mf) XREG<-cbind(XREG,new.XREGseg) } n.psi<- length(unlist(psi)) id.n.Seg<-(ncol(XREG)-n.Seg+1):ncol(XREG) XREGseg<-XREG[,id.n.Seg,drop=FALSE] #XREG<-XREG[,-id.n.Seg,drop=FALSE] #XREG<-model.matrix(obj0) non va bene perch non elimina gli eventuali mancanti in seg.Z.. #Due soluzioni #XREG<-XREG[,colnames(model.matrix(obj)),drop=FALSE] #XREG<-XREG[,match(c("(Intercept)",all.vars(formula(obj))[-1]),colnames(XREG),nomatch =0),drop=FALSE] XREG <- XREG[, match(c("(Intercept)", namesXREG0),colnames(XREG), nomatch = 0), drop = FALSE] XREG<-XREG[,unique(colnames(XREG)), drop=FALSE] n <- nrow(XREG) #Z <- list(); for (i in colnames(XREGseg)) Z[[length(Z) + 1]] <- XREGseg[, i] Z<-lapply(apply(XREGseg,2,list),unlist) #prende anche i nomi! name.Z <- names(Z) <- colnames(XREGseg) if(length(Z)==1 && is.vector(psi) && (is.numeric(psi)||is.na(psi))){ psi <- list(as.numeric(psi)) names(psi)<-name.Z } if (!is.list(Z) || !is.list(psi) || is.null(names(Z)) || is.null(names(psi))) stop("Z and psi have to be *named* list") id.nomiZpsi <- match(names(Z), names(psi)) if ((length(Z)!=length(psi)) || any(is.na(id.nomiZpsi))) stop("Length or names of Z and psi do not match") #dd <- match(names(Z), names(psi)) nome <- names(psi)[id.nomiZpsi] psi <- psi[nome] initial.psi<-psi for(i in 1:length(psi)) { if(any(is.na(psi[[i]]))) psi[[i]]<-quantile(Z[[i]], prob= seq(0,1,l=K+2)[-c(1,K+2)], names=FALSE) } a <- sapply(psi, length)#b <- rep(1:length(a), times = a) id.psi.group <- rep(1:length(a), times = a) #identificativo di apparteneza alla variabile #Znew <- list() #for (i in 1:length(psi)) Znew[[length(Znew) + 1]] <- rep(Z[i], a[i]) #Z <- matrix(unlist(Znew), nrow = n) Z<-matrix(unlist(mapply(function(x,y)rep(x,y),Z,a,SIMPLIFY = TRUE)),nrow=n) psi <- unlist(psi) psi<-unlist(tapply(psi,id.psi.group,sort)) k <- ncol(Z) PSI <- matrix(rep(psi, rep(n, k)), ncol = k) colnames(Z) <- nomiZ <- rep(nome, times = a) ripetizioni <- as.numeric(unlist(sapply(table(nomiZ)[order(unique(nomiZ))], function(xxx) {1:xxx}))) nomiU <- paste("U", ripetizioni, sep = "") nomiU <- paste(nomiU, nomiZ, sep = ".") nomiV <- paste("V", ripetizioni, sep = "") nomiV <- paste(nomiV, nomiZ, sep = ".") #forse non serve crearsi l'ambiente KK, usa mf.. #obj <- update(obj, formula = Fo, data = mf) #if (model.frame) obj$model <- mf #controlla che model.frame() funzioni sull'oggetto restituito # KK <- new.env() # for (i in 1:ncol(objframe$model)) assign(names(objframe$model[i]), objframe$model[[i]], envir = KK) if (it.max == 0) { #mf<-cbind(mf, mfExt) U <- pmax((Z - PSI), 0) colnames(U) <- paste(ripetizioni, nomiZ, sep = ".") nomiU <- paste("U", colnames(U), sep = "") #for (i in 1:ncol(U)) assign(nomiU[i], U[, i], envir = KK) # necessario il for? puoi usare colnames(U)<-nomiU;mf[nomiU]<-U for(i in 1:ncol(U)) mfExt[nomiU[i]]<-mf[nomiU[i]]<-U[,i] Fo <- update.formula(formula(obj), as.formula(paste(".~.+", paste(nomiU, collapse = "+")))) #obj <- update(obj, formula = Fo, data = KK) obj <- update(obj, formula = Fo, data = mfExt, evaluate=FALSE) #if(!is.null(obj[["subset"]])) obj[["subset"]]<-NULL obj<-eval(obj, envir=mfExt) if (model) obj$model <-mf #obj$model <- data.frame(as.list(KK)) names(psi)<-paste(paste("psi", ripetizioni, sep = ""), nomiZ, sep=".") obj$psi <- psi return(obj) } #XREG <- model.matrix(obj) creata sopra #o <- model.offset(objframe) #w <- model.weights(objframe) if (is.null(weights)) weights <- rep(1, n) if (is.null(offs)) offs <- rep(0, n) fam <- family(obj) initial <- psi obj0 <- obj dev0<-obj$dev list.obj <- list(obj) # psi.values <- NULL nomiOK<-nomiU opz<-list(toll=toll,h=h,stop.if.error=stop.if.error,dev0=dev0,visual=visual,it.max=it.max,nomiOK=nomiOK, fam=fam, eta0=obj$linear.predictors, maxit.glm=maxit.glm, id.psi.group=id.psi.group, gap=gap, pow=pow, visualBoot=visualBoot) if(n.boot<=0){ obj<-seg.glm.fit(y,XREG,Z,PSI,weights,offs,opz) } else { obj<-seg.glm.fit.boot(y, XREG, Z, PSI, weights, offs, opz, n.boot=n.boot, size.boot=size.boot, random=random) #jt, nonParam } if(!is.list(obj)){ warning("No breakpoint estimated", call. = FALSE) return(obj0) } id.psi.group<-obj$id.psi.group nomiOK<-obj$nomiOK it<-obj$it psi<-obj$psi k<-length(psi) psi.values<-if(n.boot<=0) obj$psi.values else obj$boot.restart U<-obj$U V<-obj$V #if(any(table(rowSums(V))<=1)) stop("only 1 datum in an interval: breakpoint(s) at the boundary or too close") for(jj in colnames(V)) { VV<-V[, which(colnames(V)==jj),drop=FALSE] sumV<-abs(rowSums(VV)) if( (any(diff(sumV)>=2) #se ci sono due breakpoints equivalenti || any(table(sumV)<=1))) stop("only 1 datum in an interval: breakpoint(s) at the boundary or too close each other") } rangeZ<-obj$rangeZ obj<-obj$obj beta.c<-if(k == 1) coef(obj)["U"] else coef(obj)[paste("U", 1:ncol(U), sep = "")] psi.values[[length(psi.values) + 1]] <- psi id.warn <- FALSE if (n.boot<=0 && it > it.max) { #it >= (it.max+1) warning("max number of iterations attained", call. = FALSE) id.warn <- TRUE } Vxb <- V %*% diag(beta.c, ncol = length(beta.c)) #se usi una procedura automatica devi cambiare ripetizioni, nomiU e nomiV, e quindi: length.psi<-tapply(as.numeric(as.character(names(psi))), as.numeric(as.character(names(psi))), length) forma.nomiU<-function(xx,yy)paste("U",1:xx, ".", yy, sep="") forma.nomiVxb<-function(xx,yy)paste("psi",1:xx, ".", yy, sep="") nomiU <- unlist(mapply(forma.nomiU, length.psi, name.Z)) #invece di un ciclo #paste("U",1:length.psi[i], ".", name.Z[i]) nomiVxb <- unlist(mapply(forma.nomiVxb, length.psi, name.Z)) #mf<-cbind(mf, mfExt) for(i in 1:ncol(U)) { mfExt[nomiU[i]]<-mf[nomiU[i]]<-U[,i] mfExt[nomiVxb[i]]<-mf[nomiVxb[i]]<-Vxb[,i] } # for (i in 1:ncol(U)) { # assign(nomiU[i], U[, i], envir = KK) # assign(nomiVxb[i], Vxb[, i], envir = KK) # } nnomi <- c(nomiU, nomiVxb) Fo <- update.formula(formula(obj0), as.formula(paste(".~.+", paste(nnomi, collapse = "+")))) #la seguente linea si potrebbe rimuovere perch in mfExt c' gi tutto.. if(is.matrix(y)&& (fam$family=="binomial" || fam$family=="quasibinomial")){ mfExt<-cbind(mfExt[[1]], mfExt[,-1]) } objF <- update(obj0, formula = Fo, data = mfExt, evaluate=FALSE) # if(!is.null(objF[["subset"]])) objF[["subset"]]<-NULL objF<-eval(objF, envir=mfExt) #C' un problema..controlla obj (ha due "(Intercepts)" - bhu.. al 27/03/14 non mi sembra! #Pu capitare che psi sia ai margini e ci sono 1 o 2 osservazioni in qualche intervallo. Oppure ce ne # sono di pi ma hanno gli stessi valori di x if(any(is.na(objF$coefficients))){ stop("at least one coef estimate is NA: breakpoint(s) at the boundary? (possibly with many x-values replicated)", call. = FALSE) } #aggiornare qui i weights???? (piuttosto che sotto) #------>>> #------>>> #------>>> objF$offset<- obj0$offset if(!gap){ names.coef<-names(objF$coefficients) if(k==1) {names(obj$coefficients)[match(c("U","V"), names(coef(obj)))]<- nnomi } else { names(obj$coefficients)[match(c(paste("U",1:k, sep=""), paste("V",1:k, sep="")), names(coef(obj)))]<- nnomi } objF$coefficients[names.coef]<-obj$coefficients[names.coef] # objF$coefficients<- if(sum("(Intercept)"==names(obj$coef))==2) obj$coefficients[-2] else obj$coefficients objF$fitted.values<-obj$fitted.values objF$linear.predictors<-obj$linear.predictors objF$residuals<-obj$residuals objF$deviance<-obj$deviance objF$aic<-obj$aic + 2*k objF$weights<-obj$weights } if(any(is.na(objF$coefficients))){ stop("some estimate is NA: premature stopping with a large number of breakpoints?", call. = FALSE) } Cov <- vcov(objF) id <- match(nomiVxb, names(coef(objF))) #cat(id,"\n") #return(objF) vv <- if (length(id) == 1) Cov[id, id] else diag(Cov[id, id]) #if(length(initial)!=length(psi)) initial<-rep(NA,length(psi)) a<-tapply(id.psi.group, id.psi.group, length) #ho sovrascritto "a" di sopra, ma non dovrebbe servire.. initial<-unlist(mapply(function(x,y){if(is.na(x)[1])rep(x,y) else x }, initial.psi, a)) psi <- cbind(initial, psi, sqrt(vv)) rownames(psi) <- colnames(Cov)[id] colnames(psi) <- c("Initial", "Est.", "St.Err") objF$rangeZ <- rangeZ objF$psi.history <- psi.values objF$psi <- psi objF$it <- (it - 1) objF$epsilon <- obj$epsilon objF$call <- match.call() objF$nameUV <- list(U = nomiU, V = rownames(psi), Z = name.Z) objF$id.group <- if(length(name.Z)<=1) -rowSums(as.matrix(V)) objF$id.psi.group <- id.psi.group objF$id.warn <- id.warn objF$orig.call<-orig.call if (model) objF$model <- mf #objF$mframe <- data.frame(as.list(KK)) if(n.boot>0) objF$seed<-employed.Random.seed class(objF) <- c("segmented", class(obj0)) list.obj[[length(list.obj) + 1]] <- objF class(list.obj) <- "segmented" if (last) list.obj <- list.obj[[length(list.obj)]] return(list.obj) } segmented/R/seg.glm.fit.boot.r0000644000175100001440000001263612404051010015701 0ustar hornikusersseg.glm.fit.boot<-function(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE){ #random: if TRUE, when the algorithm fails in minimizing f(y), random numbers are used as final estimates. # If the algorithm fails in minimizing f(y*), the final estimates (to be used as starting values with # the original responses y) *always* are replaced by random numbers (regardless of the random argument) #nonParm. se TRUE implemneta il case resampling. Quello semiparam dipende dal non-errore del primo tentativo #show.history() se c' stato boot restart potrebbe produrre un grafico 2x1 di "dev vs it" and "no.of distinct vs it" #-------- extract.psi<-function(lista){ #serve per estrarre il miglior psi.. dev.values<-lista[[1]] psi.values<-lista[[2]] dev.ok<-min(dev.values) id.dev.ok<-which.min(dev.values) if(is.list(psi.values)) psi.values<-matrix(unlist(psi.values), nrow=length(dev.values), byrow=TRUE) if(!is.matrix(psi.values)) psi.values<-matrix(psi.values) psi.ok<-psi.values[id.dev.ok,] r<-list(dev.no.gap=dev.ok, psi=psi.ok) r } #------------- if(!nonParam){ nonParam<-TRUE warning("`nonParam' set to TRUE for segmented glm..", call.=FALSE) } visualBoot<-opz$visualBoot opz.boot<-opz opz.boot$pow=c(1.1,1.2) opz1<-opz opz1$it.max <-1 n<-length(y) o0<-try(seg.glm.fit(y, XREG, Z, PSI, w, offs, opz), silent=TRUE) rangeZ <- apply(Z, 2, range) #serve sempre if(!is.list(o0)) { o0<- seg.glm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol=TRUE) o0<-extract.psi(o0) if(!nonParam) {warning("using nonparametric boot");nonParam<-TRUE} } if(is.list(o0)){ est.psi00<-est.psi0<-o0$psi ss00<-o0$dev.no.gap if(!nonParam) fitted.ok<-fitted(o0) } else { if(!nonParam) stop("semiparametric boot requires reasonable fitted values. try a different psi or use nonparam boot") if(random) { est.psi00<-est.psi0<-apply(rangeZ,2,function(r)runif(1,r[1],r[2])) PSI1 <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) o0<-try(seg.glm.fit(y, XREG, Z, PSI1, w, offs, opz1), silent=TRUE) ss00<-o0$dev.no.gap } else { est.psi00<-est.psi0<-apply(PSI,2,mean) ss00<-opz$dev0 } } all.est.psi.boot<-all.selected.psi<-all.est.psi<-matrix(, nrow=n.boot, ncol=length(est.psi0)) all.ss<-all.selected.ss<-rep(NA, n.boot) if(is.null(size.boot)) size.boot<-n Z.orig<-Z if(visualBoot) cat(0, " ", formatC(opz$dev0, 3, format = "f"),"", "(No breakpoint(s))", "\n") count.random<-0 for(k in seq(n.boot)){ PSI <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) if(jt) Z<-apply(Z.orig,2,jitter) if(nonParam){ id<-sample(n, size=size.boot, replace=TRUE) o.boot<-try(seg.glm.fit(y[id], XREG[id,,drop=FALSE], Z[id,,drop=FALSE], PSI[id,,drop=FALSE], w[id], offs[id], opz), silent=TRUE) } else { yy<-fitted.ok+sample(residuals(o0),size=n, replace=TRUE) o.boot<-try(seg.glm.fit(yy, XREG, Z.orig, PSI, weights, offs, opz), silent=TRUE) } if(is.list(o.boot)){ all.est.psi.boot[k,]<-est.psi.boot<-o.boot$psi } else { est.psi.boot<-apply(rangeZ,2,function(r)runif(1,r[1],r[2])) } PSI <- matrix(rep(est.psi.boot, rep(nrow(Z), length(est.psi.boot))), ncol = length(est.psi.boot)) opz$h<-max(opz$h*.9, .2) opz$it.max<-opz$it.max+1 o<-try(seg.glm.fit(y, XREG, Z.orig, PSI, w, offs, opz), silent=TRUE) if(!is.list(o) && random){ est.psi00<-est.psi0<-apply(rangeZ,2,function(r)runif(1,r[1],r[2])) PSI1 <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) o<-try(seg.glm.fit(y, XREG, Z, PSI1, w, offs, opz1), silent=TRUE) count.random<-count.random+1 } if(is.list(o)){ if(!"coefficients"%in%names(o$obj)) o<-extract.psi(o) all.est.psi[k,]<-o$psi all.ss[k]<-o$dev.no.gap if(o$dev.no.gap<=ifelse(is.list(o0), o0$dev.no.gap, 10^12)) o0<-o est.psi0<-o0$psi all.selected.psi[k,] <- est.psi0 all.selected.ss[k]<-o0$dev.no.gap #min(c(o$SumSquares.no.gap, o0$SumSquares.no.gap)) } if(visualBoot) { flush.console() spp <- if (k < 10) "" else NULL cat(k, spp, "", formatC(o0$dev.no.gap, 3, format = "f"), "\n") } } #end n.boot all.selected.psi<-rbind(est.psi00,all.selected.psi) all.selected.ss<-c(ss00, all.selected.ss) ris<-list(all.selected.psi=drop(all.selected.psi),all.selected.ss=all.selected.ss, all.psi=all.est.psi, all.ss=all.ss) if(is.null(o0$obj)){ PSI1 <- matrix(rep(est.psi0, rep(nrow(Z), length(est.psi0))), ncol = length(est.psi0)) o0<-try(seg.glm.fit(y, XREG, Z, PSI1, w, offs, opz1), silent=TRUE) } if(!is.list(o0)) return(0) o0$boot.restart<-ris return(o0) }segmented/R/print.segmented.R0000644000175100001440000000275212404051010015666 0ustar hornikusers`print.segmented` <- function(x,digits = max(3, getOption("digits") - 3),...){ #revisione 15/05/03; 24/02/04 if(is.null(x$psi)) x<-x[[length(x)]] if(!"segmented"%in%class(x)) stop("a `segmented' object is requested") cat( "Call: " ) print( x$call ) cat("\nMeaningful coefficients of the linear terms:\n") #print(x$coef[(1:(length(x$coef)-length(x$psi[,2])))]) iV<- -match(x$nameUV[[2]],names(coef(x)))#iV<- -grep("psi.",names(coef(x)))#indices all but V #print(x$coef[iV]) print.default(format(x$coef[iV], digits = digits), print.gap = 2, quote = FALSE) cat("\n") cat("Estimated Break-Point(s)",dimnames(x$psi)[[1]],":", format(signif(x$psi[,2],digits)),"\n") if("glm"%in%class(x)){ cat("\nDegrees of Freedom:", x$df.null, "Total (i.e. Null); ", x$df.residual, "Residual\n") cat("Null Deviance: ", format(signif(x$null.deviance, digits)), "\nResidual Deviance:", format(signif(x$deviance, digits)), " AIC:", format(signif(x$aic, digits)), "\n") } if("Arima"%in%class(x)){ cm <- x$call$method if (is.null(cm) || cm != "CSS") cat("\nsigma^2 estimated as ", format(x$sigma2, digits = digits), ": log likelihood = ", format(round(x$loglik, 2)), ", aic = ", format(round(x$aic, 2)), "\n", sep = "") else cat("\nsigma^2 estimated as ", format(x$sigma2, digits = digits), ": part log likelihood = ", format(round(x$loglik, 2)), "\n", sep = "") } invisible(x) } segmented/R/vcov.segmented.R0000644000175100001440000000146112404051010015503 0ustar hornikusersvcov.segmented<-function (object, var.diff=FALSE, ...){ if(inherits(object, "glm")){ if(var.diff) warning("option var.diff=TRUE ignored with `glm' objects", call.=FALSE) so <- summary.glm(object, correlation = FALSE, ...) v<-so$dispersion * so$cov.unscaled } else { if(var.diff){ if(length(object$nameUV$Z)>1) { var.diff<-FALSE warning("var.diff set to FALSE with multiple segmented variables", call.=FALSE) } v<-summary.segmented(object, var.diff=TRUE, correlation = FALSE, ...)$cov.var.diff } else { so<-summary.segmented(object, var.diff=FALSE, correlation = FALSE, ...) v<-so$sigma^2 * so$cov.unscaled } } return(v) } segmented/R/broken.line.r0000644000175100001440000001275412404051010015031 0ustar hornikusersbroken.line<-function(ogg, term=NULL, link=TRUE, interc=TRUE, se.fit=TRUE){ #ogg: l'oggetto segmented #term: una lista *nominata* con i valori rispetto a cui calcolare i fitted # OPPURE una stringa per indicare la variabile segmented OPPURE NULL (se c' solo una variabile) dummy.matrix<-function(x.values, x.name, obj.seg, psi.est=TRUE){ #given the segmented fit 'obj.seg' and a segmented variable x.name with corresponding values x.values, #this function simply returns a matrix with columns (x, (x-psi)_+, -b*I(x>psi)) #or ((x-psi)_+, -b*I(x>psi)) if obj.seg does not include the coef for the linear "x" f.U<-function(nomiU, term=NULL){ #trasforma i nomi dei coeff U (o V) nei nomi delle variabili corrispondenti #and if 'term' is provided (i.e. it differs from NULL) the index of nomiU matching term are returned k<-length(nomiU) nomiUsenzaU<-strsplit(nomiU, "\\.") nomiU.ok<-vector(length=k) for(i in 1:k){ nomi.i<-nomiUsenzaU[[i]][-1] if(length(nomi.i)>1) nomi.i<-paste(nomi.i,collapse=".") nomiU.ok[i]<-nomi.i } if(!is.null(term)) nomiU.ok<-(1:k)[nomiU.ok%in%term] return(nomiU.ok) } n<-length(x.values) #le seguenti righe selezionavano (ERRONEAMENTE) sia "U1.x" sia "U1.neg.x" (se "x" e "neg.x" erano segmented covariates) #nameU<- grep(paste("\\.",x.name,"$", sep=""), obj.seg$nameUV$U, value = TRUE) #nameV<- grep(paste("\\.",x.name,"$", sep=""), obj.seg$nameUV$V, value = TRUE) nameU<-obj.seg$nameUV$U[f.U(obj.seg$nameUV$U,x.name)] nameV<-obj.seg$nameUV$V[f.U(obj.seg$nameUV$V,x.name)] diffSlope<-coef(obj.seg)[nameU] est.psi<-obj.seg$psi[nameV, "Est."] k<-length(est.psi) PSI <- matrix(rep(est.psi, rep(n, k)), ncol = k) newZ<-matrix(x.values, nrow=n,ncol=k, byrow = FALSE) dummy1<-pmax(newZ-PSI,0) if(psi.est){ V<-ifelse(newZ>PSI,-1,0) dummy2<- if(k==1) V*diffSlope else V%*%diag(diffSlope) #t(diffSlope*t(-I(newZ>PSI))) newd<-cbind(x.values,dummy1,dummy2) colnames(newd)<-c(x.name,nameU, nameV) } else { newd<-cbind(x.values,dummy1) colnames(newd)<-c(x.name,nameU) } if(!x.name%in%names(coef(obj.seg))) newd<-newd[,-1,drop=FALSE] return(newd) } #-------------- f.U<-function(nomiU, term=NULL){ #trasforma i nomi dei coeff U (o V) nei nomi delle variabili corrispondenti #and if 'term' is provided (i.e. it differs from NULL) the index of nomiU matching term are returned k<-length(nomiU) nomiUsenzaU<-strsplit(nomiU, "\\.") nomiU.ok<-vector(length=k) for(i in 1:k){ nomi.i<-nomiUsenzaU[[i]][-1] if(length(nomi.i)>1) nomi.i<-paste(nomi.i,collapse=".") nomiU.ok[i]<-nomi.i } if(!is.null(term)) nomiU.ok<-(1:k)[nomiU.ok%in%term] return(nomiU.ok) } #------------- xvalues<-term nomeV <- ogg$nameUV$V nomeU <- ogg$nameUV$U nomeZ <- ogg$nameUV$Z n.seg<-length(nomeZ) if(is.null(xvalues)){ if(n.seg>1) stop("there are multiple segmented covariates. Please specify one.") xvalues<-ogg$model[nomeZ] } if(is.character(xvalues)){ if(!xvalues %in% nomeZ) stop("'xvalues' is not a segmented covariate") xvalues<-ogg$model[xvalues] } nomeOK<-names(xvalues) if(length(nomeOK)>1) stop("Please specify one variable") if(!nomeOK %in% nomeZ) stop("'names(xvalues)' is not a segmented covariate") #if(n.seg>1 && !is.list(x.values)) stop("with multiple segmented covariates, please specify a named dataframe") #x.values<-data.frame(x.values) #names(x.values)<-nomeZ nomi <- names(coef(ogg)) nomiSenzaV <- nomiSenzaU <- nomi nomiSenzaU[match(nomeU, nomi)] <- "" nomiSenzaV[match(nomeV, nomi)] <- "" index <- vector(mode = "list", length = length(nomeZ)) for (i in 1:n.seg) { index[[i]] <- c(match(nomeZ[i], nomi), f.U(ogg$nameUV$U, nomeZ[i]) + (match(ogg$nameUV$U[1], nomi)-1), f.U(ogg$nameUV$V, nomeZ[i]) + (match(ogg$nameUV$V[1], nomi)-1)) #grep(paste("\\.", nomeZ[i], "$", sep = ""), nomiSenzaV, value = FALSE), #grep(paste("\\.", nomeZ[i], "$", sep = ""), nomiSenzaU, value = FALSE)) } ste.fit<-fit <- vector(mode = "list", length = length(nomeZ)) for (i in 1:n.seg) { x.name <- nomeZ[i] X<-dummy.matrix(unlist(xvalues), x.name, ogg)#<--NB: xvalues non varia con i!!! perch farlo calcolare comunque? ind <- as.numeric(na.omit(unlist(index[[i]]))) if(interc && "(Intercept)"%in%nomi) { ind<- c(match("(Intercept)",nomi),ind) X<-cbind(1,X) } cof <- coef(ogg)[ind] fit[[i]]<-drop(X%*%cof) ste.fit[[i]] <- if(!se.fit) 10 else sqrt(rowSums((X %*% vcov(ogg)[ind,ind]) * X)) #sqrt(diag(X%*%Var%*%t(X))) } names(fit)<- names(ste.fit)<- nomeZ r<-list(fit=fit[[nomeOK]], se.fit=ste.fit[[nomeOK]]) if (inherits(ogg, what = "glm", FALSE) && !link){ r[[2]] <- ogg$family$mu.eta(r[[1]])*r[[2]] r[[1]] <- ogg$family$linkinv(r[[1]]) } if(!se.fit) r<-r[1] return(r) } segmented/R/draw.history.R0000644000175100001440000000775612404051010015226 0ustar hornikusersdraw.history<-function(obj,term,...){ #show.history() se c' stato boot restart potrebbe produrre un grafico 2x1 di "dev vs it" and "no.of distinct vs it" #-- f.U<-function(nomiU, term=NULL){ #trasforma i nomi dei coeff U (o V) nei nomi delle variabili corrispondenti #and if 'term' is provided (i.e. it differs from NULL) the index of nomiU matching term are returned k<-length(nomiU) nomiUsenzaU<-strsplit(nomiU, "\\.") nomiU.ok<-vector(length=k) for(i in 1:k){ nomi.i<-nomiUsenzaU[[i]][-1] if(length(nomi.i)>1) nomi.i<-paste(nomi.i,collapse=".") nomiU.ok[i]<-nomi.i } if(!is.null(term)) nomiU.ok<-(1:k)[nomiU.ok%in%term] return(nomiU.ok) } #-- if(missing(term)){ if(length(obj$nameUV$Z)>1 ) {stop("please, specify `term'")} else {term<-obj$nameUV$Z} } range.ok<-obj$rangeZ[,term] #id.ok<-grep(paste("\\.",term,"$",sep=""), rownames(obj$psi),value=FALSE) id.ok<- f.U(rownames(obj$psi), term) est.psi<-obj$psi[id.ok,2] if(length(obj$psi.history)==5) { #boot (non-autom) par(mfrow=c(1,2)) plot(obj$psi.history$all.selected.ss, type="b", xlab="bootstrap replicates", ylab="RSS (selected values)", xaxt="n", pch=20) axis(1,at=1:length(obj$psi.history$all.selected.ss),cex.axis=.7) #unicit delle soluzioni if(is.vector(obj$psi.history$all.selected.psi)){ psi.matr<-m<-matrix(obj$psi.history$all.selected.psi, ncol=1) } else { psi.matr<-m<-obj$psi.history$all.selected.psi[,id.ok,drop=FALSE] } for(i in 1:nrow(m)) m[i,]<-apply(psi.matr[1:i,,drop=FALSE],2,function(xx)length(unique(xx))) m<-t(t(m)+.1*(0:(ncol(m)-1))) matplot(1:nrow(m),m, pch=1:ncol(m), type="b", col=1:ncol(m), ylab="no. of distinct solutions",xlab="bootstrap replicates", xaxt="n") axis(1,at=1:nrow(m),cex.axis=.7) } else { if(all(diff(sapply(obj$psi.history, length))==0)){ #non-boot, non-autom A<-t(matrix(unlist(obj$psi.history),nrow=nrow(obj$psi),byrow=FALSE)) colnames(A)<-rownames(obj$psi) matplot(1:nrow(A),A[,id.ok],type="o",pch=1:length(est.psi),col=1, xlab="iterations", ylab=paste("breakpoint ","(",term,")",sep=""), ylim=range.ok, xaxt="n",...) axis(1,at=1:nrow(A),cex.axis=.7) #if(rug) points(rep(1)) abline(h=est.psi,lty=3) } else { #non-boot, Autom id.iter<-rep(1:length(obj$psi.history), times=sapply(obj$psi.history, length)) psi.history<-unlist(obj$psi.history) nomi<-unlist(sapply(obj$psi.history, names)) d<-data.frame(iter=id.iter, psi=psi.history, nomi=nomi) #associa i nomi delle componenti di $psi.history (che sono indici 1,2,..) con i nomi della variabile term ii<-unique(names(obj$psi.history[[length(obj$psi.history)]])[id.ok]) if(length(ii)>1) stop("some error in the names?..") with(d[d$nomi==ii,], plot(iter, psi, xlab="iterations", ylab=paste("breakpoint ","(",term,")",sep=""), xaxt="n",...)) axis(1,at=unique(d$iter),cex.axis=.7) #se vuoi proprio associare le stime tra le diverse iterazioni #(per poi unire nel grafico i punti con le linee. Ovviamente alcune linee saranno interrotte) # for(i in 1:length(obj$psi.history)) { # a<-obj$psi.history[[i]] # for(j in 1:length(est.psi)){ # psij<-est.psi[j] #a<- ..names match # r[i,j]<-a[which.min(abs(a-psij))] # a<-setdiff(a, r[i,j]) # } # } } } } #end_fn segmented/R/segmented.R0000644000175100001440000000020712404051010014524 0ustar hornikusers`segmented` <- function(obj, seg.Z, psi, control=seg.control(), model=TRUE, ...){ UseMethod("segmented") } segmented/R/seg.def.fit.r0000644000175100001440000001255312404051010014714 0ustar hornikusersseg.def.fit<-function(obj, Z, PSI, mfExt, opz, return.all.sol=FALSE){ #----------------- dpmax<-function(x,y,pow=1){ #deriv pmax if(pow==1) ifelse(x>y, -1, 0) else -pow*pmax(x-y,0)^(pow-1) } #----------- c1 <- apply((Z <= PSI), 2, all) c2 <- apply((Z >= PSI), 2, all) if(sum(c1 + c2) != 0 || is.na(sum(c1 + c2))) stop("psi out of the range") # pow<-opz$pow nomiOK<-opz$nomiOK toll<-opz$toll h<-opz$h gap<-opz$gap stop.if.error<-opz$stop.if.error dev.new<-opz$dev0 visual<-opz$visual id.psi.group<-opz$id.psi.group it.max<-old.it.max<-opz$it.max rangeZ <- apply(Z, 2, range) psi<-PSI[1,] names(psi)<-id.psi.group #H<-1 it <- 1 epsilon <- 10 dev.values<-psi.values <- NULL id.psi.ok<-rep(TRUE, length(psi)) nomiU<- opz$nomiU nomiV<- opz$nomiV call.ok <- opz$call.ok call.noV <- opz$call.noV fn.obj<-opz$fn.obj toll<-opz$toll k<-ncol(Z) while (abs(epsilon) > toll) { U <- pmax((Z - PSI), 0)^pow[1]#U <- pmax((Z - PSI), 0) V <- dpmax(Z,PSI,pow=pow[2])# ifelse((Z > PSI), -1, 0) for(i in 1:k) { mfExt[nomiU[i]] <- U[,i] mfExt[nomiV[i]] <- V[,i] } obj <- suppressWarnings(eval(call.ok, envir=mfExt)) dev.old<-dev.new dev.new <- dev.new1 <- eval(parse(text=fn.obj), list(x=obj)) #control$f.obj should be something like "sum(x$residuals^2)" or "x$dev" if(return.all.sol) { obj.noV <- suppressWarnings(eval(call.noV, envir=mfExt)) dev.new1 <- eval(parse(text=fn.obj), list(x=obj.noV)) #dev.new1 <- sum(mylm(x = cbind(XREG, U), y = y, w = w, offs = offs)$residuals^2) } dev.values[[length(dev.values) + 1]] <- dev.new1 if (visual) { flush.console() if (it == 1) cat(0, " ", formatC(dev.old, 3, format = "f"), "", "(No breakpoint(s))", "\n") spp <- if (it < 10) "" else NULL cat(it, spp, "", formatC(dev.new, 3, format = "f"), "",length(psi),"\n") #cat(paste("iter = ", it, spp," dev = ",formatC(dev.new,digits=3,format="f"), " n.psi = ",formatC(length(psi),digits=0,format="f"), sep=""), "\n") } epsilon <- (dev.new - dev.old)/(dev.old + .001) obj$epsilon <- epsilon it <- it + 1 obj$it <- it beta.c<-coef(obj)[nomiU] gamma.c<-coef(obj)[nomiV] if (it > it.max) break psi.values[[length(psi.values) + 1]] <- psi.old <- psi # if(it>=old.it.max && h<1) H<-h psi <- psi.old + h*gamma.c/beta.c PSI <- matrix(rep(psi, rep(nrow(Z), length(psi))), ncol = length(psi)) #check if psi is admissible.. a <- apply((Z <= PSI), 2, all) #prima era solo < b <- apply((Z >= PSI), 2, all) #prima era solo > if(stop.if.error) { isErr<- (sum(a + b) != 0 || is.na(sum(a + b))) if(isErr) { if(return.all.sol) return(list(dev.values, psi.values)) else stop("(Some) estimated psi out of its range") } } else { id.psi.ok<-!is.na((a+b)<=0)&(a+b)<=0 Z <- Z[,id.psi.ok,drop=FALSE] psi <- psi[id.psi.ok] PSI <- PSI[,id.psi.ok,drop=FALSE] nomiOK<-nomiOK[id.psi.ok] #salva i nomi delle U per i psi ammissibili id.psi.group<-id.psi.group[id.psi.ok] names(psi)<-id.psi.group if(ncol(PSI)<=0) return(0) k<-ncol(Z) } #end else #obj$psi <- psi } #end while psi<-unlist(tapply(psi, id.psi.group, sort)) names(psi)<-id.psi.group PSI <- matrix(rep(psi, rep(nrow(Z), length(psi))), ncol = length(psi)) #aggiunto da qua.. U <- pmax((Z - PSI), 0) V <- ifelse((Z > PSI), -1, 0) for(i in 1:k) { mfExt[nomiU[i]] <- U[,i] mfExt[nomiV[i]] <- V[,i] } ##LA DOMANDA E': PERCHE' QUI STIMA UN MODELLO SENZA V SE POI VIENE RISTIMATO in segmented.default (o segmented.lm o segmented.glm?) ##RE: il valore di SS.new serve per il boot restart. #Invece la domanda : non si pu restituire direttamente obj.new senza bisogno di sostituire i valori in obj ? obj.new <- suppressWarnings(eval(call.noV, envir=mfExt)) SS.new <- eval(parse(text=fn.obj), list(x=obj.new)) #sum(obj.new$residuals^2) if(!gap){ obj<-obj.new #names.coef<-names(obj$coefficients) #obj$coefficients<-c(obj.new$coefficients, rep(0,ncol(V))) #names(obj$coefficients)<-names.coef #obj$residuals<-obj.new$residuals #obj$fitted.values<-obj.new$fitted.values #obj$linear.predictors<-obj.new$linear.predictors #obj$deviance<-obj.new$deviance #obj$weights<-obj.new$weights #obj$aic<-obj.new$aic #+ 2*ncol(V) #ho fatto la modifica in segmented.glm(): "objF$aic<-obj$aic + 2*k" } else { obj <- suppressWarnings(eval(call.ok, envir=mfExt)) } obj$epsilon <- epsilon obj$it <- it #fino a qua.. obj<-list(obj=obj,it=it,psi=psi,psi.values=psi.values,U=U,V=V,rangeZ=rangeZ, epsilon=epsilon,nomiOK=nomiOK, SumSquares.no.gap=SS.new, id.psi.group=id.psi.group) #inserire id.psi.ok? return(obj) } segmented/R/davies.test.r0000644000175100001440000002622612404051010015053 0ustar hornikusers#se n=1000: value out of range in 'gammafn' #warning se "lm" con "glm"??? `davies.test` <- function (obj, seg.Z, k = 10, alternative = c("two.sided", "less", "greater"), type=c("lrt","wald"), values=NULL, dispersion=NULL) { # extract.t.value.U<-function(x){ # #estrae il t-value dell'ultimo coeff in un oggetto restituito da lm.fit # #non serve... in realt viene usata extract.t.value.U.glm() # #x<-x$obj # R<-qr.R(x$qr) # p<-ncol(R) # n<-length(x$fitted.values) # invR<-backsolve(R,diag(p)) # hat.sigma2<-sum(x$residuals^2)/(n-p) # #solve(crossprod(qr.X(x$qr))) # V<-tcrossprod(invR)*hat.sigma2 # tt<-x$coefficients[p]/sqrt(V[p,p]) # tt} #------------------------------------------------------------------------------- daviesLM<-function(y, z, xreg, weights, offs, values, k, alternative){ #Davies test with sigma unknown #-------------- #> gammaA<-function(x){ # x^(x-.5)*exp(-x)*sqrt(2*pi)*(1+1/(12*x)+1/(288*x^2)-139/(51840*x^3) -571/(2488320*x^4)) # } #exp(lgamma()) fn="pmax(x-p,0)" y<-y-offs n<-length(y) n1<-length(values) RIS<-matrix(,n1,2) X.psi<-matrix(,n,length(fn)) df.res<- n - ncol(xreg) - length(fn) for(i in 1:n1){ for(j in 1:length(fn)) X.psi[,j]<-eval(parse(text=fn[[j]]), list(x=z, p=values[i])) xx1.new<-cbind(X.psi,xreg) #lrt #mu1.new<-xx1.new%*%solve(crossprod(xx1.new), crossprod(xx1.new,y)) #rss1<-sum((y-mu1.new)^2) #sigma2<-if(missing(sigma)) rss1/(n-ncol(xx1.new)) else sigma^2 #RIS[i]<-((rss0-rss1)/ncol(X.psi))/sigma2 #Wald invXtX1<-try(solve(crossprod(sqrt(weights)*xx1.new)), silent=TRUE) if(class(invXtX1)!="try-error"){ hat.b<-drop(invXtX1%*%crossprod(weights*xx1.new,y)) mu1.new<-xx1.new%*%hat.b devE<-sum((weights*(y-mu1.new)^2)) hat.sigma<- sqrt(devE/df.res) RIS[i,1]<-hat.b[1]/(hat.sigma*sqrt(invXtX1[1, 1])) Z<-hat.b[1]/(sqrt(invXtX1[1, 1])) D2<- Z^2 + devE RIS[i,2]<-Z^2/D2 #beta } } valori<-values[!is.na(RIS[,1])] RIS<- RIS[!is.na(RIS[,1]),] V<-sum(abs(diff(asin(RIS[,2]^.5)))) onesided <- TRUE if (alternative == "less") { M <- min(RIS[,1]) best<-valori[which.min(RIS[,1])] p.naiv <- pt(M, df=df.res, lower.tail = TRUE) } else if (alternative == "greater") { M <- max(RIS[,1]) best<-valori[which.max(RIS[,1])] p.naiv <- pt(M, df=df.res, lower.tail = FALSE) } else { M <- max(abs(RIS[,1])) best<-valori[which.max(abs(RIS[,1]))] p.naiv <- pt(M, df=df.res, lower.tail = FALSE) onesided <- FALSE } u<-M^2/((n-ncol(xx1.new))+ M^2) approxx<-V*(((1-u)^((df.res-1)/2))*gamma(df.res/2+.5))/(2*gamma(df.res/2)*pi^.5) p.adj <- p.naiv + approxx p.adj <- ifelse(onesided, 1, 2) * p.adj p.adj<-list(p.adj=p.adj, valori=valori, ris.valori=RIS[,1], best=best) return(p.adj) # M<-max(abs(RIS[,1])) # u<-M^2/((n-ncol(xx1.new))+ M^2) # approxx<-V*(((1-u)^((df.res-1)/2))*gamma(df.res/2+.5))/(2*gamma(df.res/2)*pi^.5) # p.naiv<-pt(-abs(M), df=df.res) #naive p-value # p.adj<-2*(p.naiv+approxx) #adjusted p-value (upper bound) # p.adj<-min(p.adj, 1) # p.adj<-list(p.adj=p.adj, valori=values, ris.valori=RIS[,1], approxx=approxx, p.naiv=p.naiv) # return(p.adj) } #-------------------------------- daviesGLM<-function(y, z, xreg, weights, offs, values=NULL, k, list.glm, alternative){ #Davies test for GLM (via LRT or Wald) est.dispGLM<-function(object){ df.r <- object$df.residual dispersion <- if(object$family$family%in%c("poisson","binomial")) 1 else object$dev/df.r dispersion } extract.t.value.U.glm<-function(object,dispersion,isGLM=TRUE){ #estrae il t-value dell'ultimo coeff in un oggetto restituito da lm.wfit/glm.fit est.disp <- FALSE df.r <- object$df.residual if (is.null(dispersion)) dispersion <- if(isGLM&&(object$family$family%in%c("poisson","binomial"))) 1 else if (df.r > 0) { est.disp <- TRUE if (any(object$weights == 0)) warning("observations with zero weight not used for calculating dispersion") sum((object$weights * object$residuals^2)[object$weights > 0])/df.r } else { est.disp <- TRUE NaN } dispersion<-max(c(dispersion, 1e-10)) p <- object$rank p1 <- 1L:p Qr <- object$qr coef.p <- object$coefficients[Qr$pivot[p1]] covmat.unscaled <- chol2inv(Qr$qr[p1, p1, drop = FALSE]) dimnames(covmat.unscaled) <- list(names(coef.p), names(coef.p)) covmat <- dispersion * covmat.unscaled tvalue <- coef.p[1]/sqrt(covmat[1,1]) #<0.4.0-0 era coef.p[p]/sqrt(covmat[p,p]) tvalue }#end extract.t.value.U.glm #-------------- fn<-"pmax(x-p,0)" dev0<-list.glm$dev0 eta0<-list.glm$eta0 family=list.glm$family type<-list.glm$type dispersion<-list.glm$dispersion n<-length(y) r<-length(fn) n1<-length(values) RIS<-rep(NA, n1) X.psi<-matrix(,n,length(fn)) for(i in 1:n1){ for(j in 1:length(fn)) X.psi[,j]<-eval(parse(text=fn[[j]]), list(x=z, p=values[i])) xreg1<-cbind(X.psi,xreg) o1<-glm.fit(x = xreg1, y = y, weights = weights, offset = offs, family=family, etastart=eta0) dev<-o1$dev if (is.list(o1) && ncol(xreg1)==o1$rank) { RIS[i]<- if(type=="lrt") sqrt((dev0-dev)/est.dispGLM(o1))*sign(o1$coef[1]) else extract.t.value.U.glm(o1,dispersion) } } valori<-values[!is.na(RIS)] ris.valori<-RIS[!is.na(RIS)] V<-sum(abs(diff(ris.valori))) #-----Questo se il test di riferimento una \chi^2_r. (Dovresti considerare il LRT non segnato) #V<-sum(abs(diff(sqrt(RIS))))#nota sqrt #M<- max(RIS) #approxx<-(V*(M^((r-1)/2))*exp(-M/2)*2^(-r/2))/gamma(r/2) #p.naiv<-1-pchisq(M,df=r) #naive p-value #p.adj<-min(p.naiv+approxx,1) #adjusted p-value (upper bound) onesided <- TRUE if (alternative == "less") { M <- min(ris.valori) best<-valori[which.min(ris.valori)] p.naiv <- pnorm(M, lower.tail = TRUE) } else if (alternative == "greater") { M <- max(ris.valori) best<-valori[which.max(ris.valori)] p.naiv <- pnorm(M, lower.tail = FALSE) } else { M <- max(abs(ris.valori)) best<-valori[which.max(abs(ris.valori))] p.naiv <- pnorm(M, lower.tail = FALSE) onesided <- FALSE } approxx<-V*exp(-(M^2)/2)/sqrt(8*pi) p.adj <- p.naiv + approxx p.adj <- ifelse(onesided, 1, 2) * p.adj p.adj<-list(p.adj=p.adj, valori=valori, ris.valori=ris.valori, best=best) return(p.adj) } #------------------------------------------------------------------------------- if(!inherits(obj, "lm")) stop("A 'lm', 'glm', or 'segmented' model is requested") if(class(seg.Z)!="formula") stop("'seg.Z' should be an one-sided formula") alternative <- match.arg(alternative) type <- match.arg(type) if(length(all.vars(seg.Z))>1) warning("multiple segmented variables ignored in 'seg.Z'",call.=FALSE) isGLM<-"glm"%in%class(obj) Call<-mf<-obj$call mf$formula<-formula(obj) m <- match(c("formula", "data", "subset", "weights", "na.action","offset"), names(mf), 0L) mf <- mf[c(1, m)] mf$drop.unused.levels <- TRUE mf[[1L]] <- as.name("model.frame") mf$formula<-update.formula(mf$formula,paste(seg.Z,collapse=".+")) formulaOrig<-formula(obj) if(class(obj)[1]=="segmented"){ mf$formula<-update.formula(mf$formula,paste("~.-",paste(obj$nameUV$V, collapse="-"))) for(i in 1:length(obj$nameUV$U)) assign(obj$nameUV$U[i], obj$model[,obj$nameUV$U[i]], envir=parent.frame()) formulaOrig<-update.formula(formulaOrig, paste("~.-",paste(obj$nameUV$V, collapse="-"))) } mf <- eval(mf, parent.frame()) weights <- as.vector(model.weights(mf)) offs <- as.vector(model.offset(mf)) if(!is.null(Call$weights)){ #"(weights)"%in%names(mf) names(mf)[which(names(mf)=="(weights)")]<-all.vars(Call$weights) #as.character(Call$weights) #aggiungere??? # mf["(weights)"]<-weights } mt <- attr(mf, "terms") interc<-attr(mt,"intercept") y <- model.response(mf, "any") XREG <- if (!is.empty.model(mt)) model.matrix(mt, mf, contrasts) n <- nrow(XREG) if (is.null(weights)) weights <- rep(1, n) if (is.null(offs)) offs <- rep(0, n) name.Z <- all.vars(seg.Z) Z<-XREG[,match(name.Z, colnames(XREG))] if(!name.Z %in% names(coef(obj))) XREG<-XREG[,-match(name.Z, colnames(XREG)),drop=FALSE] list.glm<-list(dev0=obj$dev, eta0=obj$linear.predictor, family=family(obj), type=type, dispersion=dispersion) if(is.null(values)) values<-seq(sort(Z)[2], sort(Z)[(n - 1)], length = k) #values<-seq(min(z), max(z), length=k+2) #values<-values[-c(1,length(values))] if(class(obj)=="lm" || identical(class(obj),c("segmented","lm")) ) { if(n<=300) { rr<-daviesLM(y=y, z=Z, xreg=XREG, weights=weights, offs=offs, values=values, k=k, alternative=alternative) } else { list.glm$family<-gaussian() list.glm$type<-"wald" rr<-daviesGLM(y=y, z=Z, xreg=XREG, weights=weights, offs=offs, values=values, k=k, list.glm=list.glm, alternative=alternative) } } if(identical(class(obj),c("glm","lm")) || identical(class(obj),c("segmented","glm","lm"))) rr<-daviesGLM(y=y, z=Z, xreg=XREG, weights=weights, offs=offs, values=values, k=k, list.glm=list.glm, alternative=alternative) best<-rr$best p.adj<-rr$p.adj valori<-rr$valori ris.valori<-rr$ris.valori if(is.null(obj$family$family)) { famiglia<-"gaussian" legame<-"identity"} else { famiglia<-obj$family$family legame<-obj$family$link } out <- list(method = "Davies' test for a change in the slope", # data.name=paste("Model = ",famiglia,", link =", legame, # "\nformula =", as.expression(formulaOrig), # "\nsegmented variable =", name.Z), data.name=paste("formula =", as.expression(formulaOrig), ", method =", obj$call[[1]] , "\nmodel =",famiglia,", link =", legame, if(isGLM) paste(", statist =", type) else NULL , "\nsegmented variable =", name.Z), statistic = c("'best' at" = best), parameter = c(n.points = length(valori)), p.value = min(p.adj,1), alternative = alternative, process=cbind(psi.values=valori, stat.values=ris.valori)) class(out) <- "htest" return(out) } segmented/MD50000644000175100001440000000523312404065367012565 0ustar hornikusers89c5b426059216bc58d2bd9a10454824 *DESCRIPTION 03e16ef6581d9f8c6ee9b367cacd0d37 *NAMESPACE 88077ce27941e15ddd228ff0622b7102 *NEWS a481e58102fe61bad190eba6b1deac8d *R/broken.line.r 1bd8474f32b800fb3f2053d46c5db6d6 *R/confint.segmented.R 3c2bc1e4ad1daf2ebf3115d2ab13e36b *R/davies.test.r 7a4a8900dd3874e96b48845a570cf034 *R/draw.history.R 59d13ca441f44ba0c78f16f18bad6ff4 *R/intercept.r d13b70896a274648f648d69805d10bdc *R/lines.segmented.R 16599033c41efebba3a608fe0a60b208 *R/plot.segmented.R 42b5fbaa6fc832cfb5ae7ff899751c57 *R/points.segmented.r 8680f6075d6d37d25683d9dfa25aabd1 *R/predict.segmented.r cb1b9ef0474e68189b698bd8444fa472 *R/print.segmented.R 720d02db63f44c4811f7f62bf314c86e *R/print.summary.segmented.R 1897a3de17c45e04d74c97ba862551e8 *R/seg.control.R 2f486654546b6e73570925f515c8e820 *R/seg.def.fit.boot.r a63495696fc3cc796c78de78ef5adb57 *R/seg.def.fit.r 10eebb3f2c255bf0a825beef1c8e7344 *R/seg.glm.fit.boot.r 10fabbc8bdf949e91f18afd80266b90e *R/seg.glm.fit.r 3bcb168b00f34c433a482379d56a2da5 *R/seg.lm.fit.boot.r 5bb2070b1c3c8368a17c2a19e6b9da4b *R/seg.lm.fit.r 6784ac55ef0763e1f1923bab5a59835d *R/segmented.R 9d860b5368ab0f9ac962c2444c87048f *R/segmented.default.r 1ce16238b0e3732d4db3f03f28361b37 *R/segmented.glm.R f8bca39ce78d419556c79970a95d889f *R/segmented.lm.R c1240c856962a8db2c96de2600bdc10b *R/slope.R 60be9690ff10a73600058bdfd0cd7119 *R/summary.segmented.R 0550842891c0fc3a46bef2420a2a85ed *R/vcov.segmented.R cd89d85a25986a2107d5565f5fce91c9 *data/down.rda 2fbc9f6e83ea01a4975e7de43e76eaf7 *data/plant.rda 890397cdbc744a6cf1f2d25b4af26c21 *data/stagnant.rda 6227e4f7236f49f3debc821d4c98f7d1 *inst/CITATION 9d64af4e32959d9c47aadb932600d285 *man/broken.line.Rd e38b11eb6601e9544619dfac019d96a6 *man/confint.segmented.Rd f64106876c1b67f7b80b5c7c2656a597 *man/davies.test.Rd a1fd6bbde564db5be2a9dde1ecedfb2d *man/down.Rd bb4e4f96d8eee8acb68b5773ab2a5c5a *man/draw.history.Rd 9da286da419f12546ea551a5370141c7 *man/intercept.Rd 7a543b5123d9c64b5a4da6aba4e1d3ea *man/lines.segmented.Rd 925cd1c6b4a05d3fca632a7f93999d00 *man/plant.Rd cf5677a603d20c26097d50cce848d121 *man/plot.segmented.Rd 2118ad3e7a591dde00a9ac1d2f9688e7 *man/points.segmented.Rd 1ba7c089bae8411532d0f40ff4091c7b *man/predict.segmented.Rd c5ff81f292c40cdc317fe2ddfc99c4cd *man/print.segmented.Rd fbfd4ad065b408d53ccb43f348e110cc *man/seg.control.Rd 76d1c89e7213478643225ee87c7d287a *man/seg.lm.fit.Rd 27063e9405c643a3e072fa1be06011ab *man/segmented-package.Rd af2f6eb22adae7d5d1c1ed058bc06e19 *man/segmented.Rd 866f3920741a92dd16afa46d31a38ca8 *man/slope.Rd 1c8095f4472b4cabf3dc3cae5132260f *man/stagnant.Rd 706dec93d7feda2b0e6ef5f878529db2 *man/summary.segmented.Rd 99a8c632f6534f724d479cd4bca71824 *man/vcov.segmented.Rd segmented/DESCRIPTION0000644000175100001440000000126512404065367013764 0ustar hornikusersPackage: segmented Type: Package Title: Segmented relationships in regression models with breakpoints/changepoints estimation Version: 0.5-0.0 Date: 2014-09-10 Maintainer: Vito M. R. Muggeo Authors@R: c(person(given = c("Vito","M.","R."), family = "Muggeo", role = c("aut", "cre"), email = "vito.muggeo@unipa.it")) Description: Given a regression model, segmented `updates' the model by adding one or more segmented relationships. Several variables with multiple breakpoints are allowed. License: GPL Packaged: 2014-09-10 13:28:40 UTC; user Author: Vito M. R. Muggeo [aut, cre] NeedsCompilation: no Repository: CRAN Date/Publication: 2014-09-10 17:15:03 segmented/man/0000755000175100001440000000000012404051010013001 5ustar hornikuserssegmented/man/stagnant.Rd0000644000175100001440000000161012404051010015105 0ustar hornikusers\name{stagnant} \alias{stagnant} \docType{data} \title{Stagnant band height data} \description{ The \code{stagnant} data frame has 28 rows and 2 columns. } \usage{data(stagnant)} \format{ A data frame with 28 observations on the following 2 variables. \describe{ \item{\code{x}}{log of flow rate in g/cm sec.} \item{\code{y}}{log of band height in cm} } } \details{ Bacon and Watts report that such data were obtained by R.A. Cook during his investigation of the behaviour of stagnant surface layer height in a controlled flow of water. } \source{ Bacon D.W., Watts D.G. (1971) Estimating the transistion between two intersecting straight lines. \emph{Biometrika} \bold{58}: 525 -- 534. Originally from the PhD thesis by R.A. Cook } %\references{ % PhD thesis by R.A. Cook %} \examples{ data(stagnant) ## plot(stagnant) } \keyword{datasets} segmented/man/predict.segmented.Rd0000644000175100001440000000440212404051010016674 0ustar hornikusers\name{predict.segmented} \alias{predict.segmented} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Predict method for segmented model fits } \description{ Returns predictions and optionally associated quantities (standard errors or confidence intervals) from a fitted segmented model object. } \usage{ \method{predict}{segmented}(object, newdata, ...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{object}{ a fitted segmented model coming from \code{segmented.lm} or \code{segmented.glm}. } \item{newdata}{ An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used. } \item{\dots}{ further arguments passed to \code{predict.lm} or \code{predict.glm}. Usually these are \code{se.fit}, or \code{interval} or \code{type}. } } \details{ Basically \code{predict.segmented} builds the right design matrix accounting for breakpoint and passes it to \code{predict.lm} or \code{predict.glm} depending on the actual model fit \code{object}. } \value{ \code{predict.segmented} produces a vector of predictions with possibly associated standard errors or confidence intervals. See \code{predict.lm} or \code{predict.glm}. } %\references{ %% ~put references to the literature/web site here ~ %} \author{ Vito Muggeo } \note{ If \code{type="terms"}, \code{predict.segmented} returns predictions for each component of the segmented term. Namely if `my.x' is the segmented variable, predictions for `my.x', `U1.my.x' and `psi1.my.x' are returned. These are meaningless individually, however their sum provides the predictions for the segmented term. } %% ~Make other sections like Warning with \section{Warning }{....} ~ \seealso{ \code{\link{plot.segmented}}, \code{\link{broken.line}}, \code{\link{predict.lm}}, \code{\link{predict.glm}} } \examples{ n=10 x=seq(-3,3,l=n) set.seed(1515) y <- (x<0)*x/2 + 1 + rnorm(x,sd=0.15) segm <- segmented(lm(y ~ x), ~ x, psi=0.5) predict(segm,se.fit = TRUE)$se.fit #wrong (smaller) st.errors (assuming known the breakpoint) olm<-lm(y~x+pmax(x-segm$psi[,2],0)) predict(olm,se.fit = TRUE)$se.fit } % \dontrun{..} % KEYWORDS - R documentation directory. \keyword{models} \keyword{regression} segmented/man/vcov.segmented.Rd0000644000175100001440000000306712404051010016225 0ustar hornikusers\name{vcov.segmented} \alias{vcov.segmented} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Variance-Covariance Matrix for a Fitted Segmented Model} \description{ Returns the variance-covariance matrix of the parameters (including breakpoints) of a fitted segmented model object.} \usage{ \method{vcov}{segmented}(object, var.diff = FALSE, ...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{object}{a fitted model object of class "segmented", returned by any \code{segmented} method.} \item{var.diff}{logical. If \code{var.diff=TRUE} and there is a single segmented variable, the covariance matrix is computed using a sandwich-type formula. See Details in \code{\link{summary.segmented}}.} \item{\dots}{additional arguments. } } \details{ The returned covariance matrix is based on an approximation of the nonlinear segmented term. Therefore covariances corresponding to breakpoints are reliable only in large samples and/or clear cut segmented relationships. } \value{ The full matrix of the estimated covariances between the parameter estimates, including the breakpoints. } %\references{} \author{Vito M. R. Muggeo, \email{vito.muggeo@unipa.it}} \note{\code{var.diff=TRUE} works when there is a single segmented variable.} \seealso{\code{\link{summary.segmented}}} \examples{ ##continues example from summary.segmented() # vcov(oseg) # vcov(oseg,var.diff=TRUE) } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{regression} segmented/man/points.segmented.Rd0000644000175100001440000000364612404051010016567 0ustar hornikusers\name{points.segmented} \alias{points.segmented} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Points method for segmented objects } \description{ Takes a fitted \code{segmented} object returned by \code{segmented()} and adds on the current plot the joinpoints of the fitted broken-line relationships. } \usage{ \method{points}{segmented}(x, term, interc = TRUE, link = TRUE, ...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{x}{ an object of class \code{segmented}. } \item{term}{ the segmented variable of interest. It may be unspecified when there is a single segmented variable. } \item{interc}{ If \code{TRUE} the computed joinpoints include the model intercept (if it exists). } \item{link}{ when \code{TRUE} (default), the fitted joinpoints are plotted on the link scale } \item{\dots}{ other graphics parameters to pass on to \code{points()} function. } } \details{ We call 'joinpoint' the plane point having as coordinates the breakpoint (on the x scale) and the fitted value of the segmented relationship at that breakpoint (on the y scale). \code{points.segmented()} simply adds the fitted joinpoints on the current plot. This could be useful to emphasize the changes of the piecewise linear relationship. } %\value{ %% ~Describe the value returned %% If it is a LIST, use %% \item{comp1 }{Description of 'comp1'} %% \item{comp2 }{Description of 'comp2'} %% ... %} %\references{ %% ~put references to the literature/web site here ~ %} %\author{ %% ~~who you are~~ %} %\note{ %% ~~further notes~~ %} %% ~Make other sections like Warning with \section{Warning }{....} ~ \seealso{ \code{\link{plot.segmented}} to plot the fitted segmented lines. } \examples{ \dontrun{ #continues from ?plot.segmented points(o.seg,col=2) } } \keyword{ nonlinear } \keyword{ regression }% __ONLY ONE__ keyword per line segmented/man/plot.segmented.Rd0000644000175100001440000001172712404051010016230 0ustar hornikusers\name{plot.segmented} \alias{plot.segmented} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Plot method for segmented objects } \description{ Takes a fitted \code{segmented} object returned by \code{segmented()} and plots (or adds) the fitted broken-line for the selected segmented term. } \usage{ \method{plot}{segmented}(x, term, add=FALSE, res=FALSE, conf.level=0, interc=TRUE, link=TRUE, res.col=1, rev.sgn=FALSE, const=0, shade=FALSE, rug=TRUE, show.gap=FALSE, ...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{x}{ a fitted \code{segmented} object. } \item{term}{ the segmented variable having the piece-wise relationship to be plotted. If there is a single segmented variable in the fitted model \code{x}, \code{term} can be omitted.} \item{add}{ when \code{TRUE} the fitted lines are added to the current device.} \item{res}{ when \code{TRUE} the fitted lines are plotted along with corresponding partial residuals. See Details.} \item{conf.level}{ If greater than zero, it means the confidence level at which the pointwise confidence itervals have to be plotted.} \item{interc}{ If \code{TRUE} the computed segmented components include the model intercept (if it exists).} \item{link}{ when \code{TRUE} (default), the fitted lines are plotted on the link scale, otherwise they are tranformed on the response scale before plotting. Ignored for linear segmented fits. } \item{res.col}{when \code{res=TRUE} it means the color of the points representing the partial residuals.} \item{rev.sgn}{ when \code{TRUE} it is assumed that current \code{term} is `minus' the actual segmented variable, therefore the sign is reversed before plotting. This is useful when a null-constraint has been set on the last slope.} \item{const}{ constant to add to each fitted segmented relationship (on the scale of the linear predictor) before plotting.} \item{shade}{if \code{TRUE} and \code{conf.level>0} it produces shaded regions (in grey color) for the pointwise confidence intervals embracing the fitted segmented line. } \item{rug}{when \code{TRUE} (default) then the covariate values are displayed as a rug plot at the foot of the plot.} \item{show.gap}{ when \code{FALSE} the (possible) gaps between the fitted lines at the estimated breakpoints are hidden. When bootstrap restarting has been employed (default in \code{segmented}), \code{show.gap} is meaningless as the gap coefficients are always set to zero in the fitted model.} \item{\dots}{ other graphics parameters to pass to plotting commands: `col', `lwd' and `lty' (that can be vectors, see the example below) for the fitted piecewise lines; `ylab', `xlab', `main', `sub', `xlim' and `ylim' when a new plot is produced (i.e. when \code{add=FALSE}); `pch' and `cex' for the partial residuals (when \code{res=TRUE}). } } \details{ Produces (or adds to the current device) the fitted segmented relationship between the response and the selected \code{term}. If the fitted model includes just a single `segmented' variable, \code{term} may be omitted. Due to the parameterization of the segmented terms, sometimes the fitted lines may not appear to join at the estimated breakpoints. If this is the case, the apparent `gap' would indicate some lack-of-fit. However, since version 0.2-9.0, the gap coefficients are set to zero by default (see argument \code{gap} in in \code{\link{seg.control}}). The partial residuals are computed as `fitted + residuals', where `fitted' are the fitted values of the segmented relationship. Notice that for GLMs the residuals are the response residuals if \code{link=FALSE} and the working residuals weighted by the IWLS weights if \code{link=TRUE}. } \value{ None. } %\references{ } \author{ Vito M. R. Muggeo } \note{ For models with offset, partial residuals on the response scale are not defined. Thus \code{plot.segmented} does not work when \code{link=FALSE}, \code{res=TRUE}, and the fitted model includes an offset.} % % ~Make other sections like Warning with \section{Warning }{....} ~ %} \seealso{ \code{\link{lines.segmented}} to add the estimated breakpoints on the current plot. \code{\link{points.segmented}} to add the joinpoints of the segmented relationship. \code{\link{predict.segmented}} to compute standard errors and confidence intervals for predictions from a "segmented" fit. } \examples{ set.seed(1234) z<-runif(100) y<-rpois(100,exp(2+1.8*pmax(z-.6,0))) o<-glm(y~z,family=poisson) o.seg<-segmented(o,seg.Z=~z,psi=list(z=.5)) par(mfrow=c(2,1)) plot(o.seg, conf.level=0.95, shade=TRUE) plot(z,y) ## add the fitted lines using different colors and styles.. plot(o.seg,add=TRUE,link=FALSE,lwd=2,col=2:3, lty=c(1,3)) lines(o.seg,col=2,pch=19,bottom=FALSE,lwd=2) } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ regression } \keyword{ nonlinear } \keyword{ hplot }segmented/man/draw.history.Rd0000644000175100001440000000322312404051010015725 0ustar hornikusers\name{draw.history} \alias{draw.history} %- Also NEED an '\alias' for EACH other topic documented here. \title{ History for the breakpoint estimates } \description{ Displays breakpoint iteration values for segmented fits. } \usage{ draw.history(obj, term, ...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{obj}{ a segmented fit returned by any "segmented" method. } \item{term}{ a character to mean the `segmented' variable whose breakpoint values throughout iterations have to be displayed. } \item{\dots}{ graphic parameters to be passed to \code{matplot()}. } } \details{ For a given \code{term} in a segmented fit, \code{draw.history()} displays the different breakpoint values obtained during the estimating process, since the starting values up to the final ones. When bootstrap restarting is employed, \code{draw.history()} produces two plots, the values of objective function and the number of distinct solutions against the bootstrap replicates. } \value{ None. } %\references{ } \author{ Vito M.R. Muggeo } %\note{ ~~further notes~~ % ~Make other sections like Warning with \section{Warning }{....} ~ %} %\seealso{ ~~objects to See Also as \code{\link{help}}, ~~~ } \examples{ data(stagnant) os<-segmented(lm(y~x,data=stagnant),seg.Z=~x,psi=-.8) draw.history(os) #diagnostics with boot restarting os<-segmented(lm(y~x,data=stagnant),seg.Z=~x,psi=-.8, control=seg.control(n.boot=0)) draw.history(os) #diagnostics without boot restarting } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ regression } \keyword{ nonlinear } segmented/man/down.Rd0000644000175100001440000000203512404051010014237 0ustar hornikusers\name{down} \alias{down} \docType{data} \title{ Down syndrome in babies} \description{ The \code{down} data frame has 30 rows and 3 columns. Variable \code{cases} means the number of babies with Down syndrome out of total number of births \code{births} for mothers with mean age \code{age}. } \usage{data(down)} \format{ A data frame with 30 observations on the following 3 variables. \describe{ \item{\code{age}}{the mothers' mean age.} \item{\code{births}}{count of total births.} \item{\code{cases}}{count of babies with Down syndrome.} } } %\details{ % ~~ If necessary, more details than the description above ~~ %} \source{ Davison, A.C. and Hinkley, D. V. (1997) \emph{Bootstrap Methods and their Application}. Cambridge University Press. } \references{ Geyer, C. J. (1991) Constrained maximum likelihood exemplified by isotonic convex logistic regression. \emph{Journal of the American Statistical Association} \bold{86}, 717--724. } \examples{ data(down) } \keyword{datasets} segmented/man/seg.control.Rd0000644000175100001440000001746412404051010015541 0ustar hornikusers\name{seg.control} \alias{seg.control} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Auxiliary for controlling segmented model fitting } \description{ Auxiliary function as user interface for 'segmented' fitting. Typically only used when calling any 'segmented' method (\code{segmented.lm} or \code{segmented.glm}). } \usage{ seg.control(toll = 1e-04, it.max = 10, display = FALSE, stop.if.error = TRUE, K = 10, quant = FALSE, last = TRUE, maxit.glm = 25, h = 1, n.boot=20, size.boot=NULL, gap=FALSE, jt=FALSE, nonParam=TRUE, random=TRUE, powers=c(1,1), seed=NULL, fn.obj=NULL) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{toll}{ positive convergence tolerance. } \item{it.max}{ integer giving the maximal number of iterations. } \item{display}{ logical indicating if the value of the \emph{working} objective function should be printed at each iteration. The \emph{working} objective function is the objective function of the working model including the gap coefficients (and therefore it should not be compared with the value at convergence). If bootstrap restarting is employed, the value of the \emph{real} objective function (without gap coefficients) after every bootstrap iteration is printed. This value should decrease throughout the iterations.} \item{stop.if.error}{ logical indicating if non-admissible break-points should be removed during the estimating algorithm. Set it to \code{FALSE} if you want to perform a sort of `automatic' breakpoint selection, provided that several starting values are provided for the breakpoints. See argument \code{psi} in \code{\link{segmented.lm}} or \code{\link{segmented.glm}}. The idea of removing `non-admissible' break-points during the iterative process is discussed in Muggeo and Adelfio (2011) and it is not compatible with the bootstrap restart algorithm. This approach, indeed, should be considered as a preliminary and tentative approach to deal with an unknown number of breakpoints. } \item{K}{ the number of quantiles (or equally-spaced values) to supply as starting values for the breakpoints when the \code{psi} argument of \code{segmented} is set to \code{NA}. \code{K} is ignored when \code{psi} is different from \code{NA}. } \item{quant}{logical, indicating how the starting values should be selected. If \code{FALSE} equally-spaced values are used, otherwise the quantiles. Ignored when \code{psi} is different from \code{NA}.} \item{last}{ logical indicating if output should include only the last fitted model.} \item{maxit.glm}{ integer giving the maximum number of inner IWLS iterations (see details). } \item{h}{ positive factor (from zero to one) modifying the increments in breakpoint updates during the estimation process (see details). } \item{n.boot}{ number of bootstrap samples used in the bootstrap restarting algorithm. If 0 the standard algorithm, i.e. without bootstrap restart, is used. Default to 20 that appears to be sufficient in most of problems. However when multiple breakpoints have to be estimated it is suggested to increase \code{n.boot}, e.g. \code{n.boot=50}.} \item{size.boot}{the size of the bootstrap samples. If \code{NULL}, it is taken equal to the actual sample size.} \item{gap}{logical, if \code{FALSE} the gap coefficients are \emph{always} constrained to zero at the convergence.} \item{jt}{logical. If \code{TRUE} the values of the segmented variable(s) are jittered before fitting the model to the bootstrap resamples.} \item{nonParam}{ if \code{TRUE} nonparametric bootstrap (i.e. case-resampling) is used, otherwise residual-based. Currently working only for LM fits. It is not clear what residuals should be used for GLMs.} \item{random}{ if \code{TRUE}, when the algorithm fails to obtain a solution, random values are employed to obtain candidate values. } \item{powers}{ The powers of the pseudo covariates employed by the algorithm. These are possibly altered during the iterative process to stabilize the estimation procedure. Usually of no interest for the user. } \item{seed}{ The seed to be passed on to \code{set.seed()} when \code{n.boot>0}. Setting the seed can be useful to replicate the results when the bootstrap restart algorithm is employed. In fact a segmented fit includes \code{seed} representing the integer vector saved just before the bootstrap resampling. Re-use it if you want to replicate the bootstrap restarting algorithm with the \emph{same} samples. } \item{fn.obj}{ A character string to be used (optionally) only when \code{segmented.default} is used. It represents the function (with argument \code{'x'}) to be applied to the fit object to extract the objective function to be \emph{minimized}. Thus for \code{"lm"} fits (although unnecessary) it should be \code{fn.obj="sum(x$residuals^2)"}, for \code{"coxph"} fits it should be \code{fn.obj="-x$loglik[2]"}. If \code{NULL} the `minus log likelihood' extracted from the object, namely \code{"-logLik(x)"}, is used. See \code{\link{segmented.default}}. } } \details{ Fitting a `segmented' GLM model is attained via fitting iteratively standard GLMs. The number of (outer) iterations is governed by \code{it.max}, while the (maximum) number of (inner) iterations to fit the GLM at each fixed value of psi is fixed via \code{maxit.glm}. Usually three-four inner iterations may be sufficient. When the starting value for the breakpoints is set to \code{NA} for any segmented variable specified in \code{seg.Z}, \code{K} values (quantiles or equally-spaced) are selected as starting values for the breakpoints. In this case, it may be useful to set also \code{stop.if.error=FALSE} to automate the procedure, see Muggeo and Adelfio (2011). The maximum number of iterations (\code{it.max}) should be also increased when the `automatic' procedure is used. If \code{last=TRUE}, the object resulting from \code{segmented.lm} (or \code{segmented.glm}) is a list of fitted GLM; the i-th model is the segmented model with the values of the breakpoints at the i-th iteration. Sometimes to stabilize the procedure, it can be useful to set \code{h<1} to reduce the increments in the breakpoint updates. At each iteration the updated estimate is usually given by \code{psi.new=psi.old+increm}. By setting \code{h<1} (actually \code{min(abs(h),1)} is considered) causes the following updates of the breakpoint estimate: \code{psi.new=psi.old+h*increm}. Since version 0.2-9.0 \code{segmented} implements the bootstrap restarting algorithm described in Wood (2001). The bootstrap restarting is expected to escape the local optima of the objective function when the segmented relationship is flat. Notice bootstrap restart runs \code{n.boot} iterations regardless of \code{toll} that only affects convergence within the inner loop. } \value{ A list with the arguments as components. } \references{ Muggeo, V.M.R., Adelfio, G. (2011) Efficient change point detection in genomic sequences of continuous measurements. \emph{Bioinformatics} \bold{27}, 161--166. Wood, S. N. (2001) Minimizing model fitting objectives that contain spurious local minima by bootstrap restarting. \emph{Biometrics} \bold{57}, 240--244. } \author{ Vito Muggeo } %\note{ ~~further notes~~ % ~Make other sections like Warning with \section{Warning }{....} ~ %} %\seealso{ ~~objects to See Also as \code{\link{help}}, ~~~ } \examples{ #decrease the maximum number inner iterations and display the #evolution of the (outer) iterations seg.control(display = TRUE, maxit.glm=4) } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ regression } segmented/man/confint.segmented.Rd0000644000175100001440000000535112404051010016706 0ustar hornikusers\name{confint.segmented} \alias{confint.segmented} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Confidence intervals for breakpoints} \description{ Computes confidence intervals for the breakpoints in a fitted `segmented' model. } \usage{ \method{confint}{segmented}(object, parm, level=0.95, rev.sgn=FALSE, var.diff=FALSE, digits=max(3, getOption("digits") - 3), ...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{object}{a fitted \code{segmented} object. } \item{parm}{the segmented variable of interest. If missing all the segmented variables are considered. } \item{level}{the confidence level required (default to 0.95).} \item{rev.sgn}{vector of logicals. The length should be equal to the length of \code{parm}; recycled otherwise. when \code{TRUE} it is assumed that the current \code{parm} is `minus' the actual segmented variable, therefore the sign is reversed before printing. This is useful when a null-constraint has been set on the last slope.} \item{var.diff}{logical. If \code{var.diff=TRUE} and there is a single segmented variable, the standard error is based on sandwich-type formula of the covariance matrix. See Details in \code{\link{summary.segmented}}.} \item{digits}{controls the number of digits to print when printing the output. } \item{\dots}{additional parameters } } \details{ Currently \code{confint.segmented} computes confidence limits for the breakpoints using the standard error coming from the Delta method for the ratio of two random variables. This value is an approximation (slightly) better than the one reported in the `psi' component of the list returned by any \code{segmented} method. The resulting confidence intervals are based on the asymptotic Normal distribution of the breakpoint estimator which is reliable just for clear-cut kink relationships. See Details in \code{\link{segmented}}. } \value{ A list of matrices. Each matrix includes point estimate and confidence limits of the breakpoint(s) for each segmented variable in the model. } %\references{ } \author{ Vito M.R. Muggeo } %\note{ ~~further notes~~ % % ~Make other sections like Warning with \section{Warning }{....} ~ %} \seealso{ \code{\link{segmented}} and \code{\link{lines.segmented}} to plot the estimated breakpoints with corresponding confidence intervals. } \examples{ set.seed(10) x<-1:100 z<-runif(100) y<-2+1.5*pmax(x-35,0)-1.5*pmax(x-70,0)+10*pmax(z-.5,0)+rnorm(100,0,2) out.lm<-lm(y~x) o<-segmented(out.lm,seg.Z=~x+z,psi=list(x=c(30,60),z=.4)) confint(o) } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ regression } \keyword{ nonlinear } segmented/man/segmented.Rd0000644000175100001440000002225612404051010015252 0ustar hornikusers\name{segmented} \alias{segmented} \alias{segmented.lm} \alias{segmented.glm} \alias{segmented.default} %\alias{print.segmented} %\alias{summary.segmented} %\alias{print.summary.segmented} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Segmented relationships in regression models } \description{ Fits regression models with segmented relationships between the response and one or more explanatory variables. Break-point estimates are provided. } \usage{ segmented(obj, seg.Z, psi, control = seg.control(), model = TRUE, ...) \method{segmented}{default}(obj, seg.Z, psi, control = seg.control(), model = TRUE, ...) \method{segmented}{lm}(obj, seg.Z, psi, control = seg.control(), model = TRUE, ...) \method{segmented}{glm}(obj, seg.Z, psi, control = seg.control(), model = TRUE, ...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{obj}{ standard `linear' model of class "lm" or "glm". Since version 0.5.0-0 any regression fit may be supplied.} \item{seg.Z}{ a formula with no response variable, such as \code{seg.Z=~x1+x2}, indicating the (continuous) explanatory variables having segmented relationships with the response. Currently, formulas involving functions, such as \code{seg.Z=~log(x1)} or \code{seg.Z=~sqrt(x1)}, or selection operators, such as \code{seg.Z=~d[,"x1"]} or \code{seg.Z=~d$x1}, are \emph{not} allowed. } \item{psi}{ named list of vectors. The names have to match the variables of the \code{seg.Z} argument. Each vector includes starting values for the break-point(s) for the corresponding variable in \code{seg.Z}. If \code{seg.Z} includes only a variable, \code{psi} may be a numeric vector. A \code{NA} value means that `\code{K}' quantiles (or equally spaced values) are used as starting values; \code{K} is fixed via the \code{\link{seg.control}} auxiliary function. } \item{control}{ a list of parameters for controlling the fitting process. See the documentation for \code{\link{seg.control}} for details. } \item{model}{ logical value indicating if the model.frame should be returned.} \item{\dots}{ optional arguments. } } \details{ Given a linear regression model (usually of class "lm" or "glm"), segmented tries to estimate a new model having broken-line relationships with the variables specified in \code{seg.Z}. A segmented (or broken-line) relationship is defined by the slope parameters and the break-points where the linear relation changes. The number of breakpoints of each segmented relationship is fixed via the \code{psi} argument, where initial values for the break-points must be specified. The model is estimated simultaneously yielding point estimates and relevant approximate standard errors of all the model parameters, including the break-points. Since version 0.2-9.0 \code{segmented} implements the bootstrap restarting algorithm described in Wood (2001). The bootstrap restarting is expected to escape the local optima of the objective function when the segmented relationship is flat and the log likelihood can have multiple local optima. Since version 0.5-0.0 the default method \code{segmented.default} has been added to estimate segmented relationships in general (besides "lm" and "glm" fits) regression models, such as Cox regression or quantile regression (for a single percentile). The objective function to be minimized is the (minus) value extracted by the \code{logLik} function or it may be passed on via the \code{fn.obj} argument in \code{seg.control}. See example below. While the default method is expected to work with any regression fit (where the usual \code{coef()}, \code{update()}, and \code{logLik()} returns appropriate results), it is not recommended for "lm" or "glm" fits (as \code{segmented.default} is slower than the specific methods \code{segmented.lm} and \code{segmented.glm}), although final results are the same. However the object returned by \code{segmented.default} is \emph{not} of class "segmented" as currently the segmented methods are not guaranteed to work for `generic' (i.e., besides "lm" and "glm") regression fits. The user could try each "segmented" method on the returned object by calling it explicitly (e.g. via \code{confint.segmented()}). } \value{ The returned object depends on the \code{last} component returned by \code{seg.control}. If last=TRUE, the default, segmented returns an object of class "segmented" which inherits from the class "lm" or "glm" depending on the class of \code{obj}. Otherwise a list is returned, where the last component is the fitted model at the final iteration, see \code{\link{seg.control}}. \cr An object of class "segmented" is a list containing the components of the original object \code{obj} with additionally the followings: \item{psi}{estimated break-points and relevant (approximate) standard errors} \item{it}{number of iterations employed} \item{epsilon}{difference in the objective function when the algorithm stops} \item{model}{the model frame} \item{psi.history}{a list or a vector including the breakpoint estimates at each step} \item{seed}{the integer vector containing the seed just before the bootstrap resampling. Returned only if bootstrap restart is employed} \item{..}{Other components are not of direct interest of the user} } \references{ Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points. \emph{Statistics in Medicine} \bold{22}, 3055--3071. Muggeo, V.M.R. (2008) Segmented: an R package to fit regression models with broken-line relationships. \emph{R News} \bold{8/1}, 20--25. } \author{ Vito M. R. Muggeo, \email{vito.muggeo@unipa.it} } \note{ \enumerate{ \item The algorithm will start if the \code{it.max} argument returned by \code{seg.control} is greater than zero. If \code{it.max=0} \code{segmented} will estimate a new linear model with break-point(s) fixed at the values reported in \code{psi}. \item In the returned fit object, `U.' is put before the name of the segmented variable to mean the difference-in-slopes coefficient. \item Methods specific to the class \code{"segmented"} are \itemize{ \item \code{print.segmented} \item \code{summary.segmented} \item \code{print.summary.segmented} \item \code{plot.segmented} \item \code{lines.segmented} \item \code{confint.segmented} \item \code{vcov.segmented} \item \code{predict.segmented} \item \code{points.segmented} } Others are inherited from the class \code{"lm"} or \code{"glm"} depending on the class of \code{obj}. } } \section{ Warning }{It is well-known that the log-likelihood function for the break-point may be not concave, especially for poor clear-cut kink-relationships. In these circumstances the initial guess for the break-point, i.e. the \code{psi} argument, must be provided with care. For instance visual inspection of a, possibly smoothed, scatter-plot is usually a good way to obtain some idea on breakpoint location. However bootstrap restarting, implemented since version 0.2-9.0, is relatively more robust to starting values specified in \code{psi}. Alternatively an automatic procedure may be implemented by specifying \code{psi=NA} and \code{stop.if.error=FALSE} in \code{\link{seg.control}}. This automatic procedure, however, is expected to overestimate the number of breakpoints. } \seealso{ \code{\link{lm}}, \code{\link{glm}} } \examples{ set.seed(12) xx<-1:100 zz<-runif(100) yy<-2+1.5*pmax(xx-35,0)-1.5*pmax(xx-70,0)+15*pmax(zz-.5,0)+rnorm(100,0,2) dati<-data.frame(x=xx,y=yy,z=zz) out.lm<-lm(y~x,data=dati) o<-segmented(out.lm,seg.Z=~x,psi=list(x=c(30,60)), control=seg.control(display=FALSE)) slope(o) out.lm<-lm(y~z,data=dati) o1<-update(o,seg.Z=~x+z,psi=list(x=c(30,60),z=.3)) #the default method leads to the same results (but it is slower) #o1<-segmented.default(o,seg.Z=~x+z,psi=list(x=c(30,60),z=.3)) #o1<-segmented.default(o,seg.Z=~x+z,psi=list(x=c(30,60),z=.3), # control=seg.control(fn.obj="sum(x$residuals^2)")) #automatic procedure to estimate breakpoints in the covariate x # Notice: bootstrap restart is not allowed! o<-segmented.lm(out.lm,seg.Z=~x+z,psi=list(x=NA,z=.3), control=seg.control(stop.if.error=FALSE,n.boot=0)) #assess the progress of the breakpoint estimates throughout the iterations \dontrun{ par(mfrow=c(2,1)) draw.history(o, "x") draw.history(o, "z") } #try to increase the number of iterations and re-assess the #convergence diagnostics #An example using the default method: # Cox regression with a segmented relationship \dontrun{ library(survival) data(stanford2) o<-coxph(Surv(time, status)~age, data=stanford2) os<-segmented(o, ~age, psi=40) #estimate the breakpoint in the age effect summary(os) #actually it means summary.coxph(os) plot(os) #it does not work plot.segmented(os) #call explicitly plot.segmented() to plot the fitted piecewise lines } } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{regression} \keyword{nonlinear } segmented/man/intercept.Rd0000644000175100001440000000502712404051010015271 0ustar hornikusers\name{intercept} \alias{intercept} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Intercept estimates from segmented relationships } \description{ Computes the intercepts of each `segmented' relationship in the fitted model. } \usage{ intercept(ogg, parm, gap = TRUE, rev.sgn = FALSE, var.diff=FALSE, digits = max(3, getOption("digits") - 3)) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{ogg}{ an object of class "segmented", returned by any \code{segmented} method. } \item{parm}{ the segmented variable whose intercepts have to be computed. If missing all the segmented variables in the model are considered. } \item{gap}{ logical. should the intercepts account for the (possible) gaps? } \item{rev.sgn}{vector of logicals. The length should be equal to the length of \code{parm}, but it is recycled otherwise. When \code{TRUE} it is assumed that the current \code{parm} is `minus' the actual segmented variable, therefore the sign is reversed before printing. This is useful when a null-constraint has been set on the last slope. } \item{var.diff}{Currently ignored as only point estimates are computed. %logical. If \code{var.diff=TRUE} and there is a single segmented variable, the computed standard errors % are based on a sandwich-type formula of the covariance matrix. See Details in \code{\link{summary.segmented}}. } \item{digits}{controls number of digits in output.} } \details{ A broken-line relationship means that a regression equation exists in the intervals `\eqn{min(x)}{min(x)} to \eqn{\psi_1}{psi1}', `\eqn{\psi_1}{psi1} to \eqn{\psi_2}{psi2}', and so on. \code{intercept} computes point estimates of the intercepts of the different regression equations for each segmented relationship in the fitted model. } \value{ \code{intercept} returns a list of one-column matrices. Each matrix represents a segmented relationship. } %\references{ %% ~put references to the literature/web site here ~ %} \author{Vito M. R. Muggeo, \email{vito.muggeo@unipa.it}} %\note{ %% ~~further notes~~ %} %% ~Make other sections like Warning with \section{Warning }{....} ~ \seealso{ See also \code{\link{slope}} to compute the slopes of the different regression equations for each segmented relationship in the fitted model. } \examples{ ## see ?slope \dontrun{ intercept(out.seg) } } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ regression } segmented/man/segmented-package.Rd0000644000175100001440000000550212404051010016636 0ustar hornikusers\name{segmented-package} \alias{segmented-package} %\alias{segmented} \docType{package} \title{ Segmented relationships in regression models with breakpoints/changepoints estimation } \description{ Estimation of Regression Models with piecewise linear relationships having a fixed number of break-points. } \details{ \tabular{ll}{ Package: \tab segmented\cr Type: \tab Package\cr Version: \tab 0.5-0.0\cr Date: \tab 2014-09-10\cr License: \tab GPL\cr } Package \code{segmented} is aimed to estimate linear and generalized linear models (and virtually any regression model) having one or more segmented relationships in the linear predictor. Estimates of the slopes and of the possibly multiple breakpoints are provided. The package includes testing/estimating functions and methods to print, summarize and plot the results. \cr The algorithm used by \code{segmented} is \emph{not} grid-search. It is an iterative procedure (Muggeo, 2003) that needs starting values \emph{only} for the breakpoint parameters and therefore it is quite efficient even with several breakpoints to be estimated. Moreover since version 0.2-9.0, \code{segmented} implements the bootstrap restarting (Wood, 2001) to make the algorithm less sensitive to starting values. \cr Since version 0.5-0.0 a default method \code{segmented.dafault} has been added. It may be employed to include segmented relationships in \emph{general} regression models where specific methods do not exist. Examples include quantile and Cox regressions. See examples in \code{\link{segmented.default}}.\cr A tentative approach to deal with unknown number of breakpoints is also provided, see option \code{stop.if.error} in \code{\link{seg.control}}. } \author{ Vito M.R. Muggeo } \references{ Davies, R.B. (1987) Hypothesis testing when a nuisance parameter is present only under the alternative. \emph{Biometrika} \bold{74}, 33--43. Seber, G.A.F. and Wild, C.J. (1989) \emph{Nonlinear Regression}. Wiley, New York. Bacon D.W., Watts D.G. (1971) Estimating the transistion between two intersecting straight lines. \emph{Biometrika} \bold{58}: 525 -- 534. Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points. \emph{Statistics in Medicine} \bold{22}, 3055--3071. Muggeo, V.M.R. (2008) Segmented: an R package to fit regression models with broken-line relationships. \emph{R News} \bold{8/1}, 20--25. Muggeo, V.M.R., Adelfio, G. (2011) Efficient change point detection in genomic sequences of continuous measurements. \emph{Bioinformatics} \bold{27}, 161--166. Wood, S. N. (2001) Minimizing model fitting objectives that contain spurious local minima by bootstrap restarting. \emph{Biometrics} \bold{57}, 240--244. } \keyword{ regression } \keyword{ nonlinear } segmented/man/print.segmented.Rd0000644000175100001440000000127012404051010016376 0ustar hornikusers\name{print.segmented} \alias{print.segmented} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Print method for the segmented class } \description{ Printing the most important feautures of a segmented model. } \usage{ \method{print}{segmented}(x, digits = max(3, getOption("digits") - 3), ...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{x}{ object of class \code{segmented} } \item{digits}{ number of digits to be printed } \item{\dots}{ arguments passed to other functions } } \author{ Vito M.R. Muggeo } \seealso{ \code{\link{summary.segmented}}, \code{\link{print.summary.segmented}} } \keyword{ models } segmented/man/plant.Rd0000644000175100001440000000226612404051010014414 0ustar hornikusers\name{plant} \alias{plant} \docType{data} \title{ Plan organ dataset} \description{ The \code{plant} data frame has 103 rows and 3 columns. } \usage{data(plant)} \format{ A data frame with 103 observations on the following 3 variables: \describe{ \item{\code{y}}{measurements of the plant organ.} \item{\code{time}}{times where measurements took place.} \item{\code{group}}{three attributes of the plant organ, \code{RKV}, \code{RKW}, \code{RWC}.} } } \details{ Three attributes of a plant organ measured over time where biological reasoning indicates likelihood of multiple breakpoints. The data are scaled to the maximum value for each attribute and all attributes are measured at each time. } \source{ The data have been kindly provided by Dr Zongjian Yang at School of Land, Crop and Food Sciences, The University of Queensland, Brisbane, Australia. } %\references{ % ~~ possibly secondary sources and usages ~~ %} \examples{ \dontrun{ data(plant) attach(plant) %lattice::xyplot(y~time,groups=group,pch=19,col=2:4,auto.key=list(space="right")) lattice::xyplot(y~time,groups=group,auto.key=list(space="right")) } } \keyword{datasets} segmented/man/seg.lm.fit.Rd0000644000175100001440000000762612404051010015251 0ustar hornikusers\name{seg.lm.fit} \alias{seg.lm.fit} \alias{seg.glm.fit} \alias{seg.def.fit} \alias{seg.lm.fit.boot} \alias{seg.glm.fit.boot} \alias{seg.def.fit.boot} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Fitter Functions for Segmented Linear Models } \description{ \code{seg.lm.fit} is called by \code{segmented.lm} to fit segmented linear (gaussian) models. Likewise, \code{seg.glm.fit} is called by \code{segmented.glm} to fit generalized segmented linear models, and \code{seg.def.fit} is called by \code{segmented.default} to fit segmented relationships in general regression models (e.g., quantile regression and Cox regression). \code{seg.lm.fit.boot}, \code{seg.glm.fit.boot}, and \code{seg.def.fit.boot} are employed to perform bootstrap restart. These functions should usually not be used directly by the user. } \usage{ seg.lm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol=FALSE) seg.lm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE) seg.glm.fit(y, XREG, Z, PSI, w, offs, opz, return.all.sol=FALSE) seg.glm.fit.boot(y, XREG, Z, PSI, w, offs, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE) seg.def.fit(obj, Z, PSI, mfExt, opz, return.all.sol=FALSE) seg.def.fit.boot(obj, Z, PSI, mfExt, opz, n.boot=10, size.boot=NULL, jt=FALSE, nonParam=TRUE, random=FALSE) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{y}{ vector of observations of length \code{n}. } \item{XREG}{ design matrix for standard linear terms. } \item{Z}{ appropriate matrix including the segmented variables whose breakpoints have to be estimated. } \item{PSI}{ appropriate matrix including the starting values of the breakpoints to be estimated. } \item{w}{ possibe weights vector. } \item{offs}{ possibe offset vector. } \item{opz}{ a list including information useful for model fitting. } \item{n.boot}{ the number of bootstrap samples employed in the bootstrap restart algorithm. } \item{size.boot}{ the size of the bootstrap resamples. If \code{NULL} (default), it is taken equal to the sample size. values smaller than the sample size are expected to increase perturbation in the bootstrap resamples. } \item{jt}{ logical. If \code{TRUE} the values of the segmented variable(s) are jittered before fitting the model to the bootstrap resamples. } \item{nonParam}{ if \code{TRUE} nonparametric bootstrap (i.e. case-resampling) is used, otherwise residual-based. } \item{random}{ if \code{TRUE}, when the algorithm fails to obtain a solution, random values are used as candidate values. } \item{return.all.sol}{ if \code{TRUE}, when the algorithm fails to obtain a solution, the values visited by the algorithm with corresponding deviances are returned. } \item{obj}{ the starting regression model where the segmented relationships have to be added. } \item{mfExt}{ the model frame. } } \details{ The functions call iteratively \code{lm.wfit} (or \code{glm.fit}) with proper design matrix depending on \code{XREG}, \code{Z} and \code{PSI}. \code{seg.lm.fit.boot} (and \code{seg.glm.fit.boot}) implements the bootstrap restarting idea discussed in Wood (2001). } \value{ A list of fit information. } \references{ Wood, S. N. (2001) Minimizing model fitting objectives that contain spurious local minima by bootstrap restarting. \emph{Biometrics} \bold{57}, 240--244. } \author{ Vito Muggeo } \note{ These functions should usually not be used directly by the user. } %% ~Make other sections like Warning with \section{Warning }{....} ~ \seealso{ \code{\link{segmented.lm}}, \code{\link{segmented.glm}} } \examples{ ##See ?segmented } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{regression} \keyword{nonlinear } segmented/man/slope.Rd0000644000175100001440000000704112404051010014414 0ustar hornikusers\name{slope} \alias{slope} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Slope estimates from segmented relationships } \description{ Computes the slopes of each `segmented' relationship in the fitted model. } \usage{ slope(ogg, parm, conf.level = 0.95, rev.sgn=FALSE, var.diff=FALSE, APC=FALSE, digits = max(3, getOption("digits") - 3)) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{ogg}{ an object of class "segmented", returned by any \code{segmented} method. } \item{parm}{ the segmented variable whose slopes have to be computed. If missing all the segmented variables are considered. } \item{conf.level}{ the confidence level required. } \item{rev.sgn}{vector of logicals. The length should be equal to the length of \code{parm}, but it is recycled otherwise. When \code{TRUE} it is assumed that the current \code{parm} is `minus' the actual segmented variable, therefore the sign is reversed before printing. This is useful when a null-constraint has been set on the last slope.} \item{var.diff}{logical. If \code{var.diff=TRUE} and there is a single segmented variable, the computed standard errors are based on a sandwich-type formula of the covariance matrix. See Details in \code{\link{summary.segmented}}.} \item{APC}{logical. If \code{APC=TRUE} the `annual percent changes', i.e. \eqn{100\times(\exp(\beta)-1)}{100*(exp(b)-1)}, are computed for each interval (\eqn{\beta}{b} is the slope). Only point estimates and confidence intervals are returned. } \item{digits}{controls number of digits printed in output.} } \details{ To fit broken-line relationships, \code{segmented} uses a parameterization whose coefficients are not the slopes. Therefore given an object \code{"segmented"}, \code{slope} computes point estimates, standard errors, t-values and confidence intervals of the slopes of each segmented relationship in the fitted model. } \value{ \code{slope} returns a list of matrices. Each matrix represents a segmented relationship and its number of rows equal to the number of segments, while five columns summarize the results. } \references{Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points. \emph{Statistics in Medicine} \bold{22}, 3055--3071. } \author{Vito M. R. Muggeo, \email{vito.muggeo@unipa.it} } \note{The returned summary is based on limiting Gaussian distribution for the model parameters involved in the computations. Sometimes, even with large sample sizes such approximations are questionable (e.g., with small difference-in-slope parameters) and the results returned by \code{slope} might be unreliable. Therefore is responsability of the user to gauge the applicability of such asymptotic approximations. Anyway, the t values may be not assumed for testing purposes and they should be used just as guidelines to assess the estimate uncertainty. } \seealso{See also \code{\link{davies.test}} to test for a nonzero difference-in-slope parameter. } \examples{ set.seed(16) x<-1:100 y<-2+1.5*pmax(x-35,0)-1.5*pmax(x-70,0)+rnorm(100,0,3) out<-glm(y~1) out.seg<-segmented(out,seg.Z=~x,psi=list(x=c(20,80))) ## the slopes of the three segments.... slope(out.seg) rm(x,y,out,out.seg) # ## an heteroscedastic example.. set.seed(123) n<-100 x<-1:n/n y<- -x+1.5*pmax(x-.5,0)+rnorm(n,0,1)*ifelse(x<=.5,.4,.1) o<-lm(y~x) oseg<-segmented(o,seg.Z=~x,psi=.6) slope(oseg) slope(oseg,var.diff=TRUE) #better CI } \keyword{ regression } \keyword{ htest } segmented/man/lines.segmented.Rd0000644000175100001440000000564112404051010016362 0ustar hornikusers\name{lines.segmented} \alias{lines.segmented} %- Also NEED an '\alias' for EACH other topic documented here. \title{ Bars for interval estimate of the breakpoints } \description{ Draws bars relevant to breakpoint estimates (point estimate and confidence limits) on the current device } \usage{ \method{lines}{segmented}(x, term, bottom = TRUE, shift=TRUE, conf.level = 0.95, k = 50, pch = 18, rev.sgn = FALSE, ...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{x}{ an object of class \code{segmented}. } \item{term}{ the segmented variable of the breakpoints being drawn. It may be unspecified when there is a single segmented variable.} \item{bottom}{ logical, indicating if the bars should be plotted at the bottom (\code{TRUE}) or at the top (\code{FALSE}).} \item{shift}{ logical, indicating if the bars should be `shifted' on the y-axis before plotting. Useful for multiple breakpoints with overlapped confidence intervals.} \item{conf.level}{ the confidence level of the confidence intervals for the breakpoints. } \item{k}{ a positive integer regulating the vertical position of the drawn bars. See Details. } \item{pch}{ either an integer specifying a symbol or a single character to be used in plotting the point estimates of the breakpoints. See \code{\link{points}}. } \item{rev.sgn}{ should the signs of the breakpoint estimates be changed before plotting? see Details. } \item{\dots}{ further arguments passed to \code{\link{segments}}, for instance `col' that can be a vector. } } \details{ \code{lines.segmented} simply draws on the current device the point estimates and relevant confidence limits of the estimated breakpoints from a "segmented" object. The y coordinate where the bars are drawn is computed as \code{usr[3]+h} if \code{bottom=TRUE} or \code{usr[4]-h} when \code{bottom=FALSE}, where \code{h=(usr[4]-usr[3])/abs(k)} and \code{usr} are the extremes of the user coordinates of the plotting region. Therefore for larger values of \code{k} the bars are plotted on the edges. The argument \code{rev.sgn} allows to change the sign of the breakpoints before plotting. This may be useful when a null-right-slope constraint is set. } %\value{ % ~Describe the value returned % If it is a LIST, use % \item{comp1 }{Description of 'comp1'} % \item{comp2 }{Description of 'comp2'} % ... %} %\references{ ~put references to the literature/web site here ~ } %\author{ ~~who you are~~ } %\note{ ~~further notes~~ % ~Make other sections like Warning with \section{Warning }{....} ~ %} \seealso{ \code{\link{plot.segmented}} to plot the fitted segmented lines, and \code{\link{points.segmented}} to add the fitted joinpoints. } \examples{ ## See ?plot.segmented } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ regression } \keyword{ nonlinear } segmented/man/davies.test.Rd0000644000175100001440000001374612404051010015534 0ustar hornikusers\name{davies.test} \alias{davies.test} \title{ Testing for a change in the slope } \description{ Given a generalized linear model, the Davies' test can be employed to test for a non-constant regression parameter in the linear predictor. } \usage{ davies.test(obj, seg.Z, k = 10, alternative = c("two.sided", "less", "greater"), type=c("lrt","wald"), values=NULL, dispersion=NULL) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{obj}{ a fitted model typically returned by \code{glm} or \code{lm}. Even an object returned by \code{segmented} can be set (e.g. if interest lies in testing for an additional breakpoint).} \item{seg.Z}{ a formula with no response variable, such as \code{seg.Z=~x1}, indicating the (continuous) segmented variable being tested. Only a single variable may be tested and a warning is printed when \code{seg.Z} includes two or more terms. } \item{k}{ number of points where the test should be evaluated. See Details. } \item{alternative}{ a character string specifying the alternative hypothesis. } \item{type}{ the test statistic to be used (only for GLM, default to lrt. Ignored if \code{obj} is a simple linear model.} \item{values}{ optional. The evaluation points where the Davies approximation is computed. See Details for default values.} \item{dispersion}{ the dispersion parameter for the family to be used to compute the test statistic. When \code{NULL} (the default), it is inferred from \code{obj}. Namely it is taken as \code{1} for the Binomial and Poisson families, and otherwise estimated by the residual Chi-squared statistic (calculated from cases with non-zero weights) divided by the residual degrees of freedom. } } \details{ \code{davies.test} tests for a non-zero difference-in-slope parameter of a segmented relationship. Namely, the null hypothesis is \eqn{H_0:\beta=0}{H_0:beta=0}, where \eqn{\beta}{beta} is the difference-in-slopes, i.e. the coefficient of the segmented function \eqn{\beta(x-\psi)_+}{beta*(x-psi)_+}. The hypothesis of interest \eqn{\beta=0}{beta=0} means no breakpoint. Roughtly speaking, the procedure computes \code{k} `naive' (i.e. assuming fixed and known the breakpoint) test statistics for the difference-in-slope, seeks the `best' value and corresponding naive p-value (according to the alternative hypothesis), and then corrects the selected (minimum) p-value by means of the \code{k} values of the test statistic. If \code{obj} is a LM, the Davies (2002) test is implemented. This approach works even for small samples. If \code{obj} represents a GLM fit, relevant methods are described in Davies (1987), and the Wald or the Likelihood ratio test statistics can be used, see argument \code{type}. This is an asymptotic test. # The \code{k} evaluation points are \code{k} equally spaced values between the minimum and the maximum (excluded) values of the variable reported in \code{seg.Z}. The \code{k} evaluation points are \code{k} equally spaced values between the second and the second-last values of the variable reported in \code{seg.Z}. } \value{ A list with class '\code{htest}' containing the following components: \item{method}{title (character)} \item{data.name}{the regression model and the segmented variable being tested} \item{statistic }{the point within the range of the covariate in \code{seg.Z} at which the maximum (or the minimum if \code{alternative="less"}) occurs} \item{parameter }{number of evaluation points} \item{p.value }{the adjusted p-value} \item{process}{a two-column matrix including the evaluation points and corresponding values of the test statistic} } \references{ Davies, R.B. (1987) Hypothesis testing when a nuisance parameter is present only under the alternative. \emph{Biometrika} \bold{74}, 33--43. Davies, R.B. (2002) Hypothesis testing when a nuisance parameter is present only under the alternative: linear model case. \emph{Biometrika} \bold{89}, 484--489. } \author{ Vito M.R. Muggeo } \note{ Strictly speaking, the Davies test is not confined to the segmented regression; the procedure can be applied when a nuisance parameter vanishes under the null hypothesis. The test is slightly conservative, as the computed p-value is actually an upper bound. Results should change slightly with respect to previous versions where the evaluation points were computed as \code{k} equally spaced values between the second and the second last observed values of the segmented variable. } \section{Warning }{ The Davies test is \emph{not} aimed at obtaining the estimate of the breakpoint. The Davies test is based on \code{k} evaluation points, thus the value returned in the \code{statistic} component (and printed as "'best' at") is the best among the \code{k} points, and typically it will differ from the maximum likelihood estimate returned by \code{segmented}. Use \code{\link{segmented}} if you are interested in the point estimate. To test for a breakpoint in linear models with small samples, it is suggested to use \code{davies.test()} with objects of class "lm". If \code{obj} is a \code{"glm"} object with gaussian family, \code{davies.test()} will use an approximate test resulting in smaller p-values when the sample is small. However if the sample size is large (n>300), the exact Davies (2002) upper bound cannot be computed (as it relies on \code{gamma()} function) and the \emph{approximate} upper bound of Davies (1987) is returned. } %%\section{Warning }{Currently \code{davies.test} does not work if the fitted model \code{ogg} %% does not include the segmented variable \code{term} being tested.} \examples{ \dontrun{ set.seed(20) z<-runif(100) x<-rnorm(100,2) y<-2+10*pmax(z-.5,0)+rnorm(100,0,3) o<-lm(y~z+x) davies.test(o,~z) davies.test(o,~x) o<-glm(y~z+x) davies.test(o,~z) #it works but the p-value is too small.. } } \keyword{ htest } segmented/man/summary.segmented.Rd0000644000175100001440000000723412404051010016745 0ustar hornikusers\name{summary.segmented} \alias{summary.segmented} \alias{print.summary.segmented} \title{ Summarizing model fits for segmented regression } \description{ summary method for class \code{segmented}. } \usage{ \method{summary}{segmented}(object, short = FALSE, var.diff = FALSE, ...) \method{print}{summary.segmented}(x, short=x$short, var.diff=x$var.diff, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"),...) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{object}{ Object of class "segmented". } \item{short}{ logical indicating if the `short' summary should be printed. } \item{var.diff}{ logical indicating if different error variances should be computed in each interval of the segmented variable, see Details. } \item{x}{a \code{summary.segmented} object produced by \code{summary.segmented()}.} \item{digits}{controls number of digits printed in output.} \item{signif.stars}{logical, should stars be printed on summary tables of coefficients?} \item{\dots}{ further arguments. } } \details{ If short=TRUE only coefficients of the segmented relationships are printed. If var.diff=TRUE and there is only one segmented variable, different error variances are computed in the intervals defined by the estimated breakpoints of the segmented variable. For the jth interval with nj observations the error variance is estimated via \eqn{RSS_j/(n_j-p)}{RSSj/(nj-p)}, where \eqn{RSS_j} is the residual sum of squares in interval jth, and \eqn{p} are the model parameters. Note \code{var.diff=TRUE} does \emph{not} affect the parameter estimation which is performed via ordinary (and not weighted) least squares. However if \code{var.diff=TRUE} the variance-covariance matrix of the estimates is computed via the sandwich formula, \deqn{(X^TX)^{-1}X^TVX(X^TX)^{-1}}{(X'X)^{-1}X'VX(X'X)^{-1}} where V is the diagonal matrix including the different error variance estimates. Standard errors are the square root of the main diagonal of this matrix. } \value{ A list (similar to one returned by \code{segmented.lm} or \code{segmented.glm}) with additional components: \item{psi }{estimated break-points and relevant (approximate) standard errors} \item{Ttable }{estimates and standard errors of the model parameters. This is similar to the matrix \code{coefficients} returned by \code{summary.lm} or \code{summary.glm}, but without the rows corresponding to the breakpoints. Even the p-values relevant to the difference-in-slope parameters have been replaced by NA, since they are meaningless in this case, see \code{\link{davies.test}}.} \item{gap}{estimated coefficients, standard errors and t-values for the `gap' variables} \item{cov.var.diff}{if \code{var.diff=TRUE}, the covaraince matrix accounting for heteroscedastic errors.} \item{sigma.new}{if \code{var.diff=TRUE}, the square root of the estimated error variances in each interval.} \item{df.new}{if \code{var.diff=TRUE}, the residual degrees of freedom in each interval.} } %\references{ ~put references to the literature/web site here ~ } \author{ Vito M.R. Muggeo } \seealso{ \code{\link{print.segmented}}, \code{\link{davies.test}} } \examples{ ##continues example from segmented() # summary(segmented.model,short=TRUE) ## an heteroscedastic example.. # set.seed(123) # n<-100 # x<-1:n/n # y<- -x+1.5*pmax(x-.5,0)+rnorm(n,0,1)*ifelse(x<=.5,.4,.1) # o<-lm(y~x) # oseg<-segmented(o,seg.Z=~x,psi=.6) # summary(oseg,var.diff=TRUE)$sigma.new } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ regression } segmented/man/broken.line.Rd0000644000175100001440000000505212404051010015500 0ustar hornikusers\name{broken.line} \alias{broken.line} \title{ Fitted values for segmented relationships} \description{ Given a segmented model (typically returned by a \code{segmented} method), \code{broken.line} computes the fitted values (and relevant standard errors) for each `segmented' relationship. } \usage{ broken.line(ogg, term = NULL, link = TRUE, interc=TRUE, se.fit=TRUE) } \arguments{ \item{ogg}{ A fitted object of class segmented (returned by any \code{segmented} method). } \item{term}{ Three options. A list (whose name should be one of the segmented covariates) including values for which segmented predictions should be computed. A character meaning the name of any segmented covariate in the model. \code{NULL} if the model includes a single segmented covariate. } \item{link}{ Should the predictions be computed on the scale of the link function? Default to \code{TRUE}. } \item{interc}{ Should the model intercept be added? (provided it exists).} \item{se.fit}{ If \code{TRUE} also standard errors for predictions are returned.} } \details{ If \code{term=NULL} or \code{term} is a valid segmented covariate name, predictions for each segmented variable are the relevant fitted values from the model. If \code{term} is a (correctly named) list with numerical values, predictions corresponding to such specified values are computed. If \code{link=FALSE} and \code{ogg} inherits from the class "glm", predictions and standard errors are returned on the response scale. The standard errors come from the Delta method. Argument \code{link} is ignored whether \code{ogg} does not inherit from the class "glm". } \value{ A 2-component (if \code{se.fit=TRUE}) list representing predictions and standard errors for the segmented covariate values. } %\references{ ~put references to the literature/web site here ~ } \author{ Vito M. R. Muggeo } %\note{ %This function will be probably removed in the next versions. See \code{predict.segmented} instead. %} % ~Make other sections like Warning with \section{Warning }{....} ~ %} \seealso{ \code{\link{segmented}}, \code{\link{predict.segmented}}, \code{\link{plot.segmented}}} \examples{ set.seed(1234) z<-runif(100) y<-rpois(100,exp(2+1.8*pmax(z-.6,0))) o<-glm(y~z,family=poisson) o.seg<-segmented(o,seg.Z=~z,psi=.5) \dontrun{plot(z,y)} \dontrun{points(z,broken.line(o.seg,link=FALSE)$fit,col=2,pch=20)} } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ regression } \keyword{ nonlinear }