Package 'Trendtwosub' reference manual

Title:	Two Sample Order Free Trend Nonparametric Inference
Description:	The package contains functions for non-parametric trend comparison of two independent samples with sequential subsamples.
Authors:	Yishi Wang Developer [aut, cre], Matthew Villanueva Developer [aut], Ann Stapleton Developer [aut]
Maintainer:	Yishi Wang Developer <[email protected]>
License:	GPL (>= 2)
Version:	0.0.2
Built:	2025-03-05 02:46:04 UTC
Source:	https://github.com/wangyuncw/trendtwosub

chi.stat function

Description

This function calculates the $M$ statistics value as defined in the reference paper.

Usage

chi.stat(ftab)
chi.stat(ftab)

Arguments

ftab

it is a matrix with dimension 2 by $K$ .

Details

The $M$ statistics is defined as:

$M=\sum_{l=1}^{K}\left(\frac{(O_{x,l}-E_{x,l})^2}{E_{x,l}}+\frac{(O_{x,l}-E_{x,l})^2}{\left(n_ln_{l+1}-E_{x,l}\right)}\right)+\sum_{l=1}^{K}\left(\frac{(O_{y,l}-E_{y,l})^2}{E_{y,l}}+\frac{(O_{y,l}-E_{y,l})^2}{\left(m_lm_{l+1}-E_{y,l}\right)}\right).$

Value

chi.val, a chisuqre type of statistics value

References

Wang, Y., Stapleton, A. E., & Chen, C. (2018). Two-sample nonparametric stochastic order inference with an application in plant physiology. Journal of Statistical Computation and Simulation, 88(14), 2668-2683.

Examples

chi.stat(ftab=rbind(c(20,10,20),c(15,15,20)))
chi.stat(ftab=rbind(c(20,10,20),c(15,15,20)))

List of functions freq.less function

Description

This function finds the sum of counts that the x-sample observations is greater than or less than the ones from the y-sample.

Usage

freq.less(x, y)
freq.less(x, y)

Arguments

x, y

x and y are numerical vectors of different subsamples. The length of the two vectors can vary.

Details

When there is a tie between any pair of observations, 0.5 is added to the count. Missing value is allowed. Missing value is only added to the calculation when it is compared with another missing value from the other subsample.

Value

Two values are returned: less.count and more.count. The first one is the total count that the observations in x-sample is less than the ones from the y-sample, and the second output is the total count that the observations in x-sample is more than the ones from the y-sample. When there is a tie, 0.5 is added to the count, instead of 1 or 0.

Examples

freq.less(x=c(1,2,4,9,0,0,NA),y=c(1,4,9,NA))
freq.less(x=c(1,2,4,9,0,0,NA),y=c(1,4,9,NA))

gen.decision function This function compares frequency comparison counts of subsamples from two independent samples, and calculates a simulated p-value with a novel bootstrap method proposed in the reference paper.

Description

gen.decision function This function compares frequency comparison counts of subsamples from two independent samples, and calculates a simulated p-value with a novel bootstrap method proposed in the reference paper.

Usage

gen.decision(est.prob, effn.subsam1, effn.subsam2, fn.rep = 10^3,
  alpha = 0.05)
gen.decision(est.prob, effn.subsam1, effn.subsam2, fn.rep = 10^3,
  alpha = 0.05)

Arguments

`est.prob`	a matrix of two rows, with each row represents the the sequential comparison results of subsamples from a sample.
`effn.subsam1`	the subsample sizes from sample 1.
`effn.subsam2`	the subsample sizes from sample 2.
`fn.rep`	the total number of replications.
`alpha`	the size of type I error.

Details

The dimensions of est.prob, effn.subsam1 and effn.subsam2 need to match. For example, the first two entries of the first two rows from est.prob are pf comparison results from subsample1 and subsample2 of sample1. Thus the sum of the two entries is the product of the two subsample sizes.

Value

critical.value the critical value of the test based on the alpha level provided

chi-stat the chisqure type test statistics value from the sample provided.

pvalue the simulated p-value.

References

Examples

freq.mat<-rbind(c(20,5,10,15,20,5),c(15,10,15,10,20,5));
n.sam1<-rep(5,4);n.sam2<-rep(5,4); n.rep=1000;
gen.decision(freq.mat,n.sam1,n.sam2,n.rep);
### This command will replicate the first p-value in Table 4 of the reference paper.
freq.mat<-rbind(c(40,10,20,30,40,10),c(30,20,30,20,40,10));
n.sam1<-c(5,10,5,10);n.sam2<-c(10,5,10,5); n.rep=1000;
gen.decision(freq.mat,n.sam1,n.sam2,n.rep)
### This command will replicate the second p-value in Table 4 of the reference paper.
freq.mat<-rbind(c(20,5,10,15,20,5),c(15,10,15,10,20,5));
n.sam1<-rep(5,4);n.sam2<-rep(5,4); n.rep=1000;
gen.decision(freq.mat,n.sam1,n.sam2,n.rep);
### This command will replicate the first p-value in Table 4 of the reference paper.
freq.mat<-rbind(c(40,10,20,30,40,10),c(30,20,30,20,40,10));
n.sam1<-c(5,10,5,10);n.sam2<-c(10,5,10,5); n.rep=1000;
gen.decision(freq.mat,n.sam1,n.sam2,n.rep)
### This command will replicate the second p-value in Table 4 of the reference paper.

multi.freq function

Description

This function find trend in a sample by comparing neighboring subsamples. The subsamples are stored in a list in R.

Usage

multi.freq(fsam)
multi.freq(fsam)

Arguments

fsam

a list in R. The order of the vectors in the list follows the order of the subsamples.

Details

The first vector of data in the list will be compared with the second vector in the list by using function freq.less. Then the second vector will be compared with the 3rd vector if there is one. The statistics collected are based on computing:

$\frac{1}{n_ln_{l+1}}\sum_{i=1}^{n_l}\sum_{j=1}^{n_{l+1}}1(x_{li}<x_{(l+1)j})$

Value

count.vec it is a collection of a sequence less.count, more.count based on freq.less function.

References

Examples

x1=c(1,2,4,9,0,0,NA);x2=c(1,4,9,NA);x3=c(2,5,10);
sam=list(x1,x2,x3); #
multi.freq(sam);
x1=c(1,2,4,9,0,0,NA);x2=c(1,4,9,NA);x3=c(2,5,10);
sam=list(x1,x2,x3); #
multi.freq(sam);

pow.ana.gen.decision function

Description

This function evaluates the type I error of the proposed test.

Usage

pow.ana.gen.decision(mean.prob1, mean.prob2, effn.subsam1, effn.subsam2,
  N.rep = 10^1, boot.rep = 10^1, rseed = 1234, alpha.level = 0.05)
pow.ana.gen.decision(mean.prob1, mean.prob2, effn.subsam1, effn.subsam2,
  N.rep = 10^1, boot.rep = 10^1, rseed = 1234, alpha.level = 0.05)

Arguments

`mean.prob1`	the probability that observations of a subsample is less than the ones from another subsample, in sample #1.
`mean.prob2`	the probability that observations of a subsample is less than the ones from another subsample, in sample #2.
`effn.subsam1`	the subsample sizes from sample 1.
`effn.subsam2`	the subsample sizes from sample 2.
`N.rep`	the total number of bootstrap repetitions needed for calculating type I errors.
`boot.rep`	the number of repetitions needed to calculated simulated p-value,
`rseed`	a random seed.
`alpha.level`	the type I error level that will be assessed.

Value

the simulated type I error.

References

Examples

prob.vec<-c(.4,.2,.3,.6);
sub.sizes1<-c(2,4,3,5,3);sub.sizes2<-c(6,3,2,4,2)
pow.ana.gen.decision(prob.vec,prob.vec,sub.sizes1, sub.sizes1)
pow.ana.gen.decision(prob.vec,prob.vec,sub.sizes1, sub.sizes1,alpha.level=0.1)
prob.vec<-c(.4,.2,.3,.6);
sub.sizes1<-c(2,4,3,5,3);sub.sizes2<-c(6,3,2,4,2)
pow.ana.gen.decision(prob.vec,prob.vec,sub.sizes1, sub.sizes1)
pow.ana.gen.decision(prob.vec,prob.vec,sub.sizes1, sub.sizes1,alpha.level=0.1)

seedwt.multi.subsample dataset

Description

seedwt.multi.subsample dataset

Usage

seedwt.multi.subsample
seedwt.multi.subsample

Format

An object of class data.frame with 2916 rows and 10 columns.

Details

multiple maize inbreds were exposed to all combinations of the following stressors: drought, nitrogen, and density stress. Plants were grown in an experimental plot divided into eight sections, and each of the sections received a combination of between zero and three of the stresses previously mentioned, so that all possible stress combinations were included. More details about the experiment can be found in the references

References

Stutts, L., Wang, Y., & Stapleton, A. E. (2018). Plant growth regulators ameliorate or exacerbate abiotic, biotic and combined stress interaction effects on Zea mays kernel weight with inbred-specific patterns. Environmental and experimental botany, 147, 179-188.

simu.ustat.pattern function

Description

This function create two independent subsamples of various subsample sizes, with a given probability vector.

Usage

simu.ustat.pattern(mean.prob.vec, effn.subs, n.rep = 10^2)
simu.ustat.pattern(mean.prob.vec, effn.subs, n.rep = 10^2)

Arguments

`mean.prob.vec`	a vector of length 2. Its first element represents the probability that a random observation from one subsample is less than the the one from another subsample..
`effn.subs`	a vector contains two subsample sizes.
`n.rep`	the total number of repetition.

Details

each subsample is generated from a normal distribution, with an average generated from the mean.prob.vec.

Value

simu.tab a list of length n.rep. Each element of the list is a 2 by 2 matrix, showing the comparison results from function multi.freq.

References

Examples

simu.ustat.pattern(c(0.8,0.2),c(5,8),n.rep=100)
simu.ustat.pattern(c(0.8,0.2),c(5,8),n.rep=100)

sub.test function This function calculates the simulated p-value of comparing the trend in subsamples from two independent samples.

Description

sub.test function This function calculates the simulated p-value of comparing the trend in subsamples from two independent samples.

Usage

sub.test(sam1, sam2, fn.rep2)
sub.test(sam1, sam2, fn.rep2)

Arguments

`sam1`	the first sample.
`sam2`	the second sample
`fn.rep2`	the total number of bootstrap repetitions needed for calculating the simulated p-value.

Value

critical.value the critical value of the test based on the alpha level provided

chi-stat the chisqure type test statistics value from the sample provided.

pvalue the simulated p-value.

References

Examples

attach(seedwt.multi.subsample)
Lev.TN<-levels(TreatmentName);
Lev.Line<-levels(Line);
n<-dim(seedwt.multi.subsample)[1];
level.show=c(1:8);fn.rep3=10^2;
line.name<-Lev.Line[1]; t1.name<-Lev.TN[1];t2.name<-Lev.TN[3];
### To compare the GA treatment and the PACGA treatment from line B73
par(mfrow=c(1,2))
idx<-subset((TreatmentName==t1.name)*(Line==line.name)*(1:n),Env %in% level.show)
idx2<-subset((TreatmentName==t2.name)*(Line==line.name)*(1:n),Env %in% level.show)
boxplot(seedwt[idx]~Env[idx],xlab="ENV levels",ylab=paste('seedwt from',t1.name),
         ylim=c(0,12),cex.lab=1.5,cex.axis=1.8);
boxplot(seedwt[idx2]~Env[idx2], xlab="ENV levels",ylab=paste('seedwt from',t2.name),
         cex.lab=1.5,cex.axis=1.8);
mtext( paste ("Line Name:",line.name), side = 3,outer = TRUE, cex = 2.2,line = -3)
temp.sw1<-seedwt[idx];lab<-Env[idx]; uni.lab<-unique(lab)
sam.1<-lapply(1:length(uni.lab), function(x) temp.sw1[lab==uni.lab[x]])
temp.sw2<-seedwt[idx2];lab2<-Env[idx2]; uni.lab2<-unique(lab2)
sam.2<-lapply(1:length(uni.lab2), function(x) temp.sw2[lab2==uni.lab2[x]])
print(paste("working with line ",line.name,'and treatment',t1.name ,'vs',t2.name ))
resu<-sub.test(sam.1,sam.2,fn.rep2=fn.rep3);
## This will show a similar result as the first experiment of section 5 in the paper.
attach(seedwt.multi.subsample)
Lev.TN<-levels(TreatmentName);
Lev.Line<-levels(Line);
n<-dim(seedwt.multi.subsample)[1];
level.show=c(1:8);fn.rep3=10^2;
line.name<-Lev.Line[1]; t1.name<-Lev.TN[1];t2.name<-Lev.TN[3];
### To compare the GA treatment and the PACGA treatment from line B73
par(mfrow=c(1,2))
idx<-subset((TreatmentName==t1.name)*(Line==line.name)*(1:n),Env %in% level.show)
idx2<-subset((TreatmentName==t2.name)*(Line==line.name)*(1:n),Env %in% level.show)
boxplot(seedwt[idx]~Env[idx],xlab="ENV levels",ylab=paste('seedwt from',t1.name),
         ylim=c(0,12),cex.lab=1.5,cex.axis=1.8);
boxplot(seedwt[idx2]~Env[idx2], xlab="ENV levels",ylab=paste('seedwt from',t2.name),
         cex.lab=1.5,cex.axis=1.8);
mtext( paste ("Line Name:",line.name), side = 3,outer = TRUE, cex = 2.2,line = -3)
temp.sw1<-seedwt[idx];lab<-Env[idx]; uni.lab<-unique(lab)
sam.1<-lapply(1:length(uni.lab), function(x) temp.sw1[lab==uni.lab[x]])
temp.sw2<-seedwt[idx2];lab2<-Env[idx2]; uni.lab2<-unique(lab2)
sam.2<-lapply(1:length(uni.lab2), function(x) temp.sw2[lab2==uni.lab2[x]])
print(paste("working with line ",line.name,'and treatment',t1.name ,'vs',t2.name ))
resu<-sub.test(sam.1,sam.2,fn.rep2=fn.rep3);
## This will show a similar result as the first experiment of section 5 in the paper.

Package 'Trendtwosub'

Help Index

chi.stat function

Description

Usage

Arguments

Details

Value

References

Examples

List of functions freq.less function

Description

Usage

Arguments

Details

Value

Examples

gen.decision function This function compares frequency comparison counts of subsamples from two independent samples, and calculates a simulated p-value with a novel bootstrap method proposed in the reference paper.

Description

Usage

Arguments

Details

Value

References

Examples

multi.freq function

Description

Usage

Arguments

Details

Value

References

Examples

pow.ana.gen.decision function

Description

Usage

Arguments

Value

References

Examples

seedwt.multi.subsample dataset

Description

Usage

Format

Details

References

simu.ustat.pattern function

Description

Usage

Arguments

Details

Value

References

Examples

sub.test function This function calculates the simulated p-value of comparing the trend in subsamples from two independent samples.

Description

Usage

Arguments

Value

References

Examples