## Quantile

Quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities, or dividing the observations in a sample in the same way. $$q$$-quantiles are values that partition a finite and ordered set of data into $$q$$ subsets of (nearly) equal sizes. There are $$q-1$$ of the $$q$$-quantiles, one for each integer $$k$$ satisfying $$0<k<q$$. The $$k$$-th $$q$$-quantile of a random variable is the value $$x$$ such that the probability that a sample or the random variable will be less than $$x$$ is at most $$\frac{k}{q}$$ and the probability that a sample or the random variable will be more than $$x$$ is at most $$\frac{q-k}{q}$$. Quantiles can also be applied to continuous distributions, providing a way to generalize rank statistics to continuous variables. When the cumulative distribution function of a random variable is know, the $$q$$-quantiles are the application of the quantile function to the values $$\{\frac{1}{q},\frac{2}{q},\dots,\frac{q-1}{q}\}$$.

Code

In NM Dev, the class Quantile computes the quantiles for a data set. There are 9 different quantile definitions and implementations.

1. INVERSE_OF_EMPIRICAL_CDF: the inverse of empirical distribution function
2. INVERSE_OF_EMPIRICAL_CDF_WITH_AVERAGING_AT_DISCONTINUITIES: the inverse of empirical distribution function with averaging at discontinuities
3. NEAREST_EVEN_ORDER_STATISTICS: the nearest even order statistic as in SAS
4. LINEAR_INTERPOLATION_OF_EMPIRICAL_CDF: the linear interpolation of the empirical CDF
5. MIDWAY_THROUGH_STEPS_OF_EMPIRICAL_CDF: a piecewise linear function where the knots are the values midway through the steps of the empirical CDF
6. MINITAB_SPSS: the definition in Minitab and SPSS
7. S: the definition in S
8. APPROXIMATELY_MEDIAN_UNBIASED: the resulting quantile estimates are approximately median-unbiased regardless of the distribution of the sample
9. APPROXIMATELY_UNBIASED_IF_DATA_IS_NORMAL: the resulting quantile estimates are approximately unbiased for the expected order statistics if the sample is normally distributed
				
// create an array of doubles for our dataset and quantiles
val values = doubleArrayOf(0.0, 1.0, 2.0, 3.0, 3.0, 3.0, 6.0, 7.0, 8.0, 9.0)
val qs = doubleArrayOf(1e-10, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8,
0.9, 0.95, 1.0)

// APPROXIMATELY_MEDIAN_UNBIASED
println("APPROXIMATELY_MEDIAN_UNBIASED")
// create Quantile object
val quantile1 = Quantile(values, Quantile.QuantileType.APPROXIMATELY_MEDIAN_UNBIASED)

println("Sample size: " + quantile1.N())

for (i in qs) {
println("Q(" + i + ") = " + quantile1.value(i))
}

println()

// "NEAREST_EVEN_ORDER_STATISTICS
println("NEAREST_EVEN_ORDER_STATISTICS")
// create Quantile object
val quantile2 = Quantile(values, Quantile.QuantileType.NEAREST_EVEN_ORDER_STATISTICS)

println("Sample size: " + quantile2.N())

for (i in qs) {
println("Q(" + i + ") = " + quantile2.value(i))
}



				
APPROXIMATELY_MEDIAN_UNBIASED
Sample size: 10
Q(1.0E-10) = 0.0
Q(0.1) = 0.3666666666666667
Q(0.15) = 0.8833333333333333
Q(0.2) = 1.4
Q(0.3) = 2.4333333333333336
Q(0.4) = 3.0
Q(0.5) = 3.0
Q(0.6) = 4.6
Q(0.7) = 6.566666666666666
Q(0.8) = 7.6
Q(0.9) = 8.633333333333333
Q(0.95) = 9.0
Q(1.0) = 9.0

NEAREST_EVEN_ORDER_STATISTICS
Sample size: 10
Q(1.0E-10) = 0.0
Q(0.1) = 0.0
Q(0.15) = 1.0
Q(0.2) = 1.0
Q(0.3) = 2.0
Q(0.4) = 3.0
Q(0.5) = 3.0
Q(0.6) = 3.0
Q(0.7) = 6.0
Q(0.8) = 7.0
Q(0.9) = 8.0
Q(0.95) = 9.0
Q(1.0) = 9.0