- surpyval.univariate.nonparametric.plotting_positions.plotting_positions(x, c=None, n=None, t=None, heuristic='Blom', turnbull_estimator='Fleming-Harrington')
This function takes in data in the xcnt format and outputs an approximation of the CDF. This function can be used to produce estimates of F using the Nelson-Aalen, Kaplan-Meier, Fleming-Harrington, and the Turnbull estimates. Additionally, it can be used to create ‘plotting heuristics.’
Plotting heuristics are the values that are used to plot on probability paper and can be used to estiamte the parameters of a distribution. The use of probability plots is one of the traditional ways to estimate the parameters of a distribution.
If right censored data can be used by the regular plotting positions. If there is right censored data this method adjusts the ranks of the values using the mean order number.
- Parameters
x (array like, optional) – Array of observations of the random variables. If x is
None, xl and xr must be provided.c (array like, optional) – Array of censoring flag. -1 is left censored, 0 is observed, 1 is right censored, and 2 is intervally censored. If not provided will assume all values are observed.
n (array like, optional) – Array of counts for each x. If data is proivded as counts, then this can be provided. If
Nonewill assume each observation is 1.t (2D-array like, optional) – 2D array like of the left and right values at which the respective observation was truncated. If not provided it assumes that no truncation occurs.
heuristic (("Blom", "Median", "ECDF", "ECDF_Adj", "Modal", "Midpoint", "Mean", "Weibull", "Benard", "Beard", "Hazen", "Gringorten", "None", "Larsen", "Tukey", "DPW"). str, optional) – Method to use to compute the heuristic of F. See details of each heursitic in the probability plotting section.
turnbull_estimator (('Nelson-Aalen', 'Kaplan-Meier'), str, optional) – If using the Turnbull heuristic, you can elect to use the NA or KM method to compute R with the Turnbull estimates of the risk and death sets.
- Returns
x (numpy array) – x values for the plotting points
r (numpy array) – risk set at each x
d (numpy array) – death set at each x
F (numpy array) – estimate of F to use in plotting positions.
Examples
>>> from surpyval.nonparametric import plotting_positions >>> import numpy as np >>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8]) >>> x, r, d, F = plotting_positions(x, heuristic="Filliben") >>> F array([0.08299596, 0.20113568, 0.32068141, 0.44022714, 0.55977286, 0.67931859, 0.79886432, 0.91700404])