Non-Parametric¶
-
class
surpyval.nonparametric.nonparametric.
NonParametric
¶ Bases:
object
Result of
.fit()
method for every non-parametric surpyval distribution. This means that each of the methods in this class can be called with a model created from theNelsonAalen
,KaplanMeier
,FlemingHarrington
, orTurnbull
estimators.-
Hf
(x, interp='step')¶ Cumulative hazard rate with the non-parametric estimates from the data. This is calculated using the relationship between the hazard function and the density:
\[H(x) = -\ln(R(x))\]Parameters: x (array like or scalar) – The values of the random variables at which the survival function will be calculated Returns: Hf – The value(s) of the density function at x Return type: scalar or numpy array Examples
>>> from surpyval import NelsonAalen >>> x = np.array([1, 2, 3, 4, 5]) >>> model = NelsonAalen.fit(x) >>> model.Hf(2) array([0.45]) >>> model.df([1., 1.5, 2., 2.5]) model.Hf([1., 1.5, 2., 2.5])
-
R_cb
(x, bound='two-sided', interp='step', alpha_ci=0.05, bound_type='exp', dist='z')¶ Cumulative hazard rate with the non-parametric estimates from the data. This is calculated using the relationship between the hazard function and the density:
Parameters: - x (array like or scalar) – The values of the random variables at which the confidence bounds will be calculated
- bound (('two-sided', 'upper', 'lower'), str, optional) – Compute either the two-sided, upper or lower confidence bound(s). Defaults to two-sided
- interp (('step', 'linear', 'cubic'), optional) – How to interpolate the values between observations. Survival statistics traditionally uses step functions, but can use interpolated values if desired. Defaults to step.
- alpha_ci (scalar, optional) – The level of significance at which the bound will be computed.
- bound_type (('exp', 'regular'), str, optional) – The method with which the bounds will be calculated. Using regular will allow for the bounds to exceed 1 or be less than 0. Defaults to exp as this ensures the bounds are within 0 and 1.
- dist (('t', 'z'), str, optional) – The distribution to use in finding the bounds. Defaults to the normal (z) distribution.
Returns: R_cb – The value(s) of the upper, lower, or both confidence bound(s) of the survival function at x
Return type: scalar or numpy array
Examples
>>> from surpyval import NelsonAalen >>> x = np.array([1, 2, 3, 4, 5]) >>> model = NelsonAalen.fit(x) >>> model.R_cb([1., 1.5, 2., 2.5], bound='lower', dist='t') array([0.11434813, 0.11434813, 0.04794404, 0.04794404]) >>> model.R_cb([1., 1.5, 2., 2.5]) array([[0.97789387, 0.16706394], [0.97789387, 0.16706394], [0.91235117, 0.10996882], [0.91235117, 0.10996882]])
References
http://reliawiki.org/index.php/Non-Parametric_Life_Data_Analysis
-
df
(x, interp='step')¶ Density function with the non-parametric estimates from the data. This is calculated using the relationship between the hazard function and the density:
\[f(x) = h(x)e^{-H(x)}\]Parameters: x (array like or scalar) – The values of the random variables at which the survival function will be calculated Returns: df – The value(s) of the density function at x Return type: scalar or numpy array Examples
>>> from surpyval import NelsonAalen >>> x = np.array([1, 2, 3, 4, 5]) >>> model = NelsonAalen.fit(x) >>> model.df(2) array([0.28693267]) >>> model.df([1., 1.5, 2., 2.5]) array([0.16374615, 0. , 0.15940704, 0. ])
-
ff
(x, interp='step')¶ CDF (failure or unreliability) function with the non-parametric estimates from the data
Parameters: x (array like or scalar) – The values of the random variables at which the survival function will be calculated Returns: ff – The value(s) of the failure function at each x Return type: scalar or numpy array Examples
>>> from surpyval import NelsonAalen >>> x = np.array([1, 2, 3, 4, 5]) >>> model = NelsonAalen.fit(x) >>> model.ff(2) array([0.36237185]) >>> model.ff([1., 1.5, 2., 2.5]) array([0.18126925, 0.18126925, 0.36237185, 0.36237185])
-
hf
(x, interp='step')¶ Instantaneous hazard function with the non-parametric estimates from the data. This is calculated using simply the difference between consecutive H(x).
Parameters: x (array like or scalar) – The values of the random variables at which the survival function will be calculated Returns: hf – The value(s) of the failure function at each x Return type: scalar or numpy array Examples
>>> from surpyval import NelsonAalen >>> x = np.array([1, 2, 3, 4, 5]) >>> model = NelsonAalen.fit(x) >>> model.ff(2) array([0.36237185]) >>> model.ff([1., 1.5, 2., 2.5]) array([0.18126925, 0.18126925, 0.36237185, 0.36237185])
-
plot
(**kwargs)¶ Creates a plot of the survival function.
-
sf
(x, interp='step')¶ Surival (or Reliability) function with the non-parametric estimates from the data
Parameters: x (array like or scalar) – The values of the random variables at which the survival function will be calculated Returns: sf – The value(s) of the survival function at each x Return type: scalar or numpy array Examples
>>> from surpyval import NelsonAalen >>> x = np.array([1, 2, 3, 4, 5]) >>> model = NelsonAalen.fit(x) >>> model.sf(2) array([0.63762815]) >>> model.sf([1., 1.5, 2., 2.5]) array([0.81873075, 0.81873075, 0.63762815, 0.63762815])
-
-
class
surpyval.nonparametric.kaplan_meier.
KaplanMeier_
¶ Bases:
surpyval.nonparametric.nonparametric_fitter.NonParametricFitter
Kaplan-Meier estimator class. Calculates the Non-Parametric estimate of the survival function using:
\[R(x) = \prod_{i:x_{i} \leq x}^{} \left ( 1 - \frac{d_{i} }{r_{i}} \right )\]Examples
>>> import numpy as np >>> from surpyval import KaplanMeier >>> x = np.array([1, 2, 3, 4, 5]) >>> model = KaplanMeier.fit(x) >>> model.R array([0.8, 0.6, 0.4, 0.2, 0. ])
-
class
surpyval.nonparametric.nelson_aalen.
NelsonAalen_
¶ Bases:
surpyval.nonparametric.nonparametric_fitter.NonParametricFitter
Nelson-Aalen estimator class. Returns a NonParametric object from method
fit()
Calculates the Non-Parametric estimate of the survival function using:\[R(x) = e^{-\sum_{i:x_{i} \leq x}^{} \frac{d_{i} }{r_{i}}}\]Examples
>>> import numpy as np >>> from surpyval import NelsonAalen >>> x = np.array([1, 2, 3, 4, 5]) >>> model = NelsonAalen.fit(x) >>> model.R array([0.81873075, 0.63762815, 0.45688054, 0.27711205, 0.10194383])
-
class
surpyval.nonparametric.fleming_harrington.
FlemingHarrington_
¶ Bases:
surpyval.nonparametric.nonparametric_fitter.NonParametricFitter
Fleming-Harrington estimation of survival distribution. Returns a NonParametric object from method
fit()
Calculates the Non-Parametric estimate of the survival function using:\[R = e^{-\sum_{i:x_{i} \leq x} \sum_{i=0}^{d_x-1} \frac{1}{r_x - i}}\]See ‘NonParametric section for detailed estimate of how H is computed.’
Examples
>>> import numpy as np >>> from surpyval import FlemingHarrington >>> x = np.array([1, 2, 3, 4, 5]) >>> model = FlemingHarrington.fit(x) >>> model.R array([0.81873075, 0.63762815, 0.45688054, 0.27711205, 0.10194383])
-
class
surpyval.nonparametric.turnbull.
Turnbull_
¶ Bases:
surpyval.nonparametric.nonparametric_fitter.NonParametricFitter
Turnbull estimator class. Returns a NonParametric object from method
fit()
. Calculates the Non-Parametric estimate of the survival function using the Turnbull NPMLEExamples
>>> import numpy as np >>> from surpyval import Turnbull >>> x = np.array([[1, 5], [2, 3], [3, 6], [1, 8], [9, 10]]) >>> model = Turnbull.fit(x) >>> model.R array([1. , 0.59999999, 0.20000002, 0.2 , 0.2 , 0.2 , 0. , 0. ])
-
surpyval.nonparametric.success_run.
success_run
(n, confidence=0.95, alpha=None)¶ Function that can be used to estimte the confidence given n samples all survive a test.
-
surpyval.nonparametric.plotting_positions.
plotting_positions
(x, c=None, n=None, t=None, heuristic='Blom', turnbull_estimator='Fleming-Harrington')¶ This function takes in data in the xcnt format and outputs an approximation of the CDF. This function can be used to produce estimates of F using the Nelson-Aalen, Kaplan-Meier, Fleming-Harrington, and the Turnbull estimates. Additionally, it can be used to create ‘plotting heuristics.’
Plotting heuristics are the values that are used to plot on probability paper and can be used to estiamte the parameters of a distribution. The use of probability plots is one of the traditional ways to estimate the parameters of a distribution.
If right censored data can be used by the regular plotting positions. If there is right censored data this method adjusts the ranks of the values using the mean order number.
Parameters: - x (array like, optional) – Array of observations of the random variables. If x is
None
, xl and xr must be provided. - c (array like, optional) – Array of censoring flag. -1 is left censored, 0 is observed, 1 is right censored, and 2 is intervally censored. If not provided will assume all values are observed.
- n (array like, optional) – Array of counts for each x. If data is proivded as counts, then this can be provided. If
None
will assume each observation is 1. - t (2D-array like, optional) – 2D array like of the left and right values at which the respective observation was truncated. If not provided it assumes that no truncation occurs.
- heuristic (("Blom", "Median", "ECDF", "ECDF_Adj", "Modal", "Midpoint", "Mean", "Weibull", "Benard", "Beard", "Hazen", "Gringorten", "None", "Larsen", "Tukey", "DPW"), str, optional) – Method to use to compute the heuristic of F. See details of each heursitic in the probability plotting section.
- turnbull_estimator (('Nelson-Aalen', 'Kaplan-Meier'), str, optional) – If using the Turnbull heuristic, you can elect to use the NA or KM method to compute R with the Turnbull estimates of the risk and deat sets.
Returns: - x (numpy array) – x values for the plotting points
- r (numpy array) – risk set at each x
- d (numpy array) – death set at each x
- F (numpy array) – estimate of F to use in plotting positions.
Examples
>>> from surpyval.nonparametric import plotting_positions >>> import numpy as np >>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8]) >>> x, r, d, F = plotting_positions(x, heuristic="Filliben") >>> F array([0.08299596, 0.20113568, 0.32068141, 0.44022714, 0.55977286, 0.67931859, 0.79886432, 0.91700404])
- x (array like, optional) – Array of observations of the random variables. If x is