Conventions

Variable Names

Before discussing the formats, the conventions for vairable names needs to be clarified. These conventions are:

  • x = The random variable (time, stress etc.) array
  • xl = The random variable (time, stress etc.) array for the left interval of interval censored data.
  • xr = The random variable (time, stress etc.) array for the right interval of interval censored data.
  • c = Refers to the censoring data array associated with x
  • n = The count array associated with x
  • t = the truncation values for the left and right truncation at x (must be two dim, or use tl and tr instead)
  • r = risk set at x
  • d = failures/deaths at x
  • t = two dimensional array of the truncation interval at x
  • tl = one dimensional array of the value at which x is left truncated
  • tr = one dimensional array of the value at which x is right truncated

Data Formats

The conventional formats use in surpyval are:

  • xcnt = x variables, with c as the censoring scheme, n as the counts, and t as the truncation
  • xrd = x variables, with the risk set, r, at x and the deaths, d, also at x

All functions in surpyval have default handling conditions for c and n. That is, if these variables aren’t passed, it is assumed that there was one observation and it was a failure for every x.

Surpyval fit() functions use the xcnt format. But the package has handlers for other formats to rearrange it to the needed format. Other formats are:

wranglers for formats:

  • fs = failure time array, f, and right censored time array, s
  • fsl = fs format plus an array for left censored times.

Censoring conventions

For the censoring values, surpyval uses the convention used in Meeker and Escobar, that is:

  • -1 = left
  • 0 = failure / event
  • 1 = right
  • 2 = interval censoring. Must have left and right value in x

This convention gives an intuitive feel for the placement of the data on a timeline.

Function Conventions

The conventions for SurPyval are that each object returned from a fit() call has the ability to compute the following:

  • df() - The density function
  • ff() - The CDF
  • sf() - The survival function, or reliability function
  • hf() - The (instantaneous) hazard function
  • Hf() - The cumulative hazard function

These functions can be used to plot or even in optimisers so that you can optimize decisions that you are guiding with your survival analysis.