Proportional Intensity Regression Model

class surpyval.recurrent.regression.proportional_intensity.ProportionalIntensityModel

Bases: RecurrenceSimulationMixin, LikelihoodInferenceMixin

Model to provide methods and attributes when using a fitted proportional intensity model.

Simulation reuses the shared RecurrenceSimulationMixin (seeding, max_events backstop and the count/time-terminated drivers); the only addition here is the per-item covariate vector Z, which the simulation entry points take and thread through to the sampler. When the model was fitted by maximum likelihood it also carries the likelihood-inference behaviour (log_likelihood, aic, bic, standard_errors) from LikelihoodInferenceMixin.

Examples

>>> from surpyval.datasets import load_rossi_static
>>> from surpyval.recurrent import CrowAMSAA
>>> from surpyval.recurrent import ProportionalIntensityNHPP
>>> data = load_rossi_static()
>>> x = data['week'].values
>>> c = data['arrest'].values
>>> Z = data[["fin", "age", "race", "wexp", "mar", "paro", "prio"]].values
>>> model = ProportionalIntensityNHPP.fit(x, Z, c, dist=CrowAMSAA)
>>> type(model)
surpyval.recurrent.regression.proportional_intensity.ProportionalIntensityModel
>>> model.cif([1, 2, 3], Z.mean(axis=0))
array([0.00625402, 0.04304137, 0.13302238])
cif(x, Z)

Compute the cumulative incidence function of the model with the parameters found by the fit method.

Parameters
  • x (array_like) – The times to compute the CIF at.

  • Z (array_like) – The covariates for the item.

count_terminated_simulation(events, Z, items=1, seed=None)

Simulate count-terminated recurrence data based on the fitted model.

Parameters
  • events (int) – Number of events to simulate per sequence.

  • Z (array_like) – Covariate vector applied to every simulated sequence.

  • items (int, optional) – Number of items (or sequences) to simulate. Default is 1.

  • seed (int or numpy.random.Generator, optional) – Seed for a reproducible simulation.

Returns

An NonParametricCounting model built from the simulated data.

Return type

NonParametricCounting

count_terminated_simulation_data(events, Z, items=1, seed=None)

Simulate count-terminated recurrence data and return the raw events. Like count_terminated_simulation() but yields the simulated RecurrentEventData rather than the fitted MCF.

covariance()

Approximate parameter covariance matrix, ordered to match parameter_names. Computed as the inverse of the numerical Hessian of the negative log-likelihood at the MLE.

iif(x, Z)

Compute the instantaneous incidence function of the model with the parameters found by the fit method.

Parameters
  • x (array_like) – The times to at which to compute the iif.

  • Z (array_like) – The covariates for the item.

mcf(x, Z, items=1000, seed=None)

Estimate the mean cumulative function at x for covariates Z by simulating items time-terminated sequences out to max(x).

plot(ax=None)

PLots the CIF of the model against the data used to fit it.

To do this, the plot method takes the average of the covariates, and uses them to calculate the CIF of the model. This is then plotted against the non-parametric MCF of the raw data. That is, the raw MCF is created without considering the covariates.

Parameters

ax (matplotlib.axes.Axes, optional) – The axes to plot the data on. If None, the current axes will be used.

Returns

ax – The axes the data was plotted on.

Return type

matplotlib.axes.Axes

standard_errors()

Standard errors of the fitted parameters (the square roots of the diagonal of covariance()), ordered to match parameter_names. Entries are NaN where the variance is non-positive, which typically indicates a boundary optimum.

time_terminated_simulation(T, Z, items=1, tol=1e-08, max_events=10000, seed=None)

Simulate time-terminated recurrence data based on the fitted model.

Parameters
  • T (float) – Time termination value.

  • Z (array_like) – Covariate vector applied to every simulated sequence.

  • items (int, optional) – Number of items (or sequences) to simulate. Default is 1.

  • tol (float, optional) – Interarrival times below this value end a sequence early (a possible asymptote). Default is 1e-8.

  • max_events (int, optional) – Hard per-sequence event cap that guarantees termination. Default is 10000.

  • seed (int or numpy.random.Generator, optional) – Seed for a reproducible simulation.

Returns

An NonParametricCounting model built from the simulated data.

Return type

NonParametricCounting

Warning

A sequence is terminated early and right-censored at its last event if an interarrival time falls below tol or it reaches max_events before T. A warning is raised in either case.

time_terminated_simulation_data(T, Z, items=1, tol=1e-08, max_events=10000, seed=None)

Simulate time-terminated recurrence data and return the raw events. Like time_terminated_simulation() but yields the simulated RecurrentEventData rather than the fitted MCF.