Constraint Programming Backend
Constraint programming backend for optimal counterfactual search.
- class BaseModel[source]
Bases:
ABC,CpModelBase OR-Tools model used by the constraint programming backend.
- class ConstraintProgramBuilder(*args, **kwargs)[source]
Bases:
ModelBuilderBuild CP constraints linking feature variables to tree paths.
- build(model, *, trees, mapper)[source]
Build the model constraints for the given trees and features.
- Parameters:
model (BaseModel) – The model to which the constraints will be added.
trees (tuple[TreeVar, ...]) – The tree variables for which the constraints will be built.
mapper (Mapper[FeatureVar]) – The feature variables for which the constraints will be built.
- class Explainer(ensemble, *, mapper, weights=None, epsilon=1, model_type=Type.CP)[source]
Bases:
Model,BaseExplainerConstraint programming explainer for tree ensemble classifiers.
Initialize an empty CP-SAT model for a parsed ensemble.
- Parameters:
trees – Parsed trees contributing to the ensemble class scores.
mapper – Feature mapper aligning processed query coordinates with the finite-domain decision variables.
weights – Optional tree weights.
max_samples – Reserved parameter used by compatible higher-level explainers.
epsilon – Integer classification margin used in the pairwise score constraints.
model_type – Backend builder variant.
- explain(x, *, y, norm, return_callback=False, verbose=False, max_time=60, num_workers=None, random_seed=42, clean_up=True)[source]
Solve one counterfactual query with the CP-SAT backend.
- Parameters:
x – Query instance in the processed feature space.
y – Target class enforced by the counterfactual.
norm – Integer distance norm used by the CP objective.
return_callback – Whether to record incumbent solutions during the search.
verbose – Whether to enable CP-SAT search logging.
max_time – Time limit in seconds.
num_workers – Optional number of CP-SAT workers.
random_seed – Random seed passed to CP-SAT.
clean_up – Whether to remove query-specific constraints after the solve.
- Returns:
The decoded counterfactual, or
Nonewhen no feasible counterfactual is found within the given limits.- Return type:
Explanation | None
- Raises:
RuntimeError – If CP-SAT reports an invalid model or an unexpected status.
- get_anytime_solutions()[source]
Return intermediate solutions collected during the last solve.
- Returns:
Time-stamped incumbent objective values when
return_callbackwas enabled inexplain(), otherwiseNone.- Return type:
list[dict[str, float]] | None
- get_distance()[source]
Return the post-processed distance of the last CF.
- Returns:
Post-processed \(L_p\) distance for the last successful solve.
- Return type:
float
- Raises:
RuntimeError – If no explanation has been computed yet.
- class Explanation(mapping=None, *, columns=None, validate=True)[source]
Bases:
Mapper[FeatureVar],BaseExplanationConcrete explanation container returned by the CP backend.
Initialize a mapper from feature metadata and transformed columns.
- Parameters:
mapping – Mapping from original feature names to metadata objects.
columns – Processed pandas index describing the transformed coordinates.
validate – Whether to verify that
mappingandcolumnsare consistent.
- property query
- property value
- property x
- class FeatureManager(mapper)[source]
Bases:
objectManage CP feature variables for the counterfactual point \(x\).
This manager owns the finite-domain variables representing the processed coordinates of the counterfactual explanation. The query \(\hat{x}\) only appears later in the objective terms.
Wrap the parsed feature mapper in CP-specific feature variables.
- FEATURE_VAR_FMT = 'feature[{key}]'
- property explanation
- property mapper
- property n_columns
- property n_features
- class FeatureVar(feature, name)[source]
Bases:
Var,FeatureKeeperCP variable bundle associated with a single parsed feature.
- X_VAR_NAME_FMT = 'x[{name}]'
- class Model(trees, mapper, *, weights=None, max_samples=0, epsilon=1, model_type=Type.CP)[source]
Bases:
BaseModel,FeatureManager,TreeManager,GarbageManagerConstraint-programming formulation behind
ocean.cp.Explainer.The feature variables encode the counterfactual point
xand the tree variables encode the active leaf decisionsp_(t,l).Initialize an empty CP-SAT model for a parsed ensemble.
- Parameters:
trees – Parsed trees contributing to the ensemble class scores.
mapper – Feature mapper aligning processed query coordinates with the finite-domain decision variables.
weights – Optional tree weights.
max_samples – Reserved parameter used by compatible higher-level explainers.
epsilon – Integer classification margin used in the pairwise score constraints.
model_type – Backend builder variant.
- DEFAULT_EPSILON = 1
- L1(x, v, code=None, *, norm=1)[source]
Build the CP contribution of one feature to \(d_p(x, \hat{x})^p\).
- Parameters:
x – Query coordinate \(\hat{x}_j\).
v – CP feature variable representing one original feature or one one-hot block in the counterfactual point \(x\).
code – Optional category code \(k\) when the feature is represented by one-hot variables \(u_{j,k}\).
norm – Distance norm \(p\) used to build \(d_p(x, \hat{x})^p\).
- Returns:
Scaled linear expression for the contribution of that feature to \(d_p(x, \hat{x})^p\).
- Return type:
cp.LinearExpr
- add_objective(x, *, norm=1)[source]
Minimize the scaled integer approximation of \(d_p(x, \hat{x})\).
- Parameters:
x – Query point \(\hat{x}\) in the processed feature space. The code parameter is named
x, but mathematically it represents the query.norm – Distance norm. The default is \(L_1\).
- build()[source]
Create the CP variables and structural constraints of the formulation.
This builds the finite-domain feature variables for \(x\), the leaf/path variables \(p_{t,\ell}\), and the consistency constraints connecting both representations.
- cleanup()[source]
Remove query-specific constraints from the CP model.
The persistent structure over \(x\) and \(p_{t,\ell}\) is kept, while objective auxiliaries and class constraints for the last query \(\hat{x}\) are discarded.
- get_intervals_cost(levels, x, *, norm=1)[source]
Return interval costs for a continuous feature encoded by thresholds.
For a continuous coordinate \(x_j\), CP does not optimize directly over the real value. Instead it selects an interval between successive threshold levels. This helper assigns each interval the scaled \(L_p^p\) distance to the query coordinate \(\hat{x}_j\).
- Returns:
Scaled costs for the threshold intervals representing that continuous feature.
- Return type:
list[int]
- class TreeManager(trees, *, weights=None, scale=10000000000)
Bases:
objectManage CP tree variables and the scaled decision function \(f\).
Each
TreeVarencodes the active leaf decisions \(p_{t,\ell}\) of one tree. Aggregating those variables yields the integer-scaled CP representation of \(f(x)\).Initialize the tree manager and the score-scaling factor.
- DEFAULT_SCORE_SCALE = 10000000000
- NUM_BINARY_CLASS = 2
- TREE_VAR_FMT = 'tree[{t}]'
- XGBOOST_DEFAULT_CLASS = 1
- build_trees(model)
Create the CP tree variables and cache \(f(x)\).
- property estimators
- property function
- property n_classes
- property n_estimators
- property n_trees
- property score_scale
- property shape
- property trees
- weighted_function(weights)
Return the integer-scaled CP representation of \(f(x)\).
The returned dictionary maps each output-class pair
(op, c)to the corresponding linear expression for \(f_c(x)\).- Returns:
Dictionary of scaled class-score expressions.
- Return type:
dict[tuple[NonNegativeInt, NonNegativeInt], cp.LinearExpr]
- property weights