Assisted specification¶
Assisted specification algorithm
biogeme.assisted module¶
File assisted.py :author: Michel Bierlaire, EPFL :date: Thu Sep 17 16:21:01 2020
Assisted specification for choice models
-
class
biogeme.assisted.
groupOfVariables
(name, variables, nonlinearSpecs)[source]¶ Bases:
object
Class representing groups of variables. All variables in the group will have the same nonlinear spec. They can also share the same coefficient.
-
__init__
(name, variables, nonlinearSpecs)[source]¶ Ctor
- Parameters
name (str) – name of the group of variables
variables – list of variables in the group
nonlinearSpecs (list(function)) – list of possible nonlinear specifications
-
activate
(yes)[source]¶ A group of variables can have two status: activated or not. This function changes the status.
- Parameters
yes (bool) – if True, activates the group. If False, desactivate the group.
-
active
¶ True if the group is active.
-
alwaysActive
¶ True if the group is always active.
-
generic
¶ True if the group is generic.
-
genericForbiden
¶ True of the group cannot be made generic
-
getDecisions
()[source]¶ The decision is an integer representing the decisions with respect to the group of variables:
- Returns
decision with respect to the group of variables
-3 if it is inactive
-2 if it is active, generic and linear
-1 if it is active, alt. specific and linear
index of the nonlinear specification if active, generic and nonlinear.
100 plus the index of the nonlinear specification if active, alt. specific and nonlinear.
- Return type
int
-
linear
¶ True if linear specification.
-
makeGeneric
(yes)[source]¶ A group of variables can be generic or alternative specific. This function changes the status.
- Parameters
yes (bool) – if True, status is set to “generic”. If False, status is set to “alt. specific”.
- Raises
biogemeError – if the variable cannot be made generic.
-
name
¶ name of the group of variables.
-
nonlinearSpecs
¶ list of possible nonlinear specifications
-
selection
¶ Index of the selected non linear specification.
-
setDecisions
(decision)[source]¶ Implement the decision, after verifying its validity
- Parameters
decision –
-3 if it is inactive
-2 if it is active, generic and linear
-1 if it is active, alt. specific and linear
index of the nonlinear specification if active, generic and nonlinear.
100 plus the index of the nonlinear specification if active, alt. specific and nonlinear.
- Raises
biogemeError – if the decision is to deactive, while the group should always be active.
-
setLinear
(yes)[source]¶ A group of variables can be linear or not. This function changes the status.
- Parameters
yes (bool) – if True, status is set to “linear”. If False, status is set to “nonlinear”.
-
setSelection
(sel)[source]¶ Set the selection of the nonlinear specification.
- Parameters
sel (int) – index of list self.nonlinearSpecs corresponding to the nonlinear spec.
- Raises
biogemeError – if the index is out of range.
-
variables
¶ list of variables in the group.
-
-
class
biogeme.assisted.
segmentation
(dictOfSocioEco)[source]¶ Bases:
object
Class representing the possible segmentations
-
alwaysActive
¶ True if it must always be active
-
dictOfSocioEco
¶ dict of object of class socioEconomic characterizing the segmentation.
-
getDecisions
()[source]¶ The decision is a dict, where the keys are the name of the socioeconomic variables, and the value are a boolean mentioning if it is active or not.
- Returns
decision of activation of the socio-eco variables for the segmentation.
- Return type
dict(str: bool)
-
getExpression
(coef_name, bounds)[source]¶ Obtain the Biogeme expression.
- Parameters
coef_name (str) – name of the coefficient
bounds (tuple(float, float)) – bounds on the coefficient
- Returns
biogeme expression for the segmentation
- Return type
-
listOfVariables
¶ list of variables involved
-
setDecisions
(decisions)[source]¶ Implement the specification decisions, represented as a dict, where the keys are the name of the socioeconomic variables, and the value are a boolean mentioning if it is active or not.
- Parameters
decisions (dict(str: bool)) – decision of activation of the socio-eco variables for the segmentation.
- Raises
biogemeError – if the name of a segmentation is unknown
-
used
¶ True if used.
-
-
class
biogeme.assisted.
socioEconomic
(name, expression, values)[source]¶ Bases:
object
Class representing socio-economic characteristic
-
__init__
(name, expression, values)[source]¶ Ctor
- Parameters
name (str) – name of the segmentation variable
expression (meth:biogeme.expressions.Expression) – Biogeme expression of the variable
values (dict(int: str)) – dict with values that it can take as keys, and a name describing them as values.
-
active
¶ True if the segmentation variable is active.
-
combine
(existingValues)[source]¶ Generates the possible combinations of values, corresponding to segments.
-
expression
¶ Biogeme expression of the variable
-
name
¶ name of the segmentation variable
-
values
¶ dict with values that it can take as keys, and a name describing them as values.
-
-
class
biogeme.assisted.
solution
[source]¶ Bases:
biogeme.vns.solutionClass
Class representing one solution, that is, one model specification.
-
causeInvalidity
¶ If the solution is invalid, contains the cause of the invalidity
-
decisions
¶ The decisions consist of:
a dict of decisions for each group of variables.
for each utility, a list of decisions for each term.
the selected model
Type: tuple(dict(str: int), dict(str: list(dict(str: bool))), int)
-
description
¶ description of the solution
-
objectives
¶ values of the objectives
-
objectivesNames
¶ Names of the objective functions
-
valid
¶ True if the solution is valid
-
-
class
biogeme.assisted.
specificationProblem
(name, database, theVariables, theGroups, genericForbiden, forceActive, theNonlinearSpecs, theSegmentations, utilities, availabilities, choice, models)[source]¶ Bases:
biogeme.vns.problemClass
Class defining the choice model specification problem
-
__init__
(name, database, theVariables, theGroups, genericForbiden, forceActive, theNonlinearSpecs, theSegmentations, utilities, availabilities, choice, models)[source]¶ Ctor.
- Parameters
name (str) – name of the problem.
database (
biogeme.database.Database
) – data for the estimationtheVariables (dict(str:
biogeme.expressions.Expression
)) – variables involved in the model and their namestheGroups (dict(str: list(str))) – variables in the same groups share the same transforms and activation status. Each group is characterized by its name, and is associated to a list of variables, identified by their name.
genericForbiden (list(str)) – groups of variables that must be alternative specific.
forceActive (list(str)) – groups of variables that must be in the model.
theNonlinearSpecs (dict(str: list( fct() ))) –
associates a group of variables or a variable with a list of possible nonlinear transformations. Each transformation is a function that takes one argument (the variable), and return a tuple with
the name of the nonlinear transform
the expression of the transform.
Examples of such a function:
def sqrt(x): return 'sqrt', x**0.5 def boxcox(x): ell = Beta(f'lambda', 1, 0.0001, 3.0, 0) return 'Box-Cox', models.boxcox(x, ell)
theSegmentations (dict(str, tuple(biogeme.expression.Expression, dict(int, str)))) –
a dictionary, with keys being names and values beeing tuples (var, segments), where
var is the name of the variable
segments is a dict with keys being the value of the variable characterizing a segment, and the value being the name of the segment.
Example:
{'Income': (Income, {1: '<2500', 2: '2051_4000', 3: '4001_6000', 4: '6001_8000', 5: '8001_10000', 6: '>10000', -1: 'unknown'}), 'Gender': (Gender, {1: 'male', 2: 'female', -1: 'unkown'}),
utilities (dict(int, tuple(str, list(tuple(str, str, tuple(float, float),function))))) –
specification of the utility functions. It is a dict where
the keys are the ID of the alternatives.
the values are a tuple containing the name of the alternative and the specification.
The specification is a list of terms. A term is a tuple with the name of the variable, the name of the segmentation, the bounds on the coeffcient, and a function checking the validity of the corresponding parameter (typically, check its sign). All can be None. If they are all None, it corresponds to the alternative specification constant, without any segmentation and any assumption on the sign.
Example:
utility_pt = [('PT cte', 'Seg. cte', (None, None), None), ('PT travel time', 'Seg. time', (None, 0), None), ('PT travel cost', 'Seg. PT cost', (None, None), isNegative), ('PT Waiting time', 'Seg. wait', (None, 0), None)] utility_car = [('Car cte', 'Seg. cte', (None, None), None), ('Car travel time', 'Seg. time', (None, 0), None), ( 'Car travel cost', 'Seg. car cost', (None, None), isNegative ), ('Nbr of cars', 'Seg nbr cars', (None, None), None)] utility_sm = [( 'Distance', 'Seg. dist', (None, None), isNegative )] choiceModel = {0: ('pt', utility_pt), 1: ('car', utility_car), 2: ('sm', utility_sm)}
availabilities (dict(int,
biogeme.expressions.Expression
)) – dict describing the availability of the alternatives.choice (
biogeme.expressions.Expression
) – expression for the observed choicemodels (dict(str, fct)) – dict of possible models. A model is a function that takes the utilities and the availabilities, and return the loglikelihood expression.
- Raises
biogemeError – if a variable is found in two different groups.
biogemeError – if the list of groups forbiden to be active contains an unknown group.
biogemeError – if the list of groups forced to be active contains an unknown group.
biogemeError – if some variables are not in any group.
biogemeError – if a segmentation in a utility function is unknown.
biogemeError – if a variable in a utility function is unknown.
biogemeError – if some variables are not used.
-
applyOperator
(name, size=1)[source]¶ Apply an operator.
- Parameters
name (str) – name of the operator to apply
size (int) – size of the neighborhood
- Returns
total number of changes actually made on the solution
- Return type
int
- Raises
biogemeError – if the name of the operator is unknown.
-
archive
¶ Dictionary, where the keys are solutions (objects of type
biogeme.vns.solutionClass
) and the values are the estimation results (objects of typebiogeme.results.bioResults
).
-
availability
¶ dict describing the availability of the alternatives.
-
changeModel
(size=1, audit=False)[source]¶ Select randomly another model from the list
- Parameters
size (int) – not used here. Muxt be there for compliance with the call of operators.
audit (bool) – if True, returns the number of changes without actually implementing them.
- Returns
0 if no other model could be found. 1 otherwise.
- Return type
int
-
changeSegmentation
(size=1, audit=False)[source]¶ Change the interaction, while keeping the number of them
-
checkAvailability
()[source]¶ Check the availability of each operator.
- Returns
a dictionary with the availability status of each operator
- Return type
dict(str: bool)
-
choice
¶ expression for the observed choice
-
database
¶ object of type
biogeme.database.Database
, containing the data.
-
decisions
¶ The decisions consist of:
a dict of decisions for each group of variables.
for each utility, a list of decisions for each term.
the selected model
Type: tuple(dict(str: int), dict(str: list(dict(str: bool))), int)
-
describe
(aSolution)[source]¶ - Generates (if necessary) and returns a short description of
the solution
- Parameters
aSolution (class solution) – solution that must be described.
- Returns
description of the solution
- Return type
str
-
describeCurrentModel
()[source]¶ Generates a description of the current model
- Returns
model description.
- Return type
str
-
describeHtml
()[source]¶ Generates a description of the model in HTML format.
- Returns
description of the model in HTML format.
- Return type
str
-
evaluate
(aSolution)[source]¶ Evaluate the objectives functions of the solution and store them in the solution object.
- Parameters
aSolution (solutionClass) – solution to be evaluated
- Returns
results of the estimation
- Return type
-
generateNeighbor
(aSolution, neighborhoodSize)[source]¶ Generates a neighbor of the solution.
- Parameters
aSolution (class solution) – solution to be modified
neighborhoodSize (int) – size of the neighborhood to be applied
- Returns
a neighbor solution, and the number of changes that have been actually applied.
- Return type
tuple(class solution, int)
-
generateSolution
(nonlinearSpecs, segmentations, model)[source]¶ Generate a solution for the VNS algorithm
- Parameters
nonlinearSpecs (dict(str: tuple(int, bool))) –
nonlinear specifications. It is a dictionary where
the keys correspond to groups of variables,
the values are tuple with two entries:
index in the list self.nonlinearSpecs corresponding to the nonlinear spec, or None if linear,
a boolean that is True if the coefficient is generic, False otherwise.
Example:
nl = {'Travel time': (0, False), 'Travel cost': (0, False), 'Headway': (0, False)}
segmentations (dict(str: list(str))) –
dictionary where the keys are the name of the segmentations, and the values are lists of socio-economic characteristics that must be activated.
Example:
sg = {'Seg. cte': ['GA'], 'Seg. cost': ['class', 'who'], 'Seg. time': ['gender'], 'Seg. headway': ['class']}
model (str) – selected model
- Returns
the solution that has been generated
- Return type
class solution
- Raises
biogemeError – if a variable is set ot generic. Only groups of variables can be made generic.
biogemeError – if a group of variables is unknown
biogemeError – if an error occurs in setting the segmentation decisions
biogemeError – if the model is unknown.
-
getDecisions
()[source]¶ The decisions consist of:
a dict of decisions for each group of variables.
for each utility, a list of decisions for each term.
the selected model
- Returns
all decisions
- Return type
tuple(dict(str: int), dict(str: list(dict(str: bool))), int)
-
getSolution
()[source]¶ Generate an object of the class
solution
- Returns
the solution that has been generated
- Return type
class solution
-
isValid
(aSolution)[source]¶ Evaluate the validity of the solution.
- Parameters
aSolution (class solution) – solution to be checked
- Returns
valid, why where valid is True if the solution is valid, and False otherwise.
why
contains an explanation why it is invalid.- Return type
tuple(bool, str)
-
lastOperator
¶ Last operator used
-
maximumNumberOfParameters
¶ maximum number of parameters allowed in a specification. If the current model has more parameters, it is declared invalid and rejected by the algorithm.
-
models
¶ List tuple (name, model). A model is a function that takes the utilities and the availabilities, and return the loglikelihood expression.
-
name
¶ name of the problem.
-
neighborAccepted
(aSolution, aNeighbor)[source]¶ Notify that a neighbor has been accepted by the algorithm. Used to update the statistics on the operators.
- Parameters
aSolution (solutionClass) – solution modified. Not used in this implementation.
aNeighbor (solutionClass) – neighbor
- Raises
biogemeError – if no operator has been used yet.
-
neighborRejected
(aSolution, aNeighbor)[source]¶ Notify that a neighbor has been rejected by the algorithm. Used to update the statistics on the operators.
- Parameters
aSolution (class solution) – solution modified. Not used in this implementation.
aNeighbor (solutionClass) – neighbor
- Raises
biogemeError – if no operator has been used yet.
-
operators
¶ Dict of operators, where the keys are the names, and the values are the function imple,enting the operators.
-
operatorsManagement
¶ Object of type
biogeme.vns.operatorsManagement
-
selectedModel
¶ index of the selected model
-
setDecisions
(decisions)[source]¶ Implement the specification decisions
- Parameters
decisions (tuple(dict(str: int), dict(str: list(dict(str: bool))), int)) – specification decisions
- Raises
biogemeError – if a group of variables is unknown.
biogemeError – if an alternative is unknown.
biogemeError – if the length of decisions is inconsistent with the number of terms in the utility function.
biogemeError – if the njumber of the model is out of range.
-
setSolution
(aSolution)[source]¶ Import a solution defined as an object of class
solution
.- Parameters
aSolution (class solution) – solution to be imported.
- Raises
biogemeError – if the object has the wrong type.
-
theAlternatives
¶ Dict of utility functions, where the keys are the id of the alternatives, and the values are objects of type
biogeme.assisted.utility
-
theGroups
¶ dict of groups of variables, where the keys are the names, and the values are objects of class
groupOfVariables
-
theSegmentations
¶ dict of segmentations, where the keys are the names, and the values are objects of class
segmentation
-
theVariables
¶ dict of variables, where the keys are the names, and the values are objects of class
variable
-
utilities
¶ specification of the utility functions. See
biogeme.assisted.specificationProblem.__init__
-
-
class
biogeme.assisted.
term
(var, aSegmentation, bounds, validity)[source]¶ Bases:
object
Class representing the possible specifications of one term of the utility function
-
__init__
(var, aSegmentation, bounds, validity)[source]¶ Ctor
- Parameters
var (variable) – variable of the term.
aSegmentation (segmentation) – discrete segmentation for the parameter
bounds (tuple(float, float)) – bounds on the coefficient
validity (bool f(float)) – function checking the validity of the coefficient.
-
bounds
¶ bounds on the coefficient
-
coef_names
¶ names of the Beta parameters involved in the specification of the term.
-
describe
()[source]¶ Provides a short description of the term.
- Returns
short description.
- Return type
str
-
getBeta
(altname)[source]¶ Obtain the name of the coefficient of the term
- Parameters
altname (str) – name of the alternative
- Returns
name of the coefficient
- Return type
str
-
getDecisions
()[source]¶ The decision is a dict, where the keys are the name of the socioeconomic variables, and the value are a boolean mentioning if it is active or not.
- Returns
decision of activation of the socio-eco variables for the segmentation.
- Return type
dict(str: bool)
-
getExpression
(altname)[source]¶ Build the Biogeme expression for the term
- Parameters
altname (str) – name of the alternative
- Returns
eypression for the term
- Return type
-
isValid
(altname, estimationResults)[source]¶ Check the validity of the estimated coefficient
- Parameters
altname (str) – name of the laternative
estimationResults (
biogeme.results.bioResults
) – results of the estimation with Biogeme
- Returns
True if the valus is valid, False otherwise.
- Return type
bool
- Raises
biogemeError – if the parameter has not be estimated.
-
segmentation
¶ discrete segmentation of the parameter
-
setDecisions
(decisions)[source]¶ Implement the specification decisions, represented as a dict, where the keys are the name of the socioeconomic variables, and the value are a boolean mentioning if it is active or not.
- Parameters
decisions (dict(str: bool)) – decision of activation of the socio-eco variables for the segmentation.
-
validity
¶ function checking the validity of the coefficient.
-
var
¶ variable of the term
-
-
class
biogeme.assisted.
utility
(alternativeId, name, terms)[source]¶ Bases:
object
Class representing the possible specifications of a utility function
-
__init__
(alternativeId, name, terms)[source]¶ Ctor.
- Parameters
alternativeId (int) – id of the alternative.
name (str) – name of the alternative
terms (list(term)) – terms of the utility function.
-
getExpression
()[source]¶ Obtain the Biogeme expression for the utility function.
- Returns
Biogeme expression
- Return type
-
id
¶ id of the alternative
-
name
¶ name of the alternative
-
terms
¶ list of terms in the utility function
-
-
class
biogeme.assisted.
variable
(name, expression)[source]¶ Bases:
object
Class representing the possible specifications of a variable
-
__init__
(name, expression)[source]¶ - Parameters
name (str) – name of the variable
expression (
biogeme.expressions.Expression
) – Biogeme expression of the variable.
-
active
¶ True if variable is active
-
expression
¶ Biogeme expression for the variable
-
generic
¶ True if the variable is generic.
-
genericName
¶ Name of the generic variable
-
getExpression
()[source]¶ Returns the biogeme expression of the specification of the variable.
- Returns
expression accozunting for the status and the nonlinear specification.
- Return type
-
makeGeneric
(yes, name)[source]¶ A variable can have two status: generic or alternative specific. This function changes the status
- Parameters
yes (bool) – if True, status is set to “generic”. If False, status is set to “alt. specific”.
name (str) – name of the generic version of the variable
-
name
¶ name of the variable
-
nonlinearSpec
¶ Function with the nonlinear specifications.
-
used
¶ True if variable used in the model.
-