Assisted specification

Assisted specification algorithm

biogeme.assisted module

File assisted.py :author: Michel Bierlaire, EPFL :date: Thu Sep 17 16:21:01 2020

Assisted specification for choice models

class biogeme.assisted.SegmentedParameterTuple(dict, combinatorial)

Bases: tuple

combinatorial

Alias for field number 1

dict

Alias for field number 0

class biogeme.assisted.TermTuple(attribute, segmentation, bounds, validity)

Bases: tuple

attribute

Alias for field number 0

bounds

Alias for field number 2

segmentation

Alias for field number 1

validity

Alias for field number 3

class biogeme.assisted.groupOfVariables(name, variables, nonlinearSpecs)[source]

Bases: object

Class representing groups of variables. All variables in the group will have the same nonlinear spec. They can also share the same coefficient.

__init__(name, variables, nonlinearSpecs)[source]

Ctor

Parameters
  • name (str) – name of the group of variables

  • variables – list of variables in the group

  • nonlinearSpecs (list(function)) – list of possible nonlinear specifications

activate(yes)[source]

A group of variables can have two status: activated or not. This function changes the status.

Parameters

yes (bool) – if True, activates the group. If False, desactivate the group.

active

True if the group is active.

alwaysActive

True if the group is always active.

forbidGeneric()[source]

Forbid the generic specification in the group

forceActive()[source]

Force the variables in the group to be active.

generic

True if the group is generic.

genericForbiden

True of the group cannot be made generic

getDecisions()[source]

The decision is an integer representing the decisions with respect to the group of variables:

Returns

decision with respect to the group of variables

  • -3 if it is inactive

  • -2 if it is active, generic and linear

  • -1 if it is active, alt. specific and linear

  • index of the nonlinear specification if active, generic and nonlinear.

  • 100 plus the index of the nonlinear specification if active, alt. specific and nonlinear.

Return type

int

linear

True if linear specification.

makeGeneric(yes)[source]

A group of variables can be generic or alternative specific. This function changes the status.

Parameters

yes (bool) – if True, status is set to “generic”. If False, status is set to “alt. specific”.

Raises

biogemeError – if the variable cannot be made generic.

name

name of the group of variables.

nonlinearSpecs

list of possible nonlinear specifications

selection

Index of the selected non linear specification.

setDecisions(decision)[source]

Implement the decision, after verifying its validity

Parameters

decision

  • -3 if it is inactive

  • -2 if it is active, generic and linear

  • -1 if it is active, alt. specific and linear

  • index of the nonlinear specification if active, generic and nonlinear.

  • 100 plus the index of the nonlinear specification if active, alt. specific and nonlinear.

Raises

biogemeError – if the decision is to deactive, while the group should always be active.

setLinear(yes)[source]

A group of variables can be linear or not. This function changes the status.

Parameters

yes (bool) – if True, status is set to “linear”. If False, status is set to “nonlinear”.

setSelection(sel)[source]

Set the selection of the nonlinear specification.

Parameters

sel (int) – index of list self.nonlinearSpecs corresponding to the nonlinear spec.

Raises

biogemeError – if the index is out of range.

swapActivate()[source]

Change the activation status

swapGeneric()[source]

Change the generic/alt. specific status

swapLinear()[source]

Change the linearity status.

variables

list of variables in the group.

class biogeme.assisted.segmentation(name, dictOfSocioEco, combinatorial)[source]

Bases: object

Class representing the possible segmentations

__init__(name, dictOfSocioEco, combinatorial)[source]

Ctor

alwaysActive

True if it must always be active

combinatorial

True if all combinations are considered

describe()[source]

Description of the segmentation

Returns

description

Return type

str

dictOfSocioEco

dict of object of class socioEconomic characterizing the segmentation.

getBetaNames(coef_name)[source]

Get the name of the parameters for all combinations.

getDecisions()[source]

The decision is a dict, where the keys are the name of the socioeconomic variables, and the value are a boolean mentioning if it is active or not.

Returns

decision of activation of the socio-eco variables for the segmentation.

Return type

dict(str: bool)

getExpression(coef_name, bounds)[source]

Obtain the Biogeme expression.

Parameters
  • coef_name (str) – name of the coefficient

  • bounds (tuple(float, float)) – bounds on the coefficient

Returns

biogeme expression for the segmentation

Return type

biogeme.expressions.bioMultSum

isActive()[source]

Check if there are active variables for this segmentation.

listOfVariables

list of variables involved

name

name of the segmentation

setDecisions(decisions)[source]

Implement the specification decisions, represented as a dict, where the keys are the name of the socioeconomic variables, and the value are a boolean mentioning if it is active or not.

Parameters

decisions (dict(str: bool)) – decision of activation of the socio-eco variables for the segmentation.

Raises

biogemeError – if the name of a segmentation is unknown

used

True if used.

class biogeme.assisted.socioEconomic(name, expression, values)[source]

Bases: object

Class representing socio-economic characteristic

__init__(name, expression, values)[source]

Ctor

Parameters
  • name (str) – name of the segmentation variable

  • expression (meth:biogeme.expressions.Expression) – Biogeme expression of the variable

  • values (dict(int: str)) – dict with values that it can take as keys, and a name describing them as values.

active

True if the segmentation variable is active.

combine(existingValues)[source]

Generates the possible combinations of values, corresponding to segments.

expression

Biogeme expression of the variable

name

name of the segmentation variable

values

dict with values that it can take as keys, and a name describing them as values.

class biogeme.assisted.solution[source]

Bases: solutionClass

Class representing one solution, that is, one model specification.

__init__()[source]
causeInvalidity

If the solution is invalid, contains the cause of the invalidity

decisions

The decisions consist of:

  • a dict of decisions for each group of variables.

  • for each utility, a list of decisions for each term.

  • the selected model

Type: tuple(dict(str: int), dict(str: list(dict(str: bool))), int)

description

description of the solution

objectives

values of the objectives

objectivesNames

Names of the objective functions

valid

True if the solution is valid

class biogeme.assisted.specificationProblem(name, database, theVariables, theGroups, genericForbiden, forceActive, theNonlinearSpecs, theSegmentations, utilities, availabilities, choice, models)[source]

Bases: problemClass

Class defining the choice model specification problem

__init__(name, database, theVariables, theGroups, genericForbiden, forceActive, theNonlinearSpecs, theSegmentations, utilities, availabilities, choice, models)[source]

Ctor.

Parameters
  • name (str) – name of the problem.

  • database (biogeme.database.Database) – data for the estimation

  • theVariables (dict(str: biogeme.expressions.Expression)) – variables involved in the model and their names

  • theGroups (dict(str: list(str))) – variables in the same groups share the same transforms and activation status. Each group is characterized by its name, and is associated to a list of variables, identified by their name.

  • genericForbiden (list(str)) – groups of variables that must be alternative specific.

  • forceActive (list(str)) – groups of variables that must be in the model.

  • theNonlinearSpecs (dict(str: list( fct() ))) –

    associates a group of variables or a variable with a list of possible nonlinear transformations. Each transformation is a function that takes one argument (the variable), and return a tuple with

    • the name of the nonlinear transform

    • the expression of the transform.

    Examples of such a function:

    def sqrt(x):
        return 'sqrt', x**0.5
    
    def boxcox(x):
        ell = Beta(f'lambda', 1, 0.0001, 3.0, 0)
        return 'Box-Cox', models.boxcox(x, ell)
    

  • theSegmentations (dict(str, tuple(biogeme.expression.Expression, dict(int, str)))) –

    a dictionary, with keys being names and values beeing tuples (var, segments), where

    • var is the name of the variable

    • segments is a dict with keys being the value of the variable characterizing a segment, and the value being the name of the segment.

    Example:

    {'Income': (Income, {1: '<2500',
                         2: '2051_4000',
                         3: '4001_6000',
                         4: '6001_8000',
                         5: '8001_10000',
                         6: '>10000',
                         -1: 'unknown'}),
     'Gender': (Gender, {1: 'male',
                         2: 'female',
                        -1: 'unkown'}),
    

  • utilities (dict(int, tuple(str, list(tuple(str, str, tuple(float, float),function))))) –

    specification of the utility functions. It is a dict where

    • the keys are the ID of the alternatives.

    • the values are a tuple containing the name of the alternative and the specification.

    The specification is a list of terms. A term is a tuple with the name of the variable, the name of the segmentation, the bounds on the coeffcient, and a function checking the validity of the corresponding parameter (typically, check its sign). All can be None. If they are all None, it corresponds to the alternative specification constant, without any segmentation and any assumption on the sign.

    Example:

    utility_pt = [('PT cte', 'Seg. cte', (None, None), None),
      ('PT travel time', 'Seg. time', (None, 0), None),
      ('PT travel cost', 'Seg. PT cost', (None, None), isNegative),
      ('PT Waiting time', 'Seg. wait', (None, 0), None)]
    
    
    utility_car = [('Car cte', 'Seg. cte', (None, None), None),
      ('Car travel time', 'Seg. time', (None, 0), None),
      (
       'Car travel cost',
       'Seg. car cost',
       (None, None),
       isNegative
      ),
      ('Nbr of cars', 'Seg nbr cars', (None, None), None)]
    
    utility_sm = [(
                   'Distance',
                   'Seg. dist',
                   (None, None),
                   isNegative
                  )]
    
    choiceModel = {0: ('pt', utility_pt),
                   1: ('car', utility_car),
                   2: ('sm', utility_sm)}
    

  • availabilities (dict(int, biogeme.expressions.Expression)) – dict describing the availability of the alternatives.

  • choice (biogeme.expressions.Expression) – expression for the observed choice

  • models (dict(str, fct)) – dict of possible models. A model is a function that takes the utilities and the availabilities, and return the loglikelihood expression.

Raises
  • biogemeError – if a variable is found in two different groups.

  • biogemeError – if the list of groups forbiden to be active contains an unknown group.

  • biogemeError – if the list of groups forced to be active contains an unknown group.

  • biogemeError – if some variables are not in any group.

  • biogemeError – if a segmentation in a utility function is unknown.

  • biogemeError – if a variable in a utility function is unknown.

  • biogemeError – if some variables are not used.

applyOperator(name, size=1)[source]

Apply an operator.

Parameters
  • name (str) – name of the operator to apply

  • size (int) – size of the neighborhood

Returns

total number of changes actually made on the solution

Return type

int

Raises

biogemeError – if the name of the operator is unknown.

archive

Dictionary, where the keys are solutions (objects of type biogeme.vns.solutionClass) and the values are the estimation results (objects of type biogeme.results.bioResults).

availability

dict describing the availability of the alternatives.

changeGeneric(size=1, audit=False)[source]

Change generic vs alternative specific status

changeLinearity(size=1, audit=False)[source]

Make linear if non linear, and the other way around.

changeModel(size=1, audit=False)[source]

Select randomly another model from the list

Parameters
  • size (int) – not used here. Must be there for compliance with the call of operators.

  • audit (bool) – if True, returns the number of changes without actually implementing them.

Returns

0 if no other model could be found. 1 otherwise.

Return type

int

changeNonlinearity(size=1, audit=False)[source]

Change the nature of the nonlinear specification

changeSegmentation(size=1, audit=False)[source]

Change the interaction, while keeping the number of them

changeVariables(size=1, audit=False)[source]

Activate groups of variables

checkAvailability()[source]

Check the availability of each operator.

Returns

a dictionary with the availability status of each operator

Return type

dict(str: bool)

choice

expression for the observed choice

clone()[source]

Clone the model, in order to generate neighbors

Returns

a clone

Return type

specificationProblem

database

object of type biogeme.database.Database, containing the data.

decisions

The decisions consist of:

  • a dict of decisions for each group of variables.

  • for each utility, a list of decisions for each term.

  • the selected model

Type: tuple(dict(str: int), dict(str: list(dict(str: bool))), int)

decreaseSegmentation(size=1, audit=False)[source]

Remove a level of segmentation.

describe(aSolution)[source]
Generates (if necessary) and returns a short description of

the solution

Parameters

aSolution (class solution) – solution that must be described.

Returns

description of the solution

Return type

str

describeCurrentModel()[source]

Generates a description of the current model

Returns

model description.

Return type

str

describeHtml()[source]

Generates a description of the model in HTML format.

Returns

description of the model in HTML format.

Return type

str

evaluate(aSolution)[source]

Evaluate the objectives functions of the solution and store them in the solution object.

Parameters

aSolution (solutionClass) – solution to be evaluated

Returns

results of the estimation

Return type

class biogeme.results.bioResults

generateNeighbor(aSolution, neighborhoodSize)[source]

Generates a neighbor of the solution.

Parameters
  • aSolution (class solution) – solution to be modified

  • neighborhoodSize (int) – size of the neighborhood to be applied

Returns

a neighbor solution, and the number of changes that have been actually applied.

Return type

tuple(class solution, int)

generateSolution(nonlinearSpecs, segmentations, model)[source]

Generate a solution for the VNS algorithm

Parameters
  • nonlinearSpecs (dict(str: tuple(int, bool))) –

    nonlinear specifications. It is a dictionary where

    • the keys correspond to groups of variables,

    • the values are tuple with two entries:

      • index in the list self.nonlinearSpecs corresponding to the nonlinear spec, or None if linear,

      • a boolean that is True if the coefficient is generic, False otherwise.

    Example:

    nl = {'Travel time': (0, False),
          'Travel cost': (0, False),
          'Headway': (0, False)}
    

  • segmentations (dict(str: list(str))) –

    dictionary where the keys are the name of the segmentations, and the values are lists of socio-economic characteristics that must be activated.

    Example:

    sg = {'Seg. cte': ['GA'],
          'Seg. cost': ['class', 'who'],
          'Seg. time': ['gender'],
          'Seg. headway': ['class']}
    

  • model (str) – selected model

Returns

the solution that has been generated

Return type

class solution

Raises
  • biogemeError – if a variable is set ot generic. Only groups of variables can be made generic.

  • biogemeError – if a group of variables is unknown

  • biogemeError – if an error occurs in setting the segmentation decisions

  • biogemeError – if the model is unknown.

getBiogemeModel()[source]

Build the Biogeme expressions of a given specification

getDecisions()[source]

The decisions consist of:

  • a dict of decisions for each group of variables.

  • for each utility, a list of decisions for each term.

  • the selected model

Returns

all decisions

Return type

tuple(dict(str: int), dict(str: list(dict(str: bool))), int)

getSolution()[source]

Generate an object of the class solution

Returns

the solution that has been generated

Return type

class solution

increaseSegmentation(size=1, audit=False)[source]

Add a level of segmentation.

isValid(aSolution)[source]

Evaluate the validity of the solution.

Parameters

aSolution (class solution) – solution to be checked

Returns

valid, why where valid is True if the solution is valid, and False otherwise. why contains an explanation why it is invalid.

Return type

tuple(bool, str)

lastOperator

Last operator used

maximumNumberOfParameters

maximum number of parameters allowed in a specification. If the current model has more parameters, it is declared invalid and rejected by the algorithm.

models

List tuple (name, model). A model is a function that takes the utilities and the availabilities, and return the loglikelihood expression.

name

name of the problem.

neighborAccepted(aSolution, aNeighbor)[source]

Notify that a neighbor has been accepted by the algorithm. Used to update the statistics on the operators.

Parameters
Raises

biogemeError – if no operator has been used yet.

neighborRejected(aSolution, aNeighbor)[source]

Notify that a neighbor has been rejected by the algorithm. Used to update the statistics on the operators.

Parameters
  • aSolution (class solution) – solution modified. Not used in this implementation.

  • aNeighbor (solutionClass) – neighbor

Raises

biogemeError – if no operator has been used yet.

operators

Dict of operators, where the keys are the names, and the values are the function imple,enting the operators.

operatorsManagement

Object of type biogeme.vns.operatorsManagement

reset()[source]

Deasactivate variables from the model, that can be deactivated.

selectedModel

index of the selected model

setDecisions(decisions)[source]

Implement the specification decisions

Parameters

decisions (tuple(dict(str: int), dict(str: list(dict(str: bool))), int)) – specification decisions

Raises
  • biogemeError – if a group of variables is unknown.

  • biogemeError – if an alternative is unknown.

  • biogemeError – if the length of decisions is inconsistent with the number of terms in the utility function.

  • biogemeError – if the njumber of the model is out of range.

setSolution(aSolution)[source]

Import a solution defined as an object of class solution.

Parameters

aSolution (class solution) – solution to be imported.

Raises

biogemeError – if the object has the wrong type.

theAlternatives

Dict of utility functions, where the keys are the id of the alternatives, and the values are objects of type biogeme.assisted.utility

theGroups

dict of groups of variables, where the keys are the names, and the values are objects of class groupOfVariables

theSegmentations

dict of segmentations, where the keys are the names, and the values are objects of class segmentation

theVariables

dict of variables, where the keys are the names, and the values are objects of class variable

utilities

specification of the utility functions. See biogeme.assisted.specificationProblem.__init__

class biogeme.assisted.term(var, aSegmentation, bounds, validity)[source]

Bases: object

Class representing the possible specifications of one term of the utility function

__init__(var, aSegmentation, bounds, validity)[source]

Ctor

Parameters
  • var (variable) – variable of the term.

  • aSegmentation (segmentation) – discrete segmentation for the parameter

  • bounds (tuple(float, float)) – bounds on the coefficient

  • validity (bool f(float)) – function checking the validity of the coefficient.

bounds

bounds on the coefficient

coef_names

names of the Beta parameters involved in the specification of the term.

describe()[source]

Provides a short description of the term.

Returns

short description.

Return type

str

getBeta(altname)[source]

Obtain the name of the coefficient of the term

Parameters

altname (str) – name of the alternative

Returns

name of the coefficient

Return type

str

getDecisions()[source]

The decision is a dict, where the keys are the name of the socioeconomic variables, and the value are a boolean mentioning if it is active or not.

Returns

decision of activation of the socio-eco variables for the segmentation.

Return type

dict(str: bool)

getExpression(altname)[source]

Build the Biogeme expression for the term

Parameters

altname (str) – name of the alternative

Returns

eypression for the term

Return type

biogeme.expressions.Expression

isValid(altname, estimationResults)[source]

Check the validity of the estimated coefficient

Parameters
  • altname (str) – name of the laternative

  • estimationResults (biogeme.results.bioResults) – results of the estimation with Biogeme

Returns

True if the valus is valid, False otherwise.

Return type

bool

Raises

biogemeError – if the parameter has not be estimated.

segmentation

discrete segmentation of the parameter

setDecisions(decisions)[source]

Implement the specification decisions, represented as a dict, where the keys are the name of the socioeconomic variables, and the value are a boolean mentioning if it is active or not.

Parameters

decisions (dict(str: bool)) – decision of activation of the socio-eco variables for the segmentation.

validity

function checking the validity of the coefficient.

var

variable of the term

class biogeme.assisted.utility(alternativeId, name, terms)[source]

Bases: object

Class representing the possible specifications of a utility function

__init__(alternativeId, name, terms)[source]

Ctor.

Parameters
  • alternativeId (int) – id of the alternative.

  • name (str) – name of the alternative

  • terms (list(term)) – terms of the utility function.

getExpression()[source]

Obtain the Biogeme expression for the utility function.

Returns

Biogeme expression

Return type

biogeme.expressions.Expression

id

id of the alternative

name

name of the alternative

terms

list of terms in the utility function

class biogeme.assisted.variable(name, expression)[source]

Bases: object

Class representing the possible specifications of a variable

__init__(name, expression)[source]
Parameters
active

True if variable is active

expression

Biogeme expression for the variable

generic

True if the variable is generic.

genericName

Name of the generic variable

getExpression()[source]

Returns the biogeme expression of the specification of the variable.

Returns

expression accozunting for the status and the nonlinear specification.

Return type

biogeme.expressions.Expression

makeGeneric(yes, name)[source]

A variable can have two status: generic or alternative specific. This function changes the status

Parameters
  • yes (bool) – if True, status is set to “generic”. If False, status is set to “alt. specific”.

  • name (str) – name of the generic version of the variable

name

name of the variable

nonlinearSpec

Function with the nonlinear specifications.

used

True if variable used in the model.