.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/assisted/plot_b04segmentation.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_assisted_plot_b04segmentation.py: Catalog for segmented parameters ================================ Investigate the segmentations of parameters. We consider 4 specifications for the constants: - Not segmented - Segmented by GA (yearly subscription to public transport) - Segmented by luggage - Segmented both by GA and luggage We consider 3 specifications for the time coefficients: - Not Segmented - Segmented with first class - Segmented with trip purpose We obtain a total of 12 specifications. See `Bierlaire and Ortelli (2023) `_. Michel Bierlaire, EPFL Sun Apr 27 2025, 15:52:48 .. GENERATED FROM PYTHON SOURCE LINES 28-51 .. code-block:: Python import numpy as np from IPython.core.display_functions import display from biogeme.biogeme import BIOGEME from biogeme.catalog import segmentation_catalogs from biogeme.data.swissmetro import ( CAR_AV_SP, CAR_CO_SCALED, CAR_TT_SCALED, CHOICE, SM_AV, SM_COST_SCALED, SM_TT_SCALED, TRAIN_AV_SP, TRAIN_COST_SCALED, TRAIN_TT_SCALED, read_data, ) from biogeme.expressions import Beta from biogeme.models import loglogit from biogeme.results_processing import compile_estimation_results, pareto_optimal .. GENERATED FROM PYTHON SOURCE LINES 52-53 Read the data .. GENERATED FROM PYTHON SOURCE LINES 53-55 .. code-block:: Python database = read_data() .. GENERATED FROM PYTHON SOURCE LINES 56-57 Definition of the segmentations. .. GENERATED FROM PYTHON SOURCE LINES 57-70 .. code-block:: Python segmentation_ga = database.generate_segmentation( variable='GA', mapping={0: 'noGA', 1: 'GA'} ) segmentation_luggage = database.generate_segmentation( variable='LUGGAGE', mapping={0: 'no_lugg', 1: 'one_lugg', 3: 'several_lugg'} ) segmentation_first = database.generate_segmentation( variable='FIRST', mapping={0: '2nd_class', 1: '1st_class'} ) .. GENERATED FROM PYTHON SOURCE LINES 71-73 We consider two trip purposes: 'commuters' and anything else. We need to define a binary variable first. .. GENERATED FROM PYTHON SOURCE LINES 73-79 .. code-block:: Python database.dataframe['COMMUTERS'] = np.where(database.dataframe['PURPOSE'] == 1, 1, 0) segmentation_purpose = database.generate_segmentation( variable='COMMUTERS', mapping={0: 'non_commuters', 1: 'commuters'} ) .. GENERATED FROM PYTHON SOURCE LINES 80-81 Parameters to be estimated. .. GENERATED FROM PYTHON SOURCE LINES 81-86 .. code-block:: Python asc_car = Beta('asc_car', 0, None, None, 0) asc_train = Beta('asc_train', 0, None, None, 0) b_time = Beta('b_time', 0, None, None, 0) b_cost = Beta('b_cost', 0, None, None, 0) .. GENERATED FROM PYTHON SOURCE LINES 87-88 Catalogs for the alternative specific constants. .. GENERATED FROM PYTHON SOURCE LINES 88-98 .. code-block:: Python asc_train_catalog, asc_car_catalog = segmentation_catalogs( generic_name='asc', beta_parameters=[asc_train, asc_car], potential_segmentations=( segmentation_ga, segmentation_luggage, ), maximum_number=2, ) .. GENERATED FROM PYTHON SOURCE LINES 99-103 Catalog for the travel time coefficient. Note that the function returns a list of catalogs. Here, the list contains only one of them. This is why there is a comma after "B_TIME_catalog". .. GENERATED FROM PYTHON SOURCE LINES 103-113 .. code-block:: Python (b_time_catalog,) = segmentation_catalogs( generic_name='b_time', beta_parameters=[b_time], potential_segmentations=( segmentation_first, segmentation_purpose, ), maximum_number=1, ) .. GENERATED FROM PYTHON SOURCE LINES 114-115 Definition of the utility functions. .. GENERATED FROM PYTHON SOURCE LINES 115-121 .. code-block:: Python v_train = ( asc_train_catalog + b_time_catalog * TRAIN_TT_SCALED + b_cost * TRAIN_COST_SCALED ) v_swissmetro = b_time_catalog * SM_TT_SCALED + b_cost * SM_COST_SCALED v_car = asc_car_catalog + b_time_catalog * CAR_TT_SCALED + b_cost * CAR_CO_SCALED .. GENERATED FROM PYTHON SOURCE LINES 122-123 Associate utility functions with the numbering of alternatives. .. GENERATED FROM PYTHON SOURCE LINES 123-125 .. code-block:: Python v = {1: v_train, 2: v_swissmetro, 3: v_car} .. GENERATED FROM PYTHON SOURCE LINES 126-127 Associate the availability conditions with the alternatives. .. GENERATED FROM PYTHON SOURCE LINES 127-129 .. code-block:: Python av = {1: TRAIN_AV_SP, 2: SM_AV, 3: CAR_AV_SP} .. GENERATED FROM PYTHON SOURCE LINES 130-132 Definition of the model. This is the contribution of each observation to the log likelihood function. .. GENERATED FROM PYTHON SOURCE LINES 132-134 .. code-block:: Python log_probability = loglogit(v, av, CHOICE) .. GENERATED FROM PYTHON SOURCE LINES 135-136 Create the Biogeme object. .. GENERATED FROM PYTHON SOURCE LINES 136-141 .. code-block:: Python the_biogeme = BIOGEME( database, log_probability, generate_html=False, generate_yaml=False ) the_biogeme.model_name = 'b04segmentation' .. GENERATED FROM PYTHON SOURCE LINES 142-143 Estimate the parameters .. GENERATED FROM PYTHON SOURCE LINES 143-145 .. code-block:: Python dict_of_results = the_biogeme.estimate_catalog() .. GENERATED FROM PYTHON SOURCE LINES 146-147 Number of estimated models. .. GENERATED FROM PYTHON SOURCE LINES 147-149 .. code-block:: Python print(f'A total of {len(dict_of_results)} models have been estimated') .. rst-class:: sphx-glr-script-out .. code-block:: none A total of 12 models have been estimated .. GENERATED FROM PYTHON SOURCE LINES 150-151 All estimation results .. GENERATED FROM PYTHON SOURCE LINES 151-155 .. code-block:: Python compiled_results, specs = compile_estimation_results( dict_of_results, use_short_names=True ) .. GENERATED FROM PYTHON SOURCE LINES 156-159 .. code-block:: Python display('All estimated models') display(compiled_results) .. rst-class:: sphx-glr-script-out .. code-block:: none All estimated models Model_000000 ... Model_000011 Number of estimated parameters 11 ... 7 Sample size 10719 ... 10719 Final log likelihood -8280.199 ... -8311.858 Akaike Information Criterion 16582.4 ... 16637.72 Bayesian Information Criterion 16662.47 ... 16688.68 asc_train_ref (t-test) -1.5 (-17.4) ... -1.12 (-17.3) asc_train_diff_GA (t-test) 1.37 (19.2) ... 1.53 (22.3) asc_train_diff_one_lugg (t-test) 0.562 (7.06) ... asc_train_diff_several_lugg (t-test) 0.643 (3.72) ... b_time_ref (t-test) -1.17 (-21.5) ... -1.18 (-21.6) b_time_diff_commuters (t-test) -0.17 (-0.792) ... -0.168 (-0.784) b_cost (t-test) -0.702 (-13.5) ... -0.702 (-13.3) asc_car_ref (t-test) 0.03 (0.583) ... 0.0163 (0.396) asc_car_diff_GA (t-test) -1.22 (-7.84) ... -1.26 (-8.18) asc_car_diff_one_lugg (t-test) -0.0306 (-0.608) ... asc_car_diff_several_lugg (t-test) -0.46 (-2.11) ... asc_train (t-test) ... b_time_diff_1st_class (t-test) ... asc_car (t-test) ... b_time (t-test) ... [20 rows x 12 columns] .. GENERATED FROM PYTHON SOURCE LINES 160-161 Glossary .. GENERATED FROM PYTHON SOURCE LINES 161-164 .. code-block:: Python for short_name, spec in specs.items(): print(f'{short_name}\t{spec}') .. rst-class:: sphx-glr-script-out .. code-block:: none Model_000000 asc:GA-LUGGAGE;b_time:COMMUTERS Model_000001 asc:no_seg;b_time:FIRST Model_000002 asc:GA;b_time:FIRST Model_000003 asc:LUGGAGE;b_time:FIRST Model_000004 asc:GA-LUGGAGE;b_time:FIRST Model_000005 asc:GA;b_time:no_seg Model_000006 asc:LUGGAGE;b_time:no_seg Model_000007 asc:no_seg;b_time:no_seg Model_000008 asc:GA-LUGGAGE;b_time:no_seg Model_000009 asc:LUGGAGE;b_time:COMMUTERS Model_000010 asc:no_seg;b_time:COMMUTERS Model_000011 asc:GA;b_time:COMMUTERS .. GENERATED FROM PYTHON SOURCE LINES 165-166 Estimation results of the Pareto optimal models. .. GENERATED FROM PYTHON SOURCE LINES 166-171 .. code-block:: Python pareto_results = pareto_optimal(dict_of_results) compiled_pareto_results, pareto_specs = compile_estimation_results( pareto_results, use_short_names=True ) .. GENERATED FROM PYTHON SOURCE LINES 172-175 .. code-block:: Python display('Non dominated models') display(compiled_pareto_results) .. rst-class:: sphx-glr-script-out .. code-block:: none Non dominated models Model_000000 ... Model_000004 Number of estimated parameters 4 ... 6 Sample size 10719 ... 10719 Final log likelihood -8670.163 ... -8313.613 Akaike Information Criterion 17348.33 ... 16639.23 Bayesian Information Criterion 17377.45 ... 16682.9 asc_train (t-test) -0.652 (-12) ... b_time (t-test) -1.28 (-19.5) ... -1.19 (-18.3) b_cost (t-test) -0.79 (-15.5) ... -0.704 (-13.3) asc_car (t-test) 0.0162 (0.438) ... asc_train_ref (t-test) ... -1.12 (-18.2) asc_train_diff_GA (t-test) ... 1.52 (22.1) asc_train_diff_one_lugg (t-test) ... asc_train_diff_several_lugg (t-test) ... b_time_ref (t-test) ... b_time_diff_1st_class (t-test) ... asc_car_ref (t-test) ... 0.0143 (0.361) asc_car_diff_GA (t-test) ... -1.26 (-8.18) asc_car_diff_one_lugg (t-test) ... asc_car_diff_several_lugg (t-test) ... [19 rows x 5 columns] .. GENERATED FROM PYTHON SOURCE LINES 176-177 Glossary. .. GENERATED FROM PYTHON SOURCE LINES 177-179 .. code-block:: Python for short_name, spec in pareto_specs.items(): print(f'{short_name}\t{spec}') .. rst-class:: sphx-glr-script-out .. code-block:: none Model_000000 asc:no_seg;b_time:no_seg Model_000001 asc:GA-LUGGAGE;b_time:FIRST Model_000002 asc:GA;b_time:FIRST Model_000003 asc:no_seg;b_time:FIRST Model_000004 asc:GA;b_time:no_seg .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 14.706 seconds) .. _sphx_glr_download_auto_examples_assisted_plot_b04segmentation.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_b04segmentation.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_b04segmentation.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_b04segmentation.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_