region.max_p_regions.heuristics module

class region.max_p_regions.heuristics.MaxPRegionsHeu(local_search=None, random_state=None)

Bases: object

assign_enclaves(partition, enclave_areas, neigh_dict, attr)

Start with a partial partition (not all areas are assigned to a region) and a list of enclave areas (i.e. areas not present in the partial partition). Then assign all enclave areas to regions in the partial partition and return the resulting partition.

Parameters:
  • partition (list) – Each element (of type set) represents a region.
  • enclave_areas (list) – Each element represents an area.
  • neigh_dict (dict) – Each key represents an area. Each value is an iterable of the corresponding neighbors.
  • attr (numpy.ndarray) – See the corresponding argument in fit_from_scipy_sparse_matrix().
Returns:

partition – Each element (of type set) represents a region.

Return type:

list

find_best_area(region, candidates, attr)
Parameters:
  • region (iterable) – Each element represents an area.
  • candidates (iterable) – Each element represents an area bordering on region.
  • attr (numpy.ndarray) – See the corresponding argument in fit_from_scipy_sparse_matrix().
Returns:

An element of candidates with minimal dissimilarity when being moved to the region region.

Return type:

best_area

find_best_region_idx(area, partition, candidate_regions_idx, attr)
Parameters:
  • area – The area to be moved to one of the regions specified by candidate_regions_idx.
  • partition (list) – Each element (of type set) represents a region.
  • candidate_regions_idx (iterable) – Each element is the index of a region in the partition list.
  • attr (numpy.ndarray) – See the corresponding argument in fit_from_scipy_sparse_matrix().
Returns:

best_idx – The index of a region (w.r.t. partition) which has the smallest sum of dissimilarities after area area is moved to the region.

Return type:

int

fit(adj, attr, spatially_extensive_attr, threshold, max_it=10, objective_func=<region.objective_function.ObjectiveFunctionPairwise object>)

Alias for fit_from_scipy_sparse_matrix().

Solve the max-p-regions problem in a heuristic way (see [DAR2012]).

The resulting region labels are assigned to the instance’s labels_ attribute.

Parameters:
  • adj (scipy.sparse.csr_matrix) – Adjacency matrix representing the areas’ contiguity relation.
  • attr (numpy.ndarray) – Array (number of areas x number of attributes) of areas’ attributes relevant to clustering.
  • spatially_extensive_attr (numpy.ndarray) – Array (number of areas x number of attributes) of areas’ attributes relevant to ensuring the threshold condition.
  • threshold (numbers.Real or numpy.ndarray) – The lower bound for a region’s sum of spatially extensive attributes. The argument’s type is numbers.Real if there is only one spatially extensive attribute per area, otherwise it is a one-dimensional array with as many entries as there are spatially extensive attributes per area.
  • max_it (int, default: 10) – The maximum number of partitions produced in the algorithm’s construction phase.
  • objective_func (region.objective_function.ObjectiveFunction, default: ObjectiveFunctionPairwise()) – The objective function to use.
fit_from_dict(neighbors_dict, attr, spatially_extensive_attr, threshold, max_it=10, objective_func=<region.objective_function.ObjectiveFunctionPairwise object>)

Solve the max-p-regions problem in a heuristic way (see [DAR2012]).

The resulting region labels are assigned to the instance’s labels_ attribute.

Parameters:
  • neighbors_dict (dict) – Each key represents an area and each value is an iterable of neighbors of this area.
  • attr (dict) – A dict with the same keys as neighbors_dict and values representing the attributes for calculating homo-/heterogeneity. A value can be scalar (e.g. float or int) or a numpy.ndarray.
  • spatially_extensive_attr (dict) – A dict with the same keys as neighbors_dict and values representing the spatially extensive attribute (scalar or iterable of scalars). In the max-p-regions problem each region’s sum of spatially extensive attributes must be greater than a specified threshold. In case of iterables of scalars as dict-values all elements of the iterable have to fulfill the condition.
  • threshold (numbers.Real or numpy.ndarray) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
  • max_it (int, default: 10) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
  • objective_func (region.ObjectiveFunction, default: ObjectiveFunctionPairwise()) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
fit_from_geodataframe(gdf, attr, spatially_extensive_attr, threshold, max_it=10, objective_func=<region.objective_function.ObjectiveFunctionPairwise object>, contiguity='rook')

Alternative API for fit_from_scipy_sparse_matrix().

Parameters:
  • gdf (geopandas.GeoDataFrame) –
  • attr (str or list) – The clustering criteria (columns of the GeoDataFrame gdf) are specified as string (for one column) or list of strings (for multiple columns).
  • spatially_extensive_attr (str or list) – The name (str) or names (list of strings) of column(s) in gdf containing the spatially extensive attributes.
  • threshold (numbers.Real or numpy.ndarray) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
  • max_it (int, default: 10) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
  • objective_func (region.ObjectiveFunction, default: ObjectiveFunctionPairwise()) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
  • contiguity ({"rook", "queen"}, default: "rook") –

    Defines the contiguity relationship between areas. Possible contiguity definitions are:

    • ”rook” - Rook contiguity.
    • ”queen” - Queen contiguity.
fit_from_networkx(graph, attr, spatially_extensive_attr, threshold, max_it=10, objective_func=<region.objective_function.ObjectiveFunctionPairwise object>)

Alternative API for fit_from_scipy_sparse_matrix().

Parameters:
  • graph (networkx.Graph) –
  • attr (str, list or dict) – If the clustering criteria are present in the networkx.Graph graph as node attributes, then they can be specified as a string (for one criterion) or as a list of strings (for multiple criteria). Alternatively, a dict can be used with each key being a node of the networkx.Graph graph and each value being the corresponding clustering criterion (a scalar (e.g. float or int) or a numpy.ndarray). If there are no clustering criteria present in the networkx.Graph graph as node attributes, then a dictionary must be used for this argument. Refer to the corresponding argument in fit_from_dict() for more details about the expected dict.
  • spatially_extensive_attr (str, list or dict) – If the spatially extensive attribute is present in the networkx.Graph graph as node attributes, then they can be specified as a string (for one attribute) or as a list of strings (for multiple attributes). Alternatively, a dict can be used with each key being a node of the networkx.Graph graph and each value being the corresponding spatially extensive attribute (a scalar (e.g. float or int) or a numpy.ndarray). If there are no spatially extensive attributes present in the networkx.Graph graph as node attributes, then a dictionary must be used for this argument. Refer to the corresponding argument in fit_from_dict() for more details about the expected dict.
  • threshold (numbers.Real or numpy.ndarray) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
  • max_it (int, default: 10) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
  • objective_func (region.ObjectiveFunction, default: ObjectiveFunctionPairwise()) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
fit_from_scipy_sparse_matrix(adj, attr, spatially_extensive_attr, threshold, max_it=10, objective_func=<region.objective_function.ObjectiveFunctionPairwise object>)

Solve the max-p-regions problem in a heuristic way (see [DAR2012]).

The resulting region labels are assigned to the instance’s labels_ attribute.

Parameters:
  • adj (scipy.sparse.csr_matrix) – Adjacency matrix representing the areas’ contiguity relation.
  • attr (numpy.ndarray) – Array (number of areas x number of attributes) of areas’ attributes relevant to clustering.
  • spatially_extensive_attr (numpy.ndarray) – Array (number of areas x number of attributes) of areas’ attributes relevant to ensuring the threshold condition.
  • threshold (numbers.Real or numpy.ndarray) – The lower bound for a region’s sum of spatially extensive attributes. The argument’s type is numbers.Real if there is only one spatially extensive attribute per area, otherwise it is a one-dimensional array with as many entries as there are spatially extensive attributes per area.
  • max_it (int, default: 10) – The maximum number of partitions produced in the algorithm’s construction phase.
  • objective_func (region.objective_function.ObjectiveFunction, default: ObjectiveFunctionPairwise()) – The objective function to use.
fit_from_w(w, attr, spatially_extensive_attr, threshold, max_it=10, objective_func=<region.objective_function.ObjectiveFunctionPairwise object>)

Alternative API for fit_from_scipy_sparse_matrix().

Parameters:
  • w (libpysal.weights.weights.W) – W object representing the contiguity relation.
  • attr (numpy.ndarray) – Each element specifies an area’s attribute which is used for calculating the objective function.
  • spatially_extensive_attr (numpy.ndarray) – Each element specifies an area’s spatially extensive attribute which is used to ensure that the sum of spatially extensive attributes in each region adds up to a threshold defined by the threshold argument.
  • threshold (numbers.Real or numpy.ndarray) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
  • max_it (int, default: 10) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
  • objective_func (region.ObjectiveFunction, default: ObjectiveFunctionPairwise()) – Refer to the corresponding argument in fit_from_scipy_sparse_matrix().
grow_regions(adj, attr, spatially_extensive_attr, threshold)
Parameters:
Returns:

resultresult[0] is a list. Each list element is a set of a region’s areas. Note that not every area is assigned to a region after this function has terminated, so they won’t be in any of the set`s in `result[0]. result[1] is a list of areas not assigned to any region.

Return type:

tuple