ragraph.analysis.similarity._jaccard#

Jaccard Similarity Index.

The index compares two objects, and is calculated as the size of the overlap in properties divided by total size of properties they posess.

For examples on ‘object description functions’, please refer to ragraph.analysis.similarity.utils.

References: Kosub, S. (2016). A note on the triangle inequality for the Jaccard

distance. Retrieved from https://arxiv.org/pdf/1612.02696.pdf Jaccard, P. (1901). Étude comparative de la distribution florale dans une portion des Alpes et du Jura. Bulletin de La Société Vaudoise Des Sciences Naturelles. https://doi.org/10.5169/seals-266450

Module Contents#

Functions#

_calculate(→ float)

Calculate the Jaccard Index by the boolean object description arrays.

jaccard_index(→ float)

Calculate the Jaccard Similarity Index between to objects based on an object

jaccard_matrix(→ numpy.ndarray)

Calculate the Jaccard Similarity Index for a set of objects based on an object

mapping_matrix(→ numpy.ndarray)

Calculate an object-property mapping matrix where each entry (i,j) indicates the

ragraph.analysis.similarity._jaccard._calculate(props1: numpy.array, props2: numpy.array) float#

Calculate the Jaccard Index by the boolean object description arrays.

ragraph.analysis.similarity._jaccard.jaccard_index(obj1: Any, obj2: Any, on: Callable[[Any], List[bool]]) float#

Calculate the Jaccard Similarity Index between to objects based on an object description function.

Parameters:
  • obj1 – First object to compare.

  • obj2 – Second object to compare.

  • on – Callable that takes an object and describes it with a list of booleans. Each entry indicates the possession of a property.

Returns:

Jaccard Similarity between two objects, which is calculated as the size of the overlap in properties divided by total size of properties they posess.

ragraph.analysis.similarity._jaccard.jaccard_matrix(objects: List[Any], on: Callable[[Any], List[bool]]) numpy.ndarray#

Calculate the Jaccard Similarity Index for a set of objects based on an object description function.

Parameters:
  • objects – List of objects to generate a similarity matrix for.

  • on – Callable that takes an object and describes it with a list of booleans. Each entry indicates the possession of a property.

ragraph.analysis.similarity._jaccard.mapping_matrix(objects: List[Any], on: Callable[[Any], List[bool]]) numpy.ndarray#

Calculate an object-property mapping matrix where each entry (i,j) indicates the possession of property j by object i.

Parameters:
  • objects – List of objects to describe.

  • on – Callable that takes an object and describes it with a list of booleans. Each entry indicates the possession of a property.