Clustering analysis#
This page describes what clustering algorithms do and which ones are included in RaGraph
in the ragraph.analysis.cluster
module as well as some in the
ragraph.analysis.heuristics
module.
In general, clustering a Graph
involves the grouping of components
that have many mutual or very strong dependencies (edges). One can also compute complete
hierarchies by clustering nodes in an iterative fashion (e.g. clusters of clusters).
You can specify quite a lot, but at minimal it suffices to supply a
Graph
:
>>> from ragraph import datasets
>>> from ragraph.analysis import cluster
>>> g = datasets.get("climate_control")
>>> cluster.markov(g, names=True, inplace=True)
(<ragraph.graph.Graph(name='climate_control', kind='dataset', labels=['default'], weights={'default': 1}, annotations=Annotations({}), uuid=UUID(...)), 19 nodes, 68 edges at 0x...>, ['node.node0', 'node.node1', 'node.node2'])
Which shows that our climate_control
graph has been clustered
into two new nodes, with some default naming for them. Lets review the hierarchy
dictionary to view this in more detail:
>>> import json
>>> h = g.get_hierarchy_dict()
>>> print(json.dumps(h, indent=2))
{
"node.node0": {
"Radiator": {},
"Engine Fan": {},
"Condenser": {},
"Compressor": {},
"Evaporator Core": {},
"Accumulator": {}
},
"node.node1": {
"Heater Core": {},
"Heater Hoses": {},
"Evaporator Case": {},
"Actuators": {},
"Blower Controller": {},
"Blower Motor": {}
},
"node.node2": {
"Refrigeration Controls": {},
"Air Controls": {},
"Sensors": {},
"Command Distribution": {}
}
}
E.g. we have a cluster of three components and a cluster of the remaining 13.
An example of hierarchical clustering would be the following:
>>> g = datasets.get("climate_control") # Reload the graph
>>> g, roots = cluster.hierarchical_markov(g, inplace=True)
>>> h = g.get_hierarchy_dict()
>>> print(json.dumps(h, indent=2))
{
"node.node3": {
"node.node0": {
"Radiator": {},
"Engine Fan": {},
"Condenser": {},
"Compressor": {},
"Evaporator Core": {},
"Accumulator": {}
},
"node.node1": {
"Heater Core": {},
"Heater Hoses": {},
"Evaporator Case": {},
"Actuators": {},
"Blower Controller": {},
"Blower Motor": {}
},
"node.node2": {
"Refrigeration Controls": {},
"Air Controls": {},
"Sensors": {},
"Command Distribution": {}
}
}
}
Where we can see that the two same clusters we found earlier have now been put together
in a new parent cluster node, "node.node2"
.