Produce a Graph Summary
You can get summary information about a graph database, including the number of edges and vertices, without scanning the entire graph. Use the following Gremlin command to create a new virtual vertex containing summary information:
Vertex v = g.V("~graph_summary").next();
Virtual vertex v
contains the following properties:
Property Key | Data Type | Notes |
---|---|---|
vertex_count | long | Total vertex count. |
edge_count | long | Total edge count. |
vertex_count_per_label | Map<String, Long> | Total vertex count for a given vertex label. |
edge_count_per_label | Map<String, Long> | Total edge count for a given edge label. |
vertex_properties_per_label | Map<String, Set<String>> | Union of vertex property keys for a given vertex label. |
edge_properties_per_label | Map<String, Set<String>> | Union of edge property keys for a given edge label. |
Vertex and edge property keys are recorded as a union. They are stored as persistent data, and are not removed if the associated property is removed from all vertices or edges in the graph. To refresh the graph's property metadata, you must delete the entire database. You can delete the entire database with the following command:
g.V().drop().iterate();
The summary collection task runs asynchronously and is intended to provide approximate statistics very quickly without having to scan the entire graph. The summary metadata may lag behind the actual graph, and may be skewed if the write load on any given node is high.
The following Java example demonstrates usage of the summary vertex:
// Add vertices and edges.
final Vertex alfred = g.addV("person").property("name", "Alfred Simmons").property("age", 30).next();
final Vertex susan = g.addV("person").property("name", "Susan Field").property("location", "Vancouver").next();
final Vertex terence = g.addV("person").property("name", "Terence Tom").next();
final Vertex mycroft = g.addV("cat").property("breed", "Maine Coone").next();
g.addE("knows").from(alfred).to(susan).property("weight", 1.0).iterate();
g.addE("knows").from(alfred).to(terence).property("since", "2022").iterate();
g.addE("knows").from(susan).to(terence).iterate();
g.addE("swonk").from(terence).to(alfred).iterate();
g.addE("owns").from(alfred).to(mycroft).iterate();
// Add a sleep to allow the summary task to run.
Thread.sleep(1000);
// Get summary vertex.
final Vertex summary = g.V("~graph_summary").next();
// Loop through properties.
final Iterator<VertexProperty<Object>> properties = summary.properties();
while (properties.hasNext()) {
VertexProperty<Object> property = properties.next();
System.out.println(property.key() + ": " + property.value());
}
The above example produces the following output:
vertex_count_per_label: {person=3, cat=1}
edge_count_per_label: {swonk=1, owns=1, knows=3}
vertex_properties_per_label: {person=[name, location, age], cat=[breed]}
vertex_count: 4
edge_count: 5
edge_properties_per_label: {swonk=[], owns=[], knows=[weight, since]}
You can also use the call
step to retrieve summary data. The
following Java example uses call
:
// Get summary vertex.
final Object summary = g.call("summary").next();
System.out.println(summary);
The above example produces the following output. The summary Object is a Map:
{Edge properties by label={swonk=[], owns=[], knows=[weight, since]}, Total vertex count=4, Vertex count by label={person=3, cat=1}, Vertex properties by label={person=[name, location, age], cat=[breed]}, Total edge count=5, Edge count by label={swonk=1, owns=1, knows=3}}
You can also use pretty print support for more readable output:
// Get summary vertex.
final Object summary = g.call("summary").with("pretty").next();
System.out.println(summary);
Output:
Total vertex count: 4.
Vertex count by label: {person=3, cat=1}.
Vertex properties by label: {person=[name, location, age], cat=[breed]}.
Total edge count: 5.
Edge count by label: {swonk=1, owns=1, knows=3}.
Edge properties by label: {swonk=[], owns=[], knows=[weight, since]}.