Querying Social Graphs in SQL
An example of querying a cinema relation graph with Kinetica (no external dataset required)
An example of querying a cinema relation graph with Kinetica (no external dataset required)
The following is a complete example, using SQL, of querying a graph of movies & TV shows and the people who star in and direct them. For more information on Graphs & Solvers, see Graphs & Solvers Concepts.
The prerequisites for running this query graph example are listed below:
This example is going to demonstrate querying a cinema-based graph of relationships between actors, directors, & shows for:
Three source data tables will be used to construct the graph:
This table contains the name, semicolon-separated list of production roles, and date of birth of actors, directors, & producers used in this example. The list of production roles will be used later in graph creation to associate all of a given person's career-wide production roles with them, not just their production role on any particular show.
|
|
|
|
This table contains the title, show type (movie or TV series), semicolon-separated list of genres, year of release, and score for the shows used in this example. The list of genres will be used in graph creation to associate each of the genres with the corresponding show.
|
|
|
|
This table contains the production role each person had on each show within the example data set. The production role values are a semicolon-separated list of roles, each of which will be applied to the relation between the person and the show within the graph.
|
|
|
|
This example will demonstrate both constructing a graph from source tables and amending the graph with additional data after construction.
One graph is used for these query graph examples, qsgs_cinema, which will use several queries against the source tables to build the edges, nodes, & labels. The graph is created with the CREATE GRAPH command:
|
|
The graph will contain 4 node types:
The graph will also contain 3 edge types:
Lastly, the graph is created with the following characteristics:
The created graph can have data (nodes, edges, & labels) added to it after creation, as demonstrated with the ALTER GRAPH command:
|
|
Here, the show Stargate and the person Roland Emmerich are added to the graph, along with their associated connections. Note that the graph entities are added as constants instead of queries here, though either or a mixture of the two is acceptable in both CREATE GRAPH & ALTER GRAPH.
Though not required for supplementing the graph data, a backing record for Stargate is added to the show table for use in a supplemental data query later in this guide.
|
|
This example will demonstrate four ways graph queries can be used to answer relationship questions:
To find the directors and titles of movies (no TV series) in which James Spader has acted:
|
|
The graph query is constructed in the following way:
To perform the same query as in Query by Hops above, but employing the Kinetica full text search capability using the FILTER_BY_STRING function to find the same James Spader starting node:
|
|
The graph query is constructed in the following way:
To perform the same query as in Query by Hops above, but employing a SQL join to supplement the query results with movie scores from the show table:
|
|
The graph query is constructed in the following way:
To perform the same query as in Query by Hops above, but nesting it within another graph query to find the titles and release years of all shows directed by directors of James Spader movies:
|
|
The graph query is constructed in the following way:
Included below is a complete example containing all the above requests and corresponding output.
The script can be run via Workbooks in Workbench or any SQL client.
To execute using KiSQL, download the SQL script, switch to the directory in which it has been downloaded, and run:
|
|