In the Athens meeting we agreed that
- we will use a GraphQL API for the SPIRIT components to access the content database (DB), and
- we will use the GraphQL schema language for defining the schema of the content DB.
For point 1 (the GraphQL API) we need a GraphQL schema. Notice that, conceptually, this schema is something else than the schema we are creating for the content DB as per point 2. More precisely, while the schema for the content DB defines what exactly the objects look like that we store in the content DB, the schema for the API defines how these objects can be queried via the GraphQL API (i.e., what data can be requested for these objects) and how these objects can be inserted and modified via the GraphQL API. Essentially, the schema for the GraphQL API needs to contain some more things than what we have in the schema for the DB. A natural question at this point is: How do we get to this schema for the API?
The answer is to generate this schema automatically from the DB schema! The advantage of this approach is that we do not need to do any manual work for creating the API schema. Instead, that schema is simply generated by pushing a button (or, calling a command-line tool, to be more precise ;) Moreover, if we later extend or modify the DB schema, we can easily generate an extended API schema that reflects the changes of the DB schema. Now, the question is: How is the API schema generated from the DB schema?
We (LiU) will define an approach to generate the API schema from the DB schema, and we will develop the tool that implements this approach. In the following, I provide an overview of the approach.
The idea of the approach to generate the API schema from the DB schema is to copy the DB schema into a new file and, then, extend the schema in this new file with all the additional things needed for the API schema. These additional things needed are:
- an
ID
field in every object type that enables the GraphQL queries to access the system-generated identifier of each object,
- a query type that specifies the starting points of queries sent to the GraphQL API,
- additional fields in the object types that enable the GraphQL queries to traverse relationships between the objects in the reverse direction,
- additional fields in the object types that enable the GraphQL queries to access the data associated with these relationships, and
- a mutation type and corresponding input types that specify how data can be inserted and modified via the GraphQL API.
1. ID Fields
When inserting a data object into the database, the database management system (ArangoDB in our case) generates an identifier for it. While these identifiers do not need to be part of the DB schema, they should be contained in the schema for the GraphQL API so that they can be requested in GraphQL queries (and, then, used later in subsequent queries). Therefore, when extending the DB schema into the schema for the GraphQL API, each object type is augmented with a field named ID
whose value type is ID!
.
2. Query Type
Every GraphQL schema for a GraphQL API must have one special type called the query type. The schema for the DB does not need such a query type and, in fact, it should not contain one. The purpose of the query type is to specify the possible starting points of any kind of query that can be sent to the API. For instance, consider the following snippet of a GraphQL API schema which defines the query type of the corresponding API.
type Query {
Investigation(ID:ID!): Investigation
}
Based on this query type, it is (only) possible to write queries (API requests) that start from an Investigation
object specified by a given ID. For instance, it is possible to write the following query.
query {
Investigation(ID:371) {
Title
Description
Authorization {
SearchPurpose
Necessity
}
}
}
However, with a query type like the one above, it would not be possible to query directly for, say, an Authorization
object specified by its ID.
Now, when extending the DB schema into the API schema, the plan is to generate a query type that contains two fields (i.e., starting points for queries) for every object type in the DB schema: one of these fields can be used to query for one object of the type based on the ID of that object, and the second field can be used to access a paginated list of all objects of the corresponding type. The list is paginated, which means that it can be accessed in chunks.
For example, for the Investigation
type that we have in our DB schema, the generated query type would contain the following two fields.
Investigation(ID:ID!): Investigation
ListOfInvestigations(first:Int after:ID): ListOfInvestigations!
The additional type called ListOfInvestigations
that is used here will be defined as follows.
type ListOfInvestigations {
totalCount: Int
isEndOfWholeList: Boolean
content: [Investigation]
}
Then, it will be possible to write queries such as the following.
query {
ListOfInvestigations(first:10 after:371) {
totalCount
isEndOfWholeList
content {
Title
Description
Authorization {
SearchPurpose
}
}
}
}
3. Additional Fields For Traverssal
In the DB schema, each type of relationships (edges) between objects of particular types is defined only in one of the two related object types. For instance, consider the two object types Investigation
and Authorization
whose definition in the DB schema looks as follows.
type Investigation {
UserID: ID!
Title: String!
Description: String!
CaseNumber: String!
Authorization: Authorization
Searches: [Search]
Created: Date!
Hide: Boolean!
}
type Authorization {
SearchPurpose: String!
Necessity: [Necessity]
Proportionality: [Proportionality]
TrainedAndAuthorized: Boolean
DarkWeb: Boolean!
AuthBy: Authority!
Created: Date!
}
Notice that the relationship (i.e., the possible edges) between Investigation
objects and Authorization
objects are defined only in the definition of the type Investigation
(see the field named Authorization
) but not in the type Authorization
. Specifying every edge type only once is sufficient for the purpose of defining the schema of a (graph) database. However, it is not sufficient for supporting bidirectional traversal of these edges in GraphQL queries. Hence, the schema for the API needs to mention possible edges twice; that is, in both of the corresponding object types. For the aforementioned example of the relationships between Investigation
objects and Authorization
objects, the API schema, thus, needs to contain an additional field in the type Authorization
such that this field can be used to query from an Authorization
object to the Investigation
objects that point to it via their Authorization
fields. Hence, when extending the aforementioned part of DB schema into the schema for the GraphQL API, the definition of the Authorization
type will be extended as follows.
type InvAuthorization {
ID: ID!
SearchPurpose: String!
Necessity: [Necessity]
Proportionality: [Proportionality]
TrainedAndAuthorized: Boolean
DarkWeb: Boolean!
AuthBy: Authority!
Created: Date!
Investigation: [Investigation]
}
Observe that the value type of the added field named Investigation
is a list of Investigation
objects. This is because, according to the DB schema, multiple different Investigation
objects may point to the same Authorization
object; i.e., the relationship between Investigation
objects and Authorization
objects is a many-to-one relationship (N:1). Therefore, from an Authorization
object, we may come to multiple Investigation
objects.
Perhaps this was not the intention and, instead, the relationship between Investigation
objects and Authorization
objects was meant to be a one-to-one relationship. This could have been captured by adding the @uniqueForTarget
directive to the field named Authorization
in the DB schema (as described in the text before Example 7 of http://blog.liu.se/olafhartig/documents/graphql-schemas-for-property-graphs/). Assuming that there would be such a @uniqueForTarget
directive, then the new field named Investigation
that is added when extending the DB schema into the API schema would be defined differently:
type InvAuthorization {
ID: ID!
SearchPurpose: String!
Necessity: [Necessity]
Proportionality: [Proportionality]
TrainedAndAuthorized: Boolean
DarkWeb: Boolean!
AuthBy: Authority!
Created: Date!
Investigation: Investigation
}
This example demonstrates that the exact definition of the fields that are added when extending the DB schema into the API schema depends on the constraints that are captured by directives in the DB schema. To elaborate a but further on this point, let us assume that the aforementioned field named Authorization
in the DB schema would additionally be annotated with the @requiredForTarget
directive (in addition to the @uniqueForTarget
directive). In this case, the extension of the type Authorization
for the API schema would look as follows (notice the additional exclamation mark at the end of the value type for the new Investigation
field).
type InvAuthorization {
ID: ID!
SearchPurpose: String!
Necessity: [Necessity]
Proportionality: [Proportionality]
TrainedAndAuthorized: Boolean
DarkWeb: Boolean!
AuthBy: Authority!
Created: Date!
Investigation: Investigation!
}
4. Additional Fields and Types For Edges
Edges in a Property Graph database may have properties (key-value pairs) associated with them. When defining the DB schema, these properties can be defined as field arguments as demonstrated in the following snippet of a DB schema.
type Blogger {
Name: String!
Blogs(certainty:Int! comment:String): [Blog] @uniqueForTarget @requiredForTarget
}
type Blog {
Title: String!
Text: String!
}
By this definition, every edge from a Blogger
object to a Blog
object has a certainty
property and, optionally, it may have a comment
property.
Field arguments such as certainty
and comment
would have a different meaning when used in a schema for the GraphQL API and, thus, they have to be removed from the field definitions when extending the DB schema into the API schema. Hence, after removing the field arguments (and adding the aforementioned ID
fields and the fields for traversing edges in the opposite direction), the API schema for the aforementioned DB schema would look as follows.
type Blogger {
ID: ID!
Name: String!
Blogs: [Blog] @uniqueForTarget @requiredForTarget
}
type Blog {
ID: ID!
Title: String!
Text: String!
Blogger: Blogger!
}
Although we have to remove the field arguments from the fields that define edge types in the DB schema, we may want to enable GraphQL queries to access the values of the edge properties that these edges of these types have. For instance, we may want to query for the certainty
of edges between bloggers and blogs. To this end, the edges have to be represented as objects in the GraphQL API. Hence, it is necessary to generate an object type for each type of edges and integrate these object types into the schema for the API. For instance, for the edges between bloggers and blogs, an object type called BlogsEdgeFromBlogger
will be generated and access to objects of this new type will be integrated into the schema by adding a new field to the Blogger
type and to the Blog
type, respectively.
type Blogger {
ID: ID!
Name: String!
Blogs: [Blog] @uniqueForTarget @requiredForTarget
OutgoingBlogsEdges: [BlogsEdgeFromBlogger]
}
type Blog {
ID: ID!
Title: String!
Text: String!
Blogger: Blogger!
IncomingBlogsEdgeFromBlogger : BlogsEdgeFromBlogger!
}
type BlogsEdgeFromBlogger {
ID: ID!
source: Blogger!
target: Blog!
certainty:Int!
comment:String
}
Given this extension, it is now possible to write GraphQL queries that access properties of the edges along which they are traversing. The following query demonstrates this option.
query {
Blogger(ID:3991) {
Name
OutgoingBlogsEdges {
certainty
target {
Title
Text
}
}
}
}
5. Mutation Type and Corresponding Input Types
In addition to the aforementioned query type, another special type that a GraphQL API schema may contain is the mutation type. The fields of this type specify how data can be inserted and modified via the GraphQL API. For instance, the following snippet of a GraphQL API schema defines a mutation type.
type Mutation {
setTitleOfInvestigation(ID:ID! Title:String!): Investigation
}
Given this mutation type, it is possible to modify the title of an Investigation
object specified by a given ID; the result of this operation is defined to be an Investigation
object (we may assume that this is the modified Investigation
, which may then be retrieved as the response for the operation).
Now, when extending the DB schema into the API schema, the plan is to generate a mutation type that contains three operations for every object type in the DB schema and another three operations for every edge type. The three operations for an object type XYZ
are called createXYZ
, updateXYZ
, and deleteXYZ
; as their names suggest, these operations can be used to create, to update, and to delete an object of the corresponding type, respectively. In the following, we discuss the mutation operations in more detail.
5.1 Creating an Object
Consider the aforementioned object type Investigation
of our DB schema. The create operation for Investigation
objects will be defined as follows.
createInvestigation(data:DataToCreateInvestigation!): Investigation
The value of the argument data
is a complex input object that provides the data for the Investigation
object that is to be created. This input object must be of the type DataToCreateInvestigation
. This input type, which will be generated from the object type Investigation
of the DB schema, will be defined as follows.
input DataToCreateInvestigation {
UserID: ID!
Title: String!
Description: String!
CaseNumber: String!
Authorization: DataToConnectAuthorizationOfInvestigation
Searches: [DataToConnectSearchesOfInvestigation]
Created: Date!
Hide: Boolean!
}
input DataToConnectAuthorizationOfInvestigation {
connect: ID
create: DataToCreateAuthorization
}
input DataToConnectSearchesOfInvestigation {
connect: ID
create: DataToCreateManualSearch
}
input DataToConnectScheduleSearchesOfInvestigation {
connect: ID
create: DataToCreateSearch
}
Notice that all fields that are mandatory in Investigation
are also mandatory in DataToCreateInvestigation
(and optional fields remain optional). Moreover, fields whose value type in Investigation
is a scalar type (or a list thereof) have the same value type in DataToCreateInvestigation
. In contrast, fields that represent outgoing edges in Investigation
have a new input type that can be used to create the corresponding outgoing edge(s). This can be done in one of two ways: either by identifying the target node of the edge via the connect
field or by creating a new target node via the create
field.
5.2 Updating an Object
The update operations for Investigation
objects will be defined as follows.
updateInvestigation(ID:ID! data:DataToUpdateInvestigation!): Investigation
The Investigation
object to be updated can be identified by the argument ID
. The value of the argument data
in this case is another input type that provides values for the fields to be modified. This input type is defined as follows.
input DataToUpdateInvestigation {
UserID: ID
Title: String
Description: String
CaseNumber: String
Authorization: DataToConnectAuthorizationOfInvestigation
ManualSearches: [DataToConnectManualSearchesOfInvestigation]
ScheduleSearches: [DataToConnectScheduleSearchesOfInvestigation]
Created: Date
Hide: Boolean
}
Notice that all fields in this input type are optional to allow users to freely choose the fields of the Investigation
object that have to be updated. For instance, to update the UserID
and the Title
of the Investigation
object with the ID 371 we may write the following.
mutation {
updateInvestigation(
ID:371
data: {
UserID:87
Title:"Five Orange Pips"
}
) {
ID
UserID
Title
}
Updates like this override the previous value of the updated fields. In the case of outgoing edges this means that new edges replace all previously existing edges (without removing the target nodes of replaced edges). If you want to add edges instead, use the connect operations described below.
5.3 Deleting an Object
The delete operations for Investigation
objects will be defined as follows.
deleteInvestigation(ID:ID!): Investigation
The argument ID
can be used to identify the Investigation
object to be deleted.
Notice that deleting an object implicitly deletes all incoming and all outgoing edges of that object.
5.4 Creating an Edge
Consider the types of edges that represent the aforementioned relationship between bloggers and blogs. In the DB schema, this type of edges is defined implicitly by the field definition Blogs
in object type Blogger
. The create operation for these edges will be defined as follows.
createBlogsEdgeFromBlogger(data:DataToCreateBlogsEdgeFromBlogger!): BlogsEdgeFromBlogger
The new input type for this operation, DataToCreateBlogsEdgeFromBlogger
, will be generated as follows.
input DataToCreateBlogsEdgeFromBlogger {
sourceID: ID! # assumed to be the ID of a Blogger object
targetID: ID! # assumed to be the ID of a Blog object
certainty:Int!
comment:String
}
5.5 Updating an Edge
Update operations for edges will be generated only for the types of edges that have edge properties. The edges that represent the aforementioned relationship between bloggers and blogs are an example of such edges. The update operation that will be generated for these edges is:
updateBlogsEdgeFromBlogger(ID:ID! data:DataToUpdateBlogsEdgeFromBlogger!): BlogsEdgeFromBlogger
The argument ID
can be used to identify the BlogsEdgeFromBlogger
object that represents the edge to be updated.
The new input type for this operation, DataToUpdateBlogsEdgeFromBlogger
, will be generated as follows.
input DataToUpdateBlogsEdgeFromBlogger {
certainty:Int
comment:String
}
#### 5.6 Deleting an Edge
The delete operations for the edges between bloggers and blogs will be defined as follows.
deleteBlogsEdgeFromBlogger(ID:ID!): BlogsEdgeFromBlogger
The argument `ID` can be used to identify the `BlogsEdgeFromBlogger` object that represents the edge to be deleted.