Data
Graph
spektral.data.graph.Graph(x=None, a=None, e=None, y=None)
A container to represent a graph. The data associated with the Graph is stored in its attributes:
x
, for the node features;a
, for the adjacency matrix;e
, for the edge attributes;y
, for the node or graph labels;
All of these default to None
if you don't specify them in the constructor.
If you want to read all non-None attributes at once, you can call the
numpy()
method, which will return all data in a tuple (with the order
defined above).
Graphs also have the following attributes that are computed automatically from the data:
n_nodes
: number of nodes;n_edges
: number of edges;n_node_features
: size of the node features, if available;n_edge_features
: size of the edge features, if available;n_labels
: size of the labels, if available;
Any additional kwargs
passed to the constructor will be automatically
assigned as instance attributes of the graph.
Data can be stored in Numpy arrays or Scipy sparse matrices, and labels can also be scalars.
Spektral usually assumes that the different data matrices have specific
shapes, although this is not strictly enforced to allow more flexibility.
In general, node attributes should have shape (n_nodes, n_node_features)
and the adjacency
matrix should have shape (n_nodes, n_nodes)
.
Edge attributes can be stored in a dense format as arrays of shape
(n_nodes, n_nodes, n_edge_features)
or in a sparse format as arrays of shape (n_edges, n_edge_features)
(so that you don't have to store all the zeros for missing edges). Most
components of Spektral will know how to deal with both situations
automatically.
Labels can refer to the entire graph (shape (n_labels, )
) or to each
individual node (shape (n_nodes, n_labels)
).
Arguments
x
: np.array, the node features (shape(n_nodes, n_node_features)
);a
: np.array or scipy.sparse matrix, the adjacency matrix (shape(n_nodes, n_nodes)
);e
: np.array, the edge features (shape(n_nodes, n_nodes, n_edge_features)
or(n_edges, n_edge_features)
);y
: np.array, the node or graph labels (shape(n_nodes, n_labels)
or(n_labels, )
);
Dataset
spektral.data.dataset.Dataset(transforms=None)
A container for Graph objects. This class can be extended to represent a graph dataset.
To create a Dataset
, you must implement the Dataset.read()
method, which
must return a list of spektral.data.Graph
objects:
class MyDataset(Dataset):
def read(self):
return [Graph(x=x, adj=adj, y=y) for x, adj, y in some_magic_list]
The download()
method is automatically called if the path returned by
Dataset.path
does not exists (default ~/spektral/datasets/ClassName/
).
In this case, download()
will be called before read()
.
Datasets should generally behave like Numpy arrays for any operation that uses simple 1D indexing:
>>> dataset[0]
Graph(...)
>>> dataset[[1, 2, 3]]
Dataset(n_graphs=3)
>>> dataset[1:10]
Dataset(n_graphs=9)
>>> np.random.shuffle(dataset) # shuffle in-place
>>> for graph in dataset[:3]:
>>> print(graph)
Graph(...)
Graph(...)
Graph(...)
Datasets have the following properties that are automatically computed:
n_nodes
: the number of nodes in the dataset (always None, except in single and mixed mode datasets);n_node_features
: the size of the node features (assumed to be equal for all graphs);n_edge_features
: the size of the edge features (assumed to be equal for all graphs);n_labels
: the size of the labels (assumed to be equal for all graphs); this is computed asy.shape[-1]
.
Any additional kwargs
passed to the constructor will be automatically
assigned as instance attributes of the dataset.
Datasets also offer three main manipulation functions to apply callables to their graphs:
apply(transform)
: replaces each graph with the output oftransform(graph)
. Seespektral.transforms
for some ready-to-use transforms.
Example:apply(spektral.transforms.NormalizeAdj())
normalizes the adjacency matrix of each graph in the dataset.map(transform, reduce=None)
: returns a list containing the output oftransform(graph)
for each graph. Ifreduce
is acallable
, then returnsreduce(output_list)
.
Example:map(lambda: g.n_nodes, reduce=np.mean)
will return the average number of nodes in the dataset.filter(function)
: removes from the dataset any graph for whichfunction(graph) is False
.
Example:filter(lambda: g.n_nodes < 100)
removes from the dataset all graphs bigger than 100 nodes.
Datasets in mixed mode (one adjacency matrix, many instances of node features)
are expected to have a particular structure.
The graphs returned by read()
should not have an adjacency matrix,
which should be instead stored as a singleton in the dataset's a
attribute.
For example:
class MyMixedModeDataset(Dataset):
def read(self):
self.a = compute_adjacency_matrix()
return [Graph(x=x, y=y) for x, y in some_magic_list]
Have a look at the spektral.datasets
module for examples of popular
datasets already implemented.
Arguments
transforms
: a callable or list of callables that are automatically applied to the graphs after loading the dataset.
Data utils
to_disjoint
spektral.data.utils.to_disjoint(x_list=None, a_list=None, e_list=None)
Converts lists of node features, adjacency matrices and edge features to disjoint mode.
Either the node features or the adjacency matrices must be provided as input.
The i-th element of each list must be associated with the i-th graph.
The method also computes the batch index to retrieve individual graphs from the disjoint union.
Edge attributes can be represented as:
- a dense array of shape
(n_nodes, n_nodes, n_edge_features)
; - a sparse edge list of shape
(n_edges, n_edge_features)
;
and they will always be returned as a stacked edge list.
Arguments
-
x_list
: a list of np.arrays of shape(n_nodes, n_node_features)
-- note thatn_nodes
can change between graphs; -
a_list
: a list of np.arrays or scipy.sparse matrices of shape(n_nodes, n_nodes)
; -
e_list
: a list of np.arrays of shape(n_nodes, n_nodes, n_edge_features)
or(n_edges, n_edge_features)
;
Return
Only if the corresponding list is given as input:
x
: np.array of shape(n_nodes, n_node_features)
;a
: scipy.sparse matrix of shape(n_nodes, n_nodes)
;e
: np.array of shape(n_edges, n_edge_features)
;i
: np.array of shape(n_nodes, )
;
to_batch
spektral.data.utils.to_batch(x_list=None, a_list=None, e_list=None, mask=False)
Converts lists of node features, adjacency matrices and edge features to
batch mode,
by zero-padding all tensors to have the same node dimension n_max
.
Either the node features or the adjacency matrices must be provided as input.
The i-th element of each list must be associated with the i-th graph.
If a_list
contains sparse matrices, they will be converted to dense
np.arrays.
The edge attributes of a graph can be represented as
- a dense array of shape
(n_nodes, n_nodes, n_edge_features)
; - a sparse edge list of shape
(n_edges, n_edge_features)
;
and they will always be returned as dense arrays.
Arguments
-
x_list
: a list of np.arrays of shape(n_nodes, n_node_features)
-- note thatn_nodes
can change between graphs; -
a_list
: a list of np.arrays or scipy.sparse matrices of shape(n_nodes, n_nodes)
; -
e_list
: a list of np.arrays of shape(n_nodes, n_nodes, n_edge_features)
or(n_edges, n_edge_features)
; -
mask
: bool, if True, node attributes will be extended with a binary mask that indicates valid nodes (the last feature of each node will be 1 if the node is valid and 0 otherwise). Use this flag in conjunction with layers.base.GraphMasking to start the propagation of masks in a model.
Return
Only if the corresponding list is given as input:
x
: np.array of shape(batch, n_max, n_node_features)
;a
: np.array of shape(batch, n_max, n_max)
;e
: np.array of shape(batch, n_max, n_max, n_edge_features)
;
to_mixed
spektral.data.utils.to_mixed(x_list=None, a=None, e_list=None)
Converts lists of node features and edge features to mixed mode.
The adjacency matrix must be passed as a singleton, i.e., a single np.array or scipy.sparse matrix shared by all graphs.
Edge attributes can be represented as:
- a dense array of shape
(n_nodes, n_nodes, n_edge_features)
; - a sparse edge list of shape
(n_edges, n_edge_features)
;
and they will always be returned as a batch of edge lists.
Arguments
-
x_list
: a list of np.arrays of shape(n_nodes, n_node_features)
-- note thatn_nodes
must be the same between graphs; -
a
: a np.array or scipy.sparse matrix of shape(n_nodes, n_nodes)
; -
e_list
: a list of np.arrays of shape(n_nodes, n_nodes, n_edge_features)
or(n_edges, n_edge_features)
;
Return
Only if the corresponding element is given as input:
x
: np.array of shape(batch, n_nodes, n_node_features)
;a
: scipy.sparse matrix of shape(n_nodes, n_nodes)
;e
: np.array of shape(batch, n_edges, n_edge_features)
;
batch_generator
spektral.data.utils.batch_generator(data, batch_size=32, epochs=None, shuffle=True)
Iterates over the data for the given number of epochs, yielding batches of
size batch_size
.
Arguments
-
data
: np.array or list of np.arrays with the same first dimension; -
batch_size
: number of samples in a batch; -
epochs
: number of times to iterate over the data (default None, iterates indefinitely); -
shuffle
: whether to shuffle the data at the beginning of each epoch
Return
Batches of size batch_size
.
to_tf_signature
spektral.data.utils.to_tf_signature(signature)
Converts a Dataset signature to a TensorFlow signature.
Arguments
signature
: a Dataset signature.
Return
A TensorFlow signature.