Pooling layers
The following pooling layers are available in Spektral.
See the convolutional layers page for the notation.
SRCPool
spektral.layers.SRCPool(return_selection=False)
A general class for graph pooling layers based on the "Select, Reduce, Connect" framework presented in:
Understanding Pooling in Graph Neural Networks.
Daniele Grattarola et al.
This layer computes: Where is a node equivariant selection function that computes the supernode assignments , is a permutation-invariant function to reduce the supernodes into the new node attributes, and is a permutation-invariant connection function that computes the link between the pooled nodes.
By extending this class, it is possible to create any pooling layer in the SRC formalism.
Input
x
: Tensor of shape([batch], N, F)
representing node features;a
: Tensor or SparseTensor of shape([batch], N, N)
representing the adjacency matrix;i
: (optional) Tensor of integers with shape(N, )
representing the batch index;
Output
x_pool
: Tensor of shape([batch], K, F)
, representing the node features of the output.K
is the number of output nodes and depends on the specific pooling strategy;a_pool
: Tensor or SparseTensor of shape([batch], K, K)
representing the adjacency matrix of the output;i_pool
: (only if i was given as input) Tensor of integers with shape(K, )
representing the batch index of the output;s
: (ifreturn_selection=True
) Tensor or SparseTensor representing the supernode assignments;
API
pool(x, a, i, **kwargs)
: pools the graph and returns the reduced node features and adjacency matrix. If the batch indexi
is notNone
, a reduced version ofi
will be returned as well. Any givenkwargs
will be passed as keyword arguments toselect()
,reduce()
andconnect()
if any matching key is found. The mandatory arguments ofpool()
must be computed incall()
by callingself.get_inputs(inputs)
.select(x, a, i, **kwargs)
: computes supernode assignments mapping the nodes of the input graph to the nodes of the output.reduce(x, s, **kwargs)
: reduces the supernodes to form the nodes of the pooled graph.connect(a, s, **kwargs)
: connects the reduced supernodes.reduce_index(i, s, **kwargs)
: helper function to reduce the batch index (only called ifi
is given as input).
When overriding any function of the API, it is possible to access the
true number of nodes of the input (n_nodes
) as a Tensor in the instance variable
self.n_nodes
(this is populated by self.get_inputs()
at the beginning of
call()
).
Arguments:
return_selection
: ifTrue
, the Tensor used to represent supernode assignments will be returned withx_pool
,a_pool
, andi_pool
;
AsymCheegerCutPool
spektral.layers.AsymCheegerCutPool(k, mlp_hidden=None, mlp_activation='relu', totvar_coeff=1.0, balance_coeff=1.0, return_selection=False, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None)
An Asymmetric Cheeger Cut Pooling layer from the paper
Total Variation Graph Neural Networks
Jonas Berg Hansen and Filippo Maria Bianchi
Mode: single, batch.
This layer learns a soft clustering of the input graph as follows: where is a multi-layer perceptron with softmax output.
The layer includes two auxiliary loss terms/components: A graph total variation component given by where is the number of edges/links, is the number of clusters or output nodes, and is the number of nodes.
An asymmetrical norm component given by
The layer can be used without a supervised loss to compute node clustering by minimizing the two auxiliary losses.
Input
- Node features of shape
(batch, n_nodes_in, n_node_features)
; - Adjacency matrix of shape
(batch, n_nodes_in, n_nodes_in)
;
Output
- Reduced node features of shape
(batch, n_nodes_out, n_node_features)
; - If
return_selection=True
, the selection matrix of shape(batch, n_nodes_in, n_nodes_out)
.
Arguments
k
: number of output nodes;mlp_hidden
: list of integers, number of hidden units for each hidden layer in the MLP used to compute cluster assignments (ifNone
, the MLP has only one output layer);mlp_activation
: activation for the MLP layers;totvar_coeff
: coefficient for graph total variation loss component;balance_coeff
: coefficient for asymmetric norm loss component;return_selection
: boolean, whether to return the selection matrix;use_bias
: use bias in the MLP;kernel_initializer
: initializer for the weights of the MLP;bias_regularizer
: regularization applied to the bias of the MLP;kernel_constraint
: constraint applied to the weights of the MLP;bias_constraint
: constraint applied to the bias of the MLP;
DiffPool
spektral.layers.DiffPool(k, channels=None, return_selection=False, activation=None, kernel_initializer='glorot_uniform', kernel_regularizer=None, kernel_constraint=None)
A DiffPool layer from the paper
Hierarchical Graph Representation Learning with Differentiable Pooling
Rex Ying et al.
Mode: single, batch.
This layer learns a soft clustering of the input graph as follows:
where:
The number of output channels of is controlled by the
channels
parameter.
Two auxiliary loss terms are also added to the model: the link prediction loss and the entropy loss
The layer can be used without a supervised loss to compute node clustering by minimizing the two auxiliary losses.
Input
- Node features of shape
(batch, n_nodes_in, n_node_features)
; - Adjacency matrix of shape
(batch, n_nodes_in, n_nodes_in)
;
Output
- Reduced node features of shape
(batch, n_nodes_out, channels)
; - Reduced adjacency matrix of shape
(batch, n_nodes_out, n_nodes_out)
; - If
return_selection=True
, the selection matrix of shape(batch, n_nodes_in, n_nodes_out)
.
Arguments
k
: number of output nodes;channels
: number of output channels (ifNone
, the number of output channels is the same as the input);return_selection
: boolean, whether to return the selection matrix;activation
: activation to apply after reduction;kernel_initializer
: initializer for the weights;kernel_regularizer
: regularization applied to the weights;kernel_constraint
: constraint applied to the weights;
LaPool
spektral.layers.LaPool(shortest_path_reg=True, return_selection=False)
A Laplacian pooling (LaPool) layer from the paper
Towards Interpretable Sparse Graph Representation Learning with Laplacian Pooling
Emmanuel Noutahi et al.
Mode: disjoint.
This layer computes a soft clustering of the graph by first identifying a set of
leaders, and then assigning every remaining node to the cluster of the closest
leader:
is a regularization vecotr that is applied element-wise to the selection
matrix.
If shortest_path_reg=True
, it is equal to the inverse of the shortest path between
each node and its corresponding leader (this can be expensive since it runs on CPU).
Otherwise it is equal to 1.
The reduction and connection are computed as and , respectively.
Note that the number of nodes in the output graph depends on the input node features.
Input
- Node features of shape
(n_nodes_in, n_node_features)
; - Adjacency matrix of shape
(n_nodes_in, n_nodes_in)
;
Output
- Reduced node features of shape
(n_nodes_out, channels)
; - Reduced adjacency matrix of shape
(n_nodes_out, n_nodes_out)
; - If
return_selection=True
, the selection matrix of shape(n_nodes_in, n_nodes_out)
.
Arguments
shortest_path_reg
: boolean, apply the shortest path regularization described in the papaer (can be expensive);return_selection
: boolean, whether to return the selection matrix;
MinCutPool
spektral.layers.MinCutPool(k, mlp_hidden=None, mlp_activation='relu', return_selection=False, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None)
A MinCut pooling layer from the paper
Spectral Clustering with Graph Neural Networks for Graph Pooling
Filippo Maria Bianchi et al.
Mode: single, batch.
This layer learns a soft clustering of the input graph as follows: where is a multi-layer perceptron with softmax output.
Two auxiliary loss terms are also added to the model: the minimum cut loss and the orthogonality loss
The layer can be used without a supervised loss to compute node clustering by minimizing the two auxiliary losses.
Input
- Node features of shape
(batch, n_nodes_in, n_node_features)
; - Symmetrically normalized adjacency matrix of shape
(batch, n_nodes_in, n_nodes_in)
;
Output
- Reduced node features of shape
(batch, n_nodes_out, n_node_features)
; - Reduced adjacency matrix of shape
(batch, n_nodes_out, n_nodes_out)
; - If
return_selection=True
, the selection matrix of shape(batch, n_nodes_in, n_nodes_out)
.
Arguments
k
: number of output nodes;mlp_hidden
: list of integers, number of hidden units for each hidden layer in the MLP used to compute cluster assignments (ifNone
, the MLP has only one output layer);mlp_activation
: activation for the MLP layers;return_selection
: boolean, whether to return the selection matrix;use_bias
: use bias in the MLP;kernel_initializer
: initializer for the weights of the MLP;bias_initializer
: initializer for the bias of the MLP;kernel_regularizer
: regularization applied to the weights of the MLP;bias_regularizer
: regularization applied to the bias of the MLP;kernel_constraint
: constraint applied to the weights of the MLP;bias_constraint
: constraint applied to the bias of the MLP;
SAGPool
spektral.layers.SAGPool(ratio, return_selection=False, return_score=False, sigmoid_gating=False, kernel_initializer='glorot_uniform', kernel_regularizer=None, kernel_constraint=None)
A self-attention graph pooling layer from the paper
Self-Attention Graph Pooling
Junhyun Lee et al.
Mode: single, disjoint.
This layer computes: where returns the indices of the top K values of and
is defined for each graph as a fraction of the number of nodes, controlled by
the ratio
argument.
The gating operation (Cangea et al.) can be replaced with a sigmoid (Gao & Ji).
Input
- Node features of shape
(n_nodes_in, n_node_features)
; - Adjacency matrix of shape
(n_nodes_in, n_nodes_in)
; - Graph IDs of shape
(n_nodes, )
(only in disjoint mode);
Output
- Reduced node features of shape
(ratio * n_nodes_in, n_node_features)
; - Reduced adjacency matrix of shape
(ratio * n_nodes_in, ratio * n_nodes_in)
; - Reduced graph IDs of shape
(ratio * n_nodes_in, )
(only in disjoint mode); - If
return_selection=True
, the selection mask of shape(ratio * n_nodes_in, )
. - If
return_score=True
, the scoring vector of shape(n_nodes_in, )
Arguments
ratio
: float between 0 and 1, ratio of nodes to keep in each graph;return_selection
: boolean, whether to return the selection mask;return_score
: boolean, whether to return the node scoring vector;sigmoid_gating
: boolean, use a sigmoid activation for gating instead of a tanh;kernel_initializer
: initializer for the weights;kernel_regularizer
: regularization applied to the weights;kernel_constraint
: constraint applied to the weights;
TopKPool
spektral.layers.TopKPool(ratio, return_selection=False, return_score=False, sigmoid_gating=False, kernel_initializer='glorot_uniform', kernel_regularizer=None, kernel_constraint=None)
A gPool/Top-K layer from the papers
Graph U-Nets
Hongyang Gao and Shuiwang Ji
and
Towards Sparse Hierarchical Graph Classifiers
Cătălina Cangea et al.
Mode: single, disjoint.
This layer computes: where returns the indices of the top K values of , and is a learnable parameter vector of size .
is defined for each graph as a fraction of the number of nodes,
controlled by the ratio
argument.
The gating operation (Cangea et al.) can be replaced with a sigmoid (Gao & Ji).
Input
- Node features of shape
(n_nodes_in, n_node_features)
; - Adjacency matrix of shape
(n_nodes_in, n_nodes_in)
; - Graph IDs of shape
(n_nodes, )
(only in disjoint mode);
Output
- Reduced node features of shape
(ratio * n_nodes_in, n_node_features)
; - Reduced adjacency matrix of shape
(ratio * n_nodes_in, ratio * n_nodes_in)
; - Reduced graph IDs of shape
(ratio * n_nodes_in, )
(only in disjoint mode); - If
return_selection=True
, the selection mask of shape(ratio * n_nodes_in, )
. - If
return_score=True
, the scoring vector of shape(n_nodes_in, )
Arguments
ratio
: float between 0 and 1, ratio of nodes to keep in each graph;return_selection
: boolean, whether to return the selection mask;return_score
: boolean, whether to return the node scoring vector;sigmoid_gating
: boolean, use a sigmoid activation for gating instead of a tanh;kernel_initializer
: initializer for the weights;kernel_regularizer
: regularization applied to the weights;kernel_constraint
: constraint applied to the weights;
JustBalancePool
spektral.layers.JustBalancePool(k, mlp_hidden=None, mlp_activation='relu', normalized_loss=False, return_selection=False, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None)
The Just Balance pooling layer from the paper
Simplifying Clustering with Graph Neural Networks
Filippo Maria Bianchi
Mode: single, batch.
This layer learns a soft clustering of the input graph as follows: where is a multi-layer perceptron with softmax output.
The layer adds the following auxiliary loss to the model
The layer can be used without a supervised loss to compute node clustering by minimizing the auxiliary loss.
The layer is originally designed to be used in conjuction with a GCNConv layer operating on the following connectivity matrix
Input
- Node features of shape
(batch, n_nodes_in, n_node_features)
; - Connectivity matrix of shape
(batch, n_nodes_in, n_nodes_in)
;
Output
- Reduced node features of shape
(batch, n_nodes_out, n_node_features)
; - Reduced adjacency matrix of shape
(batch, n_nodes_out, n_nodes_out)
; - If
return_selection=True
, the selection matrix of shape(batch, n_nodes_in, n_nodes_out)
.
Arguments
k
: number of output nodes;mlp_hidden
: list of integers, number of hidden units for each hidden layer in the MLP used to compute cluster assignments (ifNone
, the MLP has only one output layer);mlp_activation
: activation for the MLP layers;normalized_loss
: booelan, whether to normalize the auxiliary loss in [0,1];return_selection
: boolean, whether to return the selection matrix;kernel_initializer
: initializer for the weights of the MLP;bias_initializer
: initializer for the bias of the MLP;kernel_regularizer
: regularization applied to the weights of the MLP;bias_regularizer
: regularization applied to the bias of the MLP;kernel_constraint
: constraint applied to the weights of the MLP;bias_constraint
: constraint applied to the bias of the MLP;
DMoNPool
spektral.layers.DMoNPool(k, mlp_hidden=None, mlp_activation='relu', return_selection=False, collapse_regularization=0.1, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None)
The DMoN pooling layer from the paper
Graph Clustering with Graph Neural Networks
Anton Tsitsulin et al.
Mode: single, batch.
This layer learns a soft clustering of the input graph as follows: where is a multi-layer perceptron with softmax output.
Two auxiliary loss terms are also added to the model: the modularity loss and the collapse regularization loss
This layer is based on the original implementation found here.
Input
- Node features of shape
(batch, n_nodes_in, n_node_features)
; - Symmetrically normalized adjacency matrix of shape
(batch, n_nodes_in, n_nodes_in)
;
Output
- Reduced node features of shape
(batch, n_nodes_out, n_node_features)
; - Reduced adjacency matrix of shape
(batch, n_nodes_out, n_nodes_out)
; - If
return_selection=True
, the selection matrix of shape(batch, n_nodes_in, n_nodes_out)
.
Arguments
k
: number of output nodes;mlp_hidden
: list of integers, number of hidden units for each hidden layer in the MLP used to compute cluster assignments (ifNone
, the MLP has only one output layer);mlp_activation
: activation for the MLP layers;collapse_regularization
: strength of the collapse regularization;return_selection
: boolean, whether to return the selection matrix;use_bias
: use bias in the MLP;kernel_initializer
: initializer for the weights of the MLP;bias_initializer
: initializer for the bias of the MLP;kernel_regularizer
: regularization applied to the weights of the MLP;bias_regularizer
: regularization applied to the bias of the MLP;kernel_constraint
: constraint applied to the weights of the MLP;bias_constraint
: constraint applied to the bias of the MLP;
Global pooling layers
GlobalAvgPool
spektral.layers.GlobalAvgPool()
An average pooling layer. Pools a graph by computing the average of its node features.
Mode: single, disjoint, mixed, batch.
Input
- Node features of shape
([batch], n_nodes, n_node_features)
; - Graph IDs of shape
(n_nodes, )
(only in disjoint mode);
Output
- Pooled node features of shape
(batch, n_node_features)
(if single mode, shape will be(1, n_node_features)
).
Arguments
None.
GlobalMaxPool
spektral.layers.GlobalMaxPool()
A max pooling layer. Pools a graph by computing the maximum of its node features.
Mode: single, disjoint, mixed, batch.
Input
- Node features of shape
([batch], n_nodes, n_node_features)
; - Graph IDs of shape
(n_nodes, )
(only in disjoint mode);
Output
- Pooled node features of shape
(batch, n_node_features)
(if single mode, shape will be(1, n_node_features)
).
Arguments
None.
GlobalSumPool
spektral.layers.GlobalSumPool()
A global sum pooling layer. Pools a graph by computing the sum of its node features.
Mode: single, disjoint, mixed, batch.
Input
- Node features of shape
([batch], n_nodes, n_node_features)
; - Graph IDs of shape
(n_nodes, )
(only in disjoint mode);
Output
- Pooled node features of shape
(batch, n_node_features)
(if single mode, shape will be(1, n_node_features)
).
Arguments
None.
GlobalAttentionPool
spektral.layers.GlobalAttentionPool(channels, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None)
A gated attention global pooling layer from the paper
Gated Graph Sequence Neural Networks
Yujia Li et al.
This layer computes: where is the sigmoid activation function.
Mode: single, disjoint, mixed, batch.
Input
- Node features of shape
([batch], n_nodes, n_node_features)
; - Graph IDs of shape
(n_nodes, )
(only in disjoint mode);
Output
- Pooled node features of shape
(batch, channels)
(if single mode, shape will be(1, channels)
).
Arguments
channels
: integer, number of output channels;bias_initializer
: initializer for the bias vectors;kernel_regularizer
: regularization applied to the kernel matrices;bias_regularizer
: regularization applied to the bias vectors;kernel_constraint
: constraint applied to the kernel matrices;bias_constraint
: constraint applied to the bias vectors.
GlobalAttnSumPool
spektral.layers.GlobalAttnSumPool(attn_kernel_initializer='glorot_uniform', attn_kernel_regularizer=None, attn_kernel_constraint=None)
A node-attention global pooling layer. Pools a graph by learning attention coefficients to sum node features.
This layer computes: where is a trainable vector. Note that the softmax is applied across nodes, and not across features.
Mode: single, disjoint, mixed, batch.
Input
- Node features of shape
([batch], n_nodes, n_node_features)
; - Graph IDs of shape
(n_nodes, )
(only in disjoint mode);
Output
- Pooled node features of shape
(batch, n_node_features)
(if single mode, shape will be(1, n_node_features)
).
Arguments
attn_kernel_initializer
: initializer for the attention weights;attn_kernel_regularizer
: regularization applied to the attention kernel matrix;attn_kernel_constraint
: constraint applied to the attention kernel matrix;
SortPool
spektral.layers.SortPool(k)
A SortPool layer as described by Zhang et al. This layers takes a graph signal and returns the topmost k rows according to the last column. If has less than k rows, the result is zero-padded to k.
Mode: single, disjoint, batch.
Input
- Node features of shape
([batch], n_nodes, n_node_features)
; - Graph IDs of shape
(n_nodes, )
(only in disjoint mode);
Output
- Pooled node features of shape
(batch, k, n_node_features)
(if single mode, shape will be(1, k, n_node_features)
).
Arguments
k
: integer, number of nodes to keep;