## Pooling layers

The following pooling layers are available in Spektral.

See the convolutional layers page for the notation.

[source]

#### DiffPool

spektral.layers.DiffPool(k, channels=None, return_mask=False, activation=None, kernel_initializer='glorot_uniform', kernel_regularizer=None, kernel_constraint=None)


A DiffPool layer from the paper

Mode: batch.

This layer computes a soft clustering $\S$ of the input graphs using a GNN, and reduces graphs as follows:

where GNN consists of one GraphConv layer with softmax activation. Two auxiliary loss terms are also added to the model: the link prediction loss and the entropy loss

The layer also applies a 1-layer GCN to the input features, and returns the updated graph signal (the number of output channels is controlled by the channels parameter). The layer can be used without a supervised loss, to compute node clustering simply by minimizing the two auxiliary losses.

Input

• Node features of shape ([batch], n_nodes, n_node_features);
• Binary adjacency matrix of shape ([batch], n_nodes, n_nodes);

Output

• Reduced node features of shape ([batch], K, channels);
• Reduced adjacency matrix of shape ([batch], K, K);
• If return_mask=True, the soft clustering matrix of shape ([batch], n_nodes, K).

Arguments

• k: number of nodes to keep;
• channels: number of output channels (if None, the number of output channels is assumed to be the same as the input);
• return_mask: boolean, whether to return the cluster assignment matrix;
• kernel_initializer: initializer for the weights;
• kernel_regularizer: regularization applied to the weights;
• kernel_constraint: constraint applied to the weights;

[source]

#### MinCutPool

spektral.layers.MinCutPool(k, mlp_hidden=None, mlp_activation='relu', return_mask=False, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None)


A MinCut pooling layer from the paper

Spectral Clustering with Graph Neural Networks for Graph Pooling
Filippo Maria Bianchi et al.

Mode: batch.

This layer computes a soft clustering $\S$ of the input graphs using a MLP, and reduces graphs as follows: where MLP is a multi-layer perceptron with softmax output. Two auxiliary loss terms are also added to the model: the minCUT loss and the orthogonality loss

The layer can be used without a supervised loss, to compute node clustering simply by minimizing the two auxiliary losses.

Input

• Node features of shape ([batch], n_nodes, n_node_features);
• Symmetrically normalized adjacency matrix of shape ([batch], n_nodes, n_nodes);

Output

• Reduced node features of shape ([batch], K, n_node_features);
• Reduced adjacency matrix of shape ([batch], K, K);
• If return_mask=True, the soft clustering matrix of shape ([batch], n_nodes, K).

Arguments

• k: number of nodes to keep;
• mlp_hidden: list of integers, number of hidden units for each hidden layer in the MLP used to compute cluster assignments (if None, the MLP has only the output layer);
• mlp_activation: activation for the MLP layers;
• return_mask: boolean, whether to return the cluster assignment matrix;
• use_bias: use bias in the MLP;
• kernel_initializer: initializer for the weights of the MLP;
• bias_initializer: initializer for the bias of the MLP;
• kernel_regularizer: regularization applied to the weights of the MLP;
• bias_regularizer: regularization applied to the bias of the MLP;
• kernel_constraint: constraint applied to the weights of the MLP;
• bias_constraint: constraint applied to the bias of the MLP;

[source]

#### SAGPool

spektral.layers.SAGPool(ratio, return_mask=False, sigmoid_gating=False, kernel_initializer='glorot_uniform', kernel_regularizer=None, kernel_constraint=None)


A self-attention graph pooling layer (SAG) from the paper

Self-Attention Graph Pooling
Junhyun Lee et al.

Mode: single, disjoint.

This layer computes the following operations:

where $\textrm{rank}(\y, K)$ returns the indices of the top K values of $\y$, and $\textrm{GNN}$ consists of one GraphConv layer with no activation. $K$ is defined for each graph as a fraction of the number of nodes.

This layer temporarily makes the adjacency matrix dense in order to compute $\A'$. If memory is not an issue, considerable speedups can be achieved by using dense graphs directly. Converting a graph from sparse to dense and back to sparse is an expensive operation.

Input

• Node features of shape (n_nodes, n_node_features);
• Binary adjacency matrix of shape (n_nodes, n_nodes);
• Graph IDs of shape (n_nodes, ) (only in disjoint mode);

Output

• Reduced node features of shape (ratio * n_nodes, n_node_features);
• Reduced adjacency matrix of shape (ratio * n_nodes, ratio * n_nodes);
• Reduced graph IDs of shape (ratio * n_nodes, ) (only in disjoint mode);
• If return_mask=True, the binary pooling mask of shape (ratio * n_nodes, ).

Arguments

• ratio: float between 0 and 1, ratio of nodes to keep in each graph;
• return_mask: boolean, whether to return the binary mask used for pooling;
• sigmoid_gating: boolean, use a sigmoid gating activation instead of a tanh;
• kernel_initializer: initializer for the weights;
• kernel_regularizer: regularization applied to the weights;
• kernel_constraint: constraint applied to the weights;

[source]

#### TopKPool

spektral.layers.TopKPool(ratio, return_mask=False, sigmoid_gating=False, kernel_initializer='glorot_uniform', kernel_regularizer=None, kernel_constraint=None)


A gPool/Top-K layer from the papers

Graph U-Nets
Hongyang Gao and Shuiwang Ji

and

Towards Sparse Hierarchical Graph Classifiers
Cătălina Cangea et al.

Mode: single, disjoint.

This layer computes the following operations:

where $\textrm{rank}(\y, K)$ returns the indices of the top K values of $\y$, and $\p$ is a learnable parameter vector of size $F$. $K$ is defined for each graph as a fraction of the number of nodes. Note that the the gating operation $\textrm{tanh}(\y)$ (Cangea et al.) can be replaced with a sigmoid (Gao & Ji).

Input

• Node features of shape (n_nodes, n_node_features);
• Binary adjacency matrix of shape (n_nodes, n_nodes);
• Graph IDs of shape (n_nodes, ) (only in disjoint mode);

Output

• Reduced node features of shape (ratio * n_nodes, n_node_features);
• Reduced adjacency matrix of shape (ratio * n_nodes, ratio * n_nodes);
• Reduced graph IDs of shape (ratio * n_nodes, ) (only in disjoint mode);
• If return_mask=True, the binary pooling mask of shape (ratio * n_nodes, ).

Arguments

• ratio: float between 0 and 1, ratio of nodes to keep in each graph;
• return_mask: boolean, whether to return the binary mask used for pooling;
• sigmoid_gating: boolean, use a sigmoid gating activation instead of a tanh;
• kernel_initializer: initializer for the weights;
• kernel_regularizer: regularization applied to the weights;
• kernel_constraint: constraint applied to the weights;

### Global pooling layers

[source]

#### GlobalAvgPool

spektral.layers.GlobalAvgPool()


An average pooling layer. Pools a graph by computing the average of its node features.

Mode: single, disjoint, mixed, batch.

Input

• Node features of shape ([batch], n_nodes, n_node_features);
• Graph IDs of shape (n_nodes, ) (only in disjoint mode);

Output

• Pooled node features of shape (batch, n_node_features) (if single mode, shape will be (1, n_node_features)).

Arguments

None.

[source]

#### GlobalMaxPool

spektral.layers.GlobalMaxPool()


A max pooling layer. Pools a graph by computing the maximum of its node features.

Mode: single, disjoint, mixed, batch.

Input

• Node features of shape ([batch], n_nodes, n_node_features);
• Graph IDs of shape (n_nodes, ) (only in disjoint mode);

Output

• Pooled node features of shape (batch, n_node_features) (if single mode, shape will be (1, n_node_features)).

Arguments

None.

[source]

#### GlobalSumPool

spektral.layers.GlobalSumPool()


A global sum pooling layer. Pools a graph by computing the sum of its node features.

Mode: single, disjoint, mixed, batch.

Input

• Node features of shape ([batch], n_nodes, n_node_features);
• Graph IDs of shape (n_nodes, ) (only in disjoint mode);

Output

• Pooled node features of shape (batch, n_node_features) (if single mode, shape will be (1, n_node_features)).

Arguments

None.

[source]

#### GlobalAttentionPool

spektral.layers.GlobalAttentionPool(channels, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None)


A gated attention global pooling layer from the paper

Gated Graph Sequence Neural Networks
Yujia Li et al.

This layer computes: where $\sigma$ is the sigmoid activation function.

Mode: single, disjoint, mixed, batch.

Input

• Node features of shape ([batch], n_nodes, n_node_features);
• Graph IDs of shape (n_nodes, ) (only in disjoint mode);

Output

• Pooled node features of shape (batch, channels) (if single mode, shape will be (1, channels)).

Arguments

• channels: integer, number of output channels;
• bias_initializer: initializer for the bias vectors;
• kernel_regularizer: regularization applied to the kernel matrices;
• bias_regularizer: regularization applied to the bias vectors;
• kernel_constraint: constraint applied to the kernel matrices;
• bias_constraint: constraint applied to the bias vectors.

[source]

#### GlobalAttnSumPool

spektral.layers.GlobalAttnSumPool(attn_kernel_initializer='glorot_uniform', attn_kernel_regularizer=None, attn_kernel_constraint=None)


A node-attention global pooling layer. Pools a graph by learning attention coefficients to sum node features.

This layer computes: where $\a \in \mathbb{R}^F$ is a trainable vector. Note that the softmax is applied across nodes, and not across features.

Mode: single, disjoint, mixed, batch.

Input

• Node features of shape ([batch], n_nodes, n_node_features);
• Graph IDs of shape (n_nodes, ) (only in disjoint mode);

Output

• Pooled node features of shape (batch, n_node_features) (if single mode, shape will be (1, n_node_features)).

Arguments

• attn_kernel_initializer: initializer for the attention weights;
• attn_kernel_regularizer: regularization applied to the attention kernel matrix;
• attn_kernel_constraint: constraint applied to the attention kernel matrix;

[source]

#### SortPool

spektral.layers.SortPool(k)


A SortPool layer as described by Zhang et al. This layers takes a graph signal $\mathbf{X}$ and returns the topmost k rows according to the last column. If $\mathbf{X}$ has less than k rows, the result is zero-padded to k.

Mode: single, disjoint, batch.

Input

• Node features of shape ([batch], n_nodes, n_node_features);
• Graph IDs of shape (n_nodes, ) (only in disjoint mode);

Output

• Pooled node features of shape (batch, k, n_node_features) (if single mode, shape will be (1, k, n_node_features)).

Arguments

• k: integer, number of nodes to keep;