## Pooling layers

The following pooling layers are available in Spektral.

See the convolutional layers page for the notation.

#### DiffPool

```
spektral.layers.DiffPool(k, channels=None, return_mask=False, activation=None, kernel_initializer='glorot_uniform', kernel_regularizer=None, kernel_constraint=None)
```

A DiffPool layer from the paper

Hierarchical Graph Representation Learning with Differentiable Pooling

Rex Ying et al.

**Mode**: batch.

This layer computes a soft clustering of the input graphs using a GNN, and reduces graphs as follows:

where GNN consists of one GraphConv layer with softmax activation.
Two auxiliary loss terms are also added to the model: the *link prediction
loss*
and the *entropy loss*

The layer also applies a 1-layer GCN to the input features, and returns
the updated graph signal (the number of output channels is controlled by
the `channels`

parameter).
The layer can be used without a supervised loss, to compute node clustering
simply by minimizing the two auxiliary losses.

**Input**

- Node features of shape
`([batch], n_nodes, n_node_features)`

; - Binary adjacency matrix of shape
`([batch], n_nodes, n_nodes)`

;

**Output**

- Reduced node features of shape
`([batch], K, channels)`

; - Reduced adjacency matrix of shape
`([batch], K, K)`

; - If
`return_mask=True`

, the soft clustering matrix of shape`([batch], n_nodes, K)`

.

**Arguments**

`k`

: number of nodes to keep;`channels`

: number of output channels (if None, the number of output channels is assumed to be the same as the input);`return_mask`

: boolean, whether to return the cluster assignment matrix;`kernel_initializer`

: initializer for the weights;`kernel_regularizer`

: regularization applied to the weights;`kernel_constraint`

: constraint applied to the weights;

#### MinCutPool

```
spektral.layers.MinCutPool(k, mlp_hidden=None, mlp_activation='relu', return_mask=False, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None)
```

A MinCut pooling layer from the paper

Spectral Clustering with Graph Neural Networks for Graph Pooling

Filippo Maria Bianchi et al.

**Mode**: batch.

This layer computes a soft clustering of the input graphs using a MLP,
and reduces graphs as follows:
where MLP is a multi-layer perceptron with softmax output.
Two auxiliary loss terms are also added to the model: the *minCUT loss*
and the *orthogonality loss*

The layer can be used without a supervised loss, to compute node clustering simply by minimizing the two auxiliary losses.

**Input**

- Node features of shape
`([batch], n_nodes, n_node_features)`

; - Binary adjacency matrix of shape
`([batch], n_nodes, n_nodes)`

;

**Output**

- Reduced node features of shape
`([batch], K, n_node_features)`

; - Reduced adjacency matrix of shape
`([batch], K, K)`

; - If
`return_mask=True`

, the soft clustering matrix of shape`([batch], n_nodes, K)`

.

**Arguments**

`k`

: number of nodes to keep;`mlp_hidden`

: list of integers, number of hidden units for each hidden layer in the MLP used to compute cluster assignments (if None, the MLP has only the output layer);`mlp_activation`

: activation for the MLP layers;`return_mask`

: boolean, whether to return the cluster assignment matrix;`use_bias`

: use bias in the MLP;`kernel_initializer`

: initializer for the weights of the MLP;`bias_initializer`

: initializer for the bias of the MLP;`kernel_regularizer`

: regularization applied to the weights of the MLP;`bias_regularizer`

: regularization applied to the bias of the MLP;`kernel_constraint`

: constraint applied to the weights of the MLP;`bias_constraint`

: constraint applied to the bias of the MLP;

#### SAGPool

```
spektral.layers.SAGPool(ratio, return_mask=False, sigmoid_gating=False, kernel_initializer='glorot_uniform', kernel_regularizer=None, kernel_constraint=None)
```

A self-attention graph pooling layer (SAG) from the paper

Self-Attention Graph Pooling

Junhyun Lee et al.

**Mode**: single, disjoint.

This layer computes the following operations:

where returns the indices of the top K values of , and consists of one GraphConv layer with no activation. is defined for each graph as a fraction of the number of nodes.

This layer temporarily makes the adjacency matrix dense in order to compute . If memory is not an issue, considerable speedups can be achieved by using dense graphs directly. Converting a graph from sparse to dense and back to sparse is an expensive operation.

**Input**

- Node features of shape
`(n_nodes, n_node_features)`

; - Binary adjacency matrix of shape
`(n_nodes, n_nodes)`

; - Graph IDs of shape
`(n_nodes, )`

(only in disjoint mode);

**Output**

- Reduced node features of shape
`(ratio * n_nodes, n_node_features)`

; - Reduced adjacency matrix of shape
`(ratio * n_nodes, ratio * n_nodes)`

; - Reduced graph IDs of shape
`(ratio * n_nodes, )`

(only in disjoint mode); - If
`return_mask=True`

, the binary pooling mask of shape`(ratio * n_nodes, )`

.

**Arguments**

`ratio`

: float between 0 and 1, ratio of nodes to keep in each graph;`return_mask`

: boolean, whether to return the binary mask used for pooling;`sigmoid_gating`

: boolean, use a sigmoid gating activation instead of a tanh;`kernel_initializer`

: initializer for the weights;`kernel_regularizer`

: regularization applied to the weights;`kernel_constraint`

: constraint applied to the weights;

#### TopKPool

```
spektral.layers.TopKPool(ratio, return_mask=False, sigmoid_gating=False, kernel_initializer='glorot_uniform', kernel_regularizer=None, kernel_constraint=None)
```

A gPool/Top-K layer from the papers

Graph U-Nets

Hongyang Gao and Shuiwang Ji

and

Towards Sparse Hierarchical Graph Classifiers

Cătălina Cangea et al.

**Mode**: single, disjoint.

This layer computes the following operations:

where returns the indices of the top K values of , and is a learnable parameter vector of size . is defined for each graph as a fraction of the number of nodes. Note that the the gating operation (Cangea et al.) can be replaced with a sigmoid (Gao & Ji).

This layer temporarily makes the adjacency matrix dense in order to compute . If memory is not an issue, considerable speedups can be achieved by using dense graphs directly. Converting a graph from sparse to dense and back to sparse is an expensive operation.

**Input**

- Node features of shape
`(n_nodes, n_node_features)`

; - Binary adjacency matrix of shape
`(n_nodes, n_nodes)`

; - Graph IDs of shape
`(n_nodes, )`

(only in disjoint mode);

**Output**

- Reduced node features of shape
`(ratio * n_nodes, n_node_features)`

; - Reduced adjacency matrix of shape
`(ratio * n_nodes, ratio * n_nodes)`

; - Reduced graph IDs of shape
`(ratio * n_nodes, )`

(only in disjoint mode); - If
`return_mask=True`

, the binary pooling mask of shape`(ratio * n_nodes, )`

.

**Arguments**

`ratio`

: float between 0 and 1, ratio of nodes to keep in each graph;`return_mask`

: boolean, whether to return the binary mask used for pooling;`sigmoid_gating`

: boolean, use a sigmoid gating activation instead of a tanh;`kernel_initializer`

: initializer for the weights;`kernel_regularizer`

: regularization applied to the weights;`kernel_constraint`

: constraint applied to the weights;

### Global pooling layers

#### GlobalAvgPool

```
spektral.layers.GlobalAvgPool()
```

An average pooling layer. Pools a graph by computing the average of its node features.

**Mode**: single, disjoint, mixed, batch.

**Input**

- Node features of shape
`([batch], n_nodes, n_node_features)`

; - Graph IDs of shape
`(n_nodes, )`

(only in disjoint mode);

**Output**

- Pooled node features of shape
`(batch, n_node_features)`

(if single mode, shape will be`(1, n_node_features)`

).

**Arguments**

None.

#### GlobalMaxPool

```
spektral.layers.GlobalMaxPool()
```

A max pooling layer. Pools a graph by computing the maximum of its node features.

**Mode**: single, disjoint, mixed, batch.

**Input**

- Node features of shape
`([batch], n_nodes, n_node_features)`

; - Graph IDs of shape
`(n_nodes, )`

(only in disjoint mode);

**Output**

- Pooled node features of shape
`(batch, n_node_features)`

(if single mode, shape will be`(1, n_node_features)`

).

**Arguments**

None.

#### GlobalSumPool

```
spektral.layers.GlobalSumPool()
```

A global sum pooling layer. Pools a graph by computing the sum of its node features.

**Mode**: single, disjoint, mixed, batch.

**Input**

- Node features of shape
`([batch], n_nodes, n_node_features)`

; - Graph IDs of shape
`(n_nodes, )`

(only in disjoint mode);

**Output**

- Pooled node features of shape
`(batch, n_node_features)`

(if single mode, shape will be`(1, n_node_features)`

).

**Arguments**

None.

#### GlobalAttentionPool

```
spektral.layers.GlobalAttentionPool(channels, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, kernel_constraint=None, bias_constraint=None)
```

A gated attention global pooling layer from the paper

Gated Graph Sequence Neural Networks

Yujia Li et al.

This layer computes: where is the sigmoid activation function.

**Mode**: single, disjoint, mixed, batch.

**Input**

- Node features of shape
`([batch], n_nodes, n_node_features)`

; - Graph IDs of shape
`(n_nodes, )`

(only in disjoint mode);

**Output**

- Pooled node features of shape
`(batch, channels)`

(if single mode, shape will be`(1, channels)`

).

**Arguments**

`channels`

: integer, number of output channels;`bias_initializer`

: initializer for the bias vectors;`kernel_regularizer`

: regularization applied to the kernel matrices;`bias_regularizer`

: regularization applied to the bias vectors;`kernel_constraint`

: constraint applied to the kernel matrices;`bias_constraint`

: constraint applied to the bias vectors.

#### GlobalAttnSumPool

```
spektral.layers.GlobalAttnSumPool(attn_kernel_initializer='glorot_uniform', attn_kernel_regularizer=None, attn_kernel_constraint=None)
```

A node-attention global pooling layer. Pools a graph by learning attention coefficients to sum node features.

This layer computes: where is a trainable vector. Note that the softmax is applied across nodes, and not across features.

**Mode**: single, disjoint, mixed, batch.

**Input**

- Node features of shape
`([batch], n_nodes, n_node_features)`

; - Graph IDs of shape
`(n_nodes, )`

(only in disjoint mode);

**Output**

- Pooled node features of shape
`(batch, n_node_features)`

(if single mode, shape will be`(1, n_node_features)`

).

**Arguments**

`attn_kernel_initializer`

: initializer for the attention weights;`attn_kernel_regularizer`

: regularization applied to the attention kernel matrix;`attn_kernel_constraint`

: constraint applied to the attention kernel matrix;

#### SortPool

```
spektral.layers.SortPool(k)
```

A SortPool layer as described by Zhang et al. This layers takes a graph signal and returns the topmost k rows according to the last column. If has less than k rows, the result is zero-padded to k.

**Mode**: single, disjoint, batch.

**Input**

- Node features of shape
`([batch], n_nodes, n_node_features)`

; - Graph IDs of shape
`(n_nodes, )`

(only in disjoint mode);

**Output**

- Pooled node features of shape
`(batch, k, n_node_features)`

(if single mode, shape will be`(1, k, n_node_features)`

).

**Arguments**

`k`

: integer, number of nodes to keep;