opengt.layer

class opengt.layer.appnp_layer.APPNP(layer_config: LayerConfig, **kwargs)[source]

Bases: Module

Wrapper layer for the APPNP layer from torch_geometric.nn.

Parameters:
  • dim_in (int) – Number of input features. Handled by GraphGym.

  • dim_out (int) – Number of output features. Handled by GraphGym.

  • K (int) – Number of propagation steps. Default is 10.

  • alpha (float) – Teleport probability. Default is 0.1.

Input:

batch.x (Tensor): Input node features of shape. batch.edge_index (Tensor): Edge indices of the graph.

Output:

ret.x (Tensor): Output node features after applying the APPNP layer.

class opengt.layer.bga_layer.BGALayer(n_head, channels, dropout=0.1)[source]

Bases: Module

Bilevel Graph Attention layer. Used in CoBFormer model. Adapted from https://github.com/null-xyj/CoBFormer

Parameters:
  • n_head (int) – Number of attention heads. Handled by GraphGym.

  • channels (int) – Number of input channels. Handled by GraphGym.

  • dropout (float) – Dropout rate.

Input:

x (Tensor): Input node features. patch (Tensor): Patch indices. attn_mask (Tensor): Attention mask. need_attn (bool): Whether to return attention weights.

Output:

x (Tensor): Output node features after applying the BGA layer.

class opengt.layer.bga_layer.FFN(channels, dropout=0.1)[source]

Bases: Module

A two-feed-forward-layer module

class opengt.layer.bga_layer.MultiHeadAttention(n_head, channels, dropout=0.1)[source]

Bases: Module

Multi-Head Attention module

class opengt.layer.bga_layer.ScaledDotProductAttention(temperature, attn_dropout=0.1)[source]

Bases: Module

Scaled Dot-Product Attention

class opengt.layer.degta_layer.DeGTAConv(dim_in)[source]

Bases: Module

Decouplized Graph Triple Attention Layer. Adapted from https://github.com/wangxiaotang0906/DeGTA

Parameters:

dim_in (int) – Number of input features.

Input:

batch.x (Tensor): Input node features. Should be concatenated from three different encoders. batch.edge_index (Tensor): Edge indices of the graph.

Output:

ret.x (Tensor): Output node features after applying the DeGTA layer.

class opengt.layer.ETransformer.ETransformer(in_dim, out_dim, num_heads, use_bias, edge_index='edge_index', use_edge_attr=False, edge_attr='edge_attr')[source]

Bases: Module

Mostly Multi-Head Graph Attention Layer.

Ported to PyG from original repo: https://github.com/DevinKreuzer/SAN/blob/main/layers/graph_transformer_layer.py

class opengt.layer.Exphormer.ExphormerAttention(in_dim, out_dim, num_heads, use_bias, dim_edge=None, use_virt_nodes=False)[source]

Bases: Module

class opengt.layer.Exphormer.ExphormerFullLayer(in_dim, out_dim, num_heads, dropout=0.0, dim_edge=None, layer_norm=False, batch_norm=True, activation='relu', residual=True, use_bias=False, use_virt_nodes=False)[source]

Bases: Module

Exphormer attention + FFN Adapted from https://github.com/hamed1375/Exphormer

Parameters:
  • in_dim (int) – Number of input features.

  • out_dim (int) – Number of output features.

  • num_heads (int) – Number of attention heads.

  • dropout (float) – Dropout rate.

  • dim_edge (int) – Number of edge features. Default: None.

  • layer_norm (bool) – Whether to use layer normalization. Default: False.

  • batch_norm (bool) – Whether to use batch normalization. Default: True.

  • activation (str) – Activation function. Default: ‘relu’.

  • residual (bool) – Whether to use residual connection. Default: True.

  • use_bias (bool) – Whether to use bias in linear layers. Default: False.

  • use_virt_nodes (bool) – Whether to use virtual nodes. Default: False.

Input:

batch.x (Tensor): Input node features. batch.edge_index (Tensor): Edge indices of the graph. batch.expander_edge_attr (Tensor): Edge features for attention. batch.expander_edge_index (Tensor): Edge indices for attention. batch.virt_h (Tensor): Virtual node features.

Output:

batch.x (Tensor): Output node features after applying the Exphormer layer.

class opengt.layer.gatedgcn_layer.GatedGCNGraphGymLayer(layer_config: LayerConfig, **kwargs)[source]

Bases: Module

GatedGCN layer. Residual Gated Graph ConvNets https://arxiv.org/pdf/1711.07553.pdf

Parameters:
  • in_dim (int) – Number of input features. Handled by GraphGym.

  • out_dim (int) – Number of output features. Handled by GraphGym.

class opengt.layer.gatedgcn_layer.GatedGCNLayer(in_dim, out_dim, dropout, residual, act='relu', equivstable_pe=False, **kwargs)[source]

Bases: MessagePassing

GatedGCN layer Residual Gated Graph ConvNets https://arxiv.org/pdf/1711.07553.pdf

aggregate(sigma_ij, index, Bx_j, Bx)[source]

sigma_ij : [n_edges, out_dim] ; is the output from message() function index : [n_edges] {}x_j : [n_edges, out_dim]

message(Dx_i, Ex_j, PE_i, PE_j, Ce)[source]

{}x_i : [n_edges, out_dim] {}x_j : [n_edges, out_dim] {}e : [n_edges, out_dim]

update(aggr_out, Ax)[source]

aggr_out : [n_nodes, out_dim] ; is the output from aggregate() function after the aggregation {}x : [n_nodes, out_dim]

class opengt.layer.gine_conv_layer.GINEConvESLapPE(nn, eps=0.0, train_eps=False, edge_dim=None, **kwargs)[source]

Bases: MessagePassing

GINEConv Layer with EquivStableLapPE implementation.

Modified torch_geometric.nn.conv.GINEConv layer to perform message scaling according to equiv. stable PEG-layer with Laplacian Eigenmap (LapPE): ICLR 2022 https://openreview.net/pdf?id=e95i1IHcWj

message(x_j, edge_attr, PE_i, PE_j)[source]

Constructs messages from node \(j\) to node \(i\) in analogy to \(\phi_{\mathbf{\Theta}}\) for each edge in edge_index. This function can take any argument as input which was initially passed to propagate(). Furthermore, tensors passed to propagate() can be mapped to the respective nodes \(i\) and \(j\) by appending _i or _j to the variable name, .e.g. x_i and x_j.

reset_parameters()[source]

Resets all learnable parameters of the module.

class opengt.layer.gine_conv_layer.GINEConvGraphGymLayer(layer_config: LayerConfig, **kwargs)[source]

Bases: Module

Graph Isomorphism Network with Edge features (GINE) layer.

Parameters:
  • dim_in (int) – Number of input features. Handled by GraphGym.

  • dim_out (int) – Number of output features. Handled by GraphGym.

Input:

batch.x (Tensor): Input node features. batch.edge_index (Tensor): Edge indices of the graph. batch.edge_attr (Tensor): Edge features.

Output:

ret.x (Tensor): Output node features after applying the GINE layer.

class opengt.layer.gine_conv_layer.GINEConvLayer(dim_in, dim_out, dropout, residual)[source]

Bases: Module

Graph Isomorphism Network with Edge features (GINE) layer.

class opengt.layer.gps_layer.GPSLayer(dim_h, local_gnn_type, global_model_type, num_heads, act='relu', pna_degrees=None, equivstable_pe=False, dropout=0.0, attn_dropout=0.0, layer_norm=False, batch_norm=True, bigbird_cfg=None, log_attn_weights=False)[source]

Bases: Module

Local MPNN + full graph attention x-former layer. Adapted from https://github.com/rampasek/GraphGPS

Parameters:
  • dim_h (int) – Number of input features.

  • local_gnn_type (str) – Type of local GNN model. Options: ‘None’, ‘GCN’, ‘GIN’, ‘GENConv’, ‘GINE’, ‘GAT’, ‘PNA’, ‘CustomGatedGCN’.

  • global_model_type (str) – Type of global attention model. Options: ‘None’, ‘Transformer’, ‘BiasedTransformer’, ‘Performer’, ‘BigBird’.

  • num_heads (int) – Number of attention heads.

  • act (str) – Activation function. Default: ‘relu’.

  • pna_degrees (list) – Degrees for PNAConv. Default: None.

  • equivstable_pe (bool) – Whether to use EquivStableLapPE. Default: False.

  • dropout (float) – Dropout rate. Default: 0.0.

  • attn_dropout (float) – Attention dropout rate. Default: 0.0.

  • layer_norm (bool) – Whether to use layer normalization. Default: False.

  • batch_norm (bool) – Whether to use batch normalization. Default: True.

  • bigbird_cfg (object) – Configuration object for BigBird layer. Default: None.

  • log_attn_weights (bool) – Whether to log attention weights. Default: False.

Input:

batch.x (Tensor): Input node features. batch.edge_index (Tensor): Edge indices of the graph. batch.edge_attr (Tensor): Edge attributes. batch.pe_EquivStableLapPE (Tensor): EquivStableLapPE features. batch.attn_bias (Tensor): Attention bias for BiasedTransformer.

Output:

batch.x (Tensor): Output node features after applying the GPS layer.

extra_repr()[source]

Set the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

class opengt.layer.graphormer_layer.GraphormerLayer(embed_dim: int, num_heads: int, dropout: float, attention_dropout: float, mlp_dropout: float)[source]

Bases: Module

class opengt.layer.grit_layer.GritTransformerLayer(in_dim, out_dim, num_heads, dropout=0.0, attn_dropout=0.0, layer_norm=False, batch_norm=True, residual=True, act='relu', norm_e=True, O_e=True, cfg={}, **kwargs)[source]

Bases: Module

Proposed Transformer Layer for GRIT Adapted from https://github.com/LiamMa/GRIT

Parameters:
  • in_dim (int) – Number of input features.

  • out_dim (int) – Number of output features.

  • num_heads (int) – Number of attention heads.

  • dropout (float) – Dropout rate.

  • attn_dropout (float) – Attention dropout rate.

  • layer_norm (bool) – Whether to use layer normalization.

  • batch_norm (bool) – Whether to use batch normalization.

  • residual (bool) – Whether to use residual connections.

  • act (str) – Activation function (‘relu’, ‘gelu’, etc.).

  • norm_e (bool) – Whether to normalize edge features.

  • O_e (bool) – Whether to use edge features in the output.

Input:

batch.x (torch.Tensor): Input node features. batch.edge_index (torch.Tensor): Edge indices of the graph. batch.edge_attr (torch.Tensor): Edge attributes.

Output:

batch.x (torch.Tensor): Output node features after applying the GritTransformer layer. batch.edge_attr (torch.Tensor): Updated edge attributes.

class opengt.layer.grit_layer.MultiHeadAttentionLayerGritSparse(in_dim, out_dim, num_heads, use_bias, clamp=5.0, dropout=0.0, act=None, edge_enhance=True, sqrt_relu=False, signed_sqrt=True, cfg={}, **kwargs)[source]

Bases: Module

Proposed Attention Computation for GRIT

opengt.layer.grit_layer.pyg_softmax(src, index, num_nodes=None)[source]

Computes a sparsely evaluated softmax. Given a value tensor src, this function first groups the values along the first dimension based on the indices specified in index, and then proceeds to compute the softmax individually for each group.

Parameters:
  • src (Tensor) – The source tensor.

  • index (LongTensor) – The indices of elements for applying the softmax.

  • num_nodes (int, optional) – The number of nodes, i.e. max_val + 1 of index. (default: None)

Return type:

Tensor

class opengt.layer.mlp_mixer.MLPMixer(layers, dim_hidden, patches, with_final_norm=True, dropout=0)[source]

Bases: Module

GraphMLPMixer layer. Adapted from https://github.com/XiaoxinHe/Graph-ViT-MLPMixer

Parameters:
  • layers (int) – Number of Mixer blocks.

  • dim_hidden (int) – Number of input features.

  • patches (int) – Number of patches.

  • with_final_norm (bool) – Whether to apply final normalization. Default: True.

  • dropout (float) – Dropout rate. Default: 0.0.

Input:

batch.x (torch.Tensor): Input node features.

Output:

batch.x (torch.Tensor): Output node features after applying the Mixer blocks.

class opengt.layer.multi_model_layer.MultiLayer(dim_h, model_types, num_heads, pna_degrees=None, equivstable_pe=False, dropout=0.0, attn_dropout=0.0, layer_norm=False, batch_norm=True, bigbird_cfg=None, exp_edges_cfg=None)[source]

Bases: Module

This layer can be used to combine different types of layers. Adapted from https://github.com/hamed1375/Exphormer

Any combination of different models can be made here.

Each layer can have several types of MPNN and Attention models combined. Examples: 1. GCN 2. GCN + Exphormer 3. GINE + CustomGatedGCN 4. GAT + CustomGatedGCN + Exphormer + Transformer

Parameters:
  • dim_h (int) – Number of input features.

  • model_types (list) – List of model types to combine.

  • num_heads (int) – Number of attention heads.

  • pna_degrees (list) – List of degrees for PNAConv. Default: None.

  • equivstable_pe (bool) – Whether to use EquivStableLapPE. Default: False.

  • dropout (float) – Dropout rate. Default: 0.0.

  • attn_dropout (float) – Attention dropout rate. Default: 0.0.

  • layer_norm (bool) – Whether to use layer normalization. Default: False.

  • batch_norm (bool) – Whether to use batch normalization. Default: True.

  • bigbird_cfg (dict) – Configuration for BigBird layer. Default: None.

  • exp_edges_cfg (dict) – Configuration for expander edges. Default: None.

Input:

batch.x (torch.Tensor): Input node features. batch.edge_index (torch.Tensor): Edge indices of the graph. batch.edge_attr (torch.Tensor): Edge attributes of the graph. batch.expander_edge_index (torch.Tensor): Expander edge indices. batch.expander_edge_attr (torch.Tensor): Expander edge attributes. batch.pe_EquivStableLapPE (torch.Tensor): EquivStableLapPE features.

Output:

batch.x (torch.Tensor): Output node features after applying the combined layers.

extra_repr()[source]

Set the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

class opengt.layer.multi_model_layer.SingleLayer(dim_h, model_type, num_heads, pna_degrees=None, equivstable_pe=False, dropout=0.0, attn_dropout=0.0, layer_norm=False, batch_norm=True, bigbird_cfg=None, exp_edges_cfg=None)[source]

Bases: Module

Model just uses one layer type. Difference with the Multi_Model is that after each layer there is no combining representations and Feed Forward network.

extra_repr()[source]

Set the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

class opengt.layer.nodeformer_layer.NodeFormerConv(dim_in, dim_out, config)[source]

Bases: Module

One layer of NodeFormer that attentive aggregates all nodes over a latent graph Adapted from https://github.com/qitianwu/NodeFormer

Parameters:
  • dim_in (int) – Number of input features.

  • dim_out (int) – Number of output features.

  • config (object) – Configuration object containing hyperparameters. - rb_order (int): Order of relational bias - rb_trans (str): Transformation for relational bias, either ‘sigmoid’ or ‘identity’ - kernel_trans (str): Type of kernel transformation, either ‘softmax’ or ‘relu’ - projection_matrix_type (str): Type of projection matrix, either ‘a’ or None - nb_random_features (int): Number of random features - use_gumbel (bool): Whether to use Gumbel sampling - nb_gumbel_sample (int): Number of Gumbel samples - use_edge_loss (bool): Whether to use edge loss - use_bn (bool): Whether to use batch normalization - use_residual (bool): Whether to use residual connection - use_act (bool): Whether to use activation function - dropout (float): Dropout rate - tau (float): Temperature parameter for Gumbel softmax - n_heads (int): Number of attention heads

Input:

batch.x (torch.Tensor): Input node features. batch.adjs (list): List of adjacency matrices for different orders of relational bias. batch.extra_loss (torch.Tensor): Aggregated extra loss for edge regularization in previous layers.

Output:

ret.x (torch.Tensor): Output node features after applying the NodeFormer layer. ret.extra_loss (torch.Tensor): Aggregated extra loss for edge regularization.

Returns:

node embeddings for next layer, edge loss at this layer

opengt.layer.nodeformer_layer.add_conv_relational_bias(x, edge_index, b, trans='sigmoid')[source]

compute updated result by the relational bias of input adjacency the implementation is similar to the Graph Convolution Network with a (shared) scalar weight for each edge

opengt.layer.nodeformer_layer.kernelized_gumbel_softmax(query, key, value, kernel_transformation, projection_matrix=None, edge_index=None, K=10, tau=0.25, return_weight=True)[source]

fast computation of all-pair attentive aggregation with linear complexity input: query/key/value [B, N, H, D] return: updated node emb, attention weight (for computing edge loss) B = graph number (always equal to 1 in Node Classification), N = node number, H = head number, M = random feature dimension, D = hidden size, K = number of Gumbel sampling

opengt.layer.nodeformer_layer.kernelized_softmax(query, key, value, kernel_transformation, projection_matrix=None, edge_index=None, tau=0.25, return_weight=True)[source]

fast computation of all-pair attentive aggregation with linear complexity input: query/key/value [B, N, H, D] return: updated node emb, attention weight (for computing edge loss) B = graph number (always equal to 1 in Node Classification), N = node number, H = head number, M = random feature dimension, D = hidden size

class opengt.layer.other_attn_layer.MultiHeadAttentionLayerGraphormerSparse(in_dim, out_dim, num_heads, use_bias, clamp=None, dropout=0.0, act=None, edge_enhance=False, **kwargs)[source]

Bases: Module

Multi-Head Graph Attention Layer. Scaled Dot-product

class opengt.layer.other_attn_layer.MultiHeadAttentionLayerSANSparse(in_dim, out_dim, num_heads, use_bias, clamp=None, dropout=0.0, act=None, edge_enhance=False, **kwargs)[source]

Bases: Module

Multi-Head Graph Attention Layer. Scaled Dot-product

opengt.layer.other_attn_layer.pyg_softmax(src, index, num_nodes=None)[source]

Computes a sparsely evaluated softmax. Given a value tensor src, this function first groups the values along the first dimension based on the indices specified in index, and then proceeds to compute the softmax individually for each group.

Parameters:
  • src (Tensor) – The source tensor.

  • index (LongTensor) – The indices of elements for applying the softmax.

  • num_nodes (int, optional) – The number of nodes, i.e. max_val + 1 of index. (default: None)

Return type:

Tensor

class opengt.layer.patch_encoder.PatchEncoder(dim_in, dim_out)[source]

Bases: Module

Patch encoder for GraphMLPMixer. Adapted from https://github.com/XiaoxinHe/Graph-ViT-MLPMixer

Parameters:
  • dim_in (int) – Number of input features.

  • dim_out (int) – Number of output features.

Input:

batch.x (torch.Tensor): Input node features. batch.edge_index (torch.Tensor): Edge indices of the graph. batch.edge_attr (torch.Tensor): Edge attributes of the graph. batch.subgraphs_nodes_mapper (torch.Tensor): Node mapping for subgraphs. batch.subgraphs_edges_mapper (torch.Tensor): Edge mapping for subgraphs. batch.combined_subgraphs (torch.Tensor): Combined subgraphs. batch.subgraphs_batch (torch.Tensor): Batch indices for subgraphs.

Output:

ret.x (torch.Tensor): Output node features after applying the patch encoder.

class opengt.layer.san_layer.MultiHeadAttentionLayer(gamma, in_dim, out_dim, num_heads, full_graph, fake_edge_emb, use_bias)[source]

Bases: Module

Multi-Head Graph Attention Layer.

Ported to PyG from original repo: https://github.com/DevinKreuzer/SAN/blob/main/layers/graph_transformer_layer.py

class opengt.layer.san_layer.SANLayer(gamma, in_dim, out_dim, num_heads, full_graph, fake_edge_emb, dropout=0.0, layer_norm=False, batch_norm=True, residual=True, use_bias=False)[source]

Bases: Module

GraphTransformerLayer from SAN.

Ported to PyG from original repo: https://github.com/DevinKreuzer/SAN/blob/main/layers/graph_transformer_layer.py

class opengt.layer.san2_layer.MultiHeadAttention2Layer(gamma, in_dim, out_dim, num_heads, full_graph, fake_edge_emb, use_bias)[source]

Bases: Module

Multi-Head Graph Attention Layer.

Ported to PyG and modified compared to the original repo: https://github.com/DevinKreuzer/SAN/blob/main/layers/graph_transformer_layer.py

class opengt.layer.san2_layer.SAN2Layer(gamma, in_dim, out_dim, num_heads, full_graph, fake_edge_emb, dropout=0.0, layer_norm=False, batch_norm=True, residual=True, use_bias=False)[source]

Bases: Module

Modified GraphTransformerLayer from SAN.

Ported to PyG from original repo: https://github.com/DevinKreuzer/SAN/blob/main/layers/graph_transformer_layer.py

opengt.layer.san2_layer.pyg_softmax(src, index, num_nodes=None)[source]

Computes a sparsely evaluated softmax. Given a value tensor src, this function first groups the values along the first dimension based on the indices specified in index, and then proceeds to compute the softmax individually for each group.

Parameters:
  • src (Tensor) – The source tensor.

  • index (LongTensor) – The indices of elements for applying the softmax.

  • num_nodes (int, optional) – The number of nodes, i.e. max_val + 1 of index. (default: None)

Return type:

Tensor

class opengt.layer.spec_layer.SpecLayer(dim_out, n_heads, dropout=0.0, norm='none')[source]

Bases: Module

SpecFormer Layer. Adapted from https://github.com/DSL-Lab/Specformer

Parameters:
  • dim_out (int) – Number of output features.

  • n_heads (int) – Number of attention heads.

  • dropout (float) – Dropout rate. Default: 0.0.

  • norm (str) – Normalization type. Options are ‘none’, ‘layer’, ‘batch’. Default: ‘none’.

Input:

batch.x (Tensor): Input node features. batch.EigVecs (Tensor): Eigenvectors of the graph Laplacian. batch.EigVals (Tensor): Eigenvalues of the graph Laplacian.

Output:

ret.x (Tensor): Output node features after applying the SpecLayer.

class opengt.layer.trans_conv_layer.TransConvLayer(dim_in, dim_out, config)[source]

Bases: Module

Transformer with fast attention. Used in SGFormer. Adapted from https://github.com/qitianwu/SGFormer

Parameters:
  • dim_in (int) – Number of input features.

  • dim_out (int) – Number of output features.

  • config (object) – Configuration object containing hyperparameters. - n_heads (int): Number of attention heads. - use_weight (bool): Whether to use weight for value. - use_residual (bool): Whether to use residual connection. - use_act (bool): Whether to use activation function. - layer_norm (bool): Whether to use layer normalization. - batch_norm (bool): Whether to use batch normalization. - dropout (float): Dropout rate.

Input:

batch.x (torch.Tensor): Input node features. batch.edge_index (torch.Tensor): Edge indices of the graph.

Output:

ret.x (torch.Tensor): Output node features after applying the TransConv layer.