Table 1 Open in new tab Optimized GRU...

Parameter name

Description

Range

Final value (rat | human)

Number of layers

Multiple layers of each of the recurrent units could be stacked on top of each other.

[1; 5]

2 | 1

Hidden size

Size of the hidden state vector.

[10; 500]

290 | 88

Loss function

As the Pearson correlation coefficient (CC) was the final evaluation metric of networks’ performance, it could be used as the cost function instead of the mean squared error (MSE) loss.

[MSE, CC, MSE and CC]

CC | CC

Learning rate

A parameter defining the rate at which network weights were updated during training.

[10⁻⁵; 1]

0.001 | 0.00121

Strength of the L2 weight regularization.

[0; 10]

0.0003 | 0.0221

Gradient clipping

Gradient clipping (Pascanu et al. 2013) limits the magnitude of the gradient to a specified value.

[yes; no]

no | no

Dropout

In the case of using a multi-layer RNN, dropout (Srivastava et al. 2014) could be set.

[0; 0.2]

0.128 |—

Residual connection

Employing a residual connection i.e., feeding the input directly to the linear readout alongside the RNN’s hidden state.

[yes; no]

yes | no

Batch size

The number of single-vessel time courses processed by the network in the training stage before each weight update.

[3; 32]

22 | 10

Number of epochs

How many times the network processed the whole training dataset during training.

[1; 100]

87 | 69

Washout time

The number of input signals’ time points used to drive the network into a state that is specific to a given input. These time points are not used for readout training and prediction.

Fixed

250 | 250

Table 1

Open in new tab

Optimized GRU hyperparameters

Parameter name	Description	Range	Final value (rat \| human)
Number of layers	Multiple layers of each of the recurrent units could be stacked on top of each other.	[1; 5]	2 \| 1
Hidden size	Size of the hidden state vector.	[10; 500]	290 \| 88
Loss function	As the Pearson correlation coefficient (CC) was the final evaluation metric of networks’ performance, it could be used as the cost function instead of the mean squared error (MSE) loss.	[MSE, CC, MSE and CC]	CC \| CC
Learning rate	A parameter defining the rate at which network weights were updated during training.	[10⁻⁵; 1]	0.001 \| 0.00121
L2	Strength of the L2 weight regularization.	[0; 10]	0.0003 \| 0.0221
Gradient clipping	Gradient clipping (Pascanu et al. 2013) limits the magnitude of the gradient to a specified value.	[yes; no]	no \| no
Dropout	In the case of using a multi-layer RNN, dropout (Srivastava et al. 2014) could be set.	[0; 0.2]	0.128 \|—
Residual connection	Employing a residual connection i.e., feeding the input directly to the linear readout alongside the RNN’s hidden state.	[yes; no]	yes \| no
Batch size	The number of single-vessel time courses processed by the network in the training stage before each weight update.	[3; 32]	22 \| 10
Number of epochs	How many times the network processed the whole training dataset during training.	[1; 100]	87 \| 69
Washout time	The number of input signals’ time points used to drive the network into a state that is specific to a given input. These time points are not used for readout training and prediction.	Fixed	250 \| 250

Parameter name	Description	Range	Final value (rat \| human)
Number of layers	Multiple layers of each of the recurrent units could be stacked on top of each other.	[1; 5]	2 \| 1
Hidden size	Size of the hidden state vector.	[10; 500]	290 \| 88
Loss function	As the Pearson correlation coefficient (CC) was the final evaluation metric of networks’ performance, it could be used as the cost function instead of the mean squared error (MSE) loss.	[MSE, CC, MSE and CC]	CC \| CC
Learning rate	A parameter defining the rate at which network weights were updated during training.	[10⁻⁵; 1]	0.001 \| 0.00121
L2	Strength of the L2 weight regularization.	[0; 10]	0.0003 \| 0.0221
Gradient clipping	Gradient clipping (Pascanu et al. 2013) limits the magnitude of the gradient to a specified value.	[yes; no]	no \| no
Dropout	In the case of using a multi-layer RNN, dropout (Srivastava et al. 2014) could be set.	[0; 0.2]	0.128 \|—
Residual connection	Employing a residual connection i.e., feeding the input directly to the linear readout alongside the RNN’s hidden state.	[yes; no]	yes \| no
Batch size	The number of single-vessel time courses processed by the network in the training stage before each weight update.	[3; 32]	22 \| 10
Number of epochs	How many times the network processed the whole training dataset during training.	[1; 100]	87 \| 69
Washout time	The number of input signals’ time points used to drive the network into a state that is specific to a given input. These time points are not used for readout training and prediction.	Fixed	250 \| 250

This Feature Is Available To Subscribers Only