FluxArchitectures
FluxArchitectures.HiddenRecur
FluxArchitectures.Reg_LayerNorm
FluxArchitectures.Seq
FluxArchitectures.SeqSkip
FluxArchitectures.DARNN
FluxArchitectures.DARNN
FluxArchitectures.DSANet
FluxArchitectures.DSANet
FluxArchitectures.Global_SelfAttn
FluxArchitectures.LSTnet
FluxArchitectures.LSTnet
FluxArchitectures.Local_SelfAttn
FluxArchitectures.ReluGRU
FluxArchitectures.Scaled_Dot_Product_Attention
FluxArchitectures.SelfAttn_Encoder
FluxArchitectures.SkipGRU
FluxArchitectures.StackedLSTM
FluxArchitectures.TPALSTM
FluxArchitectures.TPALSTM
FluxArchitectures.get_data
FluxArchitectures.get_data
FluxArchitectures.load_data
FluxArchitectures.prepare_data
FluxArchitectures.prepare_data
FluxArchitectures.HiddenRecur
— TypeHiddenRecur(cell)
HiddenRecur
takes a recurrent cell and makes it stateful, managing the hidden state in the background. As opposed to Recur
, it returns the both the hidden state and the cell state.
See also: Flux.Recur
FluxArchitectures.Reg_LayerNorm
— TypeReg_LayerNorm(h::Integer)
A normalisation layer designed to be used with recurrent hidden states of size h
. Normalises the mean and standard deviation of each input before applying a per-neuron gain/bias. To avoid numeric overflow, the division by the standard deviation has been regularised by adding ϵ = 1E-5
.
FluxArchitectures.Seq
— TypeSeq(RNN)
Seq
takes a recurrent neural network and "sequentializes" it, i.e. when Seq(RNN)
is called with a matrix of input features over a certain time interval, the recurrent neural net is fed with a sequence of inputs, and results are transformed back to matrix form.
FluxArchitectures.SeqSkip
— TypeSeqSkip(RNNCell, skiplength::Integer)
SeqSkip
takes a recurrent neural network cell and "sequentializes" it, i.e. when it is called with a matrix of input features over a certain time interval, the recurrent neural net is fed with a sequence of inputs, and results are transformed back to matrix form. In addition, the hidden state from skiplength
timesteps ago is used instead of the current hidden state. This structure combines functionality of Recur
in that it makes a recurrent neural network cell stateful, as well as Seq
in that it feeds matrices of input features as elements of a time series.
See also: Seq
, Flux.Recur
FluxArchitectures.DARNN
— MethodDARNN(inp, encodersize, decodersize, poollength, orig_idx)
Create a DA-RNN layer based on the architecture described in Qin et. al., as implemented for PyTorch here. inp
specifies the number of input features. encodersize
defines the number of LSTM encoder layers, and decodersize
defines the number of LSTM decoder layers. poolsize
gives the length of the window for the pooled input data, and orig_idx
defines the array index where the original time series is stored in the input data,
Data is expected as array with dimensions features x poolsize x 1 x data
, i.e. for 1000 data points containing 31 features that have been windowed over 6 timesteps, DARNN
expects an input size of (31, 6, 1, 1000)
.
Takes the keyword arguments init
and bias
for the initialization of the weight vector and bias of the linear layers.
FluxArchitectures.DSANet
— FunctionDSANet(inp, window, local_length, n_kernels, d_model, d_hid, n_layers, n_head, out=1, drop_prob = 0.1f0, σ = Flux.relu)
Create a DSANet
network based on the architecture described in Siteng Huang et. al.. The code follows the PyTorch implementation. inp
specifies the number of input features. window
gives the length of the window for the pooled input data. local_length
defines the length of the convolution window for the local self attention mechanism. n_kernel
defines the number of convolution kernels for both the local and global self attention mechanism. d_hid
defines the number of "hidden" convolution kernels in the self attention encoder structure. n_layers
gives the number of self attention encoders used in the network, and n_head
defines the number of attention heads. out
gives the number of output time series, drop_prob
is the dropout probability for the Dropout
layers, and σ
defines the network's activation function.
Data is expected as array with dimensions features x poolsize x 1 x data
, i.e. for 1000 data points containing 31 features that have been windowed over 6 timesteps, DSANet
expects an input size of (31, 6, 1, 1000)
.
FluxArchitectures.Global_SelfAttn
— FunctionGlobal_SelfAttn(inp, window, n_kernels, w_kernel, d_model, d_hid, n_layers, n_head)
Global self attention module for DSANet
. For parameters see DSANet.
FluxArchitectures.LSTnet
— FunctionLSTnet(in, convlayersize, recurlayersize, poolsize, skiplength)
LSTnet(in, convlayersize, recurlayersize, poolsize, skiplength, Flux.relu)
Create a LSTnet layer based on the architecture described in Lai et. al.. in
specifies the number of input features. convlayersize
defines the number of convolutional layers, and recurlayersize
defines the number of recurrent layers. poolsize
gives the length of the window for the pooled input data, and skiplength
defines the number of steps the hidden state of the recurrent layer is taken back in time.
Data is expected as array with dimensions features x poolsize x 1 x data
, i.e. for 1000 data points containing 31 features that have been windowed over 6 timesteps, LSTNet
expects an input size of (31, 6, 1, 1000)
.
Takes the keyword arguments init
for the initialization of the recurrent layers; and initW
and bias
for the initialization of the dense layer.
FluxArchitectures.Local_SelfAttn
— FunctionLocal_SelfAttn(inp, window, local_length, n_kernels, w_kernel, d_model, d_hid, n_layers, n_head)
Local self attention module for DSANet
. For parameters see DSANet.
FluxArchitectures.ReluGRU
— MethodReluGRU(in::Integer, out::Integer)
Gated Recurrent Unit layer with relu
as activation function.
FluxArchitectures.Scaled_Dot_Product_Attention
— FunctionScaled_Dot_Product_Attention(q, k, v, temperature)
Scaled dot product attention function with query q
, keys k
and values v
. Normalisation is given by temperature
. Outputs $\mathrm{softmax}\left( \frac{q \cdot k^T}{\mathrm{temperature}} \right)\cdot v$.
FluxArchitectures.SelfAttn_Encoder
— FunctionSelfAttn_Encoder(inp, d_model, n_head, d_hid, drop_prob = 0.1f0, σ = Flux.relu)
Encoder part for the self attention networks that comprise the DSANet
. For parameters see DSANet.
FluxArchitectures.SkipGRU
— MethodSkipGRU(in::Integer, out::Integer, p::Integer)
Skip Gated Recurrent Unit layer with skip length p
. The hidden state is recalled from p
steps prior to the current calculation.
FluxArchitectures.StackedLSTM
— MethodStackedLSTM(in, out, hiddensize, layers)
Stacked LSTM network. Feeds the data through a chain of LSTM layers, where the hidden state of the previous layer gets fed to the next one. The first layer corresponds to LSTM(in, hiddensize)
, the hidden layers to LSTM(hiddensize, hiddensize)
, and the final layer to LSTM(hiddensize, out)
. Takes the keyword argument init
for the initialization of the layers.
FluxArchitectures.TPALSTM
— FunctionTPALSTM(in, hiddensize, poollength)
TPALSTM(in, hiddensize, poollength, layers, filternum, filtersize)
Create a TPA-LSTM layer based on the architecture described in Shih et. al., as implemented for PyTorch by Jing Wang. in
specifies the number of input features. hiddensize
defines the input and output size of the LSTM layer, and layers
the number of LSTM layers (with standard value 1
). filternum
and filtersize
define the number and size of filters in the attention layer. Standard values are 32
and 1
. poolsize
gives the length of the window for the pooled input data.
Data is expected as array with dimensions features x poolsize x 1 x data
, i.e. for 1000 data points containing 31 features that have been windowed over 6 timesteps, TPALSTM
expects an input size of (31, 6, 1, 1000)
.
Takes the keyword arguments initW
and bias
for the initialization of the Dense
layers, and init
for the initialization of the StackedLSTM
network.
FluxArchitectures.get_data
— Methodget_data(dataset, poollength, datalength, horizon)
Return features and labels from one of the sample datasets in the repository. dataset
can be one of :solar
, :traffic
, :exchange_rate
or :electricity
. poollength
gives the number of timesteps to pool for the model, datalength
determines the number of time steps included into the output, and horizon
determines the number of time steps that should be forecasted by the model.
See also: prepare_data
, load_data
FluxArchitectures.load_data
— Methodload_data(dataset)
Load the raw data from one of the available datasets. The following example data from https://github.com/laiguokun/multivariate-time-series-data is included:
:solar
: The raw data is coming from http://www.nrel.gov/grid/solar-power-data.html: It contains the solar power production records in the year of 2006, which is sampled every 10 minutes from 137 PV plants in Alabama State.:traffic
: The raw data is coming from http://pems.dot.ca.gov. The data in this repo is a collection of 48 months (2015-2016) hourly data from the California Department of Transportation. The data describes the road occupancy rates (between 0 and 1) measured by different sensors on San Francisco Bay area freeways.:electricity
: The raw dataset is from https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014. It is the electricity consumption in kWh was recorded every 15 minutes from 2011 to 2014 for 321 clients. The data has been cleaned and converted to hourly consumption.:exchange_rate
: The collection of daily exchange rates of eight foreign countries including Australia, Great Britain, Canada, Switzerland, China, Japan, New Zealand and Singapore ranging from 1990 to 2016.
FluxArchitectures.prepare_data
— Methodprepare_data(data, poollength, datalength, horizon)
prepare_data(data, poollength, datalength, horizon; normalise=true)
Cast 2D time series data into the format used by FluxArchitectures. data
is a matrix or Tables.jl compatible datasource containing data in the form timesteps x features
(i.e. each column contains the time series for one feature). poollength
defines the number of timesteps to pool when preparing a single frame of data to be fed to the model. datalength
determines the number of time steps included into the output, and horizon
determines the number of time steps that should be forecasted by the model. The label data is assumed to be contained in the first column. Outputs features and labels.
Note that when horizon
is smaller or equal to poollength
, then the model has direct access to the value it is supposed to predict.