FluxArchitectures: TPA-LSTM

The next model in the FluxArchitectures repository is the Temporal Pattern Attention LSTM network based on the paper “Temporal Pattern Attention for Multivariate Time Series Forecasting” by Shih et. al.. It claims to have a better performance than the previously implemented LSTNet, with the additional advantage that an attention mechanism automatically tries to determine important parts of the time series, instead of introducing parameters that need to be optimized by the user.

Read More

FluxArchitectures: DA-RNN

The next model in the FluxArchitectures repository is the “Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction”, based on the paper by Qin et. al., 2017. It claims to have a better performance than the previously implemented LSTNet, with the additional advantage that an attention mechanism automatically tries to determine important parts of the time series, instead of introducing parameters that need to be optimized by the user.

Read More

Where to go from here? Announcing `FluxArchitectures`

It’s been a while since anything happened on this blog. So what happened in the meantime? Well, for the first I abandoned Tensorflow.jl in favour of Flux.jl. It seems to be the package that most people are using these days. It has a nice way of setting up models, and is nicely integrated into other parts of the Julia ecosystem as well (say, for example by combining it with differential equations to give scientific machine learning).

Read More

Moving from Julia 0.6 to 1.1

Finally, all files in the GitHub repository have been updated to be able to run on Julia 1.1. In order to be able to run them (at the time of writing), the developmental versions of the Tensorflow.jl and PyCall.jl packages need to be installed. Some notable changes are listed below:

Read More

Conversion of Movie-review data to one-hot encoding

In the last post, we obtained the files test_data.h5 and train_data.h5, containing text data from movie reviews (from the ACL 2011 IMDB dataset). In the next exercise, we need to access a one-hot encoded version of these files, based on a large vocabulary. The following code converts the data and stores it on disk for later use. It takes about two hours to run on my laptop and uses 13GB of storage for the converted file.

Read More

Feature Crosses

The next part of the Machine Learning Crash Course deals with constructing bucketized features and feature crosses. The Jupyter notebook can be downloaded here.

Read More

Feature Sets

The fourth part of the Machine Learning Crash Course deals with finding a minimal set of features that still gives a reasonable model.

Read More