尹志
(2025-05-31 21:23):
#paper https://doi.org/10.48550/arXiv.2012.07436 Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting。这是AAAI2021上的一篇关于长序列时序建模的经典工作。文章对传统Transformer进行了改进,提出了一类新的模型Informer,通过对self attention的改进和蒸馏,以及generative style decoder的构建,在时间复杂度、空间复杂度上都改善了传统Transformer存在的问题。该工作在多个数据集上取得了良好的性能。上述的几个思路在后续的时序建模中被频繁使用,非常具有启发性。
arXiv,
2020-12-14T11:43:09Z.
DOI: 10.48550/arXiv.2012.07436
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
翻译
Abstract:
Many real-world applications require the prediction of long sequencetime-series, such as electricity consumption planning. Long sequencetime-series forecasting (LSTF) demands a high prediction capacity of the model,which is the ability to capture precise long-range dependency coupling betweenoutput and input efficiently. Recent studies have shown the potential ofTransformer to increase the prediction capacity. However, there are severalsevere issues with Transformer that prevent it from being directly applicableto LSTF, including quadratic time complexity, high memory usage, and inherentlimitation of the encoder-decoder architecture. To address these issues, wedesign an efficient transformer-based model for LSTF, named Informer, withthree distinctive characteristics: (i) a $ProbSparse$ self-attention mechanism,which achieves $O(L \log L)$ in time complexity and memory usage, and hascomparable performance on sequences' dependency alignment. (ii) theself-attention distilling highlights dominating attention by halving cascadinglayer input, and efficiently handles extreme long input sequences. (iii) thegenerative style decoder, while conceptually simple, predicts the longtime-series sequences at one forward operation rather than a step-by-step way,which drastically improves the inference speed of long-sequence predictions.Extensive experiments on four large-scale datasets demonstrate that Informersignificantly outperforms existing methods and provides a new solution to theLSTF problem.
翻译
Related Links: