文献收藏与分享平台

张浩彬 (2024-04-29 20:35):

#paper doi: https://doi.org/10.48550/arXiv.2211.14730 A Time Series is Worth 64 Words: Long-term Forecasting with Transformers ICLR2023的文章，提出了PatchTST。受vision Transformer的启发，把patch技术引入到时序问题。并且回应了早期另一篇认为Transformer用在时间序列其实并不比传统线性模型好的文章（Are transformers effective for time series forecasting?（2022）），重新取得了sota。然而23年底，又有新方法出现了，讨论了其实关键不是transformer，而是patch技术

arXiv, 2022. DOI: 10.48550/arXiv.2211.14730

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

翻译

Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam

Abstract:

We propose an efficient design of Transformer-based models for multivariatetime series forecasting and self-supervised representation learning. It isbased on two key components: (i) segmentation of time series intosubseries-level patches which are served as input tokens to Transformer; (ii)channel-independence where each channel contains a single univariate timeseries that shares the same embedding and Transformer weights across all theseries. Patching design naturally has three-fold benefit: local semanticinformation is retained in the embedding; computation and memory usage of theattention maps are quadratically reduced given the same look-back window; andthe model can attend longer history. Our channel-independent patch time seriesTransformer (PatchTST) can improve the long-term forecasting accuracysignificantly when compared with that of SOTA Transformer-based models. We alsoapply our model to self-supervised pre-training tasks and attain excellentfine-tuning performance, which outperforms supervised training on largedatasets. Transferring of masked pre-trained representation on one dataset toothers also produces SOTA forecasting accuracy. Code is available at:https://github.com/yuqinie98/PatchTST.

翻译

Related Links:

http://arxiv.org/abs/2211.14730v2