[1] |
BENGIO Y . Learning Deep Architectures for AI[J]. Foundations and Trends in Machine Learning, 2009,2(1):1-127. DOI: 10.1561/2200000006
|
[2] |
LECUN Y, BENGIO Y, HINTON G . Deep Learning[J]. Nature, 2015,521(7553):436-444. DOI: 10.1038/nature14539
|
[3] |
SCHMIDHUBER J . Deep Learning in Neural Networks: An Overview[J]. Neural Networks, 2015,61:85-117. DOI: 10.1016/j.neunet.2014.09.003
|
[4] |
HOFFMAN J, TZENG E, DONAHUE J, et al. One-Shot Adaptation of Supervised Deep Convolutional Models [EB/OL]. (2013-12-21)[2018-04-15]. http://arxiv.org/abs/1312.6204
|
[5] |
KINGMA D P, WELLING M. Auto-Encoding Variational Bayes [EB/OL]. (2013-12-20)[2018-04-15].
|
[6] |
KRIZHEVSKY A, SUTSKEVER I, HINTON G E . ImageNet Classification with Deep Convolutional Neural Networks[J]. Communications of the ACM, 2017,60(6):84-90. DOI: 10.1145/3065386
|
[7] |
MIKOLOV T, SUTSKEVER I, DEORAS A, et al. Subword Language Modelling with Neural Networks [EB/OL]. (2012) [ 2018- 04- 15]. http://www.fit.vutbr.cz/~imikolov/rnnlm/char.pdf
|
[8] |
Graves A. Generating Sequences With Recurrent Neural Networks [EB/OL]. (2013 -08-04) [2018-04-15]. https://arxiv.org/abs/1308.0850
|
[9] |
CHO K, MERRIENBOER B van, GULCEHRE C, et al. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation [C]//Conference on Empirical Methods in Natural Language Processing. Doha, Qatar, 2014. DOI: 10.3115/v1/D14-1179
|
[10] |
SUTSKEVER I, VINYALS O L, Le Q V. Sequence to Sequence Learning with Neural Networks [M] //Advances in Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2014: 3104-3112
|
[11] |
VINYALS O, TOSHEV A, BENGIO S, et al. Show and Tell: A Neural Image Caption Generator [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, USA, 2015: 3156-3164. DOI: 10.1109/CVPR.2015. 7298935
|
[12] |
MOZER M C. Induction of Multiscale Temporal Structure [C]//Proc. 4th International Conference on Neural Information Processing Systems. San Francisco, USA: Morgan Kaufmann Publishers Inc., 1991: 275-282.
|
[13] |
HIHI S E, BENGIO Y. Hierarchical Recurrent Neural Networks for Long-Term Dependencies [C]//Proc. 8th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 1995: 493-499
|
[14] |
LIN T, HORNE B G, TINO P ,et al. Learning Long-Term Dependencies in NARX Recurrent Neural Networks[J]. IEEE Transactions on Neural Networks, 1996,7(6):1329-1338. DOI: 10.1109/72.548162
|
[15] |
KOUTNÍK J, GREFF K, GOMEZ F, et al. A Clockwork RNN [C]//31st International Conference on Machine Learning. Beijing, China, 2014: 1863-1871
|
[16] |
AHMED N K, ATIYA A F, GAYAR N E ,et al. An Empirical Comparison of Machine Learning Models for Time Series Forecasting[J]. Econometric Reviews, 2010,29(5/6):594-621. DOI: 10.1080/07474938.2010.481556
|
[17] |
BONTEMPI G, BEN TAIEB S, LE BORGNE Y A. Machine Learning Strategies for Time Series Forecasting[M] //BONTEMPI G, BEN TAIEB S, LE BORGNE Y A. eds. Business Intelligence. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013: 62-77. DOI: 10.1007/978-3-642-36318-4_3
|
[18] |
ROBINSON A J, FALLSIDE F . The Utility Driven Dynamic Error Propagation Network: CUED/FINFENG/TR.1 [R]. Cambridge, UK: Cambridge University, Engineering Department, 1987
|
[19] |
WERBOS P J . Generalization of Backpropagation with Application to a Recurrent Gas Market Model[J]. Neural Networks, 1988,1(4):339-356. DOI: 10.1016/0893-6080(88)90007-x
|
[20] |
Williams R J . Complexity of Exact Gradient Computation Algorithms for Recurrent Neural Networks: NUCCS-89-27 [R]. Boston USA: Northeastern University, College of Computer Science, 1989
|
[21] |
Hochreiter S, Bengio Y, Frasconi P, et al. Gradient Flow in Recurrent Nets: The Difficulty of Learning Long-Term Dependencies [M] // Kremer S C, Kolen, J F eds. A Field Guide to Dynamical Recurrent Networks. Hoboken, USA: IEEE Press, 2001
|
[22] |
HOCHREITER S, SCHMIDHUBER J . Long Short-Term Memory[J]. Neural Computation, 1997,9(8):1735-1780. DOI: 10.1162/neco.1997.9.8.1735
|
[23] |
MARTENS J, SUTSKEVER I. Learning Recurrent Neural Networks with Hessian-Free Optimization [C]//28th International Conference on Machine Learning. Bellevue, USA, 2011: 1033-1040
|
[24] |
SUTSKEVER I, MARTENS J, DAHL G E, et al. On the Importance of Initialization and Momentum in Deep Learning [C]//30th International Conference on Machine Learning. Atlanta, USA, 2013: 1139-1147
|
[25] |
GERS F A, SCHRAUDOLPH N N, SCHMIDHUBER J . Learning Precise Timing with LSTM Recurrent Networks[J]. Journal of Machine Learning Research, 2002,3:115-143
|
[26] |
KINGMA D, Ba J. Adam: A Method for Stochastic Optimization[EB/OL]. (2014-12-22) [2018-04-15]
|