Data Charging Station

Insights

2017-06-29

LSTM

原文網址

想像人在每次思考、閱讀一段文章時，不是從零開始，會保留過去的記憶；RNN就是來解決這方面的問題。

LSTM1

每次訓練時，會保留過去的訊息，然後一直傳遞下去。LSTM則是一種特殊的RNN形式。

The Problem of Long-Term Dependencies

很多情況下，會需要更多的上下文訊息，他們可能距離非常遠，這就會產生梯度消失，或是梯度爆炸。

LSTM2

LSTM

LSTM，稱為長短期記憶網絡（Long Short Term Memory networks），是一種特殊的RNN。

LSTM4

不同於RNN在每個Cell裡只包含一個tanh，LSTM增加了input gate, output gate 和 forget gate，都是用來控制我們要怎麼操作這些資料；使用sigmoid 可以看做是記憶、讀取資料量的多寡，0代表不通過，1代表全部通過。

詳細推倒部分可以看看原文，他也介紹了GRU–一種更高效的LSTM。

值得注意的，現今我們從RNN得到的好結果，幾乎都是指LSTM~

在Tensorflow中，LSTM叫出來用就可以了。

下圖是用紅虛線去學習黑線(x*sin(x))的結果

figure_2

Tensorflow

DL, Python, Tensorflow

Posted by:

kbwen

發表留言取消回覆

About Me

A tech enthusiast passionate about data science, financial markets, blockchain, IC/chip industry, cybersecurity, and artificial intelligence.
On this blog, I share insights and experiments in quantitative trading, Python programming, blockchain applications, semiconductor trends, and cybersecurity practices.
With a systematic and interdisciplinary approach, I document both coding tutorials and real-world case studies in AI, chip technology, and FinTech. My goal is to empower investors, developers, and tech professionals to harness the power of data, AI, and next-generation technologies—unlocking smarter strategies and secure, innovative living.

LSTM

The Problem of Long-Term Dependencies

LSTM

分享此文：

發表留言 取消回覆

熱門文章與頁面︰

發表留言取消回覆