尝试理解强化学习(Try to understand reinforcement learning)
我个人理解就是深度学习需要对一个一组特征设置标签， 然后反复训练模型，是这个模型尽量接近 一坨特征数据等于标签。
而强化学习是对一坨特征，模型刚开始不知道标签是具体是啥，随便输出一个值y就行，然后我们实现一个奖励函数，对这个输出值打一个分， 分数越高，说明这个随便输出的值可以认为是临时的标签数据。 相当于在训练过程中动态设置标签数据。
首先得设计一个基金交易环境， 这个环境的输出是近30天的涨幅。输入是买入，卖出，观望。假定本金1万， 打分系统就设计成收益率
然后给前30天的涨幅作为特征， 输出值定义域y[-1, 0, 1], 0表示观望，
y>0 表示买入， y=0.2 表示买入2000.
y<0 表示卖出， y=-0.5 表示卖出持有份额的一半。
但是这并不意味这个不能应用在买基金这件事上， 因为它会有一个策略， 什么时候止盈， 什么时候买入，什么时候加仓。这个策略不是简单的定投。
强化学习在非常擅长应用在游戏领域，因为游戏本身就是环境， 游戏画面就是输出， 基本上所有的游戏基本都有一个分数或者胜利的东西，即打分系统。
消灭星星游戏本身就是一个环境，这个环境的输入就是点击位置， 输出就是游戏画面。 消灭的分数就是打分系统。
gym 里有很多基于物理引擎的游戏， 非常适合来练手，学习。
Reinforcement learning is evaluation learning. What’s the difference between this and deep learning?
My personal understanding is that deep learning needs to label a group of features, and then train the model repeatedly. This model is as close as possible to a lump of feature data equal to the label.
Reinforcement learning is for a lump of features. At the beginning of the model, we don’t know what the label is. Just output a value y casually. Then we implement a reward function to give a score to the output value. The higher the score, it shows that the casually output value can be considered as temporary label data. It is equivalent to dynamically setting label data during training.
In other words, the core of reinforcement learning is to need a scoring system without setting labels in advance.
At first, the deep learning model can be regarded as randomly generating a value, and then the value is compared with the label. The smaller the value, the better the model
At first, reinforcement learning model can be regarded as randomly generating a value, and then score the value. The larger the score, the better the model.
Application of deep learning in buying funds:
For example, give the increase in the first 30 days as a feature and today’s increase as a label. Let the model train and predict the daily increase after training.
In fact, the accuracy of the prediction here has a lot to do with the characteristics. It is difficult to achieve a good prediction effect after the calculation and training based on the characteristics of the increase dimension, because there are too many factors affecting the increase of the fund
Application of reinforcement learning in buying funds:
First of all, we have to design a fund trading environment. The output of this environment is the increase in recent 30 days. Input is buy, sell, wait and see. Assuming that the principal is 10000, the scoring system is designed as the rate of return
Then, take the increase in the first 30 days as the feature, and the output value definition field y [- 1, 0, 1], 0 means wait-and-see
We can artificially design the meaning of an output value Y:
y> 0 means buying, y = 0.2 ， means buying 2000
y=0. It means to wait and see, neither buy nor sell
y< 0 means selling, y = – 0.5 = selling half of the shares held.
On the issue of buying funds, intensive learning and in-depth learning are the same. They are not accurate. The advantage is that they are more rational. Another disadvantage is that the training data set is too small, because a fund has only about 2500 data in 10 years.
A very simple example is that the emergence of an epidemic will be guaranteed by medical related funds, while artificial intelligence can not predict the emergence of an epidemic.
But this does not mean that this can not be applied to buying funds, because it will have a strategy, when to stop profits, when to buy and when to increase positions. This strategy is not a simple fixed investment.
Application of reinforcement learning in games
Reinforcement learning is very good at applying in the game field, because the game itself is the environment and the game picture is the output. Basically, all games basically have a score or victory, that is, the scoring system.
For example, play to destroy the stars
The game of eliminating stars itself is an environment. The input of this environment is the click position, and the output is the game screen. The eliminated score is the scoring system.
There are many games based on physics engine in gym, which are very suitable for practicing and learning.