强化学习与DQN
强化学习成就
Learned the world’s best player of Backgammon (Tesauro 1995) Learned acrobatic helicopter autopilots (Ng, Abbeel, Coates et al2006+) Widely used in the placement and selection of advertisements onthe web (e.g. A-B tests) Used to make strategic decisions in Jeopardy! (IBM’s Watson2011) Achieved human-level performance on Atari games from pixel-level visual input, in conjunction with deep learning (GoogleDeepmind 2015) In all these cases, performance was better than could be obtained byany other method, and was obtained without human instruction
转载于:https://www.cnblogs.com/koocn/p/7757710.html