Physics.Math.Code(@physics_lib). 📕 Обучение с подкреплением для реальных задач. Инженерный подход [2023] Фил Уиндер Книга посвящена

📕 Обучение с подкреплением для реальных задач. Инженерный подход [2023] Фил Уиндер Книга посвящена промышленно-ориентированному применению обучения с подкреплением (Reinforcement Learning, RL). Объяснено, как обучать промышленные и научные системы решению любых пошаговых задач методом проб и ошибок – без подготовки узкоспециализированных учебных множеств данных и без риска переобучить или переусложнить алгоритм. Рассмотрены марковские процессы принятия решений, глубокие Q-сети, градиенты политик и их вычисление, методы устранения энтропии и многое другое. Данная книга – первая на русском языке, где теоретический базис RL и алгоритмы даны в прикладном, отраслевом ключе. Для аналитиков данных и специалистов по искусственному интеллекту. 📘 Reinforcement Learning: Industrial Applications of Intelligent Agents [2021] Phil Winder, Ph.D. Reinforcement learning (RL) is a machine learning (ML) paradigm that is capable of optimizing sequential decisions. RL is interesting because it mimics how we, as humans, learn. We are instinctively capable of learning strategies that help us master complex tasks like riding a bike or taking a mathematics exam. RL attempts to copy this process by interacting with the environment to learn strategies. Recently, businesses have been applying ML algorithms to make one-shot decisions. These are trained upon data to make the best decision at the time. But often, the right decision at the time may not be the best decision in the long term. Yes, that full tub of ice cream will make you happy in the short term, but you’ll have to do more exercise next week. Similarly, click-bait recommendations might have the highest click-through rates, but in the long term these articles feel like a scam and hurt long-term engagement or retention. RL is exciting because it is possible to learn long-term strategies and apply them to complex industrial problems.