>>> Complexity Seminar :番外編 <<< 勉強会のお知らせ Carlos Dominogo 氏(EC Post Graduate Fellow, 東工大客員研究員)が,先日, Barbados workship にて,Markov decision process (reinforcement learning) についての講習を受けてきました.そこで,その内容についての勉強会を下記の 要領で行なおうと企画しました.ご興味のある方は,是非,ご参加下さい. 渡辺 治 ------ 題目: Reinforcement Learning の勉強会 発表者:Carlos Dominogo(EC Post Graduate Fellow, 4月より東工大客員研究員) 日時:1999年3月30日(火)3時〜6時頃 場所:東京工業大学 西8号館(11階セミナー室(予定)) *10階の渡辺の部屋にまずおいで下さい. 内容:(PAC 学習などに関する基礎知識を仮定させて頂きます) In the recent years, core areas of classical artificial intelligence like knowledge representation, inference, learning or planning have been reformulated in terms of an statistical or probabilistic framework, like the so call computational learning theory for supervised learning. In this talk, I will survey the recent advances towards this direction in an influential topic in AI: reinforcement learning, also known as, Markov decision processes. In the lecture, I will explain the basic Markov decision process model and the two central problem posed: planning and learning. Concerning planning in MDPs, I will describe three central algorithms proposed, namely, policy iteration, value iteration and linear programming approaches, and show what it is known about them formally. For learning in MDPs, I will describe the Q-learning and model-based methods as well as the more recent $E^3$ algorithm. Finally, I will move to a more realistic learning framework for MDPs and I will review algorithms for learning in MDPs with large state spaces. The lecture will be self-contained, no previous knowledge on reinforcement learning will be assumed.