본문 바로가기 주메뉴 바로가기
검색 검색영역닫기 검색 검색영역닫기 ENGLISH 메뉴 전체보기 메뉴 전체보기

학술행사

세미나

ICIM 연구교류 세미나(2.3.금)

등록일자 : 2023-01-31

https://icim.nims.re.kr/post/event/959

  • 발표자  김연응 교수(가천대학교)
  • 개최일시  2023-02-03 15:00-17:00
  • 장소  국가수리과학연구소 산업수학혁신센터(광교)

1. 일시: 2023년 2월 3일(금), 15:00-17:00

2. 장소: 산업수학혁신센터 세미나실

  • 경기 성남시 수정구 대왕판교로 815, 기업지원허브 231호 국가수리과학연구소
  • 무료주차 2시간 등록 가능

    3. 발표자: 김연응 교수(가천대학교)

    4. 주요내용: Discrete optimal control - learning LQR

    Linear Quadratic Regulator (LQR) is one of the most popular structures in optimal control problems. This talk introduces some basic properties of the LQR problems and recent progress in learning LQR. There are two approaches to measuring the performance of the algorithm, Bayesian and Frequentist regret. The advantages and disadvantages will be discussed particularly g on Thompson sampling. Thompson sampling (TS) is known to effectively address the exploration-exploitation trade-off in online learning problems including reinforcement learning for linear-quadratic regulators (LQR). However, in TS for learning LQR, its theoretical analysis is often limited to the case of Gaussian noises. The sampling can be performed directly when we further assume that the unknown system parameters lie in a prespecified compact set which is seemingly restrictive. We propose a new TS algorithm for LQR, exploiting Langevin dynamics to handle a larger class of problems including those with non-Gaussian noises. The notion of preconditioner is introduced to generate samples from non-conjugate posterior distributions.

현장강의만 진행합니다.

1. 일시: 2023년 2월 3일(금), 15:00-17:00

2. 장소: 산업수학혁신센터 세미나실

  • 경기 성남시 수정구 대왕판교로 815, 기업지원허브 231호 국가수리과학연구소
  • 무료주차 2시간 등록 가능

    3. 발표자: 김연응 교수(가천대학교)

    4. 주요내용: Discrete optimal control - learning LQR

    Linear Quadratic Regulator (LQR) is one of the most popular structures in optimal control problems. This talk introduces some basic properties of the LQR problems and recent progress in learning LQR. There are two approaches to measuring the performance of the algorithm, Bayesian and Frequentist regret. The advantages and disadvantages will be discussed particularly g on Thompson sampling. Thompson sampling (TS) is known to effectively address the exploration-exploitation trade-off in online learning problems including reinforcement learning for linear-quadratic regulators (LQR). However, in TS for learning LQR, its theoretical analysis is often limited to the case of Gaussian noises. The sampling can be performed directly when we further assume that the unknown system parameters lie in a prespecified compact set which is seemingly restrictive. We propose a new TS algorithm for LQR, exploiting Langevin dynamics to handle a larger class of problems including those with non-Gaussian noises. The notion of preconditioner is introduced to generate samples from non-conjugate posterior distributions.

현장강의만 진행합니다.

이 페이지에서 제공하는 정보에 대해 만족하십니까?