Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward OptimismOct 4, 2024ยทKihyun Yu,Duksang Lee,William Overman,Dabeen Leeยท 0 min read Cite arXivTypeManuscriptLast updated on Oct 4, 2024 ← Higher-Order Causal Message Passing for Experimentation with Complex Interference Nov 1, 2024Occupancy Prediction with Patient Data: Evaluating Time-Series, Patient-Level Aggregation, and Deep Set Models Feb 1, 2024 →