Baohe Zhang 张宝赫
Baohe Zhang 张宝赫
Home
Publications
Experience
Contact
Light
Dark
Automatic
Publications
Type
Date
2026
2025
2024
2023
2022
2021
Fitting Reinforcement Learning Model to Behavioral Data under Bandits
Investigates fitting RL models to behavioral data under bandit feedback, proposing a new method and demonstrating its effectiveness.
PDF
Constrained Reinforcement Learning with Smoothed Log Barrier Function
Proposes a Smoothed Log Barrier Function (CSAC-LB) for constrained RL, proving convergence to the optimal policy and demonstrating effectiveness on safety-critical tasks.
PDF
Code
Goal Achievement Guided Exploration: Mitigating Premature Convergence in Learning Robot Control
Proposes Goal Achievement Guided Exploration (GAGE) to mitigate premature convergence and improve sample efficiency in learning robot control with sparse rewards.
PDF
Revisiting Safe Exploration in Safe Reinforcement learning
Introduces a new safety metric, expected maximum consecutive cost steps (EMCC), to address limitations in standard SafeRL cost metrics and improve safe exploration.
PDF
Constrained Reinforcement Learning for Safe Heat Pump Control
Applies Constrained Reinforcement Learning to heat pump control, minimizing energy consumption while ensuring comfort and safety.
PDF
Learning Continuous Control with Geometric Regularity from Robot Intrinsic Symmetry
Introduces novel network structures for single-agent robot control that explicitly capture inherent reflectional and rotational symmetries to improve learning.
PDF
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
A comprehensive survey of Automated Reinforcement Learning (AutoRL), unifying the field and discussing its challenges and potential.
PDF
Automated reinforcement learning (autorl): A survey and open problems
A comprehensive survey of Automated Reinforcement Learning (AutoRL), unifying the field and discussing its challenges and potential.
PDF
On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning
Investigating the impact of Hyperparameter Optimization on Model-based RL performance.
Cite
Code
Video
Cite
×