优化PPO充电与避障策略
扩展观测特征到157维,加入充电桩、NPC、电量安全余量、地图统计和本步清扫信息。 增加低电量回充动作过滤、NPC危险区过滤,并调整奖励和终局日志以突出充电、避障和真实清扫得分。
This commit is contained in:
@@ -13,12 +13,12 @@ Configuration for Robot Vacuum PPO agent.
|
||||
|
||||
class Config:
|
||||
|
||||
# Feature dimensions (69D)
|
||||
# 特征维度(69D)
|
||||
# Feature dimensions (157D)
|
||||
# 特征维度(157D)
|
||||
FEATURES = [
|
||||
7 * 7,
|
||||
12,
|
||||
8,
|
||||
11 * 11, # wider local map view / 更大的局部地图视野
|
||||
28, # global, charger, NPC, and map-stat features / 全局、充电桩、NPC、地图统计特征
|
||||
8, # last action one-hot / 上一步动作 one-hot
|
||||
]
|
||||
FEATURE_SPLIT_SHAPE = FEATURES
|
||||
FEATURE_LEN = sum(FEATURES)
|
||||
|
||||
Reference in New Issue
Block a user