多智能体强化学习在仓库机器人中的应用 (MARL Warehouse Robots) - 专知论文

会员服务 ·

0

机器人 · 多智能体强化学习 · 算法 · 多智能体 · 智能体 ·

MARL Warehouse Robots

翻译：多智能体强化学习在仓库机器人中的应用

Price Allman,Lian Thang,Dre Simmons,Salmon Riaz

from arxiv, 6 pages, 4 tables. Project documentation: https://pallman14.github.io/MARL-QMIX-Warehouse-Robots/

We present a comparative study of multi-agent reinforcement learning (MARL) algorithms for cooperative warehouse robotics. We evaluate QMIX and IPPO on the Robotic Warehouse (RWARE) environment and a custom Unity 3D simulation. Our experiments reveal that QMIX's value decomposition significantly outperforms independent learning approaches (achieving 3.25 mean return vs. 0.38 for advanced IPPO), but requires extensive hyperparameter tuning -- particularly extended epsilon annealing (5M+ steps) for sparse reward discovery. We demonstrate successful deployment in Unity ML-Agents, achieving consistent package delivery after 1M training steps. While MARL shows promise for small-scale deployments (2-4 robots), significant scaling challenges remain. Code and analyses: https://pallman14.github.io/MARL-QMIX-Warehouse-Robots/

翻译：本文对用于协作式仓库机器人的多智能体强化学习（MARL）算法进行了比较研究。我们在Robotic Warehouse（RWARE）环境及自定义的Unity 3D仿真平台上评估了QMIX与IPPO算法。实验结果表明：QMIX的价值分解方法显著优于独立学习方案（平均回报达3.25，而先进IPPO仅为0.38），但需要大量超参数调优——特别是为稀疏奖励探索需进行超长ε退火（500万步以上）。我们成功在Unity ML-Agents中实现了部署，经过100万训练步后达到稳定的包裹配送效率。虽然MARL在小规模部署（2-4台机器人）中展现出潜力，但仍面临显著的扩展性挑战。代码与分析见：https://pallman14.github.io/MARL-QMIX-Warehouse-Robots/

0

相关内容

机器人

机器人（英语：Robot）包括一切模拟人类行为或思想与模拟其他生物的机械（如机器狗，机器猫等）。狭义上对机器人的定义还有很多分类法及争议，有些电脑程序甚至也被称为机器人。在当代工业中，机器人指能自动运行任务的人造机器设备，用以取代或协助人类工作，一般会是机电设备，由计算机程序或是电子电路控制。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【TPAMI】从人机对抗提出视觉跟踪智能评估新方法，Global Instance Tracking: Locating Target More Like Humans

【TPAMI】从人机对抗提出视觉跟踪智能评估新方法，Global Instance Tracking: Locating Target More Like Humans

专知会员服务

22+阅读 · 2022年3月29日

【ACL2022】理解知识库嵌入中的性别偏见,Understanding Gender Bias in Knowledge Base Embeddings

【ACL2022】理解知识库嵌入中的性别偏见,Understanding Gender Bias in Knowledge Base Embeddings

专知会员服务

10+阅读 · 2022年3月24日

【CVPR 2022】基于视觉-语言验证和迭代推理的视觉定位,Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

【CVPR 2022】基于视觉-语言验证和迭代推理的视觉定位,Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

专知会员服务

12+阅读 · 2022年3月19日

【CIKM2020】多模态知识图谱推荐系统，Multi-modal KG for RS

【CIKM2020】多模态知识图谱推荐系统，Multi-modal KG for RS

专知会员服务

98+阅读 · 2020年8月24日

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

专知会员服务

80+阅读 · 2020年6月11日

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

专知

37+阅读 · 2020年6月11日

论文浅尝 | GEOM-GCN: Geometric Graph Convolutional Networks

论文浅尝 | GEOM-GCN: Geometric Graph Convolutional Networks

开放知识图谱

14+阅读 · 2020年4月8日

【WWW2020-新加坡国立大学】知识图谱强化负采样的推荐系统，Reinforced Negative Sampling

【WWW2020-新加坡国立大学】知识图谱强化负采样的推荐系统，Reinforced Negative Sampling

专知

22+阅读 · 2020年3月14日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

Single-Shot Object Detection with Enriched Semantics

Single-Shot Object Detection with Enriched Semantics

统计学习与视觉计算组

14+阅读 · 2018年8月29日

2D/3D视觉信息融合仿生SLAM关键问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于稀疏表达理论和RGBD图像的人脸表情识别

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

切换系统的容错保成本和容错H无穷控制

国家自然科学基金

0+阅读 · 2015年12月31日

基于动态分层与自学习的多智能体自适应协作模型

国家自然科学基金

17+阅读 · 2008年12月31日

More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery

Arxiv

0+阅读 · 12月10日

Parameter Reduction Improves Vision Transformers: A Comparative Study of Sharing and Width Reduction

Arxiv

0+阅读 · 11月30日

Human Experts' Evaluation of Generative AI for Contextualizing STEAM Education in the Global South

Arxiv

0+阅读 · 11月26日

Towards Formalizing Reinforcement Learning Theory

Arxiv

0+阅读 · 11月5日

AI Diffusion in Low Resource Language Countries

Arxiv

0+阅读 · 11月4日

VIP会员

文章信息

相关主题

多智能体强化学习

相关VIP内容

【TPAMI】从人机对抗提出视觉跟踪智能评估新方法，Global Instance Tracking: Locating Target More Like Humans

【TPAMI】从人机对抗提出视觉跟踪智能评估新方法，Global Instance Tracking: Locating Target More Like Humans

专知会员服务

22+阅读 · 2022年3月29日

【ACL2022】理解知识库嵌入中的性别偏见,Understanding Gender Bias in Knowledge Base Embeddings

【ACL2022】理解知识库嵌入中的性别偏见,Understanding Gender Bias in Knowledge Base Embeddings

专知会员服务

10+阅读 · 2022年3月24日

【CVPR 2022】基于视觉-语言验证和迭代推理的视觉定位,Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

【CVPR 2022】基于视觉-语言验证和迭代推理的视觉定位,Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

专知会员服务

12+阅读 · 2022年3月19日

【CIKM2020】多模态知识图谱推荐系统，Multi-modal KG for RS

【CIKM2020】多模态知识图谱推荐系统，Multi-modal KG for RS

专知会员服务

98+阅读 · 2020年8月24日

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

专知会员服务

80+阅读 · 2020年6月11日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

专知

37+阅读 · 2020年6月11日

论文浅尝 | GEOM-GCN: Geometric Graph Convolutional Networks

论文浅尝 | GEOM-GCN: Geometric Graph Convolutional Networks

开放知识图谱

14+阅读 · 2020年4月8日

【WWW2020-新加坡国立大学】知识图谱强化负采样的推荐系统，Reinforced Negative Sampling

【WWW2020-新加坡国立大学】知识图谱强化负采样的推荐系统，Reinforced Negative Sampling

专知

22+阅读 · 2020年3月14日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

Single-Shot Object Detection with Enriched Semantics

Single-Shot Object Detection with Enriched Semantics

统计学习与视觉计算组

14+阅读 · 2018年8月29日

相关论文

More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery

Arxiv

0+阅读 · 12月10日

Parameter Reduction Improves Vision Transformers: A Comparative Study of Sharing and Width Reduction

Arxiv

0+阅读 · 11月30日

Human Experts' Evaluation of Generative AI for Contextualizing STEAM Education in the Global South

Arxiv

0+阅读 · 11月26日

Towards Formalizing Reinforcement Learning Theory

Arxiv

0+阅读 · 11月5日

AI Diffusion in Low Resource Language Countries

Arxiv

0+阅读 · 11月4日

相关基金

2D/3D视觉信息融合仿生SLAM关键问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于稀疏表达理论和RGBD图像的人脸表情识别

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

切换系统的容错保成本和容错H无穷控制

国家自然科学基金

0+阅读 · 2015年12月31日

基于动态分层与自学习的多智能体自适应协作模型

国家自然科学基金

17+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员