Technique helps robots find the front door
技术帮助机器人找到前门
Navigation method may speed up autonomous last-mile delivery. Watch Video
导航方法可以加速自主的最后一英里交付。
Jennifer Chu | MIT News Office
Jennifer Chu |麻省理工学院新闻办公室
November 4, 2019
2019年11月4日
In the not too distant future, robots may be dispatched as last-mile delivery vehicles to drop your takeout order, package, or meal-kit subscription at your doorstep — if they can find the door.
在不远的将来,机器人可能会作为最后一英里的运输工具被派往你家门口,把你的外卖订单、包裹或餐盒的订购扔到你家门口——如果他们能找到门的话。
Standard approaches for robotic navigation involve mapping an area ahead of time, then using algorithms to guide a robot toward a specific goal or GPS coordinate on the map. While this approach might make sense for exploring specific environments, such as the layout of a particular building or planned obstacle course, it can become unwieldy in the context of last-mile delivery.
机器人导航的标准方法包括预先绘制一个区域的地图,然后使用算法引导机器人朝向地图上的特定目标或GPS坐标。虽然这种方法对于探索特定环境(如特定建筑的布局或计划的障碍路线)可能是有意义的,但在最后一英里交付的情况下,这种方法可能会变得笨拙。
Imagine, for instance, having to map in advance every single neighborhood within a robot’s delivery zone, including the configuration of each house within that neighborhood along with the specific coordinates of each house’s front door. Such a task can be difficult to scale to an entire city, particularly as the exteriors of houses often change with the seasons. Mapping every single house could also run into issues of security and privacy.
例如,想象一下,必须事先绘制机器人传送区域内的每个邻居的地图,包括该邻居内每个房子的配置以及每个房子前门的特定坐标。这样的任务很难扩展到整个城市,特别是房屋的外观经常随着季节的变化而变化。绘制每间房子的地图也可能会遇到安全和隐私问题。
Now MIT engineers have developed a navigation method that doesn’t require mapping an area in advance. Instead, their approach enables a robot to use clues in its environment to plan out a route to its destination, which can be described in general semantic terms, such as “front door” or “garage,” rather than as coordinates on a map. For example, if a robot is instructed to deliver a package to someone's front door, it might start on the road and see a driveway, which it has been trained to recognize as likely to lead toward a sidewalk, which in turn is likely to lead to the front door.
现在麻省理工学院的工程师已经开发出一种导航方法,不需要事先绘制一个区域。相反,他们的方法使机器人能够利用环境中的线索规划到目的地的路线,这可以用一般的语义术语来描述,例如“前门”或“车库”,而不是地图上的坐标。例如,如果一个机器人接到指令,要把包裹送到某人的前门,它可能会从马路上开始,看到一条车道,经过训练,它认识到这条车道有可能通向人行道,而人行道又有可能通向前门。
The new technique can greatly reduce the time a robot spends exploring a property before identifying its target, and it doesn’t rely on maps of specific residences.
这项新技术可以大大减少机器人在识别目标之前探索一处房产的时间,而且它不依赖特定住宅的地图。
“We wouldn’t want to have to make a map of every building that we’d need to visit,” says Michael Everett, a graduate student in MIT’s Department of Mechanical Engineering. “With this technique, we hope to drop a robot at the end of any driveway and have it find a door.”
麻省理工学院机械工程系的研究生迈克尔·埃弗雷特说:“我们不想把我们需要参观的每一栋建筑都绘制成地图。”。“有了这项技术,我们希望能把机器人扔到任何车道的尽头,让它找到一扇门。”
Everett will present the group’s results this week at the International Conference on Intelligent Robots and Systems. The paper, which is co-authored by Jonathan How, professor of aeronautics and astronautics at MIT, and Justin Miller of the Ford Motor Company, is a finalist for “Best Paper for Cognitive Robots.”
埃弗雷特将在本周的智能机器人和系统国际会议上介绍该组织的成果。这篇论文由麻省理工学院航空航天学教授乔纳森·豪斯(Jonathan How)和福特汽车公司(Ford Motor Company)的贾斯汀·米勒(Justin Miller)共同撰写,是“认知机器人最佳论文”的最终入围者
“A sense of what things are”
“对事物的感觉”
In recent years, researchers have worked on introducing natural, semantic language to robotic systems, training robots to recognize objects by their semantic labels, so they can visually process a door as a door, for example, and not simply as a solid, rectangular obstacle.
近年来,研究人员致力于将自然的语义语言引入机器人系统,训练机器人通过语义标签识别物体,这样他们就可以将一扇门视为一扇门,而不仅仅是一个实心的矩形障碍物。
“Now we have an ability to give robots a sense of what things are, in real-time,” Everett says.
埃弗雷特说:“现在我们有能力让机器人实时感知事物的本质。”。
Everett, How, and Miller are using similar semantic techniques as a springboard for their new navigation approach, which leverages pre-existing algorithms that extract features from visual data to generate a new map of the same scene, represented as semantic clues, or context.
埃弗雷特,如何,和Miller正在使用类似的语义技术作为他们的新导航方法的跳板,它利用预先存在的算法从视觉数据中提取特征以生成同一场景的新地图,表示为语义线索或上下文。
In their case, the researchers used an algorithm to build up a map of the environment as the robot moved around, using the semantic labels of each object and a depth image. This algorithm is called semantic SLAM (Simultaneous Localization and Mapping).
在他们的案例中,研究人员使用一种算法来建立机器人移动时的环境地图,使用每个物体的语义标签和深度图像。这种算法称为语义SLAM(同时定位和映射)。
While other semantic algorithms have enabled robots to recognize and map objects in their environment for what they are, they haven’t allowed a robot to make decisions in the moment while navigating a new environment, on the most efficient path to take to a semantic destination such as a “front door.”
虽然其他语义算法使机器人能够识别和映射环境中的对象,但它们不允许机器人在导航新环境的同时,在最有效的路径上做出决策,以到达语义目的地,如“前门”
“Before, exploring was just, plop a robot down and say ‘go,’ and it will move around and eventually get there, but it will be slow,” How says.
霍华德说:“以前,探索只是把一个机器人扑通一声放下来说‘走’,它会四处移动,最终到达那里,但速度会很慢。”。
The cost to go
去的代价
The researchers looked to speed up a robot’s path-planning through a semantic, context-colored world. They developed a new “cost-to-go estimator,” an algorithm that converts a semantic map created by preexisting SLAM algorithms into a second map, representing the likelihood of any given location being close to the goal.
研究人员希望通过一个语义的、背景色的世界来加速机器人的路径规划。他们开发了一种新的“去成本估计器”,该算法将由预先存在的SLAM算法创建的语义映射转换成第二映射,表示任何给定位置接近目标的可能性。
“This was inspired by image-to-image translation, where you take a picture of a cat and make it look like a dog,” Everett says. “The same type of idea happens here where you take one image that looks like a map of the world, and turn it into this other image that looks like the map of the world but now is colored based on how close different points of the map are to the end goal.”
埃弗雷特说:“这是从一个图像到另一个图像的转换中得到的灵感,在这里你拍一张猫的照片,让它看起来像一只狗。”。“同样的想法也会在这里发生,你把一张看起来像世界地图的图像,变成另一张看起来像世界地图的图像,但现在是根据地图上的不同点离最终目标的距离来着色的。”
This cost-to-go map is colorized, in gray-scale, to represent darker regions as locations far from a goal, and lighter regions as areas that are close to the goal. For instance, the sidewalk, coded in yellow in a semantic map, might be translated by the cost-to-go algorithm as a darker region in the new map, compared with a driveway, which is progressively lighter as it approaches the front door — the lightest region in the new map.
这个成本地图是彩色的,用灰度表示,把较暗的区域表示为远离目标的位置,把较亮的区域表示为接近目标的区域。例如,在语义地图中用黄色编码的人行道,可能会被cost-to-go算法转换为新地图中较暗的区域,而车道在接近前门时会逐渐变亮——这是新地图中最亮的区域。
The researchers trained this new algorithm on satellite images from Bing Maps containing 77 houses from one urban and three suburban neighborhoods. The system converted a semantic map into a cost-to-go map, and mapped out the most efficient path, following lighter regions in the map, to the end goal. For each satellite image, Everett assigned semantic labels and colors to context features in a typical front yard, such as grey for a front door, blue for a driveway, and green for a hedge.
研究人员在Bing地图的卫星图像上训练了这种新算法,这些图像包含来自一个城市和三个郊区的77栋房屋。该系统将语义映射转换为代价映射,并按照映射中较轻的区域映射出最有效的路径,以达到最终目标。对于每个卫星图像,埃弗雷特都为典型前院的上下文特征指定语义标签和颜色,例如灰色表示前门,蓝色表示车道,绿色表示树篱。
During this training process, the team also applied masks to each image to mimic the partial view that a robot’s camera would likely have as it traverses a yard.
在这个训练过程中,研究小组还对每幅图像应用了面具,以模拟机器人的相机在穿越院子时可能拥有的部分视图。
“Part of the trick to our approach was [giving the system] lots of partial images,” How explains. “So it really had to figure out how all this stuff was interrelated. That’s part of what makes this work robustly.”
How解释道:“我们方法的一部分诀窍是(给系统)提供大量的部分图像。“所以它必须弄清楚这些东西是如何相互关联的。这正是这项工作得以稳健开展的部分原因。”
The researchers then tested their approach in a simulation of an image of an entirely new house, outside of the training dataset, first using the preexisting SLAM algorithm to generate a semantic map, then applying their new cost-to-go estimator to generate a second map, and path to a goal, in this case, the front door.
然后,研究人员在模拟训练的数据集之外的全新房子的图像中测试他们的方法,首先使用预先存在的SLAM算法来生成语义图,然后将它们的新成本应用到GO估计器来生成第二张地图,以及路径到目标,在这种情况下,前门。
The group’s new cost-to-go technique found the front door 189 percent faster than classical navigation algorithms, which do not take context or semantics into account, and instead spend excessive steps exploring areas that are unlikely to be near their goal.
该小组的新的“成本-去”技术发现,前门比不考虑上下文或语义的经典导航算法快189%,而是花费了过多的步骤来探索不太可能接近目标的领域。
Everett says the results illustrate how robots can use context to efficiently locate a goal, even in unfamiliar, unmapped environments.
埃弗雷特说,研究结果说明了机器人如何利用上下文来有效地定位目标,即使是在不熟悉的、未映射的环境中。
“Even if a robot is delivering a package to an environment it’s never been to, there might be clues that will be the same as other places it’s seen,” Everett says. “So the world may be laid out a little differently, but there’s probably some things in common.”
埃弗雷特说:“即使一个机器人正在把一个包裹送到一个它从未去过的环境中,也可能会有一些线索和它看到的其他地方一样。”。“所以世界的布局可能有点不同,但可能有一些共同点。”
This research is supported, in part, by the Ford Motor Company.
这项研究在一定程度上得到了福特汽车公司的支持。