符毓 Yu (2023-11-30 23:11):
#paper doi.org/10.48550/arXiv.2311.05332, 2023, On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving. 文远知行的团队近期的论文,把GPT应用在自动驾驶领域。测试结果显示GPT在图像识别,点云识别,天气识别,V2X图像,模拟图像识别,多角度图片识别都有较高准确率;在交通灯识别,左右空间区分上容易出错
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving
翻译
Abstract:
The pursuit of autonomous driving technology hinges on the sophisticatedintegration of perception, decision-making, and control systems. Traditionalapproaches, both data-driven and rule-based, have been hindered by theirinability to grasp the nuance of complex driving environments and theintentions of other road users. This has been a significant bottleneck,particularly in the development of common sense reasoning and nuanced sceneunderstanding necessary for safe and reliable autonomous driving. The advent ofVisual Language Models (VLM) represents a novel frontier in realizing fullyautonomous vehicle driving. This report provides an exhaustive evaluation ofthe latest state-of-the-art VLM, GPT-4V(ision), and its application inautonomous driving scenarios. We explore the model's abilities to understandand reason about driving scenes, make decisions, and ultimately act in thecapacity of a driver. Our comprehensive tests span from basic scene recognitionto complex causal reasoning and real-time decision-making under varyingconditions. Our findings reveal that GPT-4V demonstrates superior performancein scene understanding and causal reasoning compared to existing autonomoussystems. It showcases the potential to handle out-of-distribution scenarios,recognize intentions, and make informed decisions in real driving contexts.However, challenges remain, particularly in direction discernment, trafficlight recognition, vision grounding, and spatial reasoning tasks. Theselimitations underscore the need for further research and development. Projectis now available on GitHub for interested parties to access and utilize:\url{https://github.com/PJLab-ADG/GPT4V-AD-Exploration}
翻译
回到顶部