来自用户 符毓 Yu 的文献。
当前共找到 27 篇文献分享,本页显示第 1 - 20 篇。
符毓 Yu (2025-02-28 23:00):
#paper doi.org/10.48550/arXiv.2411.13677, 2024, Bimanual Dexterity for Complex Tasks. 遥操作是机器人获取数据的重要方式。文章介绍了一种便携、低成本(总成本约12k美元,其中5k的手,7k的系统;可额外配合双机械臂16k)且极其精确的双手人形机器人手臂系统遥操作方法,展示了该系统在桌面和移动环境中的适用性,并展示了它在执行双手灵巧任务时相较于其他方法(如 SteamVR 和 Vision Pro等)的高效性。但由于缺乏触觉反馈,操作员只能依赖视觉反馈进行遥操作,无法感知机器人手臂的感觉
arXiv, 2024-11-20T19:53:35Z. DOI: 10.48550/arXiv.2411.13677
To train generalist robot policies, machine learning methods often require asubstantial amount of expert human teleoperation data. An ideal robot forhumans collecting data is one that closely mimics them: bimanual … >>>
To train generalist robot policies, machine learning methods often require asubstantial amount of expert human teleoperation data. An ideal robot forhumans collecting data is one that closely mimics them: bimanual arms anddexterous hands. However, creating such a bimanual teleoperation system withover 50 DoF is a significant challenge. To address this, we introduce Bidex, anextremely dexterous, low-cost, low-latency and portable bimanual dexterousteleoperation system which relies on motion capture gloves and teacher arms. Wecompare Bidex to a Vision Pro teleoperation system and a SteamVR system andfind Bidex to produce better quality data for more complex tasks at a fasterrate. Additionally, we show Bidex operating a mobile bimanual robot for in thewild tasks. The robot hands (5k USD) and teleoperation system (7k USD) isreadily reproducible and can be used on many robot arms including two xArms(16k USD). Website at https://bidex-teleop.github.io/ <<<
符毓 Yu (2025-01-31 11:25):
#paper doi.org/10.48550/arXiv.2405.18730, 2024, Development of a Novel Impedance-Controlled Quasi-Direct-Drive Robotic Hand. 准直驱执行器除了低成本、易于控制等优势外,本文提出准直驱执行器在灵巧手的应用场景,如从桌子边缘拾取硬币等小物体,或从非结构化环境中快速 / 动态抓取小物体,也有独特的优势。
arXiv, 2024-05-29T03:20:46Z. DOI: 10.48550/arXiv.2405.18730
Most robotic hands and grippers rely on actuators with large gearboxes andforce sensors for controlling gripping force. However, this might not be idealfor tasks that require the robot to interact … >>>
Most robotic hands and grippers rely on actuators with large gearboxes andforce sensors for controlling gripping force. However, this might not be idealfor tasks that require the robot to interact with an unstructured and unknownenvironment. In this paper, we introduce a novel quasi-direct-drivetwo-fingered robotic hand with variable impedance control in the joint spaceand Cartesian space. The hand has a total of four degrees of freedom,backdrivable differential gear trains, and four brushless direct current (BLDC)motors. Motor torque is controlled through Field-Oriented Control (FOC) withcurrent sensing. Variable impedance control enables the robotic hand to executedexterous manipulation tasks safely during environment-robot and human-robotinteractions. The quasi-direct-drive actuators eliminate the need for complextactile/force sensors or precise motion planning when handling environmentalcontact. A majority-3D-printed assembly makes this a low-cost research platformbuilt with affordable, readily available off-the-shelf components. Experimentalvalidation demonstrates the robotic hand's capability for stable force-closureand form-closure grasps in the presence of disturbances, reliable in-handmanipulation, and safe dynamic manipulations despite contact with theenvironment. <<<
符毓 Yu (2024-12-31 21:56):
#paper doi:10.1109/CBS.2017.8266084 2017 IEEE International Conference on Cyborg and Bionic Systems (CBS), 2017, Electromagnetic design of a high torque density permanent magnet motor for biomimetic robot. 实现更快仿生机器人目标的限制因素是驱动机器人的电机产生的扭矩,以及这些电机的质量和功耗。这些限制决定了需要一种低质量大扭矩的电机。为了提高电机的扭矩密度,本文概述了一种分数槽集中绕组永磁无刷直流电机的电磁设计,以满足机器人的目标,同时最小化总质量。分析了分数槽集中绕组电机的极槽数组合依据,并根据永磁场和气隙场计算了电机由永磁体和绕组电流产生的磁场。此外,对电机进行了有限元分析,研究了磁场分布、反电动势、铁损、稳态扭矩和齿槽扭矩。结果证明了设计方法的可行性,电机的扭矩密度为3.47Nm/kg。高于国际主流商用电机参数
符毓 Yu (2024-11-30 20:46):
#paper doi.org/10.48550/arXiv.2411.18454, 2024, Optimizing Coverage in Convex Quadrilateral Regions with a Single UAV. 本文研究了单个无人机的最佳悬停高度,以提供对地面上任何凸四边形区域的覆盖。无人机采用了一个定向天线与倾斜波束,产生一个椭圆形的覆盖模式。考虑两种情况:(1)在四边形内内接最大的椭圆以覆盖其内部,以及(2)围绕四边形外接最小的椭圆以确保完全覆盖。我们推导出最佳的无人机高度和天线倾斜条件下,在这两种情况下的简化但广泛接受的路径损耗模型和覆盖效率的数值结果。这项工作有助于开发节能的无人机通信系统。
arXiv, 2024-11-27T15:45:31Z. DOI: 10.48550/arXiv.2411.18454
This letter investigates the optimal hovering altitude of a single UAV toprovide coverage over any convex quadrilateral region on the ground. The UAVemploys a directional antenna with a tiltable beam, … >>>
This letter investigates the optimal hovering altitude of a single UAV toprovide coverage over any convex quadrilateral region on the ground. The UAVemploys a directional antenna with a tiltable beam, producing an ellipticalcoverage pattern. Two scenarios are considered: (1) inscribing the largestellipse within the quadrilateral to cover its interior, and (2) circumscribingthe smallest ellipse about the quadrilateral to ensure full coverage. We derivethe optimal UAV altitude and antenna tilt conditions in both scenarios for asimplified yet widely accepted path loss model and present numerical resultsfor coverage efficiency. The work contributes to the development ofenergy-efficient UAV-based communication systems. <<<
符毓 Yu (2024-10-30 21:44):
#paper doi:10.3390/aerospace6030026 Aerospace, 2019, Electric VTOL Configurations Comparison 本文介绍了自五十年代以来建造的 VTOL,并讨论了它们的优点、缺点和问题。对三种代表性 eVTOL(每种主要配置一种)的五个主要参数和三个参考任务进行了比较。这些参数包括磁盘负载、总悬停时间、巡航速度、实际航程和飞行时间。通过计算所需的时间和能量,评估了 eVTOL 在城市、城郊和远程任务中的性能。结果表明,最佳配置取决于任务。多旋翼飞机在悬停时效率更高。矢量推力喷气发动机在巡航时效率更高,航程也更大。升力 + 巡航是一种折衷方案。
In the last ten years, different concepts of electric vertical take-off and landing aircrafts (eVTOLs) have been tested. This article addresses the problem of the choice of the best configuration. … >>>
In the last ten years, different concepts of electric vertical take-off and landing aircrafts (eVTOLs) have been tested. This article addresses the problem of the choice of the best configuration. VTOLs built since the fifties are presented and their advantages, disadvantages, and problems are discussed. Three representative eVTOLs, one for each main configuration, are compared on five main parameters and three reference missions. The parameters are disk loading, total hover time, cruise speed, practical range, and flight time. The performance of the eVTOLs on the urban, extra-urban, and long-range mission is evaluated computing the time and energy required. The results show that the best configuration depends on the mission. The multirotor is more efficient in hover. The vectored thrust jet is more efficient in cruise and has a higher range. The lift + cruise is a compromise. <<<
符毓 Yu (2024-09-30 18:24):
#paper doi.org/10.3390/en16041594 Euspen, 2023, A comprehensive review on the application of 3D-printed ferromagnetic parts in electric machines。通过在电机设计领域引入增材制造技术,电机的设计灵活性显著提高。本文旨在全面回顾目前如何利用这种扩展的设计自由度。其中许多已经成功打印出来并通过了实验验证。在其他情况下,实验验证不足或结果有限。本文的第二部分简要介绍了印刷铁磁材料应用的两个缺点,即较差的磁性能和单一材料打印的涡流损耗缓解复杂。但是,尽管铁磁材料增材制造技术及其应用尚未完全成熟和发展,但值得期待。
This paper presents a prototype of a low-cost two-phase axial-gap transverse flux generator, in which the magnetic and electric circuits have been made of reused materials, and the stator housing … >>>
This paper presents a prototype of a low-cost two-phase axial-gap transverse flux generator, in which the magnetic and electric circuits have been made of reused materials, and the stator housing has been manufactured by 3D printing of plastic. Therefore, this work presents as a novelty the combination of the novel transverse flux topology and two challenging trends in electrical machines manufacturing, such as reusing of components and additive manufacturing. Axial-gap transverse flux machines potentially enable the combination of two of the main advantages of axial flux machines and transverse flux machines, i.e., short axial length and a high number of poles. The two-phase arrangement with shared air gap is of great interest in order to reduce further the axial length while avoiding the use of magnetic materials in the rotor, such as iron or soft magnetic composites. However, the equivalent air gap might be large, with significant leakage and fringing effects as the magnetic flux closes through the air. Therefore, in this paper the accuracy of the analytical equations and the magnetic equivalent circuit is firstly investigated. The two-phase axial-gap transverse flux machine is prone to misalignment between phases and rotor imbalances that alter the air gap length, so these effects have been included in the simulations with the finite element method. Experimental tests have been conducted throughout the investigation, from the prototype characterization to the steady-state operation, both with no load and with resistive loads. <<<
符毓 Yu (2024-08-31 23:18):
#paper doi: 10.1038/s41598-022-06214-7, Science Report, 2022, Low voltage optical fiber positioner robot based on minimum inductance hollow cup motors 新一代光纤定位机器人选用了相位电感最小的4 mm空心杯电机。由于光纤定位机器人负载为恒定值,电机转动惯量很小,因此提出了一种基于空间矢量脉冲宽度调制的开环定位控制方法,并通过相关实验策略直接整定具体的开环参数。从细分、基频、造波方式、峰值电流4个方面详细讨论了开环驱动方式的关键因素。基于实际光纤定位机器人,搭建了硬件驱动与考核平台。定位试验表明,所提方法实用有效,满足新一代光纤定位机器人的精密定位需求。
AbstractWith the further transformation of The Large Sky Area Multi-Object Fiber Spectroscopic Telescope, the new generation of fiber positioner robot chooses a 4 mm hollow cup motor with minimum phase … >>>
AbstractWith the further transformation of The Large Sky Area Multi-Object Fiber Spectroscopic Telescope, the new generation of fiber positioner robot chooses a 4 mm hollow cup motor with minimum phase inductance. Because the load of the fiber positioner robot is constant and the inertia of the motor is very small, an open loop positioning control method based on Space Vector Pulse Width Modulation is proposed, and the specific open loop parameters are directly tuned by relevant experimental strategies. The critical factors of the open loop driving mode are discussed in detail from four aspects: subdivision, fundamental frequency, wave generation mode and peak current. Based on the actual fiber positioner robot, the hardware driver and assessment platform are built. The positioning tests show that the method proposed is practical and effective, and meets the precision positioning demand of the new generation optical fiber positioner robot. <<<
符毓 Yu (2024-07-31 21:51):
#paper doi.org/10.48550/arXiv.2312.06512, 2024, Stoch BiRo: Design and Control of a low cost bipedal robot. 本文所提出的双足平台模型突出了熟练的行走能力、低计算需求和轻量级硬件设计。强化学习的奖励函数设计是用作动画镜像模仿跟随(motion-imitation rewards)并没有优先服务于整个机器人的IMU的水平保持,减少了很多扭矩模拟的数据
arXiv, 2023-12-11T16:39:11Z. DOI: 10.48550/arXiv.2312.06512
This paper introduces the Stoch BiRo, a cost-effective bipedal robot designedwith a modular mechanical structure having point feet to navigate uneven andunfamiliar terrains. The robot employs proprioceptive actuation in abduction,hips, … >>>
This paper introduces the Stoch BiRo, a cost-effective bipedal robot designedwith a modular mechanical structure having point feet to navigate uneven andunfamiliar terrains. The robot employs proprioceptive actuation in abduction,hips, and knees, leveraging a Raspberry Pi4 for control. Overcomingcomputational limitations, a Learning-based Linear Policy controller managesbalance and locomotion with only 3 degrees of freedom (DoF) per leg, distinctfrom the typical 5DoF in bipedal systems. Integrated within a modular controlarchitecture, these controllers enable autonomous handling of unforeseenterrain disturbances without external sensors or prior environment knowledge.The robot's policies are trained and simulated using MuJoCo, transferringlearned behaviors to the Stoch BiRo hardware for initial walking validations.This work highlights the Stoch BiRo's adaptability and cost-effectiveness inmechanical design, control strategies, and autonomous navigation, promisingdiverse applications in real-world robotics scenarios. <<<
符毓 Yu (2024-06-30 23:02):
#paper doi.org/10.48550/arXiv.2404.17569, 2024, MaPa: Text-driven Photorealistic Material Painting for 3D Shapes. 本文提供了通过文字给3D模型渲染高质量材质表面的算法。 算法分为四步,首先,将网格分解为不同的片段,并使用片段控制图像生成技术(具体采用 ControlNet)将它们投影到 2D 图像上;第二,根据相似的材质属性和外观将这些片段分类。第三,每个材质组都会经过选择过程,会在此过程中识别和优化合适的材质图,以准确表示其纹理和特性。最后是迭代的,不断在多个视图中渲染和优化这些材质图,填补视觉数据中的任何空白,并重复分组和优化阶段,直到网格的每个片段都由相应的材质图准确表示。这种综合方法可确保根据 3D 网格每个片段的独特特征定制详细而逼真的材质纹理。
This paper aims to generate materials for 3D meshes from text descriptions.Unlike existing methods that synthesize texture maps, we propose to generatesegment-wise procedural material graphs as the appearance representation, whichsupports … >>>
This paper aims to generate materials for 3D meshes from text descriptions.Unlike existing methods that synthesize texture maps, we propose to generatesegment-wise procedural material graphs as the appearance representation, whichsupports high-quality rendering and provides substantial flexibility inediting. Instead of relying on extensive paired data, i.e., 3D meshes withmaterial graphs and corresponding text descriptions, to train a material graphgenerative model, we propose to leverage the pre-trained 2D diffusion model asa bridge to connect the text and material graphs. Specifically, our approachdecomposes a shape into a set of segments and designs a segment-controlleddiffusion model to synthesize 2D images that are aligned with mesh parts. Basedon generated images, we initialize parameters of material graphs and fine-tunethem through the differentiable rendering module to produce materials inaccordance with the textual description. Extensive experiments demonstrate thesuperior performance of our framework in photorealism, resolution, andeditability over existing methods. Project page: https://zju3dv.github.io/MaPa <<<
符毓 Yu (2024-05-31 23:30):
#paper doi:10.1109/TIA.2002.805572 IEEE Transactions on Industry Applications, 2002, Molecular structure of nucleic acids; A comparison between the axial flux and the radial flux structures for PM synchronous motors. 本文对比了两个外部定子一个内部转子的轴向磁通电机和一个外定子一个内转子的径向磁通电机,在电机总体积、单位损耗表面的损耗和气隙磁通密度保持不变的情况下,比较了两种电机结构的电磁转矩和转矩密度。证明当轴向长度短并且极数高的情况下,轴向磁通电机有优势
符毓 Yu (2024-04-30 21:53):
#paper doi:10.1002/eem2.12734, Energy&Environmental Materials, 2024, Confluence of ZnO and PTFE Binder for Enhancing Performance of Thin-Film Lithium-Ion Batteries。开发具有高比容量和循环稳定性的负极材料对于改进薄膜锂离子电池至关重要。 薄膜氧化锌(ZnO)由于其高比容量而具有前景,但它在循环过程中会受到体积变化和结构应力的影响,导致电池性能较差。本文用磁控溅射方法聚四氟乙烯(PTFE)与ZnO结合在一起,确保了薄膜复合电极的牢固结合。PTFE有效降低了活性材料上的应力并减轻了Li+离子嵌入和脱嵌过程中的体积变化影响。ZnO/PTFE薄膜电极从第1次到第100次循环的容量保持率高达82%,超过了裸ZnO薄膜的50%
Developing anode materials with high specific capacity and cycling stability is vital for improving thin‐film lithium‐ion batteries. Thin‐film zinc oxide (ZnO) holds promise due to its high specific capacity, but … >>>
Developing anode materials with high specific capacity and cycling stability is vital for improving thin‐film lithium‐ion batteries. Thin‐film zinc oxide (ZnO) holds promise due to its high specific capacity, but it suffers from volume changes and structural stress during cycling, leading to poor battery performance. In this research, we ingeniously combined polytetrafluoroethylene (PTFE) with ZnO using a radio frequency (RF) magnetron co‐sputtering method, ensuring a strong bond in the thin‐film composite electrode. PTFE effectively reduced stress on the active material and mitigated volume change effects during Li+ ion intercalation and deintercalation. The composite thin films are thoroughly characterized using advanced techniques such as X‐ray diffraction, scanning electron microscopy, and X‐ray photoelectron spectroscopy for investigating correlations between material properties and electrochemical behaviors. Notably, the ZnO/PTFE thin‐film electrode demonstrated an impressive specific capacity of 1305 mAh g−1 (=7116 mAh cm−3) at a 0.5C rate and a remarkable capacity retention of 82% from the 1st to the 100th cycle, surpassing the bare ZnO thin film (50%). This study provides valuable insights into using binders to stabilize active materials in thin‐film batteries, enhancing battery performance. <<<
符毓 Yu (2024-03-31 23:50):
#paper doi.org/10.48550/arXiv.2403.16527, 2024, Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art. 智能控制系统能通过预训练在各场景下得到广泛应用,但在训练外场景下表现糟糕。大模型出现有希望提供现有训练方式缺乏的推理能力,但大模型会产生“幻觉”(听起来合理但很差的决策)。本文尝试定义“幻觉”,并给出检测和缓解规划中出现“幻觉”的方法分类,评估指标和数据集等
Autonomous systems are soon to be ubiquitous, from manufacturing autonomy toagricultural field robots, and from health care assistants to the entertainmentindustry. The majority of these systems are developed with modularsub-components … >>>
Autonomous systems are soon to be ubiquitous, from manufacturing autonomy toagricultural field robots, and from health care assistants to the entertainmentindustry. The majority of these systems are developed with modularsub-components for decision-making, planning, and control that may behand-engineered or learning-based. While these existing approaches have beenshown to perform well under the situations they were specifically designed for,they can perform especially poorly in rare, out-of-distribution scenarios thatwill undoubtedly arise at test-time. The rise of foundation models trained onmultiple tasks with impressively large datasets from a variety of fields hasled researchers to believe that these models may provide common sense reasoningthat existing planners are missing. Researchers posit that this common sensereasoning will bridge the gap between algorithm development and deployment toout-of-distribution tasks, like how humans adapt to unexpected scenarios. Largelanguage models have already penetrated the robotics and autonomous systemsdomains as researchers are scrambling to showcase their potential use cases indeployment. While this application direction is very promising empirically,foundation models are known to hallucinate and generate decisions that maysound reasonable, but are in fact poor. We argue there is a need to step backand simultaneously design systems that can quantify the certainty of a model'sdecision, and detect when it may be hallucinating. In this work, we discuss thecurrent use cases of foundation models for decision-making tasks, provide ageneral definition for hallucinations with examples, discuss existingapproaches to hallucination detection and mitigation with a focus on decisionproblems, and explore areas for further research in this exciting field. <<<
符毓 Yu (2024-02-29 22:43):
#paper doi.org/10.48550/arXiv.2304.09349 2023, LLM as A Robotic Brain: Unifying Egocentric Memory and Control. LLM 代理通过预训练获得知识和推理能力来解决机器人技术和规划任务。然而,人们在教机器人“该做什么”付出了较多努力。文章重点在于传达机器人不能做什么,以及满足安全操作标准。针对在协作环境中部署LLM代理,提出了解决LLM模型固有的概率性和不能应对复杂条件的约束方式。最终在VirtualHome环境和真实机器人实验上都表明,能在不影响目标完成率的情况下满足安全约束条件
符毓 Yu (2024-01-31 21:30):
#paper doi:10.1049/elp2.12371 IET Power Electronics, 2023, AC losses calculation of parallel multi‐strand flat wire windings for automotive drive motor. 汽车驱动电机采用扁线绕组,提高了槽填充系数和效率。 然而,在当前驱动电机向高频化、高压化发展的环境下,绕组交流损耗的增加削弱了这些优势。交流损耗的准确分析和高性能绕组的设计已成为驱动电机设计过程中的关键问题和难点。基于绕组交流损耗产生机理,作者提出了一种绕组交流损耗分析方法,可以有效分离直流损耗、涡流损耗和环流损耗。该方法用于分析60 kW扁线绕线永磁同步电机(PMSM)的绕组交流损耗。此外,提出并联多股扁线绕组,以减少绕组交流损耗,消除电机磁负载对绕组层数选择的限制。计算了并联多股扁线绕组在不同频率下的交流损耗,这证明了所提出的绕组拓扑在降低绕组交流损耗方面的有效性。此外,通过实验验证了仿真模型的正确性,并通过有限元分析验证了损耗计算方法。 本文提出了一种准确分析绕组交流损耗的方法。基于该方法,计算了60kW永磁同步电机扁线绕组的交流损耗,并提出了一种可降低绕组交流损耗的高性能绕组结构。得到以下结论。 1)所提出的绕组交流损耗分析方法有效地分离了直流损耗、涡流损耗和环流损耗。该方法解决了绕组交流损耗成分分离的问题,可以考虑不同温度、频率、层数对绕组交流损耗的影响。为绕组交流损耗的研究提供了理论基础。 2)随着温度升高,绕组的直流损耗增大,而涡流损耗减小。另一方面,增加激励频率会导致涡流损耗和环流损耗增加。 3)增加绕组层数可以减少扁线绕组的涡流损耗。绕组中的涡流损耗始终在最接近槽口的导体中最大,约占总涡流损耗的60%。绕组EC损耗随着层数的增加而减小,8层绕组的涡流损耗比4层绕组降低了48.77%。 4) 绕组分成的股数越多,涡流损耗减少得越多,但产生的环流损耗也越多。通过采用端部绕组换位技术,可以有效抑制环流损耗,而涡流损耗的变化基本保持不变。所提出的并联多股扁线绕组在峰值速度下最大可降低26.28%的交流损耗,这证明所提出的绕组可以在不改变绕组层数的情况下有效地降低绕组交流损耗。
AbstractAutomobile drive motor with flat wire winding has improved the slot fill factor and efficiency. However, under the current environment of drive motors development to high frequency and voltage, the … >>>
AbstractAutomobile drive motor with flat wire winding has improved the slot fill factor and efficiency. However, under the current environment of drive motors development to high frequency and voltage, the increase in winding AC loss has weakened those advantages. Accurate analysis of AC loss and the design of high‐performance windings have become the key issues and difficulties in the design process of drive motors. Based on the winding AC loss generation mechanism, the authors propose a winding AC loss analysis method, which can effectively separate DC loss, eddy current loss, and circulating current loss. The method is used to analyse winding AC loss of 60 kW flat wire winding permanent magnet synchronous motor (PMSM). In addition, parallel multi‐strand flat wire windings are proposed to reduce the winding AC loss and eliminate the limitation of the magnetic load of the motor on the selection of winding layers. The AC loss of the parallel multi‐strand flat wire windings at different frequencies is calculated, which demonstrates the effectiveness of the proposed winding topology in reducing the winding AC loss. In addition, the correctness of the simulation model was verified through experiments, and the loss calculation method was verified through finite element analysis. <<<
符毓 Yu (2023-12-31 16:44):
#paper doi.org/10.1038/s41586-023-06306-y Nature, 2023, Solid-body trajectoids shaped to roll along desired pathways 本文介绍了一种名为trajectoids的固体轨迹体,可以沿着所需路径滚动,并通过算法设计出这些轨迹体,并通过三维打印验证了这些设计的可行性。文章探讨了轨迹体的运动规律、路径设计和形态学,并提供了多个物理系统中的应用案例,如量子力学、经典光学和机器人学等。研究结果对于理解物体运动的动力学和设计新型光学器件具有重要意义
符毓 Yu (2023-11-30 23:11):
#paper doi.org/10.48550/arXiv.2311.05332, 2023, On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving. 文远知行的团队近期的论文,把GPT应用在自动驾驶领域。测试结果显示GPT在图像识别,点云识别,天气识别,V2X图像,模拟图像识别,多角度图片识别都有较高准确率;在交通灯识别,左右空间区分上容易出错
The pursuit of autonomous driving technology hinges on the sophisticatedintegration of perception, decision-making, and control systems. Traditionalapproaches, both data-driven and rule-based, have been hindered by theirinability to grasp the nuance … >>>
The pursuit of autonomous driving technology hinges on the sophisticatedintegration of perception, decision-making, and control systems. Traditionalapproaches, both data-driven and rule-based, have been hindered by theirinability to grasp the nuance of complex driving environments and theintentions of other road users. This has been a significant bottleneck,particularly in the development of common sense reasoning and nuanced sceneunderstanding necessary for safe and reliable autonomous driving. The advent ofVisual Language Models (VLM) represents a novel frontier in realizing fullyautonomous vehicle driving. This report provides an exhaustive evaluation ofthe latest state-of-the-art VLM, GPT-4V(ision), and its application inautonomous driving scenarios. We explore the model's abilities to understandand reason about driving scenes, make decisions, and ultimately act in thecapacity of a driver. Our comprehensive tests span from basic scene recognitionto complex causal reasoning and real-time decision-making under varyingconditions. Our findings reveal that GPT-4V demonstrates superior performancein scene understanding and causal reasoning compared to existing autonomoussystems. It showcases the potential to handle out-of-distribution scenarios,recognize intentions, and make informed decisions in real driving contexts.However, challenges remain, particularly in direction discernment, trafficlight recognition, vision grounding, and spatial reasoning tasks. Theselimitations underscore the need for further research and development. Projectis now available on GitHub for interested parties to access and utilize:\url{https://github.com/PJLab-ADG/GPT4V-AD-Exploration} <<<
符毓 Yu (2023-10-31 22:49):
#paper doi:10.1080/03772063.2020.1830862 IETE Journal of Research, 2020, Electric Vehicle Control and Driving Safety Systems: A Review。过去10年汽车电子电气架构正在快速升级中,对应包括动态实时的要求,稳定性等控制的安全测试也需要更新更可靠的测试方法,本文总结了过去和当前的电动汽车不同控制设计的解决方案
符毓 Yu (2023-09-30 10:07):
#paper doi:10.1038/171737a0 International Conference on Industrial Engineering and Systems Management (IESM), 2019, AMHS Vehicle Management Policies in Semiconductor Manufacturing: A Short Review. AMHS(Automated Material Handling Systems)在半导体晶圆厂里属于较为重要的设备系统,主要解决不同加工工序间物料的等待和运输效率问题。国内有少数企业正在努力做此环节的国产替代故而关注到这个方向。本文较为基础,介绍了AMHS的背景、特征和构成,以及主流调度方法的各自优缺点
符毓 Yu (2023-08-31 22:39):
#paper doi.org/10.48550/arXiv.2303.09165 2023, A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation。 为了解决机器视觉中大量人工标注的成本问题,团队尝试通过用合成数据的方式解决。基于一定规则生成合成数据后,本文展示了通过合成数据进行预训练的方式优于真实数据,同时也能优于几种数据增加后的结果的可能性。未来应用具有较大的想象力
Deep learning in computer vision has achieved great success with the price of large-scale labeled training data. However, exhaustive data annotation is impracticable for each task of all domains of … >>>
Deep learning in computer vision has achieved great success with the price of large-scale labeled training data. However, exhaustive data annotation is impracticable for each task of all domains of interest, due to high labor costs and unguaranteed labeling accuracy. Besides, the uncontrollable data collection process produces non-IID training and test data, where undesired duplication may exist. All these nuisances may hinder the verification of typical theories and exposure to new findings. To circumvent them, an alternative is to generate synthetic data via 3D rendering with domain randomization. We in this work push forward along this line by doing profound and extensive research on bare supervised learning and downstream domain adaptation. Specifically, under the well-controlled, IID data setting enabled by 3D rendering, we systematically verify the typical, important learning insights, e.g., shortcut learning, and discover the new laws of various data regimes and network architectures in generalization. We further investigate the effect of image formation factors on generalization, e.g., object scale, material texture, illumination, camera viewpoint, and background in a 3D scene. Moreover, we use the simulation-to-reality adaptation as a downstream task for comparing the transferability between synthetic and real data when used for pre-training, which demonstrates that synthetic data pre-training is also promising to improve real test results. Lastly, to promote future research, we develop a new large-scale synthetic-to-real benchmark for image classification, termed S2RDA, which provides more significant challenges for transfer from simulation to reality. The code and datasets are available at this https URL. <<<
符毓 Yu (2023-07-31 16:41):
#paper doi: 10.48550/arXiv.2307.05973 2023, Composable 3D Value Maps for Robotic Manipulation with Language Models. 李飞飞团队最新论文研究,把语言模型与机器人操作结合。与大语言模型结合后人机交互效率得到提高,并且能做到基于视觉的实时轨迹规划。目测机械臂移动速率为常见机械臂工作速率的八分之一,到真实应用的话稳定性还需要进一步提高(超过25%的出错率)
Large language models (LLMs) are shown to possess a wealth of actionable knowledge that can be extracted for robot manipulation in the form of reasoning and planning. Despite the progress, … >>>
Large language models (LLMs) are shown to possess a wealth of actionable knowledge that can be extracted for robot manipulation in the form of reasoning and planning. Despite the progress, most still rely on pre-defined motion primitives to carry out the physical interactions with the environment, which remains a major bottleneck. In this work, we aim to synthesize robot trajectories, i.e., a dense sequence of 6-DoF end-effector waypoints, for a large variety of manipulation tasks given an open-set of instructions and an open-set of objects. We achieve this by first observing that LLMs excel at inferring affordances and constraints given a free-form language instruction. More importantly, by leveraging their code-writing capabilities, they can interact with a visual-language model (VLM) to compose 3D value maps to ground the knowledge into the observation space of the agent. The composed value maps are then used in a model-based planning framework to zero-shot synthesize closed-loop robot trajectories with robustness to dynamic perturbations. We further demonstrate how the proposed framework can benefit from online experiences by efficiently learning a dynamics model for scenes that involve contact-rich interactions. We present a large-scale study of the proposed method in both simulated and real-robot environments, showcasing the ability to perform a large variety of everyday manipulation tasks specified in free-form natural language. Project website: this https URL <<<