The Language-Guided Navigation module leverages an LLM (like ChatGPT) and the open-set O3D-SIM.The Language-Guided Navigation module leverages an LLM (like ChatGPT) and the open-set O3D-SIM.

VLN: LLM and CLIP for Instance-Specific Navigation on 3D Maps

Abstract and 1 Introduction

  1. Related Works

    2.1. Vision-and-Language Navigation

    2.2. Semantic Scene Understanding and Instance Segmentation

    2.3. 3D Scene Reconstruction

  2. Methodology

    3.1. Data Collection

    3.2. Open-set Semantic Information from Images

    3.3. Creating the Open-set 3D Representation

    3.4. Language-Guided Navigation

  3. Experiments

    4.1. Quantitative Evaluation

    4.2. Qualitative Results

  4. Conclusion and Future Work, Disclosure statement, and References

3.4. Language-Guided Navigation

In this section, we leverage the LLM-based approach from [1], which uses ChatGPT [35] to understand and map language commands to pre-defined function primitives that the robot can understand and execute. However, there are a few differences between our current approach and the approach in [1] regarding the use case of the LLM and the implementation of our function primitives. The previous approach used the LLM’s ability to bring in an open-set understanding by mapping general queries to the already-known closed-set class labels obtained via Mask2Former [7].

\ However, given the open-set nature of our new representation, O3D-SIM, the LLM does not need to do that. Figure 4 shows both approaches’ code output differences. The function primitives work similarly to the older approach, requiring the desired object type and its instance as an input. But now, the desired object is not from a pre-defined set of classes but a small query defining the object, so the implementation to find the desired location changes. We use the text and image-aligned nature of CLIP embeddings to find the desired object, where the input description is passed to the model, and its corresponding embedding is used to find the object in O3D-SIM.

\ A cosine similarity is calculated between the embedding of the description and all the embeddings of our representation. These are ranked in a decreasing order, and the desired instance is selected. Once the instance is finalized, a goal corresponding to this instance is generated and passed to the navigation stack for autonomous navigation of the robot, hence achieving Language-Guided Navigation.

\

:::info Authors:

(1) Laksh Nanwani, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(2) Kumaraditya Gupta, International Institute of Information Technology, Hyderabad, India;

(3) Aditya Mathur, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(4) Swayam Agrawal, International Institute of Information Technology, Hyderabad, India;

(5) A.H. Abdul Hafez, Hasan Kalyoncu University, Sahinbey, Gaziantep, Turkey;

(6) K. Madhava Krishna, International Institute of Information Technology, Hyderabad, India.

:::


:::info This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.

:::

\

Piyasa Fırsatı
Large Language Model Logosu
Large Language Model Fiyatı(LLM)
$0.0003454
$0.0003454$0.0003454
+4.60%
USD
Large Language Model (LLM) Canlı Fiyat Grafiği
Sorumluluk Reddi: Bu sitede yeniden yayınlanan makaleler, halka açık platformlardan alınmıştır ve yalnızca bilgilendirme amaçlıdır. MEXC'nin görüşlerini yansıtmayabilir. Tüm hakları telif sahiplerine aittir. Herhangi bir içeriğin üçüncü taraf haklarını ihlal ettiğini düşünüyorsanız, kaldırılması için lütfen [email protected] ile iletişime geçin. MEXC, içeriğin doğruluğu, eksiksizliği veya güncelliği konusunda hiçbir garanti vermez ve sağlanan bilgilere dayalı olarak alınan herhangi bir eylemden sorumlu değildir. İçerik, finansal, yasal veya diğer profesyonel tavsiye niteliğinde değildir ve MEXC tarafından bir tavsiye veya onay olarak değerlendirilmemelidir.

Ayrıca Şunları da Beğenebilirsiniz

BitMine Expands Treasury Holdings with $140 Million Ethereum Acquisition

BitMine Expands Treasury Holdings with $140 Million Ethereum Acquisition

BitMine has significantly bolstered its cryptocurrency treasury with the acquisition of 48,049 ETH, valued at approximately $140 million at current market prices. The substantial purchase positions the company among a growing cohort of corporations holding Ethereum as a strategic reserve asset, extending a trend previously dominated by Bitcoin treasury strategies.
Paylaş
MEXC NEWS2025/12/17 17:19
Hyper Foundation Proposes Validator Vote to Burn Assistance Fund Tokens

Hyper Foundation Proposes Validator Vote to Burn Assistance Fund Tokens

The Hyper Foundation has put forward a proposal for validators to vote on burning the $HYPE tokens currently held in the project's Assistance Fund. If approved, the burn would permanently remove these tokens from circulating supply, representing a significant shift in the protocol's token economics and treasury management philosophy.
Paylaş
MEXC NEWS2025/12/17 17:21
This Altcoin Could 1000x By 2026

This Altcoin Could 1000x By 2026

The post This Altcoin Could 1000x By 2026 appeared on BitcoinEthereumNews.com. The SEC has approved a framework for the streamlined adoption of digital asset products in the United States on Wednesday, allowing exchanges to list and trade commodity-based trust shares without requiring a rule change to be filed first. This marks a significant milestone, opening the door for a surge in spot altcoin ETFs in the coming months. As a result, anticipation is building around institutional liquidity flows to the altcoin market – but which projects could perform the best?  Many analysts are betting on Bitcoin Hyper (HYPER) as a potential 1000x opportunity. It has not yet launched on exchanges, so it’s not immediately eligible for a spot ETF like some of the larger altcoins. That said, its use case positions it at the forefront of blockchain innovation, which signals huge potential for price gains as institutional capital rotates through the altcoin market. The project is developing the world’s first ZK-rollup-powered Bitcoin Layer 2 blockchain, addressing Bitcoin’s key issues of slow speeds and limited functionality while maintaining its renowned characteristics of security and immutability. SEC Approves Generic ETF Listing Standards The SEC has approved a proposed 19b-4 rule change from Cboe’s BZX exchange, Nasdaq, and NYSE Arca to standardize listing requirements for crypto exchange-traded products (ETPs) and streamline the process for public trading. According to Bloomberg ETF expert James Seyffart, this move paves the way for a “wave of spot crypto ETP launches in the coming weeks and months.” WOW. The SEC has approved Generic Listing Standards for “Commodity Based Trust Shares” aka includes crypto ETPs. This is the crypto ETP framework we’ve been waiting for. Get ready for a wave of spot crypto ETP launches in coming weeks and months. pic.twitter.com/xDKCuj41mc — James Seyffart (@JSeyff) September 17, 2025 Under the new listing standards, commodities must meet one of three conditions…
Paylaş
BitcoinEthereumNews2025/09/19 07:09