The 3DIML approach uses a three-step pipeline to convert erratic 2D instance masks into logical 3D instance segmentations. To ensure consistent object classification in spite of segmentation noise, InstanceMap uses a scalable mask association graph based on extended hLoc to associate masks across pictures in the first stage. After resolving ambiguities with a label NeRF, InstanceLift refines these pseudolabels. A quick post-processing step then effectively combines labels that are in disagreement.The 3DIML approach uses a three-step pipeline to convert erratic 2D instance masks into logical 3D instance segmentations. To ensure consistent object classification in spite of segmentation noise, InstanceMap uses a scalable mask association graph based on extended hLoc to associate masks across pictures in the first stage. After resolving ambiguities with a label NeRF, InstanceLift refines these pseudolabels. A quick post-processing step then effectively combines labels that are in disagreement.

Consistent 3D Mask Labeling Made Simple

2025/10/24 23:34

Abstract and I. Introduction

II. Background

III. Method

IV. Experiments

V. Conclusion and References

\

III. METHOD

Given a sequence of N posed RGB images, (Ii , Ti) where I denotes the image and T pose, we first extract viewinconsistent instance masks Mi using a generic instance segmentation model such as Mask2Former or SAM.

\ A. Mask Association

\ We first generate pseudolabel masks with InstanceMap. Formally, define ϕ(M, r) to map a mask M and region r to a consistent label for the same 3D object across different masks and regions. We extend the popular hLoc [16] framework for scalable 3D reconstruction to mask association as follows:

\

\ Fig. 2: Overview of 3DIML. A sequence of color images is segmented into object instances by an image segmentation backbone. The resulting masks produced are fed into InstanceMap, which produces instance masks consistent over all frames. These pseudo instance masks and their respective camera poses are used to supervise an instance label NeRF, which further improves consistency and resolves ambiguity present in the InstanceMap outputs. The feature extraction and global data association blocks together form InstanceMap.

\ Since NetVLAD and LoFTR don’t have 3D information, 3DIML only performs well if each image in the scan sequence contains enough context for these models. We observe empirically a good rule of thumb is to have at least one other recognizable landmark for frames containing near-identical objects.

\ Mask Association Graph: Insofar, our approach produces instance masks and dense pixel correspondences among images that share a visual overlap. However, segmentation models such as SAM [2] suffer multiple issues: (a) segmentations of the same object need not be consistent across images, owing to viewpoint and appearance variations; and (b) owing to over-segmentation of objects, there isn’t usually a one-one correspondence among masks.

\

\

\ B. Mask Refinement

\ ϕ(M, r) is inherently noisy due to varying segmentation hierarchies for different instance masks due to differing viewpoints as well as design specifics. To address this, in InstanceLift we feed the pseudolabel masks to a label NeRF, which resolves some ambiguities. Still, NeRF cannot handle extreme cases of label ambiguity, to which we devise a fast post-processing method that determines and merges colliding labels based on random renders from the label NeRF. The few remaining, if any, ambiguities can be corrected via sparse human annotation.

\

\ Fig. 3: InstanceLoc enables 3D-consistent instance segmentation for novel views of the scene unobserved by the InstanceMap pipeline. We leverage off-the-shelf instance segmentation models to first produce 3D-inconsistent instance labels for a new input image. We then query the label field over a sparse set of points on the image and use this to localize each 2D instance mask i.e., assign a 3Dconsistent label to each mask.

\ Post graph construction, we merge labels a, b if

\

\ Since we only need coarse information i.e. instance mask noise, we render images downsampled by a factor of 2.

\ C. Fast Instance Localization and Rendering

\ Training a label field enables us to predict 3D-consistent instance labels for novel viewpoints without rerunning 3DIML. However, rendering every pixel is slow, and rendering from a novel viewpoint is often noisy. We propose a fast localization approach that instead precomputes instance masks for the input image using an instance segmentation model (here FastSAM [8]). Given this instance mask, for each instance region, we sample the corresponding pixelwise 3D object labels from the label NeRF and take the majority label. Another benefit is that the input instance masks can be constructed using prompts and edited before localization.

\

:::info Authors:

(1) George Tang, Massachusetts Institute of Technology;

(2) Krishna Murthy Jatavallabhula, Massachusetts Institute of Technology;

(3) Antonio Torralba, Massachusetts Institute of Technology.

:::


:::info This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Share Insights

You May Also Like

Ethereum’s ERC-8004 Brings AI-Driven Economic Potential

Ethereum’s ERC-8004 Brings AI-Driven Economic Potential

The post Ethereum’s ERC-8004 Brings AI-Driven Economic Potential appeared on BitcoinEthereumNews.com. Key Points: ERC-8004 launch by Cobo enables AI as economic entities in crypto. No immediate market impact noted yet. Potential for significant future Ethereum ecosystem evolution. Cobo’s co-founder Fish the Godfish introduced a groundbreaking crypto stack—x402, AP2, and ERC-8004—on September 17th, enabling AI agents to transact as economic entities officially. This technical advancement fosters new machine involvement in economic activities within Ethereum, anticipated to alter future DeFi landscapes, despite no current financial or market impact observed. ERC-8004 and AI: Transforming Ethereum Transactions Cobo’s ERC-8004 aims to transform the cryptocurrency landscape by allowing AI agents to engage in economic activities, introducing a stack that interlinks x402 and AP2 for seamless transactions. Fish the Godfish, the primary architect of this initiative, has highlighted the potential for AI to evolve into true economic agents, changing how transactions are approached in blockchain ecosystems. The introduction of this stack is a technological milestone, though no immediate financial impact has surfaced. The stack positions Ethereum as a hub for machine-led commerce, foreshadowing future changes in decentralized finance and smart contract applications. When AI learns to spend: From x402 to AP2, and then to ERC-8004, explore how to make the Agent a true economic entity. — Fish the Godfish, Co-founder and CEO of Cobo Reactions to the announcement have been cautiously optimistic, with many in the community anticipating advancements, although industry influencers have yet to comment. This caution suggests that while the technical potential is acknowledged, its market and practical impacts remain speculative. Ethereum’s Evolution: AI Agents and Market Dynamics Did you know? ERC-8004, hailed as a significant advancement, has historical parallels with early smart contract technologies that first enabled programmable transactions on blockchains. Ethereum (ETH) is valued at $3,957.24 with a market cap of 477,631,941,155. Its 24-hour trading volume is $15.36 billion, showing a -55.14% change,…
Share
2025/10/26 07:35
XRP (XRP) Faces Potential Downturn as Death Cross Pattern Re-emerges

XRP (XRP) Faces Potential Downturn as Death Cross Pattern Re-emerges

The post XRP (XRP) Faces Potential Downturn as Death Cross Pattern Re-emerges appeared on BitcoinEthereumNews.com. Ted Hisokawa Oct 24, 2025 16:07 XRP is on the brink of forming a ‘death cross’ pattern, reminiscent of its 65% crash in 2021. Experts warn of potential risks including falling burn rate and insider selling. The price of XRP, the cryptocurrency developed by Ripple, is currently navigating a challenging phase, marked by a significant decline from its peak earlier this year. According to CoinMarketCap, XRP has dropped by 34% from its highest point, situating it firmly within a bearish market. Death Cross Pattern and Historical Context A looming ‘death cross’ pattern on the daily chart is raising alarms among analysts. This technical chart pattern, which occurs when a short-term moving average crosses below a long-term moving average, has historically signaled a potential downturn. The last instance of this pattern for XRP was in 2021, leading to a dramatic 65% price drop. Current Market Conditions As of October 23, XRP was trading at $2.4137, a price level that reflects recent volatility and market consolidation. This price action is consistent with broader trends observed across the altcoin market, where significant price swings have been common since early October. Despite these challenges, XRP remains a key player in the cryptocurrency space, backed by robust fundamentals. Additional Risks for XRP Beyond the technical patterns, XRP faces other risks that could impact its price. Notably, the burn rate for the token is declining, which could affect its perceived scarcity and value. Furthermore, insider selling has been flagged as a potential concern, possibly contributing to downward pressure on the price. Market Developments and Future Outlook In contrast to the current bearish sentiment, Ripple’s ecosystem continues to expand. The recent launch of the REX-Oprey XRP ETF has been a significant milestone, quickly surpassing $100 million in assets. This…
Share
2025/10/26 07:24