Explores MaGGIe's architecture, featuring mask guidance embeddings, progressive refinement (PRM), and bidirectional matte fusion for consistent video results.Explores MaGGIe's architecture, featuring mask guidance embeddings, progressive refinement (PRM), and bidirectional matte fusion for consistent video results.

MaGGIe Architecture Deep Dive: Mask Guidance and Sparse Refinement

Abstract and 1. Introduction

  1. Related Works

  2. MaGGIe

    3.1. Efficient Masked Guided Instance Matting

    3.2. Feature-Matte Temporal Consistency

  3. Instance Matting Datasets

    4.1. Image Instance Matting and 4.2. Video Instance Matting

  4. Experiments

    5.1. Pre-training on image data

    5.2. Training on video data

  5. Discussion and References

\ Supplementary Material

  1. Architecture details

  2. Image matting

    8.1. Dataset generation and preparation

    8.2. Training details

    8.3. Quantitative details

    8.4. More qualitative results on natural images

  3. Video matting

    9.1. Dataset generation

    9.2. Training details

    9.3. Quantitative details

    9.4. More qualitative results

7. Architecture details

This section delves into the architectural nuances of our framework, providing a more detailed exposition of components briefly mentioned in the main paper. These insights are crucial for a comprehensive understanding of the underlying mechanisms of our approach.

7.1. Mask guidance identity embedding

7.2. Feature extractor

\ Figure 7. Converting Dense-Image to Sparse-Instance Features. We transform the dense image features into sparse, instance-specific features with the help of instance tokens.

7.3. Dense-image to sparse-instance features

7.4. Detail aggregation

This process, akin to a U-net decoder, aggregates features from different scales, as detailed in Fig. 8. It involves upscaling sparse features and merging them with corresponding higher-scale features. However, this requires precomputed downscale indices from dummy sparse convolutions on the full input image.

7.5. Sparse matte head

Our matte head design, inspired by MGM [56], comprises two sparse convolutions with intermediate normalization and activation (Leaky ReLU) layers. The final output undergoes sigmoid activation for the final prediction. Non-refined locations in the dense prediction are assigned a value of zero.

7.6. Sparse progressive refinement

The PRM module progressively refines A8 → A4 → A1 to have A. We assume that all predictions are rescaled to the largest size and perform refinement between intermediate predictions and uncertainty indices U:

\

7.7. Attention loss and loss weight

\ Figure 8. Detail Aggregation Module merges sparse features across scales. This module equalizes spatial scales of sparse features using inverse sparse convolution, facilitating their combination.

\ Figure 9. Temporal Sparsity Between Two Consecutive Frames. The top row displays a pair of successive frames. Below, the second row illustrates the predicted differences by two distinct frameworks, with areas of discrepancy emphasized in white. In contrast to SparseMat’s output, which appears cluttered and noisy, our module generates a more refined sparsity map. This map effectively accentuates the foreground regions that undergo notable changes between the frames, providing a clearer and more focused representation of temporal sparsity. (Best viewed in color).

7.8. Temporal sparsity prediction

A key aspect of our approach is the prediction of temporal sparsity to maintain consistency between frames. This module contrasts the feature maps of consecutive frames to predict their absolute differences. Comprising three convolution layers with batch normalization and ReLU activation, this module processes the concatenated feature maps from two adjacent frames and predicts the binary differences between them.

\ Unlike SparseMat [50], which relies on manual threshold selection for frame differences, our method offers a more robust and domain-independent approach to determining frame sparsity. This is particularly effective in handling variations in movement, resolution, and domain between frames, as demonstrated in Fig. 9

7.9. Forward and backward matte fusion

\ This fusion enhances temporal consistency and minimizes error propagation.

\

:::info Authors:

(1) Chuong Huynh, University of Maryland, College Park (chuonghm@cs.umd.edu);

(2) Seoung Wug Oh, Adobe Research (seoh,jolee@adobe.com);

(3) Abhinav Shrivastava, University of Maryland, College Park (abhinav@cs.umd.edu);

(4) Joon-Young Lee, Adobe Research (jolee@adobe.com).

:::


:::info This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

:::

\

Market Opportunity
DeepBook Logo
DeepBook Price(DEEP)
$0.034235
$0.034235$0.034235
-2.86%
USD
DeepBook (DEEP) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Octav Integrates Chainlink to Deliver Independent Onchain NAV for DeFi

Octav Integrates Chainlink to Deliver Independent Onchain NAV for DeFi

Octav integrates Chainlink oracles to deliver neutral on-chain NAV, restoring trust during volatile DeFi markets. October shocks exposed DeFi operating without
Share
Crypto News Flash2025/12/21 17:51
SEC Final Judgments on FTX Executives Filed

SEC Final Judgments on FTX Executives Filed

The SEC has filed proposed final consent judgments against former FTX executives. Key figures involved include Caroline Ellison, Gary Wang, and Nishad Singh.
Share
CoinLive2025/12/21 18:06
SHIB Price Drops as Leadership Concerns Grow

SHIB Price Drops as Leadership Concerns Grow

The post SHIB Price Drops as Leadership Concerns Grow appeared on BitcoinEthereumNews.com. Shiba Inu investors uneasy as Kusama’s silence fuels leadership concerns. SHIB slid 13% in three days, retracing from $0.00001484 to $0.00001305. Shibarium exploit and Kusama’s absence have weighed on investor trust. Shiba Inu investors are voicing concerns about the project’s long-term direction as leadership uncertainty and slow ecosystem progress erode confidence.  The token, which rallied from its meme-coin origins to become the second-largest meme asset by market cap, counts more than 1.5 million holders worldwide. But as SHIB matures, the gap between early hype and current delivery has widened.  The project’s transition into an “ecosystem coin” with spin-off projects and Shibarium, its layer-2 network, once raised expectations. Analysts now point to internal challenges as the main factor holding SHIB back from fulfilling that potential. Kusama’s Silence Adds to Instability Central to the debate is the role of Shytoshi Kusama, Shiba Inu’s pseudonymous lead developer. Investors are concerned about the intermittent disappearance of the project’s lead developer, who repeatedly takes unannounced social media breaks.  For instance, Kusama went silent on X for over a month before resurfacing this week amid growing speculation that he had abandoned the Shiba Inu project.  Kusama returned shortly after the Shibarium bridge suffered an exploit worth around $3 million. However, he did not directly address the issue but only reassured Shiba Inu community members of his commitment to advancing the project.  Although most community members didn’t complain about Kusama’s anonymity in the project’s initial stages, his recent behavior has raised concerns. Many are beginning to develop trust issues, particularly because nobody could reveal the SHIB developer’s identity for the past five years. He has conducted all communications under pseudonyms. SHIB Price Action Reflects Sentiment Shift Market reaction has mirrored the doubts. SHIB, which spiked 26% at the start of September, has since reversed. Over the last…
Share
BitcoinEthereumNews2025/09/18 04:13