Details BSGAL's implementation on the LVIS dataset using CenterNet2 with ResNet-50/Swin-L backbones.Details BSGAL's implementation on the LVIS dataset using CenterNet2 with ResNet-50/Swin-L backbones.

Technical Details: BSGAL Training, Swin-L Backbone, and Dynamic Threshold Strategy

2025/12/07 02:00

Abstract and 1 Introduction

  1. Related work

    2.1. Generative Data Augmentation

    2.2. Active Learning and Data Analysis

  2. Preliminary

  3. Our method

    4.1. Estimation of Contribution in the Ideal Scenario

    4.2. Batched Streaming Generative Active Learning

  4. Experiments and 5.1. Offline Setting

    5.2. Online Setting

  5. Conclusion, Broader Impact, and References

    \

A. Implementation Details

B. More ablations

C. Discussion

D. Visualization

A. Implementation Details

A.1. Dataset

We choose LVIS (Gupta et al., 2019) as the dataset for our experiments. LVIS is a large-scale instance segmentation dataset, comprising approximately 160,000 images with over 2 million high-quality instance segmentation annotations across 1203 real-world categories. The dataset is further divided into three categories: rare, common, and frequent, based on their occurrence across images. Instances marked as ‘rare’ appear in 1-10 images, ‘common’ instances appear in 11-100 images, whereas ‘frequent’ instances appear in more than 100 images. The overall dataset exhibits a long-tail distribution, closely resembling the data distribution in the real world, and is widely applied under multiple settings, including few-shot segmentation (Liu et al., 2023) and open-world segmentation (Wang et al., 2022; Zhu et al., 2023). Therefore, we believe that selecting LVIS allows for a better reflection of the model’s performance in real-world scenarios. We use the official LVIS dataset splits, with about 100,000 images in the training set and 20,000 images in the validation set.

A.2. Data Generation

Our data generation and annotation process is consistent with Zhao et al. (2023), and we briefly introduce it here. We first use StableDiffusion V1.5 (Rombach et al., 2022a) (SD) as the generative model. For the 1203 categories in LVIS (Gupta et al., 2019), we generate 1000 images per category, with image resolution 512 × 512. The prompt template for generation is “a photo of a single {CATEGORY NAME}”. We use U2Net (Qin et al., 2020), SelfReformer (Yun and Lin, 2022), UFO (Su et al., 2023), and CLIPseg (Luddecke and Ecker ¨ , 2022) respectively to annotate the raw generative images, and select the mask with the highest CLIP score as the final annotation. To ensure data quality, images with CLIP scores below 0.21 are filtered out as low-quality images. During training, we also employ the instance paste strategy provided by Zhao et al. (2023) for data augmentation. For each instance, we randomly resize it to match the distribution of its category in the training set. The maximum number of pasted instances per image is set to 20.

\ In addition, to further expand the diversity of generated data and make our research more universal, we also use other generative models, including DeepFloyd-IF (Shonenkov et al., 2023) (IF) and Perfusion (Tewel et al., 2023) (PER), with 500 images per category per model. For IF, we use the pre-trained model provided by the author, and the generated images are the output of Stage II, with a resolution of 256×256. For PER, the base model we use is StableDiffusion V1.5. For each category, we fine-tune the model using the images croped from the training set, with 400 fine-tuning steps. We use the fine-tuned model to generate images.

\ Table 7. Comparison of different generated data.

\ We also explore the effect of using different generated data on the model performance (see Table 7). We can see that based on the original StableDiffusion V1.5, using other generative models can bring some performance improvement, but this improvement is not obvious. Specifically, for specific frequency categories, we found that IF has a more significant improvement for rare categories, while PER has a more significant improvement for common categories. This is likely because IF data is more diverse, while PER data is more consistent with the distribution of the training set. Considering that the overall performance has been improved to a certain extent, we finally adopt the generated data of SD + IF + PER for subsequent experiments.

A.3. Model Training

Follow Zhao et al. (2023), We use CenterNet2 (Zhou et al., 2021) as our segmentation model, with ResNet-50 (He et al., 2016) or Swin-L (Liu et al., 2022) as the backbone. For ResNet-50, the maximum training iteration is set to 90,000 and the model is initialized with weights first pretrained on ImageNet-22k then finetuned on LVIS (Gupta et al., 2019), as Zhao

\ Figure 5. Model performances when using different amount of generated data.

\ et al. (2023) did. And we use 4 Nvidia 4090 GPUs with a batch size of 16 during training. As for Swin-L, the maximum training iteration is set to 180,000 and the model is initialized with weights pretrained on ImageNet-22k, since our early experiments show that this initialization can bring a slight improvement compared to the weights trained with LVIS. And we use 4 Nvidia A100 GPUs with a batch size of 16 for training. Besides, due to the large number of parameters of Swin-L, the additional memory occupied by saving the gradient is large, so we actually use the algorithm in Algorithm 2.

\ The other unspecified parameters also follow the same settings as X-Paste (Zhao et al., 2023), such as the AdamW (Loshchilov and Hutter, 2017) optimizer with an initial learning rate of 1e−4.

A.4. Data Amount

In this work, we have generated over 2 million images. Figure 5 shows the model performances when using different amount of generated data(1%,10%,40%,70%,100%). Overall, as the amount of generated data increases, the performance of the model also improves, but there is also some fluctuation. Our method is always better than the baseline, which proves the effectiveness and robustness of our method.

A.5. Contribution Estimation

\ Thus, we essentially calculate the cosine similarity. Then we conducted an experimental comparison, as shown in Table 8,

\ Table 8. Comparison of using grad normalization or not.

\ Figure 6. Illustration of noisy images exhibiting various noise scales and categories. Each row, from top to bottom, signifies different noise levels, specifically 0, 40, 100, 200, and 400, respectively. All images are sourced from the CIFAR-10 dataset.

\ we can see that if we normalize the gradient, our method will have a certain improvement. In addition, since we need to keep two different thresholds, it is difficult to ensure the consistency of the acceptance rate. So we adopt a dynamic threshold strategy, pre-set an acceptance rate, maintain a queue to save the contribution of the previous iter, and then dynamically adjust the threshold according to the queue, so that the acceptance rate stays at the pre-set acceptance rate.

A.6. Toy Experiment

The following are the specific experimental settings implemented on CIFAR-10: We employed a simple ResNet18 as the baseline model and conducted training over 200 epochs, and the accuracy after training on the original training set is 93.02%. The learning rate is set at 0.1, utilizing the SGD optimizer. A momentum of 0.9 is in effect, with a weight decay of 5e-4. We use a cosine annealing learning rate scheduler. The constructed noisy images are depicted in Figure 6. A decline in image quality is observed as the noise level escalates. Notably, when the noise level reaches 200, the images become significantly challenging to identify. For Table 1, we use Split1 as R, while G consists of ‘Split2 + Noise40’, ‘Split3 + Noise100’, ‘Split4 + Noise200’,

A.7. A Simplification Only Forward Once

\

:::info Authors:

(1) Muzhi Zhu, with equal contribution from Zhejiang University, China;

(2) Chengxiang Fan, with equal contribution from Zhejiang University, China;

(3) Hao Chen, Zhejiang University, China ([email protected]);

(4) Yang Liu, Zhejiang University, China;

(5) Weian Mao, Zhejiang University, China and The University of Adelaide, Australia;

(6) Xiaogang Xu, Zhejiang University, China;

(7) Chunhua Shen, Zhejiang University, China ([email protected]).

:::


:::info This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

XRP Potential Double Bottom Strengthens Amid Ripple’s 250M Transfer

XRP Potential Double Bottom Strengthens Amid Ripple’s 250M Transfer

The post XRP Potential Double Bottom Strengthens Amid Ripple’s 250M Transfer appeared on BitcoinEthereumNews.com. Ripple’s transfer of 250 million XRP to an unknown wallet has immediately altered the short-term liquidity for XRP price, reducing available tokens in sell zones and potentially supporting a bullish reversal. This move coincides with shrinking exchange reserves, signaling tighter supply amid growing buyer interest. Ripple transferred 250 million XRP, impacting circulating supply and exchange liquidity. XRP price shows a potential double-bottom pattern at $1.99, with a key neckline at $2.2443. Exchange reserves dropped 2.51%, while taker buy CVD rose, indicating stronger buyer aggression per CryptoQuant data. Ripple’s 250M XRP transfer tightens liquidity, boosting XRP price potential amid double-bottom signals. Explore how shrinking reserves and rising CVD support bullish trends—stay informed on crypto shifts today. What does Ripple’s 250 million XRP transfer mean for XRP price? Ripple’s transfer of 250 million XRP to an unknown wallet has reshaped the short-term liquidity environment for XRP price by reducing the number of tokens readily available in sell zones. This large movement, often seen as a strategic repositioning, highlights implications for circulating supply and forces traders to reassess market dynamics. As fewer XRP tokens sit in immediate exchange reserves, the transfer could amplify price reactions to buying pressure, especially with supporting on-chain indicators. How is the double-bottom pattern influencing XRP price action? XRP price has formed a potential double-bottom structure around the $1.99 level, where both touches demonstrated strong rejection from buyers, establishing this zone as a critical support. This pattern suggests a possible brief test near $1.90 before advancing, with the neckline at $2.2443 serving as the pivotal breakout point; surpassing it could target $2.5021. On-chain data from TradingView reinforces this setup, as volume profiles align with historical resistance breaks, and expert analysis from market observers notes that such formations often precede 10-15% rallies in similar conditions. Short sentences here emphasize: the…
Share
BitcoinEthereumNews2025/12/07 10:28
Peter Schiff Challenges Trump to U.S. Economy Debate After Bitcoin-Gold Clash with CZ

Peter Schiff Challenges Trump to U.S. Economy Debate After Bitcoin-Gold Clash with CZ

The post Peter Schiff Challenges Trump to U.S. Economy Debate After Bitcoin-Gold Clash with CZ appeared on BitcoinEthereumNews.com. Peter Schiff has challenged President Trump to a public debate on the U.S. economy following Trump’s criticism of his comments on the ongoing affordability crisis. This exchange highlights tensions over inflation, economic policies, and their impacts on everyday Americans amid claims of falling prices and recovery. Schiff’s Challenge: Gold advocate Peter Schiff proposes a debate to discuss Trump’s economic strategies and their role in rising costs. Trump’s Response: The president labels Schiff a detractor and insists prices are dropping, attributing issues to prior administration policies. Broader Context: Searches for affordability have surged 110% year-over-year, reflecting public concerns despite official dismissals, per Google data. Peter Schiff challenges Trump to debate U.S. economy amid affordability crisis and inflation debates. Explore Schiff’s views on Bitcoin vs. gold and policy impacts—stay informed on crypto’s role in financial stability today. What is Peter Schiff’s Challenge to President Trump About? Peter Schiff’s challenge to President Trump stems from a heated exchange over the U.S. economy’s health, particularly the affordability crisis affecting Americans. On December 6, 2025, during an appearance on Fox & Friends Weekend, Schiff highlighted how inflation is accelerating under current policies, exacerbating everyday cost pressures. Trump responded sharply on Truth Social, calling Schiff a “Trump hating loser” and claiming prices are falling dramatically, including gasoline at $1.99 per gallon in some states. Schiff then invited Trump or a representative to debate these economic realities publicly, emphasizing the need for truthful discourse on policy effectiveness. How Does Peter Schiff’s Debate with CZ Relate to Economic Concerns? Peter Schiff’s recent debate with Changpeng Zhao (CZ), founder of Binance, at Binance Blockchain Week in Dubai underscores his longstanding skepticism toward cryptocurrencies like Bitcoin, tying directly into broader economic discussions on inflation and asset value. Schiff argued that Bitcoin lacks inherent value, serving only as a speculative tool…
Share
BitcoinEthereumNews2025/12/07 10:01
Texas Monet Bank Plans Crypto Services as Bitcoin Hits $126K High

Texas Monet Bank Plans Crypto Services as Bitcoin Hits $126K High

The post Texas Monet Bank Plans Crypto Services as Bitcoin Hits $126K High appeared on BitcoinEthereumNews.com. Monet Bank, a Texas-based institution owned by billionaire Andy Beal, has rebranded to prioritize cryptocurrency services, offering secure digital asset banking solutions amid regulatory shifts. This move positions it as a premier provider for crypto custody, lending, and blockchain-integrated transactions, capitalizing on Bitcoin’s 2025 all-time high of $126,000. Rebranding Focus: Monet Bank’s transition from Beal Savings Bank to XD Bank and now Monet Bank emphasizes digital asset innovation for the modern economy. Regulatory Changes: Recent federal adjustments under the Trump administration have eased restrictions, enabling banks like Monet to engage with cryptocurrencies without prior cautions. Growth in Sector: With Bitcoin hitting $126,000 in 2025, institutions such as Monet are expanding services, including blockchain for faster payments, supported by FDIC regulation and over $1 billion in capital. Discover how Monet Bank’s pivot to cryptocurrency services is reshaping banking. Explore secure digital asset solutions and regulatory insights for crypto investors today. (148 characters) What is Monet Bank’s Strategy for Cryptocurrency Services? Monet Bank’s cryptocurrency services represent a strategic pivot to integrate digital assets into traditional banking, providing clients with custody, lending, and blockchain-based transactions. Founded in 1988 as Beal Savings Bank, the Texas institution has undergone two rebrands in 2025—first to XD Bank and now to Monet Bank—to align with the digital economy. This evolution allows FDIC-insured operations while offering innovative tools for cryptocurrency users, backed by less than $6 billion in assets and strong capital reserves. How Does Monet Bank’s Rebranding Impact Crypto Banking? Monet Bank’s rebranding to focus on cryptocurrency services stems from a clear vision to become a leading digital asset financial institution. According to the bank’s official statement, it aims to deliver “innovative and forward-facing solutions for the digital economy,” operating through six Texas offices under strict FDIC oversight. This small community bank, with assets under $6 billion…
Share
BitcoinEthereumNews2025/12/07 09:52