Proposes BSGAL, a gradient-based method for effective screening and utilization of generated data to improve downstream perception tasks.Proposes BSGAL, a gradient-based method for effective screening and utilization of generated data to improve downstream perception tasks.

BSGAL: Gradient-Based Screening for Long-Tailed Perception Tasks

2025/12/06 21:00

Abstract and 1 Introduction

  1. Related work

    2.1. Generative Data Augmentation

    2.2. Active Learning and Data Analysis

  2. Preliminary

  3. Our method

    4.1. Estimation of Contribution in the Ideal Scenario

    4.2. Batched Streaming Generative Active Learning

  4. Experiments and 5.1. Offline Setting

    5.2. Online Setting

  5. Conclusion, Broader Impact, and References

    \

A. Implementation Details

B. More ablations

C. Discussion

D. Visualization

6. Conclusion

In this paper, we propose a new problem, how to design an effective method to realize the effective screening and utilization of generated data, to further improve the performance of downstream perception tasks. To address this problem, we propose a gradient-based generated data contribution estimation method and embed it into the actual training process. We design a complete pipeline that can automatically generate data to improve the performance of downstream perception tasks. Experiments prove that our method can achieve better performance than unfiltered or CLIP-filtered methods on long-tailed segmentation tasks.

Broader Impact

Our goal is to advance the field of Machine Learning. There are many potential societal consequences of our work, none which we feel must be specifically highlighted here.

References

Jordan Ash, Surbhi Goel, Akshay Krishnamurthy, and Sham Kakade. Gone fishing: Neural active learning with fisher embeddings. In Adv. Neural Inform. Process. Syst., pages 8927–8939, 2021.

\ Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds. In Int. Conf. Learn. Represent., 2020.

\ Shekoofeh Azizi, Simon Kornblith, Chitwan Saharia, Mohammad Norouzi, and David J. Fleet. Synthetic data from diffusion models improves imagenet classification. Transactions on Machine Learning Research, 2023.

\ Wenbin Cai, Ya Zhang, and Jun Zhou. Maximizing expected model change for active learning in regression. In IEEE Int. Conf. Data Mining, pages 51–60. IEEE, 2013.

\ Arantxa Casanova, Pedro O. Pinheiro, Negar Rostamzadeh, and Christopher J. Pal. Reinforced active learning for image segmentation. In Int. Conf. Learn. Represent., 2020.

\ Kai Chen, Enze Xie, Zhe Chen, Lanqing Hong, Zhenguo Li, and Dit-Yan Yeung. Integrating geometric control into text-to-image diffusion models for high-quality detection data generation via text prompt. arXiv preprint arXiv:2306.04607, 2023.

\ Chengxiang Fan, Muzhi Zhu, Hao Chen, Yang Liu, Weijia Wu, Huaqi Zhang, and Chunhua Shen. Divergen: Improving instance segmentation by learning wider data distribution with more diverse generative data. arXiv preprint arXiv:2405.10185, 2024.

\ Vitaly Feldman and Chiyuan Zhang. What neural networks memorize and why: Discovering the long tail via influence estimation. Adv. Neural Inform. Process. Syst., 33: 2881–2891, 2020.

\ Chun-Mei Feng, Kai Yu, Yong Liu, Salman Khan, and Wangmeng Zuo. Diverse data augmentation with diffusions for effective test-time prompt tuning. In Int. Conf. Comput. Vis., pages 2704–2714, 2023.

\ Yonatan Geifman and Ran El-Yaniv. Deep active learning over the long tail. arXiv preprint arXiv:1711.00941, 2017.

\ Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian, TsungYi Lin, Ekin D Cubuk, Quoc V Le, and Barret Zoph. Simple copy-paste is a strong data augmentation method for instance segmentation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 2918–2928, 2021.

\ Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.

\ Mohamed Goudjil, Mouloud Koudil, Mouldi Bedda, and Noureddine Ghoggali. A novel active learning method using svm for text classification. Int. J. Autom. Comput., 15:290–298, 2018.

\ Agrim Gupta, Piotr Dollar, and Ross Girshick. Lvis: A dataset for large vocabulary instance segmentation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 5356– 5364, 2019.

\ Zayd Hammoudeh and Daniel Lowd. Training data influence analysis and estimation: A survey. arXiv preprint arXiv:2212.04612, 2022.

\ Haoyu He, Jianfei Cai, Jing Zhang, Dacheng Tao, and Bohan Zhuang. Sensitivity-aware visual parameter-efficient fine-tuning. In Int. Conf. Comput. Vis., pages 11825– 11835, 2023.

\ Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In IEEE Conf. Comput. Vis. Pattern Recog., pages 770–778, 2016.

\ Suyog Dutt Jain and Kristen Grauman. Active image segmentation propagation. In IEEE Conf. Comput. Vis. Pattern Recog., pages 2864–2873, 2016.

\ Ruoxi Jia, Fan Wu, Xuehui Sun, Jiacen Xu, David Dao, Bhavya Kailkhura, Ce Zhang, Bo Li, and Dawn Song. Scalability vs. utility: Do we have to sacrifice one for the other in data importance quantification? In IEEE Conf. Comput. Vis. Pattern Recog., pages 8239–8247, 2021.

\ Ajay J Joshi, Fatih Porikli, and Nikolaos Papanikolopoulos. Multi-class active learning for image classification. In IEEE Conf. Comput. Vis. Pattern Recog., pages 2372– 2379. IEEE, 2009.

\ Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick. Segment anything. In Int. Conf. Comput. Vis., pages 4015–4026, 2023.

\ Pang Wei Koh and Percy Liang. Understanding black-box predictions via influence functions. In Proc. Int. Conf. Mach. Learn., pages 1885–1894. PMLR, 2017.

\ Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009.

\ David D Lewis. A sequential algorithm for training text classifiers: Corrigendum and additional data. In Acm Sigir Forum, pages 13–19. ACM New York, NY, USA, 1995.

\ David D Lewis and Jason Catlett. Heterogeneous uncertainty sampling for supervised learning. In Machine learning proceedings 1994, pages 148–156. Elsevier, 1994.

\ Daiqing Li, Huan Ling, Seung Wook Kim, Karsten Kreis, Sanja Fidler, and Antonio Torralba. Bigdatasetgan: Synthesizing imagenet with pixel-wise annotations. In IEEE Conf. Comput. Vis. Pattern Recog., pages 21330–21340, 2022.

\ Feng Li, Hao Zhang, Peize Sun, Xueyan Zou, Shilong Liu, Jianwei Yang, Chunyuan Li, Lei Zhang, and Jianfeng Gao. Semantic-sam: Segment and recognize anything at any granularity. arXiv preprint arXiv:2307.04767, 2023a.

\ Ziyi Li, Qinye Zhou, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang, and Weidi Xie. Open-vocabulary object segmentation with diffusion models. In Int. Conf. Comput. Vis., pages 7667–7676, 2023b.

\ Robert F Ling. Residuals and influence in regression, 1984.

\ Yang Liu, Muzhi Zhu, Hengtao Li, Hao Chen, Xinlong Wang, and Chunhua Shen. Matcher: Segment anything with one shot using all-purpose feature matching. arXiv preprint arXiv:2305.13310, 2023.

\ Zhuoming Liu, Hao Ding, Huaping Zhong, Weijia Li, Jifeng Dai, and Conghui He. Influence selection for active learning. In Int. Conf. Comput. Vis., pages 9274–9283, 2021.

\ Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, and Baining Guo. Swin transformer v2: Scaling up capacity and resolution. In IEEE Conf. Comput. Vis. Pattern Recog., 2022.

\ Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.

\ Timo Luddecke and Alexander Ecker. Image segmentation ¨ using text and image prompts. In IEEE Conf. Comput. Vis. Pattern Recog., pages 7086–7096, 2022.

\ Wenjie Luo, Alex Schwing, and Raquel Urtasun. Latent structured active learning. Adv. Neural Inform. Process. Syst., 26, 2013.

\ Dwarikanath Mahapatra, Behzad Bozorgtabar, JeanPhilippe Thiran, and Mauricio Reyes. Efficient active learning for image classification and segmentation using a sample selection and conditional generative adversarial network. In Int. Conf. Med. Image Comput. Comput.- Assist. Interv., pages 580–588. Springer, 2018.

\ Hieu T Nguyen and Arnold Smeulders. Active learning using pre-clustering. In Int. Conf. Learn. Represent., page 79, 2004.

\ Garima Pruthi, Frederick Liu, Satyen Kale, and Mukund Sundararajan. Estimating training data influence by tracing gradient descent. pages 19920–19930, 2020.

\ Xuebin Qin, Zichen Zhang, Chenyang Huang, Masood Dehghan, Osmar R Zaiane, and Martin Jagersand. U2-net: Going deeper with nested u-structure for salient object detection. Pattern recognition, 106:107404, 2020.

\ Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In Proc. Int. Conf. Mach. Learn., pages 8748–8763. PMLR, 2021.

\ Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Brij B Gupta, Xiaojiang Chen, and Xin Wang. A survey of deep active learning. ACM computing surveys (CSUR), 54(9):1–40, 2021.

\ Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer. High-resolution im- ¨ age synthesis with latent diffusion models. In IEEE Conf. Comput. Vis. Pattern Recog., pages 10684–10695, 2022a.

\ Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer. High-resolution im- ¨ age synthesis with latent diffusion models. In IEEE Conf. Comput. Vis. Pattern Recog., pages 10684–10695, 2022b.

\ Alvin E Roth. The Shapley value: essays in honor of Lloyd S. Shapley. Cambridge University Press, 1988.

\ Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, et al. Photorealistic text-to-image diffusion models with deep language understanding. Adv. Neural Inform. Process. Syst., 35:36479–36494, 2022.

\ Akanksha Saran, Safoora Yousefi, Akshay Krishnamurthy, John Langford, and Jordan T. Ash. Streaming active learning with deep neural networks. In Proc. Int. Conf. Mach. Learn., pages 30005–30021. PMLR, 2023.

\ Ozan Sener and Silvio Savarese. Active learning for convolutional neural networks: A core-set approach. In Int. Conf. Learn. Represent., 2018.

\ Alex Shonenkov, Misha Konstantinov, Daria Bakshandaeva, Christoph Schuhmann, Ksenia Ivanova, and Nadiia Klokova. Deepfloyd-if, 2023.

\ Abhinav Shrivastava, Abhinav Gupta, and Ross Girshick. Training region-based object detectors with online hard example mining. In IEEE Conf. Comput. Vis. Pattern Recog., pages 761–769, 2016.

\ Yukun Su, Jingliang Deng, Ruizhou Sun, Guosheng Lin, Hanjing Su, and Qingyao Wu. A unified transformer framework for group-based segmentation: Cosegmentation, co-saliency detection and video salient object detection. IEEE Transactions on Multimedia, 2023.

\ Yoad Tewel, Rinon Gal, Gal Chechik, and Yuval Atzmon. Key-locked rank one editing for text-to-image personalization. ACM SIGGRAPH 2023 Conference Proceedings, 2023.

\ Alexander Vezhnevets, Joachim M Buhmann, and Vittorio Ferrari. Active learning for semantic segmentation with expected change. In IEEE Conf. Comput. Vis. Pattern Recog., pages 3162–3169. IEEE, 2012.

\ Weiyao Wang, Matt Feiszli, Heng Wang, Jitendra Malik, and Du Tran. Open-world instance segmentation: Exploiting pseudo ground truth from learned pairwise affinity. In IEEE Conf. Comput. Vis. Pattern Recog., pages 4422– 4432, 2022.

\ Ji Wei, Li Jingjing, Bi Qi, Liu Tingwei, Li Wenbo, and Cheng Li. Segment anything is not always perfect: An investigation of sam on different real-world applications. Mach. Intell. Resea., pages 1–14, 2024.

\ Weijia Wu, Yuzhong Zhao, Hao Chen, Yuchao Gu, Rui Zhao, Yefei He, Hong Zhou, Mike Zheng Shou, and Chunhua Shen. Datasetdm: Synthesizing data with perception annotations using diffusion models. Adv. Neural Inform. Process. Syst., 2023a.

\ Weijia Wu, Yuzhong Zhao, Mike Zheng Shou, Hong Zhou, and Chunhua Shen. Diffumask: Synthesizing images with pixel-level annotations for semantic segmentation using diffusion models. In Int. Conf. Comput. Vis., pages 1206–1217, 2023b.

\ Jiahao Xie, Wei Li, Xiangtai Li, Ziwei Liu, Yew Soon Ong, and Chen Change Loy. Mosaicfusion: Diffusion models as data augmenters for large vocabulary instance segmentation. arXiv preprint arXiv:2309.13042, 2023.

\ Lihe Yang, Xiaogang Xu, Bingyi Kang, Yinghuan Shi, and Hengshuang Zhao. Freemask: Synthetic images with dense annotations make stronger segmentation models. In NeurIPS, 2023.

\ Chih-Kuan Yeh, Joon Kim, Ian En-Hsu Yen, and Pradeep K Ravikumar. Representer point selection for explaining deep neural networks. Adv. Neural Inform. Process. Syst., 31, 2018.

\ Haobo Yuan, Xiangtai Li, Chong Zhou, Yining Li, Kai Chen, and Chen Change Loy. Open-vocabulary sam: Segment and recognize twenty-thousand classes interactively. arXiv preprint arXiv:2401.02955, 2024.

\ Yi Ke Yun and Weisi Lin. Selfreformer: Self-refined network with transformer for salient object detection. arXiv preprint arXiv:2205.11283, 2022.

\ Renrui Zhang, Xiangfei Hu, Bohao Li, Siyuan Huang, Hanqiu Deng, Yu Qiao, Peng Gao, and Hongsheng Li. Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners. In IEEE Conf. Comput. Vis. Pattern Recog., pages 15211–15222, 2023.

\ Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, JeanFrancois Lafleche, Adela Barriuso, Antonio Torralba, and Sanja Fidler. Datasetgan: Efficient labeled data factory with minimal human effort. In IEEE Conf. Comput. Vis. Pattern Recog., pages 10145–10155, 2021.

\ Hanqing Zhao, Dianmo Sheng, Jianmin Bao, Dongdong Chen, Dong Chen, Fang Wen, Lu Yuan, Ce Liu, Wenbo Zhou, Qi Chu, Weiming Zhang, and Nenghai Yu. X-paste: Revisiting scalable copy-paste for instance segmentation using clip and stablediffusion. In Proc. Int. Conf. Mach. Learn., 2023.

\ Xingyi Zhou, Vladlen Koltun, and Philipp Krahenb ¨ uhl. ¨ Probabilistic two-stage detection. arXiv preprint arXiv:2103.07461, 2021.

\ Muzhi Zhu, Hengtao Li, Hao Chen, Chengxiang Fan, Weian Mao, Chenchen Jing, Yifan Liu, and Chunhua Shen. Segprompt: Boosting open-world segmentation via categorylevel prompt learning. In Int. Conf. Comput. Vis., pages 999–1008, 2023.

\

:::info Authors:

(1) Muzhi Zhu, with equal contribution from Zhejiang University, China;

(2) Chengxiang Fan, with equal contribution from Zhejiang University, China;

(3) Hao Chen, Zhejiang University, China ([email protected]);

(4) Yang Liu, Zhejiang University, China;

(5) Weian Mao, Zhejiang University, China and The University of Adelaide, Australia;

(6) Xiaogang Xu, Zhejiang University, China;

(7) Chunhua Shen, Zhejiang University, China ([email protected]).

:::


:::info This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

BlackRock boosts AI and US equity exposure in $185 billion models

BlackRock boosts AI and US equity exposure in $185 billion models

The post BlackRock boosts AI and US equity exposure in $185 billion models appeared on BitcoinEthereumNews.com. BlackRock is steering $185 billion worth of model portfolios deeper into US stocks and artificial intelligence. The decision came this week as the asset manager adjusted its entire model suite, increasing its equity allocation and dumping exposure to international developed markets. The firm now sits 2% overweight on stocks, after money moved between several of its biggest exchange-traded funds. This wasn’t a slow shuffle. Billions flowed across multiple ETFs on Tuesday as BlackRock executed the realignment. The iShares S&P 100 ETF (OEF) alone brought in $3.4 billion, the largest single-day haul in its history. The iShares Core S&P 500 ETF (IVV) collected $2.3 billion, while the iShares US Equity Factor Rotation Active ETF (DYNF) added nearly $2 billion. The rebalancing triggered swift inflows and outflows that realigned investor exposure on the back of performance data and macroeconomic outlooks. BlackRock raises equities on strong US earnings The model updates come as BlackRock backs the rally in American stocks, fueled by strong earnings and optimism around rate cuts. In an investment letter obtained by Bloomberg, the firm said US companies have delivered 11% earnings growth since the third quarter of 2024. Meanwhile, earnings across other developed markets barely touched 2%. That gap helped push the decision to drop international holdings in favor of American ones. Michael Gates, lead portfolio manager for BlackRock’s Target Allocation ETF model portfolio suite, said the US market is the only one showing consistency in sales growth, profit delivery, and revisions in analyst forecasts. “The US equity market continues to stand alone in terms of earnings delivery, sales growth and sustainable trends in analyst estimates and revisions,” Michael wrote. He added that non-US developed markets lagged far behind, especially when it came to sales. This week’s changes reflect that position. The move was made ahead of the Federal…
Share
BitcoinEthereumNews2025/09/18 01:44
Western Union Eyes Stablecoin Card for Inflation Zones

Western Union Eyes Stablecoin Card for Inflation Zones

The post Western Union Eyes Stablecoin Card for Inflation Zones appeared on BitcoinEthereumNews.com. Western Union is building a stablecoin-backed prepaid card targeting countries with high inflation rates. Summary Western Union is creating a stablecoin-backed prepaid card for inflation-heavy economies. The USDPT token on Solana launches in 2026, integrating with the firm’s remittance network. Partnership with Rain enables Visa stablecoin cards and crypto-to-cash conversions. The money transfer giant plans to offer the product in markets where local currency depreciation erodes purchasing power, CFO Matthew Cagwin told the UBS Global Technology and AI conference. Cagwin pointed to Argentina as a prime use case, where inflation exceeded 200% last year. The dollar-denominated card would help preserve value for remittance recipients in economies facing rapid currency devaluation. Rain partnership brings Visa stablecoin cards Western Union has partnered with Rain to issue Visa cards linked to stablecoins. The collaboration allows users to convert digital assets stored in wallets connected to Rain’s platform into local cash at Western Union branches. The company is building on-ramps and off-ramps within its digital asset network to reduce banking system dependence and accelerate fund settlement. “We’re working with several providers to build this infrastructure,” Cagwin stated. Western Union plans to launch the US Dollar Payment Token (USDPT) in 2026, a stablecoin issued by Anchorage Digital on the Solana network. The token will integrate with the company’s broader digital asset strategy. The prepaid card will function as a bridge between stablecoins and everyday spending in high-inflation economies. Users receive remittances loaded onto cards denominated in dollars. The cards can be spent at merchants or withdrawn as cash at Western Union locations. Company reverses decade-long crypto skepticism Western Union maintained a dismissive stance toward cryptocurrencies for years. In 2017, Chief Technology Officer David Thompson questioned Bitcoin’s viability as currency, comparing crypto to commodities rather than functional money. The company argued that digital assets lacked governance,…
Share
BitcoinEthereumNews2025/12/07 02:47