搜索结果

约 595,782 条结果

  1. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

    Ze Liu, Yutong Lin, Yue Cao, Han Hu - 2021 IEEE/CVF International Conference on Computer Vision (ICCV) - 2021
    This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision. Challenges in adapting Transformer from language to vision arise from differences between the two domains, such as large variations in the scale of visual entities and the high resolution of pixels in images compared to words in text. To address these differences, we propose a…
    被引用次数:27,437
  2. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn - arXiv (Cornell University) - 2020
    While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not necessary and a pure tra…
    被引用次数:21,100
  3. Transformers: State-of-the-Art Natural Language Processing

    Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond - 2020
    Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, Alexander Rush. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demon…
    被引用次数:7,594
  4. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

    Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang - Proceedings of the AAAI Conference on Artificial Intelligence - 2021
    Many real-world applications require the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) demands a high prediction capacity of the model, which is the ability to capture precise long-range dependency coupling between output and input efficiently. Recent studies have shown the potential of Transformer to increase the prediction capacity. H…
    被引用次数:5,203
  5. Exploring the Limits of Transfer Learning with a Unified Text-to-Text\n Transformer

    Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee - arXiv (Cornell University) - 2019
    Transfer learning, where a model is first pre-trained on a data-rich task\nbefore being fine-tuned on a downstream task, has emerged as a powerful\ntechnique in natural language processing (NLP). The effectiveness of transfer\nlearning has given rise to a diversity of approaches, methodology, and\npractice. In this paper, we explore the landscape of transfer learning\ntechniques for NLP by introducing a unified frame…
    被引用次数:8,311
  6. Emerging Properties in Self-Supervised Vision Transformers

    Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jeǵou - 2021 IEEE/CVF International Conference on Computer Vision (ICCV) - 2021
    In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond the fact that adapting self-supervised methods to this architecture works particularly well, we make the following observations: first, self-supervised ViT features contain explicit information about the semantic segmentation of an image, which…
    被引用次数:4,487
  7. SwinIR: Image Restoration Using Swin Transformer

    Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang - 2021
    Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline…
    被引用次数:3,769
  8. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

    Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan - 2021 IEEE/CVF International Conference on Computer Vision (ICCV) - 2021
    Although convolutional neural networks (CNNs) have achieved great success in computer vision, this work investigates a simpler, convolution-free backbone network use-fid for many dense prediction tasks. Unlike the recently-proposed Vision Transformer (ViT) that was designed for image classification specifically, we introduce the Pyramid Vision Transformer (PVT), which overcomes the difficulties of porting Transformer…
    被引用次数:4,412
  9. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

    Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo - arXiv (Cornell University) - 2021
    Medical image segmentation is an essential prerequisite for developing healthcare systems, especially for disease diagnosis and treatment planning. On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard and achieved tremendous success. However, due to the intrinsic locality of convolution operations, U-Net generally demonstrates limitations in exp…
    被引用次数:3,755
  10. HuggingFace's Transformers: State-of-the-art Natural Language Processing

    Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond - arXiv (Cornell University) - 2019
    Recent progress in natural language processing has been driven by advances in both model architecture and model pretraining. Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks. \textit{Transformers} is an open-source library with the goal of opening up these advances to the wider machine learn…
    被引用次数:3,114