Bridging Text and Image for Artist Style Transfer via Contrastive Learning
Title: | Bridging Text and Image for Artist Style Transfer via Contrastive Learning |
---|---|
Authors: | Liu, Zhi-Song, Wang, Li-Wen, Xiao, Jun, Kalogeiton, Vicky |
Publication Year: | 2024 |
Collection: | Computer Science |
Subject Terms: | Computer Science - Computer Vision and Pattern Recognition, Computer Science - Human-Computer Interaction |
More Details: | Image style transfer has attracted widespread attention in the past few years. Despite its remarkable results, it requires additional style images available as references, making it less flexible and inconvenient. Using text is the most natural way to describe the style. More importantly, text can describe implicit abstract styles, like styles of specific artists or art movements. In this paper, we propose a Contrastive Learning for Artistic Style Transfer (CLAST) that leverages advanced image-text encoders to control arbitrary style transfer. We introduce a supervised contrastive training strategy to effectively extract style descriptions from the image-text model (i.e., CLIP), which aligns stylization with the text description. To this end, we also propose a novel and efficient adaLN based state space models that explore style-content fusion. Finally, we achieve a text-driven image style transfer. Extensive experiments demonstrate that our approach outperforms the state-of-the-art methods in artistic style transfer. More importantly, it does not require online fine-tuning and can render a 512x512 image in 0.03s. Comment: 18 pages, 8 figures. arXiv admin note: substantial text overlap with arXiv:2202.13562 |
Document Type: | Working Paper |
Access URL: | http://arxiv.org/abs/2410.09566 |
Accession Number: | edsarx.2410.09566 |
Database: | arXiv |
FullText | Text: Availability: 0 CustomLinks: – Url: http://arxiv.org/abs/2410.09566 Name: EDS - Arxiv Category: fullText Text: View this record from Arxiv MouseOverText: View this record from Arxiv – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edsarx&genre=article&issn=&ISBN=&volume=&issue=&date=20241012&spage=&pages=&title=Bridging Text and Image for Artist Style Transfer via Contrastive Learning&atitle=Bridging%20Text%20and%20Image%20for%20Artist%20Style%20Transfer%20via%20Contrastive%20Learning&aulast=Liu%2C%20Zhi-Song&id=DOI: Name: Full Text Finder (for New FTF UI) (s8985755) Category: fullText Text: Find It @ SCU Libraries MouseOverText: Find It @ SCU Libraries |
---|---|
Header | DbId: edsarx DbLabel: arXiv An: edsarx.2410.09566 RelevancyScore: 1128 AccessLevel: 3 PubType: Report PubTypeId: report PreciseRelevancyScore: 1128.01806640625 |
IllustrationInfo | |
Items | – Name: Title Label: Title Group: Ti Data: Bridging Text and Image for Artist Style Transfer via Contrastive Learning – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Liu%2C+Zhi-Song%22">Liu, Zhi-Song</searchLink><br /><searchLink fieldCode="AR" term="%22Wang%2C+Li-Wen%22">Wang, Li-Wen</searchLink><br /><searchLink fieldCode="AR" term="%22Xiao%2C+Jun%22">Xiao, Jun</searchLink><br /><searchLink fieldCode="AR" term="%22Kalogeiton%2C+Vicky%22">Kalogeiton, Vicky</searchLink> – Name: DatePubCY Label: Publication Year Group: Date Data: 2024 – Name: Subset Label: Collection Group: HoldingsInfo Data: Computer Science – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22Computer+Science+-+Computer+Vision+and+Pattern+Recognition%22">Computer Science - Computer Vision and Pattern Recognition</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Human-Computer+Interaction%22">Computer Science - Human-Computer Interaction</searchLink> – Name: Abstract Label: Description Group: Ab Data: Image style transfer has attracted widespread attention in the past few years. Despite its remarkable results, it requires additional style images available as references, making it less flexible and inconvenient. Using text is the most natural way to describe the style. More importantly, text can describe implicit abstract styles, like styles of specific artists or art movements. In this paper, we propose a Contrastive Learning for Artistic Style Transfer (CLAST) that leverages advanced image-text encoders to control arbitrary style transfer. We introduce a supervised contrastive training strategy to effectively extract style descriptions from the image-text model (i.e., CLIP), which aligns stylization with the text description. To this end, we also propose a novel and efficient adaLN based state space models that explore style-content fusion. Finally, we achieve a text-driven image style transfer. Extensive experiments demonstrate that our approach outperforms the state-of-the-art methods in artistic style transfer. More importantly, it does not require online fine-tuning and can render a 512x512 image in 0.03s.<br />Comment: 18 pages, 8 figures. arXiv admin note: substantial text overlap with arXiv:2202.13562 – Name: TypeDocument Label: Document Type Group: TypDoc Data: Working Paper – Name: URL Label: Access URL Group: URL Data: <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2410.09566" linkWindow="_blank">http://arxiv.org/abs/2410.09566</link> – Name: AN Label: Accession Number Group: ID Data: edsarx.2410.09566 |
PLink | https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2410.09566 |
RecordInfo | BibRecord: BibEntity: Subjects: – SubjectFull: Computer Science - Computer Vision and Pattern Recognition Type: general – SubjectFull: Computer Science - Human-Computer Interaction Type: general Titles: – TitleFull: Bridging Text and Image for Artist Style Transfer via Contrastive Learning Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Liu, Zhi-Song – PersonEntity: Name: NameFull: Wang, Li-Wen – PersonEntity: Name: NameFull: Xiao, Jun – PersonEntity: Name: NameFull: Kalogeiton, Vicky IsPartOfRelationships: – BibEntity: Dates: – D: 12 M: 10 Type: published Y: 2024 |
ResultId | 1 |