Integrating Contrastive Learning and Transformer Technologies for Personalized Outfit Recommendations Using Generative AI
Keywords:
Generative AI, Contrastive Learning, Transformer, Fashion Recommendation, CLIP, BM25, Outfit Generation, Personalized Styling, Multimodal Embeddings, Real-Time AI SystemAbstract
The growing demand for AI-driven fashion recommendation systems is driven by the complexity of user preferences and the limitations of traditional filtering or keyword-based approaches. While existing models attempt to align visual and textual modalities, many fall short in delivering real-time, personalized, and context-aware outfit suggestions. This study aims to design a generative AI-based outfit recommendation system that integrates contrastive learning with transformer architectures to deliver prompt-to-outfit recommendations based on natural language queries. The proposed framework utilizes a multi-stage pipeline combining CLIP for contrastive image-text embedding, BM25 for semantic text relevance, and a transformer-based generative model for sequential outfit creation. A unified dataset compiled from FashionIQ, Kaggle, social media scraping, and a custom composite dataset was used for model training and validation. The model was evaluated using Top-K accuracy, macro F1-score, BLEU score, and real-time inference latency. Results demonstrate a Top-1 accuracy of 83.7%, a macro F1-score of 0.862, and an average BLEU score of 0.77, outperforming baseline models such as Style2Vec [3], OutfitTransformer [4], and CP-TransMatch [13]. Moreover, the system reduced inference latency by 45.2%, achieving real-time responses under 400 ms.This study highlights the potential of multimodal generative modeling for interactive and inclusive fashion recommendations. The integration of a feedback loop enables adaptive learning, positioning the system as a robust, scalable solution for e-commerce, digital wardrobe assistants, and stylist AI applications
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. You are free to share and adapt the material, but only for non-commercial purposes. You must give appropriate credit to the author(s).

