Skip to content

VanditGupta/SightSync

Repository files navigation

SightSync

SightSync is an interactive Streamlit application that combines computer vision and natural language processing to generate captions for uploaded images and videos and recommend songs based on those captions using the Spotify API.

Deployed Application

The application is deployed on HuggingFace. You can access it using the following link: SightSync

Features

  • Image Captioning: Upload an image and get a caption for it using the nlpconnect/vit-gpt2-image-captioning HuggingFace model.
  • Song Recommendation: Get song recommendations based on the caption generated for the uploaded image using the Spotify API.
  • Custom Song Recommendation: Get song recommendations based on the keyword clicked by the user.

Prerequisites

Before running the application, make sure to have the required libraries installed. You can install them using the following:

  1. Create Spotify API credentials by following the instructions here

  2. Create a .env file in the project directory and add the following credentials:

    SPOTIPY_CLIENT_ID=your_client_id
    SPOTIPY_CLIENT_SECRET=your_client_secret
  3. Create a virtual environment (optional) using the following command:

python -m venv sightsync
source sightsync/bin/activate
  1. Clone the repository using the following command:
git clone https://github.com/VanditGupta/SightSync.git
  1. Navigate to the project directory using the following command:
cd SightSync

6.Install the required libraries using the following command:

pip install -r requirements.txt

Running the Application

To run the application, use the following command:

streamlit run app.py

Open the application in your browser using the local URL provided in the terminal:

http://localhost:PORT_NUMBER/

Technologies Used

  • Streamlit - The web framework used
  • HuggingFace - The model used for image captioning
  • Spotify API - The API used for song recommendation
  • PyTorch - The deep learning library used for image captioning
  • Spotipy - The Python library used for the Spotify API

Working

The user uploads an image, and the Streamlit application uses the nlpconnect/vit-gpt2-image-captioning HuggingFace model to generate a caption for the image. Using NLP techniques, the application first preprocesses the caption and then uses the Spotify API to recommend songs based on the caption. The user can also click on keywords generate to get song recommendations based on that keyword.

Screenshots

  1. Home Page

  1. Image Captioning

  1. Song Recommendation

  1. Custom Song Recommendation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. Please read the CONTRIBUTING.md file for more details.

License

This project is licensed under the MIT License. Please read the LICENSE.md file for more details.

Contact

For any inquiries or contributions, please contact me at gupta.vandi@northeastern.edu

Project Status

This project is currently in active development. For the latest updates, please check our GitHub repository.

About

SightSync: Multimedia magic with Spotify API, Hugging Face models, and advanced Language Model (LLM). Seamlessly generate image captions and receive personalized song recommendations based on visual context. Explore captivating visuals and curated melodies in one platform.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors