SightSync is an interactive Streamlit application that combines computer vision and natural language processing to generate captions for uploaded images and videos and recommend songs based on those captions using the Spotify API.
The application is deployed on HuggingFace. You can access it using the following link: SightSync
- Image Captioning: Upload an image and get a caption for it using the nlpconnect/vit-gpt2-image-captioning HuggingFace model.
- Song Recommendation: Get song recommendations based on the caption generated for the uploaded image using the Spotify API.
- Custom Song Recommendation: Get song recommendations based on the keyword clicked by the user.
Before running the application, make sure to have the required libraries installed. You can install them using the following:
-
Create Spotify API credentials by following the instructions here
-
Create a
.envfile in the project directory and add the following credentials:SPOTIPY_CLIENT_ID=your_client_id SPOTIPY_CLIENT_SECRET=your_client_secret
-
Create a virtual environment (optional) using the following command:
python -m venv sightsync
source sightsync/bin/activate- Clone the repository using the following command:
git clone https://github.com/VanditGupta/SightSync.git- Navigate to the project directory using the following command:
cd SightSync6.Install the required libraries using the following command:
pip install -r requirements.txtTo run the application, use the following command:
streamlit run app.pyOpen the application in your browser using the local URL provided in the terminal:
http://localhost:PORT_NUMBER/- Streamlit - The web framework used
- HuggingFace - The model used for image captioning
- Spotify API - The API used for song recommendation
- PyTorch - The deep learning library used for image captioning
- Spotipy - The Python library used for the Spotify API
The user uploads an image, and the Streamlit application uses the nlpconnect/vit-gpt2-image-captioning HuggingFace model to generate a caption for the image. Using NLP techniques, the application first preprocesses the caption and then uses the Spotify API to recommend songs based on the caption. The user can also click on keywords generate to get song recommendations based on that keyword.
- Home Page
- Image Captioning
- Song Recommendation
- Custom Song Recommendation
Contributions are welcome! Please feel free to submit a Pull Request. Please read the CONTRIBUTING.md file for more details.
This project is licensed under the MIT License. Please read the LICENSE.md file for more details.
For any inquiries or contributions, please contact me at gupta.vandi@northeastern.edu
This project is currently in active development. For the latest updates, please check our GitHub repository.




