Real-Time Speech-to-Speech Chatbot Based on Ollama & Kokoro - Install Locally
AI Summary
Overview
- Project: Real-time speech-to-speech chatbot
- Integration of:
- Speech recognition
- AI-driven reasoning
- Web information access
- Natural sounding voice output
- Powered by Whisper, Llama 3.1, Kukuro, and Eggno agent framework
Installation Steps
- Olama Installation
- Visit ola.com and download.
- For Linux, run command in terminal.
- For Mac/Windows, download executable.
- Create Virtual Environment
- Use Conda to create a virtual environment.
- Clone Project Repository
- Clone the Vocal Agent repository.
- Install Dependencies
- Run
pip install -r requirements.txt
from the repo’s root.- Text-to-Speech Synthesis
- Use Kukuro models available in the repository.
- Download Olama Model
- Run
lama list
to check for models.- Download Llama 3.18 billion model.
Running the Tool
- Execute the command
python3 main.py
from the project root.- First run may require additional model downloads.
- Handle input format issues (ensure input is a valid list).
Observations
- Initial errors related to input format; require conversion of string to list.
- Functionality confirmed to work after code adjustments; improvements needed in interface design.
Conclusion
- Project works but has bugs that need addressing.
- Suggestion: Develop a cleaner interface, possibly using Gradio.
- Repository link to be provided in video description for access.