Real-Time Speech-to-Speech Chatbot Based on Ollama & Kokoro - Install Locally



AI Summary

Overview

  • Project: Real-time speech-to-speech chatbot
  • Integration of:
    • Speech recognition
    • AI-driven reasoning
    • Web information access
    • Natural sounding voice output
  • Powered by Whisper, Llama 3.1, Kukuro, and Eggno agent framework

Installation Steps

  1. Olama Installation
    • Visit ola.com and download.
    • For Linux, run command in terminal.
    • For Mac/Windows, download executable.
  2. Create Virtual Environment
    • Use Conda to create a virtual environment.
  3. Clone Project Repository
    • Clone the Vocal Agent repository.
  4. Install Dependencies
    • Run pip install -r requirements.txt from the repo’s root.
  5. Text-to-Speech Synthesis
    • Use Kukuro models available in the repository.
  6. Download Olama Model
    • Run lama list to check for models.
    • Download Llama 3.18 billion model.

Running the Tool

  • Execute the command python3 main.py from the project root.
  • First run may require additional model downloads.
  • Handle input format issues (ensure input is a valid list).

Observations

  • Initial errors related to input format; require conversion of string to list.
  • Functionality confirmed to work after code adjustments; improvements needed in interface design.

Conclusion

  • Project works but has bugs that need addressing.
  • Suggestion: Develop a cleaner interface, possibly using Gradio.
  • Repository link to be provided in video description for access.