Kani TTS Installation Guide
Prerequisites
Before installing Kani TTS, ensure you have Python 3.8 or higher installed on your system. The following dependencies are required for core functionality.
Core Dependencies
Important Note
The custom transformers build is critical for the "lfm2" model type. Make sure to install it from the GitHub repository as shown above.
Quick Start
Once you have installed the dependencies, you can start generating speech with Kani TTS using these simple commands.
Generate Audio with Default Sample Text
This command will load the TTS model and generate speech using the built-in sample text, saving the output as generated_audio_YYYYMMDD_HHMMSS.wav.
Generate Audio with Custom Text
Use the --prompt flag to specify your own text for speech generation. The model will process your custom text and generate corresponding audio.
Web Interface
For a browser-based interface with real-time audio playback and interactive controls, you can use the included web interface.
Start the FastAPI Server
This starts the FastAPI server on http://localhost:8000. The server provides REST API endpoints for speech generation.
Access the Web Interface
Open the client.html file in your web browser to access the interactive interface.
Web Interface Features
- • Interactive text input with example prompts
- • Parameter adjustment (temperature, max tokens)
- • Real-time audio generation and playback
- • Download functionality for generated audio
- • Server health monitoring
Configuration
Kani TTS comes with sensible default configurations that work well for most use cases. You can customize these settings based on your specific requirements.
Default Configuration
- Model: https://huggingface.co/nineninesix/kani-tts-450m-0.1-pt
- Sample Rate: 22,050 Hz
- Generation: 1200 max tokens, temperature 1.4
Model Variants
Choose different models for specific voice characteristics and performance requirements.
Base Model (Default)
nineninesix/kani-tts-450m-0.1-pt
Generates random voices with consistent quality.
Female Voice
nineninesix/kani-tts-450m-0.2-ft
Fine-tuned for female voice characteristics.
Male Voice
nineninesix/kani-tts-450m-0.1-ft
Fine-tuned for male voice characteristics.
Changing Models
To use a different model, modify the ModelConfig class in config.py:
Troubleshooting
Common Installation Issues
Solution: Ensure you've installed the custom transformers build from GitHub
Solution: Reduce batch size or use CPU mode for testing
Solution: Check sample rate settings and ensure proper audio codec installation
Performance Optimization
- • Use GPU acceleration when available for faster processing
- • Adjust temperature and max tokens based on your quality vs speed requirements
- • Consider using smaller model variants for edge device deployment
- • Monitor memory usage and adjust batch sizes accordingly
Ready to Start?
Now that you have Kani TTS installed, explore the demo and start building your applications.