This project demonstrates how to use the Hugging Face Inference API for text-to-speech conversion with a simple web UI.
The original implementation had CORS issues because it was trying to call the Hugging Face API directly from the browser. This has been fixed by:
- Creating a Node.js/Express server that acts as a proxy between the UI and the Hugging Face API
- Modifying the UI code to use the proxy endpoint instead of directly calling the API
- Adding proper CORS headers to the server
-
Install dependencies:
npm install
-
Set your Hugging Face API token as an environment variable:
export HF_TOKEN=your_hugging_face_token
Start the server:
npm run server
This will:
- Start the Express server on port 3000 (or the port specified in the PORT environment variable)
- Serve the static files from the
ui
directory - Create an API endpoint at
/api/text-to-speech
that proxies requests to the Hugging Face API
Then open your browser and navigate to:
http://localhost:3000
- The UI makes a POST request to the local server endpoint
/api/text-to-speech
with the text and model parameters - The server forwards the request to the Hugging Face API using your API token
- The server receives the audio data from the Hugging Face API and sends it back to the UI
- The UI creates a blob URL from the audio data and sets it as the source of the audio element
This approach avoids CORS issues because:
- The UI is served from the same origin as the API endpoint
- The server handles the cross-origin request to the Hugging Face API
- The server adds the necessary CORS headers to allow the UI to access the API