ML Backend Vocal Separation

FreelanceJobs

Canada

Canada

Find similar jobs

About

1. General objective
Create an automated ML backend allowing to separate an audio file into:
voice only (vocal stem) = main output
instrumental = optional output (useful for debug / future)
The backend must be stable, automated, documented and scalable.
2. Scope
Included:
voice / instrumental separation via API
automated processing (no GUI)
job management (processing/done/error)
audio file outputs
3. Technique
V1 choice: via API (speed / simplicity)
Future option: Demucs / MDX local if needed (not required V1).
4. Required functionality
4.1 Voice / instrumental separation
Input:
1 audio file (wav)
Output:
(mandatory)
(optional)
Pipeline:
upload audio → separation → stems generation → return download URL
4.2 Multiple files processing (Batch Processing)
The backend must allow sending several audio files at the same time.
Example: 10 files sent → creation of 10 distinct jobs with job_id.
Processing can be:
in parallel
(depending on service capacity or server capacity).
The backend must automatically manage these jobs and return results individually.
4.3 Max file size
Maximum audio file length: 4 minutes.
Accepted formats: wav
4.4 Minimal API security
Otherwise someone can spam your backend.
The API must be protected with a simple API key.
4.5 File storage
Resulting files must be stored either:
on cloud storage (AWS S3 or equivalent).
Returned URLs must be accessible to the main backend.
5. Integration interface
The backend must provide a simple API allowing another developer to send an audio file for separation and retrieve results.
The API must allow:
• send one or multiple audio files
• follow processing status
• retrieve separated files (voice and instrumental)
The exact API format is free, but must be clear, documented and easy to integrate.
6. Integration
The backend must expose a clear HTTP API so another developer can integrate it into the main backend later.
No user interface required.
7. Technical constraints
clean code + README
clear logs
Docker recommended
easy to deploy
no blocking dependencies
8. Expected deliverables
complete code on a GitHub repository owned by the client
installation documentation
API documentation (Swagger/Postman or README)
test examples with audio files
Contract duration of less than 1 month.
Mandatory skills: PyTorch, TensorFlow, OpenCV, Python, Deep Learning, Machine Learning, Automatic Speech Recognition, AI Text-to-Speech

Canada

Languages

English

Notice for Users

This job was posted by one of our partners. You can view the original job source here.

Find similar jobs