Multimodal Specialist

Multimodal Specialist

Specialist

Remote

February 16, 2026

Application ends: May 17, 2026

Job Description

REQUIREMENTS

Bachelor’s degree in Computer Science, Information Technology, or equivalent professional experience.
4+ years of experience working with vision, audio, or video datasets
Familiarity with annotation tools (e.g., labelling platforms for bounding boxes, segmentation, transcription)
Strong spatial and temporal reasoning skills
Close attention to detail and consistency in evaluation
Ability to analyze large-scale multimodal datasets

RESPONSIBILITIES

Review and validate image, video, and audio annotations
Assess bounding boxes, segmentation masks, and object labelling accuracy
Perform image segmentation, QA, and detect spatial inconsistencies
Validate video events, temporal sequences, and frame-level annotations
Conduct audio transcription QA and verify timestamp accuracy
Score multimodal model outputs for correctness and quality
Identify labelling inconsistencies, noise, and structural errors
Provide structured feedback to improve annotation standards

Are you interested in this position?

Apply by clicking on the “Apply Now” button below!

#CrossChannelJobs #JobSearch
#CareerOpportunities #HiringNow
#Employment #JobOpenings
#JobSeekers
#FacebookLinkedIn