Multimodal Specialist

February 16, 2026
Application ends: May 17, 2026

Job Description

REQUIREMENTS

  • Bachelor’s degree in Computer Science, Information Technology, or equivalent professional experience.
  • 4+ years of experience working with vision, audio, or video datasets
  • Familiarity with annotation tools (e.g., labelling platforms for bounding boxes, segmentation, transcription)
  • Strong spatial and temporal reasoning skills
  • Close attention to detail and consistency in evaluation
  • Ability to analyze large-scale multimodal datasets

RESPONSIBILITIES

  • Review and validate image, video, and audio annotations
  • Assess bounding boxes, segmentation masks, and object labelling accuracy
  • Perform image segmentation, QA, and detect spatial inconsistencies
  • Validate video events, temporal sequences, and frame-level annotations
  • Conduct audio transcription QA and verify timestamp accuracy
  • Score multimodal model outputs for correctness and quality
  • Identify labelling inconsistencies, noise, and structural errors
  • Provide structured feedback to improve annotation standards

Are you interested in this position?


Apply by clicking on the “Apply Now” button below!

#CrossChannelJobs #JobSearch
#CareerOpportunities #HiringNow
#Employment #JobOpenings
#JobSeekers
#FacebookLinkedIn