Building a job title predictor
Our company handles millions of job postings per day. Job title normalization is an important tool for us, considering the varying quality of job postings. The task of the normalizer is to extract the most important part of a job title: the actual position that the applicant will apply to.
The predictor normalizes exotic jobs like “Python Ninja” -> “Python Developer” and also get rid of things like:
• Modifiers of the position, like intern/junior/senior
• Shift information: “Cashier – 3rd shift”
• Payment information: “Doorman for 16$ per hour”
• Other catchy phrases like “apply now”, “hiring immediately”
We will present how we extracted a finite set of cleaned job titles
and built a noisy title -> cleaned title predictor based on a large amount of unlabeled data using neural networks.