Using NLP on Job Descriptions
In the previous post, I explained the process of web scraping job descriptions from a certain website. After collecting and cleaning the data, some NLP models were developed in order to detect skills, knowledge, minimum experience and levels (degree) in job descriptions.
Manual annotation model
After some research, me and my group decided to pursue with a NER model from the spaCy library. To feed the model we had to perform manual annotation by using a free annotation software called docanno. We manual labelled around 200 job descriptions.
Automated labelling model
This model isn't capable of differentiate the different labels (skills, knowledge, minimum experience and levels), basically because it is only taking the skills from a dictionary and matching those with the job descriptions.
Very similar to the Automated labelling model, the only difference is that the data isn't trained, so basically it is only looking for the words in the job descriptions, but it is not taking into account the position of the word in the sentence.
Let's focused on the manual labelling, since it is the one that allows us to see the different labels.
More data annotation would definitely improve the model, we can clearly see, that some entities are still missing in the examples above.
These models can be applied on several use cases, such as: helping HR to tackle the right candidate, helping the job seeker to find his perfect match, etc, it's up to your imagination!
You can find the code on my github: