Using NLP on Job Descriptions



In the previous post, I explained the process of web scraping job descriptions from a certain website. After collecting and cleaning the data, some NLP models were developed in order to detect skills, knowledge, minimum experience and levels (degree) in job descriptions.


The models

Manual annotation model

After some research, me and my group decided to pursue with a NER model from the spaCy library. To feed the model we had to perform manual annotation by using a free annotation software called docanno. We manual labelled around 200 job descriptions.

Automated labelling model

This model isn't capable of differentiate the different labels (skills, knowledge, minimum experience and levels), basically because it is only taking the skills from a dictionary and matching those with the job descriptions.

Entity Ruler

Very similar to the Automated labelling model, the only difference is that the data isn't trained, so basically it is only looking for the words in the job descriptions, but it is not taking into account the position of the word in the sentence.


Let's focused on the manual labelling, since it is the one that allows us to see the different labels.

Blockchain Developer


Web Developer


Data Scientist


Final Thoughts

More data annotation would definitely improve the model, we can clearly see, that some entities are still missing in the examples above.
These models can be applied on several use cases, such as: helping HR to tackle the right candidate, helping the job seeker to find his perfect match, etc, it's up to your imagination!

You can find the code on my github:


Congratulations @macrodrigues! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s):

You received more than 200 upvotes.
Your next target is to reach 300 upvotes.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Check out the last post from @hivebuzz:

PUD - PUH - PUM - It's all about to Power Up!
Christmas Challenge - 1000 Hive Power Delegation Winner
Support the HiveBuzz project. Vote for our proposal!