Experts in this TalentCloud should be able to deal with unstructured data such as emails, text messages, notes, voice data, and semi-structured data in combination with structured data. This TalentCloud requires you to build NLP Machine Learning/Deep Learning models using open stack programming languages (mainly Python/PySpark/PyTorch).
Required Skills
- Extensive experience as a data scientist with a focus on NLP and NLU
- Proficient in Python
- Proficient in deep learning
- Expert in Natural Language Processing (NLP)
- Proficient in NLTK, PyTorch, spaCy, TensorFlow, Keras, Gensim
- Hands-on experience writing data processing and data pipeline for unstructured data model development including gathering and building datasets to collect intents, cleaning noisy data, designing feedback loop on data needs
- Experience building intent recognition and classification models
- Experience with phrase level identification
- Hands-on experience with transformers such as BERT and ELMO models
- Experience with active learning and reinforcement learning
- Experience with Named Entity Recognition models
- Experience in cloud platforms such as GCP, AWS, or Azure
Preferred Skills
- Experience with NoSQL databases
- Experience with Spark
- Experience with Google Natural Language API and Stanford’s Core NLP