SENIOR NLP DATA SCIENTIST

- País: Bolivia
- Experiencia: +3 years
Descripción del cargo
Key Responsibilities:
- Identify new data sources.
- Lead annotation efforts
- Prepare data for ingestion by LLMs
- Perform prompt engineering with LLMs
- Leverage NLP tools like SpaCy and Snorkel to analyze, annotate, and augment data, using both supervised and unsupervised approaches
- Identify relevant datasets and determine annotation needs
- Version datasets
- Set up data pipelines
- Monitor ML models in production for data drift
- Stay up to date with the latest technologies in Data Science
Qualifications:
- Minimum of a bachelor’s degree in computer science, STEM degree and three (3) years of relevant. experience or combination or education, experience and training.
- Exposure to modern NLP systems, including word embeddings, transformer architectures, LLMs such as ChatGPT, and good software design principles.
- Experience with NLP toolkits such as NLTK, SpaCy, and Gensim.
- Experience leading teams and mentoring junior data scientist.
- Fluency in Python.
- Experience with NLP libraries and tolls such as NLTK, SpaCy, Gensim.
- Experience setting up CI/CD pipelines.
- Strong analytical and problem-solving skills.
- Excellent communication and collaboration skills.
Preferred Qualifications:
- Experience with unsupervised techniques like clustering and topic modeling
- Experience with tools for data versioning like dvc
- Experience with unsupervised tools for labeling data, e.g., Snorkel
- Experience with big data technologies such as Hadoop, Spark, or Kafka