XX
NLP Data Scientist/Scientific Data EngineerEuropean Bioinformatics Institute | EMBL-EBILondon, England, United Kingdom
XX

NLP Data Scientist/Scientific Data Engineer

European Bioinformatics Institute | EMBL-EBI
  • GB
    London, England, United Kingdom
  • GB
    London, England, United Kingdom
Jetzt Bewerben

Über

NLP Data Scientist / Scientific Data Engineer Open at EMBL-EBI, Cambridge, United Kingdom. Contract length: 3 years (project based). Salary: Grade 5–6 (monthly £3 303–£3 695 after tax, excluding pension & insurances). Closing date: 11/01/2026.
About the team Safety and toxicology concerns remain one of the most persistent challenges in drug discovery. This role joins a multi‑disciplinary team to develop a comprehensive open‑source side‑effect resource for the scientific and pharmaceutical community, and to provide structured and standardised training sets for AI/ML applications that improve early identification of safety liabilities.
Role overview The position is embedded within the Chemical Biology Services team at EMBL‑EBI and the Open Targets Safety 2.0 project. You will work closely with safety scientists from Open Targets pharmaceutical partners (MSD, Genentech, GSK, Pfizer, Sanofi), ensuring delivery of workpackages and seamless integration of pipelines into ChEMBL and the Open Targets Platform.
Key responsibilities
Develop machine learning pipelines for extracting drug side effects from drug labels, clinical trials, publications and other documents
Investigate modern NLP methodologies and propose ideas for the implementation of data extraction methods and pipelines
Apply language models to extract and map drug‑related information from unstructured text, e.g. from the scientific literature, ClinicalTrials.gov
Implement and/or fine‑tune different NLP models, e.g. NER models, transformer models, LLMs
Integrate project workflows with existing infrastructures in the EBI Chemical Biology Services and Open Targets teams
Prepare and evaluate benchmark datasets from the open domain as training sets for NLP models
Work with domain experts to develop new gold standards for NLP tasks where needed
Assist with and/or perform data curation to prepare clean and reliable training sets
Apply and/or adapt existing methods for mapping extracted entities to biomedical ontologies, e.g. drugs, side effects/phenotypes, and diseases
Work closely with Safety 2.0 project group members bridging the ChEMBL and Open Targets teams
Work closely with the Open Targets Core team to ensure seamless integration of data and workflows into the Open Targets Platform and long‑term sustainability
Collaborate with the Open Targets Partners to assess, prioritise, validate and refine the developed methods
Disseminate the outcomes of the project to the scientific community and stakeholders through presentations and publications
Required qualifications
PhD, Masters or equivalent experience in computational linguistics, computer science, bioinformatics, or cheminformatics
Experience with language models e.g. transformer models, LLMs, AI agents for information extraction
Experience with document and text preprocessing, cleaning and transformation techniques including mapping to ontologies
Experience with data structures, data models and databases
Knowledge of cheminformatics resources and/or bioinformatics databases
Knowledge of data analysis and machine learning
Proficiency in Python
Knowledge of data frameworks e.g. pySpark, pandas, Polar
Excellent attention to detail
Strong communication skills, both presentations and verbal
Experience working in a team‑oriented environment and collaborating
Able to work independently, to manage time and work to deadlines
Preferred experience
Experience with the application of NLP methods to cheminformatics and/or biomedical domains
Experience with version control
Experience in safety/toxicology in industry or research
Other helpful information Hybrid Working: At EMBL‑EBI we embrace a hybrid approach – team members are typically on site at least three days a week, with a desk always available. Interviews: Introductory meetings will be held remotely starting in February 2026. Salary: Grade 5–6 (£3 303–£3 695 per month after tax, excluding pension & insurances). Why join us: EMBL‑EBI, part of the European Molecular Biology Laboratory, is a world‑leading research centre for large biological data. Enjoy a collaborative, inclusive culture, flexible working and a wide range of on‑site and remote facilities.
Benefits
Financial incentives: monthly family, child and non‑resident allowances, annual salary review, pension scheme, death benefit, long‑term care, accident‑at‑work and unemployment insurances
Flexible working arrangements – including hybrid patterns
Private medical insurance for you and your immediate family (including prescriptions, dental and optical cover)
Generous time off: 30 days annual leave per year plus public holidays
Relocation package including installation grant (if required)
Campus life: free shuttle bus, on‑site library, subsidised gym and cafeteria, casual dress code, sports and social club activities (on campus or remotely)
Family benefits: on‑site nursery, 10 days child sick leave, generous parental leave, holiday clubs on campus and monthly family and child allowances
Benefits for non‑UK residents: visa exemption, education grant for private schooling, financial support to travel back home every second year and a monthly non‑resident allowance
Additional information
International applicants: we recruit internationally and successful candidates are offered visa exemptions.
EMBL is a signatory of DORA – find out how we apply DORA principles to our recruitment and performance assessment processes.
Diversity and inclusion: we strongly believe that inclusive and diverse teams benefit from higher levels of innovation and creative thought. We encourage applications from women, LGBTQ+ individuals and people from all nationalities.
How to apply: submit a cover letter and CV through our online system. Applications will close at 23:59 CET on the date shown above (11/01/2026). We aim to respond within two weeks after the closing date.
Closing date 11/01/2026
#J-18808-Ljbffr
  • London, England, United Kingdom

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.