- +2
- +7
- Paris, Île-de-France, France
About
Job Description:
The kind of data engineer we are looking for is a software engineer first, with a strong focus (at least initially) on Python programming and scalability.
Primary Responsibilities- Write Python pipelines for semantic processing (NLP) and data augmentation in general:
- Vectorisation (embeddings) to/from MongoDB/Pinecone
- Named Entity Recognition (NER) to/from MongoDB
- Elasticsearch indexing
- Write Python transformation pipelines for derived data around companies:
- Compute insightful data points from raw company data
- Maintain Python framework for data point computation
- Level up the technical capabilities of your team, esp. junior teammates
- Contribute to task breakdown and phasing with Tech Lead
- Implement cloud-based data acquisition/ETL pipelines (esp. with Airflow, DBT, Snowflake)
- Expand web scraping capabilities (Pub/Sub, GCS, CloudRun)
- Move to a Tech Lead role as the tech organisation grows.
- 5+ years developing scalable industry-ready software applications in Python
- 3+ years implementing data processing pipelines/ETLs
- 2+ years working with advanced MongoDB and SQL + Elasticsearch ideally
- Solid understanding of computing scalability (multiprocessing/threading, distributed computing)
- Some hands-on experience with common/modern data frameworks (esp. GCP, DBT, Snowflake)
- Outstanding problem-solving skills for performance optimisation
- Product-driven: You might feel your professional interests revolve around building complex systems thanks to advanced design patterns.
- Socially open: We often build software making creative (and possibly weird) analogies with music, philosophy or football.
- Taking responsibility: This is about self-motivation and embracing accountability for pushing the Product in the right direction.
- Humble: We believe in the growth mindset, meaning everybody can learn pretty much everything.
- Curious: We like to share our passions (travel, food, books, sports, etc.) and interact beyond tech.
As a global organization, Datasite knows that diverse perspectives are essential to our success. We’re committed to maintaining a diverse workforce to serve our customers around the world. Datasite is an equal opportunity employer (EEO) and furthers the principles of EEO through Affirmative Action.
Nice-to-have skills
- Python
- MongoDB
- Elasticsearch
- SQL
- GCP
- ETL
- Distributed Computing
Work experience
- Data Engineer
- NLP
Languages
- English