Sr. AI/ML Ops Engineer
blaZop Inc.
- New York, New York, United States
- New York, New York, United States
About
Python Expertise: Proficiency in Object-Oriented Python Data Science (Jupyter Notebooks): Demonstrated expertise in data science, including analysis and modeling using Jupyter Notebooks. Deep Learning (PyTorch): Proven experience in deep learning, particularly with PyTorch, and familiarity with other frameworks. Good LLM Knowledge: Good understanding of Natural Language Processing (NLP) and Language Models (LLM). Any successful Implementation of GenAI (LLMs) on custom-data is preferred. Bachelors/Masters in Data Science is preferred. Responsible For
MLOps Implementation (Docker, Kubernetes, Azure DevOps, AWS SageMaker): Lead the implementation of MLOps practices, ensuring seamless integration of machine learning models into production systems. Leverage containerization with Docker and orchestration with Kubernetes. Implement MLOps technologies from both Azure and AWS, such as Azure DevOps and AWS SageMaker. Code Development (Python, NumPy, Pandas): Develop and maintain scalable and efficient Python code for machine learning applications. Utilize NumPy and Pandas for effective data manipulation and analysis. Collaboration (Git): Collaborate with cross-functional teams to understand business requirements and seamlessly integrate machine learning solutions into software applications. Utilize Git for version control and collaborative coding. DevOps Integration (Jenkins, GitLab): Work closely with DevOps teams to streamline deployment processes, ensuring reliability and scalability. Implement continuous integration and deployment (CI/CD) practices with tools like Jenkins or GitLab. Observability (Prometheus, Grafana, Azure Monitor, AWS CloudWatch): Focus on fine-tuning models and identifying data anomalies. Implement observability tools like Prometheus and Grafana for monitoring and troubleshooting. Leverage Azure Monitor and AWS CloudWatch for cloud-specific observability. Model Evaluation (TensorBoard): Implement model evaluation tools such as TensorBoard to ensure models are working as expected and meet performance criteria. Documentation (Confluence, Markdown): Create comprehensive documentation for code, models, and deployment processes using tools like Confluence and Markdown. Training and Knowledge Sharing: Provide training and knowledge-sharing sessions to team members on best practices in MLOps and Python coding. Job Nature
Full Time Job Location
Remote, USA Job Level
Sr. Position How to Apply
Interested candidates can send their resumes to contact@blazop.com mentioning "Job Title" in the subject line. blaZop is an AI-powered hyper-automation & observability platform that enables autonomous IT and cloud operations. It empowers teams to achieve more with less effort, consolidate tools, reduce operational costs, establish and maintain more secure and standardized environments, minimize outages, and gain an insightful view of all IT and cloud services from a single interface. The unified platform includes a range of integrated products for managing the entire lifecycle (design, build, operate, and optimize) of multi-vendor & complex IT and cloud environments.
#J-18808-Ljbffr
Languages
- English
Notice for Users
This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.