Zurück zur Stellenangebote
XX
Data ScientistNeoMaxUnited States
XX

Data Scientist

NeoMax
  • US
    United States
  • US
    United States

Über

Job Description
We are seeking a Data Scientist who can work in a quick-paced, dynamic, agile software development environment.
You will collaborate on a team on multiple projects that include automating processing of large forensic images, extracting and enriching metadata, and displaying resulting information in meaningful ways for analysts to conduct assessments.
Required Skills
Demonstrated experience building production data pipelines and ETL/ELT workflows at scale Demonstrated experience with Apache Spark and PySpark for distributed data processing Demonstrated experience with advanced Python programming skills including data manipulation libraries (Pandas, NumPy) and data engineering best practices Demonstrated experience understanding data security, privacy, governance, and compliance principles Demonstrated experience with workflow orchestration tools such as Step Functions and Airflow Demonstrated experience with containerization such as Docker or Podman, and deploying data applications in cloud environments Demonstrated experience with AWS services (in particular S3, Lambda, and Step Functions) Demonstrated experience with PostgreSQL and MySQL in production environments, including performance tuning and schema design Demonstrated experience with SQL and query optimization for complex analytical workloads Demonstrated experience with version control (Git) and CI/CD practices for data pipelines Demonstrated experience working with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight Demonstrated experience with strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks Desired Skills
Demonstrated experience with data lakehouse architectures using Apache Iceberg Demonstrated experience configuring, deploying, and integrating data platform components:
Apache Ranger (access control and data governance), Trino (distributed SQL query engine), Data catalogs (Unity Catalog OSS, Apache Polaris, etc.), and Apache Superset (data visualization and dashboarding)
Demonstrated experience with Bash scripting for automation and data processing tasks Demonstrated experience with Infrastructure as Code (Terraform or CloudFormation) for data infrastructure Demonstrated experience with tracking data lineage and associated tooling such as OpenLineage Demonstrated experience with Java Demonstrated experience with data quality frameworks, testing methodologies, and validation strategies Demonstrated experience or background with large-scale data migrations or platform modernization efforts Demonstrated experience integrating AI/ML services and models (translation, OCR, speech-to-text, NLP, language detection, topic modeling), LLMs, and RAG (retrieval-augmented generation) pipelines Demonstrated experience with geospatial data processing (H3, PostGIS, or similar) Demonstrated experience Contributing to data engineering documentation, best practices, or design patterns Demonstrated experience with NoSQL databases (DynamoDB, etc.) Demonstrated experience with excellent written and verbal communication skills with both technical and non-technical audiences Demonstrated experience with Linux Operating Systems Demonstrated experience with Agile/Scrum development methodologies in a fast-paced, collaborative team environment Demonstrated experience working effectively in high-performing, cross-functional teams with multiple concurrent projects Demonstrated experience working directly with stakeholders to gather requirements, understand needs, and translate them into technical solutions with minimal oversight Demonstrated experience in self-directed work with a strong ownership mentality and commitment to code quality, testing, and documentation Demonstrated experience context-switching between projects and systems as priorities demand
  • United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.