Retour aux emplois
XX
Internship, Data Scientist at Everstream AnalyticsFeedinkooNew York, New York, United States
XX

Internship, Data Scientist at Everstream Analytics

Feedinkoo
  • US
    New York, New York, United States
  • US
    New York, New York, United States

À propos

Position Overview
Are you a curious, driven student who’s passionate about data and its potential to fuel smarter decision‑making? Join our Natural Language Processing (NLP) and Generative AI Data Science team as a Data Science Intern, where you’ll get hands‑on experience collecting and working with real‑world, publicly available data from online sources — including news outlets and company websites. What You’ll Work On
Data Collection: Develop and maintain scripts to automate the collection of publicly available data from online sources, ensuring compliance with each website’s terms of service and robots.txt directives. Data Cleaning & Preprocessing: Clean, validate, and organize collected data to ensure accuracy and usability for downstream tasks. Data Storage: Store extracted data in structured formats such as CSV, JSON, or databases, ensuring efficient retrieval and analysis. Collaboration: Work closely with data scientists and analysts to understand data requirements and ensure legal compliance. Documentation: Document data collection processes, data dictionaries, and any challenges encountered to facilitate knowledge sharing and future maintenance. What You Bring
Pursuing a degree in Computer Science, Data Science, Information Technology, or a related field. Familiarity with Python and libraries such as BeautifulSoup, Scrapy, or Selenium for data collection tasks. Understanding of HTML, CSS, and JavaScript to navigate and parse web content effectively. Basic knowledge of data storage formats and databases (e.g., CSV, JSON, SQL). Strong problem‑solving skills and attention to detail. Excellent communication skills, both written and verbal. Bonus Point For
Familiarity with AI‑powered data collection tools (e.g. Firecrawl). Familiarity with web concepts such as sitemaps, robots.txt, and RSS feeds. Experience with data visualization tools or libraries (e.g., Matplotlib, Seaborn). Familiarity with version control systems like Git. Understanding of ethical considerations and legal guidelines related to data collection. Ability to work independently and manage time effectively in a remote or hybrid work environment. Why This Internship Rocks
This isn’t just another internship — it’s a chance to work on real data projects that directly support our NLP and generative AI initiatives. You’ll gain hands‑on experience with modern tools and techniques used in industry, collaborate with a talented and supportive team of data professionals, and build a portfolio that goes well beyond classroom assignments. Whether you’re passionate about ethical data practices, fascinated by AI, or eager to level up your Python and web‑scraping skills, this role offers meaningful exposure and flexibility — all within a fully remote work environment designed with students in mind.
#J-18808-Ljbffr
  • New York, New York, United States

Compétences linguistiques

  • English
Avis aux utilisateurs

Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.