Data Scientist
Job Description
Role: Data Scientist
Location: Atlanta, GA (Onsite)
Duration: 12+ months
What We Expect from the Data Scientist
We are looking for a Data Scientist with strong analytical depth, statistical expertise, and hands-on experience working across the full data lifecycle—from raw data exploration to model evaluation and deployment.
1. Exploratory Data Analysis (EDA)
Perform in-depth exploratory data analysis to understand structure, distributions, correlations, and anomalies
Identify data quality issues (missing values, outliers, inconsistencies)
Generate meaningful insights using summary statistics and visualizations
Translate exploratory findings into actionable business hypotheses
Document assumptions and analytical findings clearly
2. Understanding of Data Sources
Strong understanding of structured and unstructured data sources
Experience working with relational databases (SQL), data warehouses, APIs, and third-party data providers
Ability to assess data reliability, completeness, and bias
Knowledge of data governance, data lineage, and metadata management
Collaborate with data engineering teams to ensure data availability and integrity
3. Data Pipelines & Data Engineering Awareness
Design and build scalable data pipelines for ingestion, transformation, and feature engineering
Experience with ETL/ELT processes
Work with batch and/or real-time data processing systems
Familiarity with big data and distributed processing frameworks
Ensure reproducibility and automation of data workflows
4. Statistical Expertise
Strong foundation in probability and statistical inference
Hypothesis testing, confidence intervals, p-values
Regression analysis (linear, logistic)
Time series analysis (if applicable)
Sampling techniques and bias mitigation
Understanding of statistical assumptions and their impact on models
Ability to compute and interpret statistical derivatives (rate of change, marginal effects, partial derivatives in modeling contexts)
5. Model Development & Evaluation
Develop predictive and classification models using appropriate algorithms
Apply cross-validation and proper train-test splitting
Evaluate models using relevant metrics (Accuracy, Precision, Recall, F1-score, ROC-AUC, RMSE, MAE, etc.)
Perform feature selection and feature importance analysis
Detect and mitigate overfitting and underfitting
Conduct model performance monitoring and re-training strategies
6. Statistical & Technical Tools (Expected Proficiency)
Programming: Python and/or R
Libraries: Pandas, NumPy, Scikit-learn, Statsmodels
Visualization: Matplotlib, Seaborn, Tableau or Power BI
Databases: SQL
Big Data Tools: Spark (preferred)
Version control: Git
I look forward to hearing from you.
Best regards,
Ankit Kalia
Technical Recruiter - Metasys Technologies
xxxxxxxxxxxxxxx
Similar Jobs
Sr. Data Scientist
Pennsylvania
Data Scientist
Remote
Data Scientist
Remote
Data Scientist
Remote
Data Scientist
New York