Data Engineer

Cincinnati, Ohio

Employer: System One Holdings, LLC

Industry:

Salary: Competitive

Job type: Full-Time

Title: Data Engineer
Location: ALTA is supporting a 3 month contract opportunity working for a client located in the Cincinnati, OH area.

No C2C available.

Hybrid position, Local Cincinnati candidates only (can work from home, but must be able to report to office when asked)

Description:

Data Pipeline Development: Design, develop, and maintain robust data pipelines using Databricks to process and transform large volumes of data.
ETL Process Management: Implement ETL (Extract, Transform, Load) processes to integrate data from various sources into Databricks, ensuring data quality and integrity.
Data Integration: Integrate Databricks with other data storage solutions and data lakes, ensuring seamless data flow and accessibility.
Performance Optimization: Optimize data processing and query performance within Databricks to ensure efficient data retrieval and processing.
Data Analysis and Visualization: Utilize Databricks to perform complex data analysis and create visualizations to support data-driven decision-making.
Collaborate with Data Scientists and Analysts: Work closely with data scientists and analysts to understand their requirements and provide the necessary infrastructure and tools within Databricks.
Security and Compliance: Ensure that data processing within Databricks complies with organizational security policies and industry regulations, implementing necessary security measures. This includes setting up encryption, managing network security configurations, and performing regular security audits.
Monitoring and Troubleshooting: Monitor data pipelines and workflows for performance issues or errors, and troubleshoot any problems that arise to maintain smooth operations.
Cluster Management: Manage the creation, configuration, and scaling of Databricks clusters to ensure optimal performance and cost-efficiency. This includes monitoring cluster usage, resource allocation, and ensuring high availability.
User and Access Management: Implement and manage user access controls, ensuring that only authorized personnel have access to Databricks resources. This involves setting up role-based access controls (RBAC), managing permissions, and integrating with identity management systems.
Backup and Disaster Recovery: Develop and implement backup and disaster recovery plans for Databricks environments. Ensure that data and configurations are regularly backed up and that there are clear procedures in place for restoring services in the event of a failure.

Required Qualifications:

Experience with Databricks: Hands-on experience with Databricks, including familiarity with its architecture, features, and services.
Proficiency in Spark: Strong knowledge of Apache Spark, including Spark SQL, Spark Streaming, and Spark MLlib, as Databricks is built on Spark.
Programming Languages: Proficiency in programming languages commonly used in data engineering such as Python, Scala, SQL, and Java.
Data Warehousing and ETL: Experience with data warehousing concepts, ETL processes, and tools like Apache Airflow, Talend, or Informatica.
Database Management: Knowledge of relational and NoSQL databases, data modeling, and query optimization.
Big Data Technologies: Familiarity with big data technologies and ecosystems, including Hadoop, Hive, and Kafka.
Data Analysis: Ability to perform complex data analysis and create data visualizations to support business decisions.
Problem-Solving: Strong analytical and problem-solving skills to troubleshoot and resolve issues in data pipelines and workflows.
Communication Skills: Excellent verbal and written communication skills to collaborate with data scientists, analysts, and other stakeholders.
Team Collaboration: Ability to work effectively in a team environment and contribute to cross-functional projects.

Certifications (Optional but Beneficial)
• Databricks Certifications: Certifications such as Databricks Certified Associate Developer for Apache Spark or Databricks Certified Professional Data Scientist can demonstrate expertise and enhance job prospects.
• Cloud Certifications: Certifications from cloud providers (e.g., Azure Certified Solutions Architect, Azure Data Engineer) can be advantageous.

Work Experience:

Prior experience working in data engineering, data analytics, or a related field is often required. This includes experience in building and maintaining data pipelines, ETL processes, and data integration.

Job Responsibilities:

Responsible to design, build, refactor, and maintain data pipelines using Microsoft Azure, SQL, Azure Data Factory, Azure Synapse, Databricks, Python, and PySpark to meet business requirements for reporting, analysis, and data science
Responsible to teach, adhere to, and contribute to DataOps and MLOps standards and best practices to accelerate and continuously improve data system performance
Responsible to design, and integrate fault tolerance and enhancements into data pipelines to improve quality and performance
Perform root cause analysis and solve problems using analytical and technical skills to optimize data delivery and reduce costs
Mentor more junior Data Engineers

Job Requirements:

Demonstrated experience with Microsoft Azure, SQL, Azure Data Factory, Azure Synapse, Databricks, Python, PySpark, SAP Datasphere, Power BI or other cloud-based data systems
Demonstrated experience with Azure DevOps, GitHub, CI/CD
Demonstrated experience with database storage systems such as cloud, relational, mainframe, data lake, and data warehouse
Demonstrated experience building cloud ETL pipelines using code or ETL platforms utilizing database connections, APIs, or file-based
Demonstrated experience with data warehousing concepts and agile methodology
Demonstrated experience designing and coding data manipulations applying processing techniques to extract value from large, disconnected datasets
Demonstrates continuous learning to upskill data engineering techniques and business acumen

Education and Experience:

Bachelor's or Master's degree in computer science, software engineering, information technology or equivalent combination of data engineering professional experience and education
3+ years proven Data Engineering experience

Created: 2024-10-19

Reference: 327664

Country: United States

State: Ohio

City: Cincinnati

ZIP: 45219

Similar jobs:

Data Quality Engineer

Insight Global in Hudson, Ohio
Technical Operations Engineer, Data Center Cybersecurity

Google in Columbus, Ohio
Data Center Technical Operations Engineer I

Experis in Hilliard, Ohio

💸 $32.38 per hour
Data Center Chief Engineer

Amazon in Johnstown, Ohio
Senior Product Data Management Engineer

PARKER HANNIFIN CORP in Akron, Ohio
Senior Manager, Data Engineering

Insight Global in Cleveland, Ohio
Data Center Engineering Operations Technician , DCC Communities

Amazon in Hilliard, Ohio
Data Center Engineering Operations Technician , Data Center Engineering Operations

Amazon in Dublin, Ohio
Cleared Eng Ops Technician, Data Center Engineering Operations

Amazon in Dublin, Ohio
Cleared Engineering Operations Technician, Data Center Engineering Operations

Amazon in Dublin, Ohio
Controls System Development Engineer, Data Center Controls SysDev Engineering - Mechanical

Amazon in Columbus, Ohio

💸 $116300 per year
Sr Data Science/Azure ML Engineer

Apex Systems in Findlay, Ohio
Sr Engineer - Data Science

Wendy's in Dublin, Ohio

💸 $90000 - $153000 per year
Controls System Development Engineer, Data Center Controls SysDev Engineering - Electrical

Amazon in Columbus, Ohio

💸 $116300 per year
Lead AWS Data Engineer ( Onsite)

Cognizant Technology Solutions in Columbus, Ohio
Back End Data Engineer (AWS)

Deloitte in Dayton, Ohio
Sr. Physical Security Engineer, Data Center Design Engineering

Amazon in Johnstown, Ohio
Data Engineer

System One Holdings, LLC in Whitehall, Ohio
Modeling & Simulation (M&S) Engineer and Data Analyst

Modern Technology Solutions, Inc. in Wright- Patterson AFB, Ohio
Data Center Engineering Operations Technician

Amazon in Hilliard, Ohio