• Find preferred job with Jobstinger
  • ID
    #15823913
  • Job type
    Permanent
  • Salary
    $80,000 - $140,000
  • Source
    Acunor Infotech
  • Date
    2021-06-07
  • Deadline
    2021-08-06

Data Engineer

Pennsylvania, Pittsburgh, 15201 Pittsburgh USA
 
Permanent

Vacancy expired!

Note - Visa sponsorship provided.

Role and Responsibilities
  • Design, build and deploy data pipelines (batch and streaming) in the data lake using Hadoop technology stacks and programming languages such as PySpark, Python, Spark, Spark Streaming, Hive
  • Design, build and deploy error handling, data reconciliation, audit log monitoring, job scheduling using PySpark and shell scripting / CRON scripts
  • Data extraction from multiple data sources on to raw data layer
  • Decode the data from flat files, text files. Write Python codes to decode raw files and convert into business variables
  • Develop and implement coding best practices using Spark, Python, and PySpark
  • Collaborate with offshore and onshore team and effectively communicate status, issues, and risks daily
  • Develop data model and structure for the data lake to ensure alignment with the data domain, integration needs and efficient access to the data
  • Analyze existing and future data requirements, including data volumes, data growth, data types, latency requirements, data quality, the volatility of source systems, and analytic workload requirement

Required Skill Set
  • Must have skills - PySpark, Spark HDFS, Hadoop, Oozie, Shell Scripting, Hive, Impala
  • Familiarity with

    MongoDB, JSON, XML and other data sources / file formats
  • Hands-on experience working with

    Hadoop ecosystems including HIVE, HDFS, MapReduce, Spark, Oozie
  • Hands-on experience working with scripting/programming languages such as PySpark, Python, Scala, to write batch and streaming jobs
  • Hands-on experience working with Big Data Platforms such as Cloudera, Hortonworks, or MapReduce
  • Experience with ETL and ELT approaches of data ingestion and integration in the Hadoop ecosystems
  • Experience with using the Agile approach to deliver solutions
  • Deep understanding of Data Warehouse and Data Lake design, standards and best practices
  • Good to have: Experience of working in financial services domain
  • Outstanding written and verbal communication skills

Vacancy expired!

Report job

Related Jobs

Jobstinger