Big Data Engineer

Big Data Engineers in this TalentCloud are required to have extensive experience in building end-to-end data platforms, data pipelines and data flows. This should include data ingestion/integration, data storage, data transformation, data processing, data deployment, data operations and data cataloging.

The Engineers in this TalentCloud should be able to design and work closely with other big data architects, big data developers, data scientists, DevOps and DataOps engineers to develop a platform capable of executing operational, analytic and predictive workloads that serve thousands of applications and support machine learning deployment and inferencing.

Required Skills

Extensive experience as a data Engineer, database developer and building data driven applications
Good understanding of distributed systems and distributed databases
Experience with ETL/ELT development, Batch processing and stream processing
Familiarity with frameworks like Spark and Kafka and tools around them
Understanding of Big Data Ingestion/Integration/Storage/Processing, transformation/ETL tools and data formats for storage
Ability to debug, troubleshoot and optimize data solutions in the Big Data Ecosystem with tools like Spark, Presto, Hive, Kafka and NoSQL & relational databases and data warehouses
Experience working with SQL Engines on large data – Presto, Impala, Dremio, SparkSQL, Hive, Drill, Druid and others
Knowledge and experience working with DevOps and DataOps teams and collaborate with them to develop the process and automate deployment
Programming experience with one or more – Java, Scala, Python
Expertise with both intermediate and advanced level of SQL query development
Ability to understand and work with complex datasets and build solutions around them with data modeling
Work with other team members – business analysts and data analyst, data stewards to understand the requirements and build solutions
Ability, passion, and aptitude for learning new programming and querying languages, and applying them to build data solutions
Good understanding of tools around the DevOps ecosystem with basic understanding of dockers and CI/CD processes
Good level of expertise working with GIT

Preferred Skills

Experience with Data Warehousing, Data Modeling, Data Marts, Data Virtualization, MPP based Engines like Redshift, Vertica, BigQuery, Snowflake, etc.
Experience with relational databases like – Postgres, MySQL, MariaDB, Oracle, etc.
Working with at least one or more of NoSQL Databases and able to develop a data model with at least one or more of the main types of NoSQL databases
Key value data stores – Redis, DynamoDB, Riak,
Document databases – MongoDB, CouchDB, Couchbase
Graph Databases – Neo4J,
Wide column databases – Cassandra, HBase, Scylla
Time Series databases – InfluxDB, TimeScale
Search engines and databases – ElasticSearch, Solr
InMemory databases or InMemory Grids – Apache Ignite, GridGain, etc

FUTURE OF WORK PLATFORM

COMPARE OFFERINGS

UPSKILLING PLATFORM

EXPERFY TALENTCLOUDS

CUSTOM TALENTCLOUDS

Big Data Big Data Engineer

Big Data Engineer

Popular Cloud Architect Roles in this TalentCloud

Cloud Description

Expertise

Skillset

TECHNOLOGY & TOOLS

Required Tech Tools

Looking to hire from this TalentCloud?

Get Started

About

Downloads

The Harvard Innovation Lab

For Clients

For Experts

Solutions

Upskilling Platform

Resource Hub

About Us

Contact Us

Address

Stay in Touch