Careers

Latest Job Postings

Big Data Dev Engineer

Twitter Facebook
Apply
Location
Wayne, NJ
Salary
$1
Job Type
Direct Hire
Date
Dec 03, 2018
Job ID
2643472

Spearhead Staffing has a permanent hire opportunity for a Big Data Engineer in Wayne, NJ 07470. Salary depends upon experience level. Please read full job description for required skills.
Locals/daily commuting distance preferred; no relo available. Work is onsite.

This position is not open to 3rd party c2c agencies and provides no visa sponsorship. All applicants must have permanent US work authorization and not require visa sponsorship now or in the future.

To apply, please send a Word format resume to the attention of Job# JOS316 resume@spearheadstaffing.com  

The Big Data Development Engineer plays a critical role in making sure the enterprise data and advanced analytics strategy is envisioned by being comfortable working across the full spectrum of data engineering disciplines spanning database architecture, database design, and analytic solution implementation of engineering Cloud-based Big Data Cloudera ecosystem.

Responsibilities include but are not limited to:

  • Build scalable databases for the consumption of structured and unstructured data, work with external data providers to maximize competitor data throughout the company.
  • Import and export data between an external RDBMS and the Hadoop Cloudera cluster, including the ability to import specific subsets, change the delimiter and file format of imported data during ingest and alter the data access pattern or privileges.
  • Automate data pipelines from various systems to create advanced analytics and Machine Learning.
  • Ingest real-time and near real time (NRT) streaming data into HDFS, including the ability to distribute to multiple data sources and convert data from on ingest from one format to another.
  • Convert data from one file format to another, Evolve an Avro or Parquet schema for performance optimization.
  • Write own data with compression.
  • Denormalize data from multiple disparate data sets.
  • Develop RESTful web services.
  • Write query to filter data or to join multiple data sets.
  • Read and/or create a Hive or an HCatalog table from existing data in HDFS.

Required Skills:

  • Ability to communicate both verbally and written to employees in the business units utilizing non-technical language.
  • Very strong knowledge of Data warehousing best practices for optimal performance in an MPP environment.
  • Ability to develop in Python and or Scala language.
  • Strong experience utilizing Spark and Hadoop.
  • Strong experience utilizing SQL and NoSQL databases.
  • Strong programming skills in Spark, Java or similar.
  • Knowledge of Apache Avro and Apache Parquet.
  • Experience with Machine Learning and Computational Statistics.
  • Strong experience utilizing Cloudera Enterprise deployments relational databases, Cloudera Manager databases, Hive Metastore, Hue database, Oozie database.
  • Knowledge of Microsoft Azure.
  • Knowledge of Impala with Kudo (interactive SQL).
  • Familiarity with BI Visualization tools like MS PowerBI.
  • Experience creating a data-driven culture and impactful data strategies.
  • Knowledge of integration concepts as they relate to sourcing data from disparate sources.

Required Experience:

  • B.S in Computer Science or a related IT degree with a minimum of 5 years IT experience designing and developing application programs using Python, Scala. 
  • Minimum of 3 years working with Big Data projects.
  • Must have experience developing data engineering solutions.
  • Cloudera and Hadoop Developer training courses working towards certification.
  • Knowledge of Trifacta Data Wrangling software, Cloudera Spark and Hadoop workflow experience preferred.
  • Cloudera CCP Data Engineer certification preferred.