Software Engineer - Data

Location: Northern Suburbs of Chicago, IL
Date Posted: 05-29-2018
POSITION TYPE: Contract-to-Hire / Direct Hire

Our client is looking to hire a software engineer with either Java/Apache/Big Data skills OR Python/R/Text Analytics skills.  Someone who has a combination of the two would be highly desirable.
The role requires working closely with others, frequently in a matrixed environment, and with little supervision. The role requires a self-starter who is proficient in problem solving and requires staying well-informed of technological advancements and putting new innovations into effective practice.  You’ll also be responsible for the development of high-performance, distributed computing tasks using Big Data technologies such as Hadoop, NoSQL and text mining. 

  • Bachelor’s degree in Computer Science or related field of study or work in a Big Data / Text Analytics environment
  • Understanding of best practices and standards for Hadoop Distributed File System (HDFS)
  • 2 Years of hands-on experience with any of the following components (at least three):
    • Hadoop HDFS
    • Languages:
      Experience in REST API development
      Knowledge of standard web technologies (HTTP, HTTPS, HTML5)
    • Object Oriented Programming using Java
    • Scripted Programming (Python, R)
    • Experience with data serialization: JSON or Avro
    • Experience with at least one of NoSQL: Cassandra, MangoDB or Graph Models
    • Experience with Messaging: Kafka, Flume or Storm
  • Basic understanding in building machine learning applications, machine learning APIs, tools, and open source libraries
  • Experienced and comfortable with Unstructured Data extract, preparation and processing
  • Basic data modeling experience using Big Data Technologies
  • Basic knowledge of Data Science concepts such as Machine Learning (Clustering, Decision Trees), Deep Learning, Neural Networks, Natural Language Processing (NLP) and basic applied statistics (Mean/Standard Deviation/Correlation/GLM Regression)
  • A quick learner with a willingness to do your own research for answers to questions and comfortable attempting solutions on your own. 

  • Responsible for building and supporting a NoSQL and Hadoop-based ecosystem designed for enterprise-wide analysis of structured, semi-structured, and unstructured data
  • Using Big Data programming languages and technology, writing code, completing programming and documentation, and testing and debugging of applications
  • Analyze, design, program, debug and modify software enhancements and/or new products used in distributed, large scale analytics and visualization solutions
  • Interacting with data scientists and industry experts to understand how data needs to be converted, loaded and presented
  • Experience working with data scientists to integrate machine learning & statistical models into products for clients is a plus
  • Working with Hadoop/Spark clusters
  • Support regular requests to move data from one cluster to another
  • Bring new data sources into HDFS, transform and load to databases
  • Work collaboratively with Data Scientists and business and IT leaders throughout the company to understand Big Data needs and use cases.
  • Utilizing Machine Learning frameworks for the next generation of applications and platforms using the latest Big Data technologies for large scale enterprise application
  • Develops and maintains system documentation for new and existing applications
  • Develop large scale RESTful Web Services 
  • Collaborate with cross-functional teams - business stakeholders, engineers, program management, project management, etc. - to produce the best solutions possible 
  • Strive for continuous improvement of code quality and development practices 
  • Deliver results through collaboration 
  • Translate functional and technical requirements into detailed design 
  • Scaling up machine learning models that are creating business value into highly automated products that act as supply chain decision support systems
  • Following best-in-class software development practices such as agile work flow management and leveraging platforms such as Jira to ensure quality and timeliness of product delivery
  • Developing code in Python, R or other state-of-the-art languages to scale both supervised and unsupervised learning models and building the associated data flow pipelines
  • Identifying creative ideas for integration of machine learning based solutions into business processes to optimize business performance metrics

Attitude and Aptitude for Learning

Our client is willing to consider individuals who have little to no experience with data persistence (NoSQL), such as using Graph Model – NEO / document data repositories – Mongo / columnar data repositories – Hbase, etc.  as they will be training the person they hire according to team culture and their own approach.  Paramount to their consideration will be your enthusiasm to learn and your willingness to be a hands-on self-starter.
this job portal is powered by CATS