DATA ENGINEER
ABOUT THE JOB
-
As a data engineer for OceanSMART you will be working on building / maintaining data pipelines, acquiring data from our external suppliers, capturing both real-time data and static data. Making this data available to our development teams for being exposed through our Software as a Service solution.
-
Our real-time and batch processing pipelines process a daily load of over 200 million messages and derive insights from billions of historical records.
WE ARE LOOKING FOR A PERSON WHO WILL
-
Design and maintain reliable, scalable, efficient data processing pipelines to process real-time data using Event driven approaches.
-
Design, develop and maintain highly scalable Kafka Stream processing microservices using Java and Quarkus framework and deploy in Kubernetes
-
Use Apache Spark to process historical data (billions of records and terabytes in size) using Scala/Python and derive optimizations & insights
-
Responsible for data modelling in RDBMS, NoSQL data stores like HBase/Mongo DB and Archive datastores on Azure Blobs or ADLS Gen2 datastores
-
Monitoring/Alerting
-
Identify opportunities and explore ways to enhance data infrastructure to improve data acquisition, data processing, and data visualization processes
-
Collaborate with DevOps for data pipeline setup and configuration
Personal Traits
-
Being open-minded and curious
-
Speak up when solutions are not ideal and communicate suggested solutions to cross-functional pairs.
-
Ability to work independently because you are the first brick of the team, many opportunities but also many challenges.
-
A true team player with great communication and interpersonal skills.
A POTENTIAL CANDIDATE WILL HAVE
Required Experience:
-
Bachelor’s or master's degree in computer science, Software Engineering, Information Technology, or a related technical field.
-
3+ years' experience as a Data Engineer or similar role.
-
Understand the big picture
-
Experience with big-data technologies such as Apache Kafka and Apache Spark.
-
Experience with JVM languages (Java or Scala), Python.
-
Good at multi-threading, atomic operations
-
Computation framework: Spark (Azure Batch & Databricks)
-
Databases:
-
SQL: Postgres (Postgis), SQL server, TimescaleDB
-
NoSQL: Hbase, Cassandra, MongoDB
-
Object data stores: Azure Blob Store, different storage tiers
-
Mapping: Geoserver
-
Understand designs of resilience, fault-tolerance, high availability, and high scalability
Desirable
-
Hand-on experiences on popular Data platforms such as Azure, Amazon or GCP with event driven, micro-service architecture for loosely coupled, highly scalable design.
-
Experience in performance tuning/optimizing Big Data programs.
-
• Familiar with Scrum framework, Scrum with Kanban
SOME OF OUR BENEFITS
-
Great salary package. Annual performance review.
-
13th-salary Bonus for all staff.
-
Premium Healthcare Insurance (2 sponsored packages): you and your spouse/child/parent.
-
Annual Health Check-up for all staff.
-
Annual Loyalty Award packages (3mil - 5mil - 10mil), 5-year Award, and 10-year Award.
-
Good career advancement opportunities.
-
Product-oriented. Agile project management style. Dynamic and English-speaking working environment.
-
Opportunity to acquire technical knowledge and experience in the latest technologies.
-
Up to 18 annual leaves a year PLUS 05 sponsored extra day.
-
Working hours: 8 hours x 5 days/week (Monday to Friday). Thirty-min break at 4 PM every day.
-
And so much more!
If this interest you, please contact us for a coffee. So we can share and learn more about you.