Worldwide | Feb. 8, 2024
Report as Closed
Salary: $77,940 - $86,601
Data engineer will participate in the building of large-scale image data processing systems and API’s and should be able to leverage common image and geodata formats to work with the latest open-source technologies.
A Data Engineer should embrace the challenge of dealing with terabytes or even petabytes of data daily in a high-throughput API/microservice ecosystem. A Data Engineer understands how to apply technologies to solve complex problems that bring value to new and existing data processing pipelines. The Data Engineer generally works on implementing complex projects with a focus on collecting, parsing, managing, analyzing and making available large sets of imagery and image data to turn information into insights using multiple platforms. The Data Engineer should be able to develop prototypes and proof of concepts for the selected solutions. This role will drive the engineering and building of geospatial data assets to support image processing pipeline.
Key responsibilities include:
• Design, build and support of cloud and open source systems to process geospatial data assets via an API-based platform
• Partner with other internal development communities to bring needed data sets into the asset and making data available to the Enterprise and internal development communities
• Building highly scalable APIs and associative architecture to support thousands of requests per second
• Being able to work across multiple teams internal/external to gather requirements and ensure project development is aligned to those requirements.
• Being able to improve the performance of the existing services and be able to identify the scope for any enhancements.
• Being able to work with parsing, managing, analyzing and making available large sets of data to turn information into insights using multiple platforms.
• Working at all stages of the software life cycle: Proof of Concept, MVP, Production, and Deprecation
• BSc degree in a geo-science, Image Science, Computer Science, Engineering, or Physics or relevant job experience
• At least 3 years of experience with Python development language
• Proven experience working with basic image processing workflows e.g. classification, vegetation indices
• Experience developing HTTP APIs (Open API, REST, and/or GraphQL) which serve up data in an open-source technology, preferably in a cloud environment
• Ability to build and maintain modern cloud architecture, e.g. AWS, Google Cloud, Azure, etc.
• Experience working with PostgreSQL/PostGIS
• Experience working with cloud object storage, e.g. AWS S3, Google Cloud Storage, etc.
• Experience with code versioning and dependency management systems such as GitHub, SVT, or Maven. Git experience is strongly preferred
• Proven success utilizing Docker or other container tool sets to build, test, and deploy within a CI/CD Environment, preferably using Kubernetes and ArgoCD
• Highly proficient in Python
• MSc in Computer Science or related field
• 5 or more years of experience with Python development language
• Experience with stream processing, e.g. Kafka
• Solid understanding and experience implementing Spatial Temporal Asset Catalog (STAC) specification
• Experience working with STAC catalogs, items, and assets
• Proven experience with Elasticsearch, AWS Opensearch or similar distributed search and analytic solutions
• Proven experience (2 years) with distributed systems, e.g. Argo, Kubernetes, Spark, distributed databases, grid computing
• Proficient (3+ years) working in a Command Line Interface system e.g Docker, Argo, K8s, AWS CLI, GCloud, pSQL, SSH
• Experience with advanced image processing workflows with LiDAR, multi/hyper-spectral imagery
• Experience with photogrammetry techniques and/or geolocation and mensuration assessment experience
• Experience with raw data handling from Unmanned Aerial Systems (UAS) and/or Satellite image collection platforms
• Proficient with QGIS or similar desktop GIS environments with emphasis in imagery analytics
• Experience with developing HTTP APIs using common Python frameworks (Flask, Django, FastAPI)
• Knowledge of Geoserver and other OGC standard technologies
• Familiarity with agriculture and/or precision agriculture-oriented businesses
• Experience implementing complex data projects with a focus on collecting, parsing, managing, and delivery of large sets of data to turn information into insights using multiple platforms.
• Demonstrated experience adapting to new technologies
• Experience with object-oriented design, coding and testing patterns as well as experience in engineering (commercial or open source) software platforms and large-scale data infrastructures should be present
• Experience creating cloud computing solutions and web applications that can synthesize data from public and private APIs