Hands-on experience in Data Engineering including build of scalable Data pipelines with strong focus on writing clean, efficient, and maintainable code
Proficiency in relational and non-relational databases (e.g., JSON, KV, table store)
Knowledge of normalization and denormalization techniques (e.g., JSON embeddings)
Understanding of structured, semi-structured, and unstructured data
Experience with distributed, graph, and vector databases (e.g., Azure AI Search, AWS OpenSearch)
Understanding of ACID and CAP principles
Experience with data lakes and data warehousing
Proficiency in gathering and processing data from multiple heterogeneous sources
Experience integrating data from multiple sources
Familiarity with both ETL and ELT processes
Experience with data migration and replication techniques
Ability to implement data solutions on-premises and in private clouds
Experience with cloud-native pipeline orchestration and workflow orchestration using a dedicated processing cluster (e.g., AirFlow, Apache Spark)
Experience with batch data processing