Job Overview
A typical day in the life of a Junior Database Engineer/Developer will involve designing, developing, and maintaining a robust and secure database infrastructure to efficiently manage company data. They collaborate with cross-functional teams to understand data requirements and migrate data from spreadsheets or other sources to relational databases or cloud-based solutions like Google BigQuery and AWS. They develop import workflows and scripts to automate data import processes, optimize database performance, ensure data integrity, and implement data security measures. Their creativity in problem-solving and continuous learning mindset contribute to improving data engineering processes. Proficiency in SQL, database design principles, and familiarity with Python programming are key qualifications for this role.
Job Description
-
Design, develop, and maintain the database infrastructure to store and manage company data efficiently and securely.
-
Work with databases of varying scales, including small-scale databases, and databases involving big data processing.
-
Work on data security and compliance, by implementing access controls, encryption, and compliance standards (GDPR).
-
Collaborate with cross-functional teams to understand data requirements and support the design of the database architecture.
-
Migrate data from spreadsheets or other sources to a relational database system (e.g., PostgreSQL, MySQL) or cloud-based solutions like Google BigQuery.
-
Develop import workflows and scripts to automate the data import process and ensure data accuracy and consistency.
-
Optimize database performance by analyzing query execution plans, implementing indexing strategies, and improving data retrieval and storage mechanisms.
-
Work with the team to ensure data integrity and enforce data quality standards, including data validation rules, constraints, and referential integrity.
-
Monitor database health and identify and resolve issues.
-
Collaborate with the full-stack web developer in the team to support the implementation of efficient data access and retrieval mechanisms.
-
Implement data security measures to protect sensitive information and comply with relevant regulations.
-
Demonstrate creativity in problem-solving and contribute ideas for improving data engineering processes and workflows.
-
Embrace a learning mindset, staying updated with emerging database technologies, tools, and best practices.
-
Explore third-party technologies as alternatives to legacy approaches for efficient data pipelines.
-
Familiarize yourself with tools and technologies used in the team’s workflow, such as Knime for data integration and analysis.
-
Use Python for tasks such as data manipulation, automation, and scripting.
-
Collaborate with the Data Research Engineer to estimate development efforts and meet project deadlines.
-
Assume accountability for achieving development milestones.
-
Prioritize tasks to ensure timely delivery, in a fast-paced environment with rapidly changing priorities.
-
Collaborate with and assist fellow members of the Data Research Engineering Team as required.
-
Perform tasks with precision and build reliable systems.
-
Leverage online resources effectively like StackOverflow, ChatGPT, Bard, etc., while considering their capabilities and limitations.
-
Bachelor’s degree in Computer Science, Information Systems, or a related field is desirable but not essential.
-
Knowledge of database development and administration concepts, especially with relational databases like PostgreSQL or MySQL.
-
Knowledge of SQL and understanding of database design principles, normalization, and indexing.
-
Knowledge of data migration, ETL (Extract, Transform, Load) processes, or integrating data from various sources.
-
Knowledge of cloud-based databases, such as Google BigQuery and AWS RDS.
-
Eagerness to develop import workflows and scripts to automate data import processes.
-
Ideally, familiarity with Knime or similar tools for data integration and analysis.
-
Knowledge of data security best practices, including access controls, encryption, and compliance standards (e.g., GDPR, HIPAA).
-
Strong problem-solving and analytical skills with attention to detail.
-
Creative and critical thinking.
-
Strong willingness to learn and expand knowledge in data engineering.
-
Knowledge of Python programming, including data manipulation and automation, and with modules such as Pandas, SQLAlchemy, gspread, PyDrive, PySpark.
-
Familiarity with Agile development methodologies is a plus.
-
Familiarity with Docker containers or similar technologies is a plus.
-
Experience with version control systems, such as Git, for collaborative development.
-
Ability to thrive in a fast-paced environment with rapidly changing priorities.
-
Ability to work collaboratively in a team environment.
-
Good and effective communication skills.
-
Comfortable with autonomy and ability to work independently.
Qualifications
Bachelor’s degree in Computer Science, Information Systems, or a related field is desirable but not essential.