Data Engineer Interview Questions

Data Engineer Interview Questions

Le data engineer est un professionnel de l’informatique présent dans presque tous les secteurs. Il/Elle suit l’évolution et les tendances des données pour orienter les stratégies futures de l’entreprise. Une part essentielle de son travail consiste à transformer des données brutes en données exploitables en créant des pipelines et des systèmes de données.

Questions d'entretien d'embauche fréquentes pour un data engineer (H/F) et comment y répondre

Question 1

Question 1 : Décrivez en détail votre niveau d’expertise en langage de programmation.

How to answer
Comment répondre : Avant l’entretien, révisez votre CV et dressez la liste des programmes que vous maîtrisez. Si vous vous apercevez que vous ne connaissez pas un logiciel que l’entreprise utilise majoritairement, mettez en avant votre motivation et votre volonté de vous former au logiciel en question.
Question 2

Question 2 : Expliquez selon vous en quoi consiste le data engineering.

How to answer
Comment répondre : Soulignez votre rôle au sein de l’entreprise et par rapport à d’autres fonctions telles que data scientist pour définir clairement votre contribution. Précisez la différence entre un ingénieur axé sur les bases de données et un ingénieur axé sur les pipelines de données.
Question 3

Question 3 : Quelle est votre expérience en gestion de données dans le cloud et avec Apache Hadoop ?

How to answer
Comment répondre : Renseignez-vous sur les logiciels de gestion de données dans le cloud utilisés par l’entreprise (notamment Apache Hadoop). Un data engineer doit maîtriser les langages de programmation et les systèmes de gestion des données couramment employés dans le secteur, dont Apache Hadoop.

20,186 data engineer interview questions shared by candidates

want you to write me a simple spell checking engine. The query language is a very simple regular expression-like language, with one special character: . (the dot character), which means EXACTLY ONE character (it can be any character). So, for example, 'c.t' would match 'cat' as the dot matches any character. There may be any number of dot characters in the query (or none). Your spell checker will have to be optimized for speed, so you will have to write it in the required way. There would be a one-time setUp() function that does any pre-processing you require, and then there will be an isMatch() function that should run as fast as possible, utilizing that pre-processing. There are some examples below, feel free to ask for clarification. Word List: [cat, bat, rat, drat, dart, drab] Queries: cat -> true c.t -> true .at -> true ..t -> true d..t -> true dr.. -> true ... -> true .... -> true ..... -> false h.t -> false c. -> false */ // write a function // Struct setup(List<String> list_of_words) // Do whatever processing you want here // with reasonable efficiency. // Return whatever data structures you want. // This function will only run once // write a function // bool isMatch(Struct struct, String query) // Returns whether the query is a match in the // dictionary (True/False) // Should be optimized for speed
avatar

Data Engineer

Interviewed at Meta

3.6
May 22, 2020

want you to write me a simple spell checking engine. The query language is a very simple regular expression-like language, with one special character: . (the dot character), which means EXACTLY ONE character (it can be any character). So, for example, 'c.t' would match 'cat' as the dot matches any character. There may be any number of dot characters in the query (or none). Your spell checker will have to be optimized for speed, so you will have to write it in the required way. There would be a one-time setUp() function that does any pre-processing you require, and then there will be an isMatch() function that should run as fast as possible, utilizing that pre-processing. There are some examples below, feel free to ask for clarification. Word List: [cat, bat, rat, drat, dart, drab] Queries: cat -> true c.t -> true .at -> true ..t -> true d..t -> true dr.. -> true ... -> true .... -> true ..... -> false h.t -> false c. -> false */ // write a function // Struct setup(List<String> list_of_words) // Do whatever processing you want here // with reasonable efficiency. // Return whatever data structures you want. // This function will only run once // write a function // bool isMatch(Struct struct, String query) // Returns whether the query is a match in the // dictionary (True/False) // Should be optimized for speed

""" # Question 2: # Fill in the blanks # # Given an array containing None values fill in the None values # with most recent non None value in the array # # For example: # - input array: [1,None,2,3,None,None,5,None] # # - output array: [1,1,2,3,3,3,5,5] #
avatar

Data Engineer

Interviewed at Meta

3.6
Jun 8, 2020

""" # Question 2: # Fill in the blanks # # Given an array containing None values fill in the None values # with most recent non None value in the array # # For example: # - input array: [1,None,2,3,None,None,5,None] # # - output array: [1,1,2,3,3,3,5,5] #

Python questions: 1. Replace None value with previous value present in a list. 2. Given a ´dictionary, print the key for nth highest value present in the dict. If there are more than 1 record present for nth highest value then sort the key and print the first one. 3. Given two sentences, you have to print the words those are not present in either of the sentences.(If one word is present twice in 1st sentence but not present in 2nd sentence then you have to print that word too) 4. Forgot another question you have to pass all the cases, specially on edge cases. SQL questions: 1. Mostly % calculation, also refer the questions available here in Glassdoor. Next will be my Onsite Interview of 3.30hr. If anyone can help me on that or else I will update later on that interview as well.
avatar

Data Engineer

Interviewed at Meta

3.6
Mar 9, 2021

Python questions: 1. Replace None value with previous value present in a list. 2. Given a ´dictionary, print the key for nth highest value present in the dict. If there are more than 1 record present for nth highest value then sort the key and print the first one. 3. Given two sentences, you have to print the words those are not present in either of the sentences.(If one word is present twice in 1st sentence but not present in 2nd sentence then you have to print that word too) 4. Forgot another question you have to pass all the cases, specially on edge cases. SQL questions: 1. Mostly % calculation, also refer the questions available here in Glassdoor. Next will be my Onsite Interview of 3.30hr. If anyone can help me on that or else I will update later on that interview as well.

You have a 2-D array of friends like [[A,B],[A,C],[B,D],[B,C],[R,M], [S],[P], [A]] Write a function that creates a dictionary of how many friends each person has. People can have 0 to many friends. However, there won't be repeat relationships like [A,B] and [B,A] and neither will there be more than 2 people in a relationship
avatar

Data Engineer

Interviewed at Meta

3.6
Oct 31, 2018

You have a 2-D array of friends like [[A,B],[A,C],[B,D],[B,C],[R,M], [S],[P], [A]] Write a function that creates a dictionary of how many friends each person has. People can have 0 to many friends. However, there won't be repeat relationships like [A,B] and [B,A] and neither will there be more than 2 people in a relationship

SQL questions - A table schema with tables like employee, department, employee_to_projects, projects 1) Select employee from departments where max salary of the department is 40k 2) Select employee assigned to projects 3) Select employee which have the max salary in a given department 4) Select employee with second highest salary 5) Table has two data entries every day for # of apples and oranges sold. write a query to get the difference between the apples and oranges sold on a given day
avatar

Data Engineer

Interviewed at Meta

3.6
May 24, 2016

SQL questions - A table schema with tables like employee, department, employee_to_projects, projects 1) Select employee from departments where max salary of the department is 40k 2) Select employee assigned to projects 3) Select employee which have the max salary in a given department 4) Select employee with second highest salary 5) Table has two data entries every day for # of apples and oranges sold. write a query to get the difference between the apples and oranges sold on a given day

Viewing 1 - 10 interview questions

Glassdoor has 20,186 interview questions and reports from Data engineer interviews. Prepare for your interview. Get hired. Love your job.