Did you ever have any of your model run on real time
Senior Data Engineer Interview Questions
2,380 senior data engineer interview questions shared by candidates
- strengths - Weakness - Current project - Why u want to join us
Tell us a about yourself.
Descreva um projeto em que você atuou, fale das tecnologias empregadas, do desafio, dos objetivos a serem atingidos?
Tell me about yourself?
Data flow and Architectures of the projects I worked
Finding Top N within nested Categories
how can this query be optimized? select * from table
Shared in DescriptionQuestion1) If we have input.csv, we need to find the output. File and desired output are given below. username, mobile user1,999999991:888888882 user3,777777771 user2,777777234:823232351 user5,734452343:943433434:834323434 user1,999999991:9994433777 output user1:3 user2:2 user3:1 Question2) How can we read a csv file into dataframe Question3) Option to modify the encoding while reading a file in Scala Question 4) Optin to modify the timestamp while reading a file Question 5) How to introduce separators like "," while reading a file Question 6) How to infer Schema =============================== Question 7) How have below 2 tables, we need to find out users who visited a bank but didn't make any transactions? -- Visits table: -- +---------+------------+ -- | user_id | visit_date | -- +---------+------------+ -- | 1 | 2020-01-01 | -- | 2 | 2020-01-02 | -- | 12 | 2020-01-01 | -- | 19 | 2020-01-03 | -- | 1 | 2020-01-02 | -- | 2 | 2020-01-03 | -- | 1 | 2020-01-04 | -- | 7 | 2020-01-11 | -- | 9 | 2020-01-25 | -- | 8 | 2020-01-28 | -- +---------+------------+ -- Transactions table: -- +---------+------------------+--------+ -- | user_id | transaction_date | amount | -- +---------+------------------+--------+ -- | 1 | 2020-01-02 | 120 | -- | 2 | 2020-01-03 | 22 | -- | 7 | 2020-01-11 | 232 | -- | 1 | 2020-01-04 | 7 | -- | 9 | 2020-01-25 | 33 | -- | 9 | 2020-01-25 | 66 | -- | 8 | 2020-01-28 | 1 | -- | 9 | 2020-01-25 | 99 | -- +---------+------------------+--------+
python and SQL in telephonic Hadoop input file formats, when to use what, design streaming system, Hive optimization, Spark implementation, Python multi-environment related questions Agile and behavioural rounds
Viewing 111 - 120 interview questions