Sample Greenplum DBA Interview Questions

1. What were some of the main responsibilities in your previous assignments as a Greenplum DBA?

2. List out the import steps, and strategies in the process of migrating Oracle Database data to a Greenplum Database

3. Specify how does data stored in Greenplum, specifically explain data distribution?

4. Can we create indexes in Greenplum, like we do in a Oracle Database?

5. Briefly explain the architecture of a Greenplum database? 

6. Explain the query plan, and the concept of motion, compare gather, and redistribute motion?

7. Do we have to vacuum a table after an insert, if yes why, specify how you implement vacuuming a large table, that has frequent inserts/loads?

8. How do transform a query that uses "CONNECT BY PRIOR" in Oracle, into PostgreSQL/Greenplum database?

9. Dataloads into Greenplum, what are different utilities/protocols used in the process?

10.What happens if one of the segment node goes down, for any reason, how does Greenplum manages queries, will they be impacted? How do recover a failed segment, and synchronize data?

11.If you have the task of architecting a database in Greenplum, how would you do it?

12.Did you tune the internal parameters with in Greenplum?  How do you tune memory in Greenplum?

13. when the client connects does he connect to the Master or segment node?

14. Can you explain the process of data migration from Oracle to Greenplum,

15. Which command would you use to back up a database?

16. When you restore from a backup taken from gp_dump, can you  import a table?

17. Can open and view a dump file?

18. Which option would you use to export the ddl of the database or table?

19. When a user submits a query, where does it run in Master or segment nodes?

20. If you configure your with Master and Segment nodes, where would the data reside?

21. How would go about query tuning?

22. What would you do when a user or users are complaining that a particular query is running slow?

23. What would you do to gather statistics in the database? as well as reclaim the space?

24. How would you implement compression and explain possible the compression types?

25. What are major differences between Oracle and Greenplum? 

26. What is good and bad about the Greenplum, compared to Oracle and Greenplum?

27. How would troubleshoot an issue/error/problem, when there is no one available to help you or you are all by yourself?

28. Do you know plpgsql?

29. Can you write stored procedures in Greenplum/PostgreSQL?

30 Can you create stored functions and use them in Greenplum?

31. Can you do partitioning in Greenplum tables?

32. Which parameters can you use to manage workload in a Greenplum database?

33. Tell me some of the aspects/implementations/configurations you have done in Greenplum?

34. Your perspective on very large database backups,

35. performance tuning

36. How do you enforce referential integrity in GP.

37. like oracle, is there controlfile and system tablespace for GPDB.  What is minimum requirement to start or create a GPDB.

38. how do you export data to flat files, without headers.

39. Can you export table data to pipes and compress the output.

40. If the master goes down, does the standby master automatically come online.  

41. How we automatically activate standby master.

42. How do you clone database from prod to development.

43. Can we change the number of segments while restoring to new environment like development. Suppose production has 100 segments on 20 servers.  Can we restore to 5 servers with less segments.

44. Columnar data storage

45. How do you enforce referential integrity constraint

45. Does Greenplum support of database triggers?

46. Greenplum Hadoop integration 

47. Partitioning in Greenplum

48. What's diff b/n delete, truncate and drop

49. What's hardening in oracle

50. How is data imported into GP

51. What's denormalization, normalization

52. Security in GP

53. How Modeling is done

54. How much and how often data is ported into GP

55. How long it took for 2tb data

56. Explain Process for importing data into GP

57. Indexing in GP

58. Diff b/n oltp vs olap

59. Experience in GP and duties 

60. How you tackle the GPDB Performance Issues

61. What operation issue did you experience on GPDB ?

62. What is the cause  for GPDB to pick redistribute motion for query.

63. What version of DCA and Greenplum you worked on 

64. What is the best practice to implement ETL process to load data into GPDB

65. How can you find out what query are using disk spill space