Data Science Interview Preparation

PySpark: Learn & Test Your Knowledge (PySpark Interview Preparation) - Level: Intermediate

PySpark MCQs - Level 2 (Intermediate)

Data Structures and Operations - Level 2 (Intermediate)

1. Which method is used to return a new DataFrame by replacing null values in specified columns in PySpark?

replaceNull()
fillna()
nullReplacement()
nullify()

2. In PySpark, what does the repartition method do?

It redistributes data across partitions based on specified columns.
It sorts the DataFrame based on specified columns.
It joins two DataFrames.
It filters rows based on a condition.

3. Which of the following is NOT a valid way to create a DataFrame in PySpark?

From an existing RDD.
Using a Python dictionary.
Reading from a CSV file.
Reading from a relational database.

4. What does the cache method do in PySpark?

It sorts the DataFrame based on specified columns.
It filters rows based on a condition.
It persists the DataFrame in memory across operations.
It redistributes data across partitions.

5. Which operation is used to drop rows with null values in PySpark?

dropna()
nullifyRows()
removeNulls()
clearNulls()

6. Which function is used to calculate the summary statistics of columns in a DataFrame in PySpark?

stats()
describe()
summary()
metrics()

7. Which method is used to drop a column from a DataFrame in PySpark?

removeColumn()
deleteColumn()
excludeColumn()
drop()

8. In PySpark, which operation is used to perform element-wise addition of two DataFrames?

join()
merge()
union()
concat()

9. Which of the following is a valid way to select distinct rows from a DataFrame in PySpark?

uniqueRows()
dropDuplicates()
distinctRows()
filterDuplicates()

10. Which method is used to convert a DataFrame column into a list in PySpark?

toList()
convertToList()
listify()
collect_list()
'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })();
Theme images by Barcin. Powered by Blogger.