PySpark: Learn & Test Your Knowledge (PySpark Interview Preparation) - Level: Beginner
Data Structures and Operations - Level 1 (Beginner)
1. Which data structure in PySpark is immutable and distributed across multiple nodes?
ListDataFrame
Array
Set
2. Which operation is used to display the first few rows of a DataFrame?
show()head()
print()
display()
3. Which data structure is mutable and cannot be distributed across multiple nodes in PySpark?
DataFrameArray
List
RDD
4. Which operation is used to select specific columns from a DataFrame in PySpark?
select()filter()
groupBy()
distinct()
5. Which operation is used to count the number of rows in a DataFrame in PySpark?
count()length()
size()
rows()
6. Which operation is used to filter rows based on a condition in PySpark?
select()filter()
groupBy()
distinct()
7. Which operation is used to drop duplicate rows from a DataFrame in PySpark?
dropDuplicates()removeDuplicates()
dropDuplicateRows()
removeDuplicateRows()
8. Which operation is used to sort rows in a DataFrame based on one or more columns in PySpark?
sort()orderBy()
sortRows()
sortData()
9. Which operation is used to rename a column in a DataFrame in PySpark?
renameColumn()changeColumn()
withColumnRenamed()
updateColumn()
10. Which operation is used to drop a column from a DataFrame in PySpark?
deleteColumn()dropColumn()
removeColumn()
drop()