Spark SQL and Machine Learning - Advanced
Spark SQL and Machine Learning - Advanced
1. Which optimizer in Spark SQL is rule-based and designed to improve query performance?
Catalyst OptimizerTungsten Optimizer
Pluto Optimizer
Athena Optimizer
2. Which algorithm in Spark MLlib is used for collaborative filtering?
K-MeansAlternating Least Squares (ALS)
Decision Trees
Naive Bayes
3. Which function in Spark SQL is used to unpersist cached tables from memory?
clearCache()removeCache()
deleteCache()
unpersistTable()
4. In Spark MLlib, which method is used to evaluate the performance of a regression model?
score()predict()
evaluate()
assess()
5. Which algorithm in Spark MLlib is used for frequent pattern mining?
FP-GrowthApriori
K-Means
DBSCAN
6. Which feature in Spark MLlib is used for feature selection and dimensionality reduction?
StringIndexerVectorAssembler
PCA
OneHotEncoder
7. Which function in Spark SQL is used to specify the partitioning columns when writing a DataFrame to a table?
repartition()distributeBy()
sortBy()
partitionBy()
8. Which function in Spark MLlib is used to perform hyperparameter tuning for machine learning models?
CrossValidatorGridSearchCV
RandomizedSearchCV
HyperparameterOptimizer
9. Which algorithm in Spark MLlib is used for outlier detection?
Local Outlier Factor (LOF)One-Class SVM
Isolation Forest
DBSCAN
10. Which Spark MLlib feature transformer is used for converting categorical features into numerical features?
StringIndexerVectorAssembler
PCA
OneHotEncoder