Home Unlabelled Spark SQL and Machine Learning - Intermediate

Spark SQL and Machine Learning - Intermediate

1. Which method is used to load a dataset from a LIBSVM file format in Spark MLlib?

loadSVM()
loadSVMLib()
loadLibsvm()
loadLibSVM()

2. Which algorithm is NOT available for classification tasks in Spark MLlib?

Linear Regression
Decision Trees
Random Forest
Logistic Regression

3. What is the purpose of the VectorAssembler transformer in Spark MLlib?

To split a feature vector into multiple columns.
To perform dimensionality reduction on feature vectors.
To combine a given list of feature columns into a single feature vector column.
To perform normalization on feature vectors.

4. Which method is used to perform cross-validation for tuning hyperparameters in Spark MLlib?

GridSearchCV()
CrossValidator()
RandomizedSearchCV()
HyperparameterTuner()

5. Which function is used to evaluate the performance of a classification model in Spark MLlib?

RegressionEvaluator()
BinaryClassificationEvaluator()
MulticlassClassificationEvaluator()
ClusteringEvaluator()

6. Which algorithm is used for collaborative filtering in Spark MLlib?

K-Means
Alternating Least Squares (ALS)
Decision Trees
Logistic Regression

7. Which method is used to train a classification model in Spark MLlib?

fitModel()
train()
trainClassifier()
fit()

8. Which method is used to save a trained model in Spark MLlib?

storeModel()
save()
saveModel()
exportModel()

9. Which evaluation metric is commonly used for regression tasks in Spark MLlib?

AUC (Area Under the Curve)
RMSE (Root Mean Squared Error)
F1-Score
Precision

10. Which method is used to load a model from disk in Spark MLlib?

loadModel()
importModel()
load()
retrieveModel()

'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })();

Data Science Interview Preparation

Spark SQL and Machine Learning - Intermediate