Machine Learning Data Analyst Free Quiz
61. What is K-means clustering?
A clustering algorithm that partitions data into K clustersA regression algorithm
A classification algorithm
A dimensionality reduction technique
62. What is the main objective of clustering in data analysis?
To group similar data points togetherTo predict continuous values
To identify outliers
To perform data visualization
63. What is the elbow method in K-means clustering?
A method to determine the optimal number of clustersA way to measure clustering accuracy
A method to handle missing data
A technique for scaling features
64. What is an API in the context of data analysis?
A set of functions for interacting with a systemA visualization tool
A method for handling missing data
A machine learning model
65. What is feature engineering in machine learning?
The process of creating new features from raw dataThe process of splitting data
A method to clean missing data
A way to scale data
66. In Python, what does the Seaborn library specialize in?
Data manipulationMachine learning
Statistical data visualization
Text processing
67. What is the F1 score in classification problems?
The harmonic mean of precision and recallThe ratio of true positives to total positives
The difference between actual and predicted values
A metric for clustering quality
68. What is time series data?
Data that is grouped into clustersData collected or recorded at regular time intervals
Data that includes missing values
Data used in classification problems
69. What is a confusion matrix used for?
To calculate regression errorsTo evaluate the performance of a classification model
To identify clusters
To visualize data
70. What is the purpose of cross-validation?
To split data into training and testing setsTo evaluate a model's performance on different subsets of the data
To identify outliers
To increase model accuracy
71. In SQL, what does the LIMIT clause do?
Limits the number of records returnedSorts the records
Groups the records
Joins multiple tables
72. What is the recall metric in classification?
The ratio of true negatives to total predictionsThe ratio of true positives to actual positives
The ratio of true positives to predicted positives
The harmonic mean of precision and recall
73. What is overfitting in machine learning?
A model performs well on training data but poorly on new dataA model that generalizes well
A model with high bias
A model that uses too few features
74. In Python, which function is used to remove missing data from a DataFrame?
drop_columns()dropna()
drop_duplicates()
fillna()
75. What is a ROC curve in machine learning?
A plot of the true positive rate against the false positive rateA method to evaluate regression models
AContinuing with questions 75–100:
75. What is a ROC curve in machine learning?
A plot of the true positive rate against the false positive rateA method to evaluate regression models
A plot of precision against recall
A method to calculate accuracy
76. What is the difference between recall and precision?
Precision is the ability to retrieve all relevant instances; recall measures true positivesRecall is the ability to retrieve all relevant instances; precision measures true positives
Recall measures true positives; precision measures true negatives
Recall measures accuracy; precision measures errors
77. What is a bagging technique in machine learning?
An ensemble method that combines multiple models to improve accuracyA method for reducing the number of features
A way to scale data
A method for clustering
78. What is gradient boosting in machine learning?
A way to reduce overfittingAn ensemble technique that builds models sequentially to minimize errors
A method for scaling features
A clustering algorithm
79. What is the purpose of one-hot encoding in machine learning?
To convert categorical variables into a binary formatTo scale numerical features
To handle missing values
To remove duplicate data
80. What is the silhouette score in clustering?
A measure of how similar an object is to its own cluster compared to othersA method to calculate clustering accuracy
A method for scaling data
A clustering algorithm
81. What is regularization in machine learning?
A technique to reduce overfitting by penalizing large coefficientsA method to increase model complexity
A way to impute missing data
A method to scale features
82. What is the purpose of a validation set in machine learning?
To fine-tune the model and assess performance before testingTo train the model
To evaluate the final performance of the model
To visualize the data
83. What is bias in a machine learning model?
The error introduced by approximating a complex problem by a simplified modelThe error due to noise in the data
The error that occurs during training
The error due to insufficient data
84. What is variance in a machine learning model?
The model's sensitivity to fluctuations in the training dataThe error introduced by the model being too simple
The average of all errors
The spread of data points
85. What is a residual in linear regression?
The difference between the observed value and the predicted valueThe coefficient of determination
The error between the predicted value and the actual value
The slope of the regression line
86. What is a support vector machine (SVM)?
A supervised learning model used for classification and regressionA clustering algorithm
A method for scaling features
A type of neural network
87. What is data leakage in machine learning?
When information from outside the training dataset is used to create the modelWhen the model overfits the data
When the dataset contains missing values
When the model is underfitting
88. What is logistic regression used for?
Predicting binary outcomesPredicting continuous values
Clustering data points
Reducing the dimensionality of data
89. What is the goal of unsupervised learning?
To find patterns and relationships in data without labeled outcomesTo predict a specific label
To optimize a neural network
To split data into training and testing sets
90. What is the purpose of stratified sampling?
To ensure that each subgroup is proportionately representedTo select random samples from the population
To increase the sample size
To reduce bias in sampling
91. What is the purpose of the “train-test split” in machine learning?
To evaluate the performance of a model on unseen dataTo reduce the number of features
To clean missing data
To perform feature scaling
92. What is ensemble learning in machine learning?
Combining multiple models to improve performanceA method for reducing the number of features
A clustering algorithm
A technique to handle missing data
93. What is the goal of dimensionality reduction?
To reduce the number of features in a dataset while preserving its informationTo increase model complexity
To increase the number of data points
To scale data
94. What does the term “hyperparameter” refer to in machine learning?
Parameters that are set before the learning process beginsParameters learned during training
Features in the dataset
Weights of the model
95. What is a decision boundary in classification?
A line that separates different classes in a classification problemThe maximum depth of a decision tree
The threshold for decision-making in regression
The final layer in a neural network
96. What is a learning curve in machine learning?
A graph showing the performance of a model over timeA method for scaling data
A visualization of model predictions
A plot of precision and recall
97. What does it mean if a machine learning model has high variance?
The model performs well on training data but poorly on test dataThe model performs consistently across all data
The model generalizes well
The model has too few features
98. What is a neural network in machine learning?
A model inspired by the structure of the human brain, used for classification and regression tasksA clustering algorithm
A method for handling missing data
A method for scaling features
99. What is feature selection in machine learning?
The process of selecting the most relevant features for a modelA technique for scaling features
A method to add new features
A process for splitting data into training and testing sets
100. What is the bias-variance tradeoff in machine learning?
The balance between model complexity (variance) and model error (bias)The difference between training and testing errors
The process of adjusting hyperparameters
The comparison between classification and regression