- [1086] Describe how random forests work. What is an ensemble method?
- [2956] Describe a situation that would use batch processing and another that would need real-time predictions.
- [1995] Which is more likely to result in overfitting a model? More training data, decreased nodes in a neural network's hidden layers, eliminating sparse or infrequent features, or switching from a linear to a Gaussian or RBF kernel in SVM?
- [1993] Tell me about a forecasting model you built that didn't match the real-world results.
- [1397] Tell me about a time you deployed a machine-learning model into production. How did your offline model fit into the production environment? What tools did you use?
- [1388] Describe a time you worked with a difficult team member in a research setting.
- [1277] Tell me about a machine learning project you worked on. What was the outcome?
- [1088] What is the difference between batch and real-time predictions?
- [1087] Explain linear regression to a non-technical stakeholder.
- [1085] Describe how the split in a decision tree works.
- [1084] What is entropy and how is it used in machine learning?
- [964] Design an ML monitoring system (drift, performance, outliers, and quality) for a fantasy sports app.
- [930] Design a system that filters waste by detecting paper and putting it in the correct bin.
- [908] Explain linear regression to a technical stakeholder.
- [907] How do you make sure a model you have deployed stays up to date?
- [906] Explain gradient descent.
- [815] Tell me about a machine learning research topic you find interesting.
- [798] How are tree combinations from boosting methods different from those used in random forests?
- [711] Describe the model you would build to predict which credit card transactions are fraud.
- [710] What are the assumptions of linear regression? What happens if any of the assumptions are violated?
- [690] Tell me about a machine learning project you worked on.
- [649] What is over- or under-fitting? Which models are most likely to over- or under-fit? Why?
- [610] You have dataset comprising 1,000 avatar images and 100,000 user descriptions with associated avatar images. Create a model that recommends an image from a *new set* of 100,000 images (does not contain any of the original 1,000 images) for a user description.
- [542] Consider an interval from 0 to 1 containing ten uniformly distributed points. Describe the distribution between the 5th and 6th points.
- [415] What metrics would you consider to assess how well your model is doing?
- [340] How do you stay up to date with advancements in machine learning?
- [335] Design a monitoring system for Tiktok.
- [329] Is the Mean Squared Error (MSE) a suitable cost function for a logistic regression model?
- [225] What is the significance of an area under the curve (AUC) equal to 0.5?
- [209] Design a personalized news ranking system.
- [130] Design a product recommendation system.
- [79] Design an evaluation framework for ads ranking.
- [69] In K-Nearest Neighbors (KNN), does setting k=1 lead to higher variance or higher bias?