Machine Learning Terminology
Brief definitions of common machine learning terms.
1. Machine Learning Terminology
- Model : A function that produces predictions after training.
- Feature : An input to the model derived from data.
- Feature engineering : The process of finding the most suitable set of inputs for the model. Deriving and adding new features from existing ones is part of feature engineering.
- Parameter : Internal values of the model that are set by training.
- Hyperparameter : External settings of the model chosen by the developer, not learned from the data.
- Data acquisition : Collecting data for training.
- Target variable : The quantity the model is trained to predict.
- Error function : A function that measures the difference between predictions and actual targets.
- Training data : Data used for training; often about 60% of the full dataset.
- Validation data : Data used to evaluate a trained model’s accuracy and performance; often about 20% of the full dataset.
- Test data : Data used to assess the final model after validation; often about 20% of the full dataset.
- Linear regression : Predicting a single value from a continuous range.
- Logistic regression : Applying a sigmoid to the output of linear regression to obtain values between 0 and 1.
- Neural network : A network inspired by biological neurons, usually with input and output layers and hidden layers in between; nodes in each layer are often called neurons.
- Deep learning : Machine learning that uses neural networks (typically with many layers).
- Decision tree : A tree where nodes represent conditions and leaves represent decisions; commonly used for supervised classification and regression, with structure learned from training data.
2. References
- Feature engineering, hyperparameter tuning : https://stats.stackexchange.com/questions/448757/difference-between-feature-engineering-and-hyperparameter-optimizations
- Parameter, hyperparameter : https://machinelearningmastery.com/difference-between-a-parameter-and-a-hyperparameter/
- Model validation, model testing : https://stats.stackexchange.com/questions/19048/what-is-the-difference-between-test-set-and-validation-set