Skip to content
Machine Learning Terminology

Machine Learning Terminology

Brief definitions of common machine learning terms.

1. Machine Learning Terminology

  • Model : A function that produces predictions after training.
  • Feature : An input to the model derived from data.
  • Feature engineering : The process of finding the most suitable set of inputs for the model. Deriving and adding new features from existing ones is part of feature engineering.
  • Parameter : Internal values of the model that are set by training.
  • Hyperparameter : External settings of the model chosen by the developer, not learned from the data.
  • Data acquisition : Collecting data for training.
  • Target variable : The quantity the model is trained to predict.
  • Error function : A function that measures the difference between predictions and actual targets.
  • Training data : Data used for training; often about 60% of the full dataset.
  • Validation data : Data used to evaluate a trained model’s accuracy and performance; often about 20% of the full dataset.
  • Test data : Data used to assess the final model after validation; often about 20% of the full dataset.
  • Linear regression : Predicting a single value from a continuous range.
  • Logistic regression : Applying a sigmoid to the output of linear regression to obtain values between 0 and 1.
  • Neural network : A network inspired by biological neurons, usually with input and output layers and hidden layers in between; nodes in each layer are often called neurons.
  • Deep learning : Machine learning that uses neural networks (typically with many layers).
  • Decision tree : A tree where nodes represent conditions and leaves represent decisions; commonly used for supervised classification and regression, with structure learned from training data.

2. References