Machine Learning Algorithms and Use Cases
Classified in Computers
Written at on English with a size of 33.21 KB.
Algorithm | Train or Test | Use Cases | Supervised | Pipe? | File Type | CPU or GPU |
AutoGluon-Tabular | training and (optionally) validation | AutoGluon-Tabular succeeds by ensembling multiple models and stacking them in multiple layers. | Y | N | CSV | CPU or GPU (single instance only) M5 |
BlazingText | train | Text Classification can be used to solve various use-cases like sentiment analysis, spam detection, hashtag prediction | Y | Y | Text file (one sentence per line with space-separated tokens) | CPU or GPU (single instance only) M5 |
CatBoost | training and (optionally) validation | GradientBoosting Regression Neural Network, Gradient Boosting is best useful when the number of dimensions in the data is less , when a simple linear model performs very badly, | Y | N | CSV | CPU (single instance only) |
DeepAR Forecasting | train and (optionally) test | Time series Data. Good for cold start problems where the dataset miught not be available. | Y | N | JSON Lines or Parquet | CPU or GPU |
Factorization Machines | train and (optionally) test | Supervised algo for sparse datasets. RMSE for regression, log loss for binary classification | Y | Y | recordIO-protobuf float32 | CPU (GPU for dense data) m5 |
Image Classification - MXNet | train and validation, (optionally) train_lst, validation_lst, and model | MXNet offers faster calculation speeds and resource utilisation on GPU. | Y | Y | recordIO or image files (.Jpg or .Png) | GPU |
Image Classification - TensorFlow | training and validation | In comparison, TensorFlow is inferior; however, the latter performs better on CPU. | Y | File | image files (.Jpg, .Jpeg, or .Png) | CPU or GPU |
IP Insights | train and (optionally) validation | Flagging IP addresses | N | File | CSV | CPU or GPU |
K-Means | train and (optionally) test | Clustering | N | Y | recordIO-protobuf or CSV | CPU or GPUCommon (single GPU device on one or more instances) |
K-Nearest-Neighbors (k-NN) | train and (optionally) test | Text mining, Facial recognition. Good for small dataset. Requires feature scaling. | Y | Y | recordIO-protobuf or CSV | CPU or GPU (single GPU device on one or more instances) |
LDA | train and (optionally) test | Text classifiction. Uses Statistics. | Y | Y | recordIO-protobuf or CSV | CPU (single instance only) |
LightGBM | training and (optionally) validation | Gradient boosting framework | Y | File | CSV | CPU (single instance only) |
Linear Learner | train and (optionally) validation, test, or both | regression or classification | Y | Y | recordIO-protobuf or CSV | CPU or GPU |
Neural Topic Model | train and (optionally) validation, test, or both | Text classifiction. Uses Neural Networks. Better. | Y | Y | recordIO-protobuf or CSV | CPU or GPU |
Object2Vec | train and (optionally) validation, test, or both | can analyze images or paragraphs and provdide relationships between them | Y | File | JSON Lines | CPU or GPU (single instance only) |
Object Detection | train and validation, (optionally) train_annotation, validation_annotation, and model | Y | Y | recordIO or image files (.Jpg or .Png) | GPU | |
PCA | train and (optionally) test | Dimensionality Reduction | N | Y | recordIO-protobuf or CSV | CPU or GPU |
Random Cut Forest | train and (optionally) test | Outliers and Forecasting | N | Y | recordIO-protobuf or CSV | CPU |
Semantic Segmentation | train and validation, train_annotation, validation_annotation, and (optionally) label_map and model | Image classification with no boundaries. Autonomous vehicles. | Y | Y | Image files | GPU (single instance only) |
Seq2Seq Modeling | train, validation, and vocab | solve complex Language problems like Machine Translation, Question Answering, creating Chatbots, Text Summarization | Y | File | recordIO-protobuf integer tokens not float | GPU (single instance only) |
TabTransformer | training and (optionally) validation | Transforming Categorical features to achieve higher accuracy | Y | File | CSV | CPU or GPU (single instance only) |
XGBoost (0.90-1, 0.90-2, 1.0-1, 1.2-1, 1.2-21) | train and (optionally) validation | y. It provides parallel tree boosting and is the leading machine learning library for regression, classification, and ranking problems. | Y | Y | CSV, LibSVM, or Parquet | CPU (or GPU for 1.2-1) |