Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
コース概要
spark.mllib: data types, algorithms, and utilities
- Data types
- Basic statistics
- summary statistics
- correlations
- stratified sampling
- hypothesis testing
- streaming significance testing
- random data generation
- Classification and regression
- linear models (SVMs, logistic regression, linear regression)
- naive Bayes
- decision trees
- ensembles of trees (Random Forests and Gradient-Boosted Trees)
- isotonic regression
- Collaborative filtering
- alternating least squares (ALS)
- Clustering
- k-means
- Gaussian mixture
- power iteration clustering (PIC)
- latent Dirichlet allocation (LDA)
- bisecting k-means
- streaming k-means
- Dimensionality reduction
- singular value decomposition (SVD)
- principal component analysis (PCA)
- Feature extraction and transformation
- Frequent pattern mining
- FP-growth
- association rules
- PrefixSpan
- Evaluation metrics
- PMML model export
- Optimization (developer)
- stochastic gradient descent
- limited-memory BFGS (L-BFGS)
spark.ml: high-level APIs for ML pipelines
- Overview: estimators, transformers and pipelines
- Extracting, transforming and selecting features
- Classification and regression
- Clustering
- Advanced topics
要求
Knowledge of one of the following:
- Java
- Scala
- Python
- SparkR.
35 時間
お客様の声 (1)
多くの実用的な例、同じ問題へのさまざまなアプローチ方法、そして現在のソリューションを改善するためのあまり知られていないトリックなど
Rafal - Nordea
コース - Apache Spark MLlib
Machine Translated