Gallatin K, Albon C. Machine Learning with Python Cookbook: Practical Solutions from Preprocessing to Deep Learning

Файл формата pdf
размером 9,24 МБ

Добавлен пользователем CoronaSUN 24.09.2023 14:09
Описание отредактировано 15.03.2024 20:12

Gallatin K, Albon C. Machine Learning with Python Cookbook: Practical Solutions from Preprocessing to Deep Learning

2nd Edition. — O’Reilly, 2023. — 413 p. — ISBN 978-1-098-13572-0.

This practical guide provides more than 200 self-contained recipes to help you solve machine learning challenges you may encounter in your work. If you're comfortable with Python and its libraries, including pandas and scikit-learn, you'll be able to address specific problems, from loading data to training models and leveraging neural networks.

Each recipe in this updated edition includes code that you can copy, paste, and run with a toy dataset to ensure that it works. From there, you can adapt these recipes according to your use case or application. Recipes include a discussion that explains the solution and provides meaningful context.

Go beyond theory and concepts by learning the nuts and bolts you need to construct working machine learning applications. You'll find recipes for:

Vectors, matrices, and arrays
Working with data from CSV, JSON, SQL, databases, cloud storage, and other sources
Handling numerical and categorical data, text, images, and dates and times
Dimensionality reduction using feature extraction or feature selection
Model evaluation and selection
Linear and logical regression, trees and forests, and k-nearest neighbors
Supporting vector machines (SVM), naäve Bayes, clustering, and tree-based models
Saving, loading, and serving trained models from multiple frameworks

True PDF

Preface

Working with Vectors, Matrices, and Arrays in NumPy
Introduction
Creating a Vector
Creating a Matrix
Creating a Sparse Matrix
Preallocating NumPy Arrays
Selecting Elements
Describing a Matrix
Applying Functions over Each Element
Finding the Maximum and Minimum Values
Calculating the Average, Variance, and Standard Deviation
Reshaping Arrays
Transposing a Vector or Matrix
Flattening a Matrix
Finding the Rank of a Matrix
Getting the Diagonal of a Matrix
Calculating the Trace of a Matrix
Calculating Dot Products
Adding and Subtracting Matrices
Multiplying Matrices
Inverting a Matrix
Generating Random Values

Loading Data
Introduction
Loading a Sample Dataset
Creating a Simulated Dataset
Loading a CSV File
Loading an Excel File
Loading a JSON File
Loading a Parquet File
Loading an Avro File
Querying a SQLite Database
Querying a Remote SQL Database
Loading Data from a Google Sheet
Loading Data from an S3 Bucket
Loading Unstructured Data

Data Wrangling
Introduction
Creating a Dataframe
Getting Information about the Data
Slicing DataFrames
Selecting Rows Based on Conditionals
Sorting Values
Replacing Values
Renaming Columns
Finding the Minimum, Maximum, Sum, Average, and Count
Finding Unique Values
Handling Missing Values
Deleting a Column
Deleting a Row
Dropping Duplicate Rows
Grouping Rows by Values
Grouping Rows by Time
Aggregating Operations and Statistics
Looping over a Column
Applying a Function over All Elements in a Column
Applying a Function to Groups
Concatenating DataFrames
Merging DataFrames

Handling Numerical Data
Introduction
Rescaling a Feature
Standardizing a Feature
Normalizing Observations
Generating Polynomial and Interaction Features
Transforming Features
Detecting Outliers
Handling Outliers
Discretizating Features
Grouping Observations Using Clustering
Deleting Observations with Missing Values
Imputing Missing Values

Handling Categorical Data
Introduction
Encoding Nominal Categorical Features
Encoding Ordinal Categorical Features
Encoding Dictionaries of Features
Imputing Missing Class Values
Handling Imbalanced Classes

Handling Text
Introduction
Cleaning Text
Parsing and Cleaning HTML
Removing Punctuation
Tokenizing Text
Removing Stop Words
Stemming Words
Tagging Parts of Speech
Performing Named-Entity Recognition
Encoding Text as a Bag of Words
Weighting Word Importance
Using Text Vectors to Calculate Text Similarity in a Search Query
Using a Sentiment Analysis Classifier

Handling Dates and Times
Introduction
Converting Strings to Dates
Handling Time Zones
Selecting Dates and Times
Breaking Up Date Data into Multiple Features
Calculating the Difference Between Dates
Encoding Days of the Week
Creating a Lagged Feature
Using Rolling Time Windows
Handling Missing Data in Time Series

Handling Images
Introduction
Loading Images
Saving Images
Resizing Images
Cropping Images
Blurring Images
Sharpening Images
Enhancing Contrast
Isolating Colors
Binarizing Images
Removing Backgrounds
Detecting Edges
Detecting Corners
Creating Features for Machine Learning
Encoding Color Histograms as Features
Using Pretrained Embeddings as Features
Detecting Objects with OpenCV
Classifying Images with Pytorch

Dimensionality Reduction Using Feature Extraction
Introduction
Reducing Features Using Principal Components
Reducing Features When Data Is Linearly Inseparable
Reducing Features by Maximizing Class Separability
Reducing Features Using Matrix Factorization
Reducing Features on Sparse Data

Dimensionality Reduction Using Feature Selection
Introduction
Thresholding Numerical Feature Variance
Thresholding Binary Feature Variance
Handling Highly Correlated Features
Removing Irrelevant Features for Classification
Recursively Eliminating Features

Model Evaluation
Introduction
Cross-Validating Models
Creating a Baseline Regression Model
Creating a Baseline Classification Model
Evaluating Binary Classifier Predictions
Evaluating Binary Classifier Thresholds
Evaluating Multiclass Classifier Predictions
Visualizing a Classifier’s Performance
Evaluating Regression Models
Evaluating Clustering Models
Creating a Custom Evaluation Metric
Visualizing the Effect of Training Set Size
Creating a Text Report of Evaluation Metrics
Visualizing the Effect of Hyperparameter Values

Model Selection
Introduction
Selecting the Best Models Using Exhaustive Search
Selecting the Best Models Using Randomized Search
Selecting the Best Models from Multiple Learning Algorithms
Selecting the Best Models When Preprocessing
Speeding Up Model Selection with Parallelization
Speeding Up Model Selection Using Algorithm-Specific Methods
Evaluating Performance After Model Selection

Linear Regression
Introduction
Fitting a Line
Handling Interactive Effects
Fitting a Nonlinear Relationship
Reducing Variance with Regularization
Reducing Features with Lasso Regression

Trees and Forests
Introduction
Training a Decision Tree Classifier
Training a Decision Tree Regressor
Visualizing a Decision Tree Model
Training a Random Forest Classifier
Training a Random Forest Regressor
Evaluating Random Forests with Out-of-Bag Errors
Identifying Important Features in Random Forests
Selecting Important Features in Random Forests
Handling Imbalanced Classes
Controlling Tree Size
Improving Performance Through Boosting
Training an XGBoost Model
Improving Real-Time Performance with LightGBM

K-Nearest Neighbors
Introduction
Finding an Observation’s Nearest Neighbors
Creating a K-Nearest Neighbors Classifier
Identifying the Best Neighborhood Size
Creating a Radius-Based Nearest Neighbors Classifier
Finding Approximate Nearest Neighbors
Evaluating Approximate Nearest Neighbors

Logistic Regression
Introduction
Training a Binary Classifier
Training a Multiclass Classifier
Reducing Variance Through Regularization
Training a Classifier on Very Large Data
Handling Imbalanced Classes

Support Vector Machines
Introduction
Training a Linear Classifier
Handling Linearly Inseparable Classes Using Kernels
Creating Predicted Probabilities
Identifying Support Vectors
Handling Imbalanced Classes

Naive Bayes
Introduction
Training a Classifier for Continuous Features
Training a Classifier for Discrete and Count Features
Training a Naive Bayes Classifier for Binary Features
Calibrating Predicted Probabilities

Clustering
Introduction
Clustering Using K-Means
Speeding Up K-Means Clustering
Clustering Using Mean Shift
Clustering Using DBSCAN
Clustering Using Hierarchical Merging

Tensors with PyTorch
Introduction
Creating a Tensor
Creating a Tensor from NumPy
Creating a Sparse Tensor
Selecting Elements in a Tensor
Describing a Tensor
Applying Operations to Elements
Finding the Maximum and Minimum Values
Reshaping Tensors
Transposing a Tensor
Flattening a Tensor
Calculating Dot Products
Multiplying Tensors

Neural Networks
Introduction
Using Autograd with PyTorch
Preprocessing Data for Neural Networks
Designing a Neural Network
Training a Binary Classifier
Training a Multiclass Classifier
Training a Regressor
Making Predictions
Visualize Training History
Reducing Overfitting with Weight Regularization
Reducing Overfitting with Early Stopping
Reducing Overfitting with Dropout
Saving Model Training Progress
Tuning Neural Networks
Visualizing Neural Networks

Neural Networks for Unstructured Data
Introduction
Training a Neural Network for Image Classification
Training a Neural Network for Text Classification
Fine-Tuning a Pretrained Model for Image Classification
Fine-Tuning a Pretrained Model for Text Classification

Saving, Loading, and Serving Trained Models
Introduction
Saving and Loading a scikit-learn Model
Saving and Loading a TensorFlow Model
Saving and Loading a PyTorch Model
Serving scikit-learn Models
Serving TensorFlow Models
Serving PyTorch Models in Seldon
Index

Чтобы скачать этот файл зарегистрируйтесь и/или войдите на сайт используя форму сверху.
Регистрация

Узнайте сколько стоит уникальная работа конкретно по Вашей теме:
Сколько стоит заказать работу?

Смотри также

Подробнее

Bruce Peter, Bruce Andrew, Gedeck Peter. Practical Statistics for Data Scientists 50+ Essential Concepts Using R and Python

Раздел: Компьютерная литература → Наука о данных

2nd Edition. — O’Reilly Media, 2020. — 361 p. — ISBN: 978-1-492-07294-2. Statistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this popular guide adds comprehensive examples in Python, provides practical...

14,70 МБ
добавлен 21.05.2020 04:16
описание отредактировано 04.08.2022 19:08

Подробнее

Geron Aurelien. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

Раздел: Искусственный интеллект → Машинное обучение (Machine Learning)

2nd Edition. — O’Reilly, 2019. — 856 р. - ISBN: 1492032646 Final Edition Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. The updated edition of this best-selling book...

54,00 МБ
добавлен 22.01.2020 13:32
описание отредактировано 15.03.2024 20:12

Подробнее

Howard Jeremy, Gugger Sylvain. Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD

Раздел: Python → PyTorch

O’Reilly Media, Inc., 2020. — 624 p. — ISBN: 978-1492045526. Deep learning is often viewed as the exclusive domain of math PhDs and big tech companies. But as this hands-on guide demonstrates, programmers comfortable with Python can achieve impressive results in deep learning with little math background, small amounts of data, and minimal code. How? With fastai, the first...

32,82 МБ
добавлен 06.09.2020 15:00
описание отредактировано 27.09.2021 02:25

Подробнее

Nield T. Essential Math for Data Science. Take Control of Your Data with Fundamental Linear Algebra, Probability, and Statistics

Раздел: Искусственный интеллект → Интеллектуальный анализ данных

O’Reilly Media, 2022. — 350 p. To succeed in data science you need some math proficiency. But not just any math. This common-sense guide provides a clear, plain English survey of the math you'll need in data science, including probability, statistics, hypothesis testing, linear algebra, machine learning, and calculus. Practical examples with Python code will help you see how...

11,67 МБ
добавлен 29.06.2022 13:33
описание отредактировано 29.06.2022 21:48

Подробнее

Raschka Sebastian, Liu Yuxi (Hayden), Mirjalili Vahid. Machine Learning with PyTorch and Scikit-Learn

Раздел: Искусственный интеллект → Машинное обучение (Machine Learning)

Packt Publishing, 2022. — 741 p. — ISBN 9781801819312. PyTorch book of the bestselling and widely acclaimed Python Machine Learning series expanded to include transformers, XGBoost, and graph neural networks Key Features: Learn applied machine learning with a solid foundation in theory Clear, intuitive explanations take you deep into the theory and practice of Python machine...

29,32 МБ
добавлен 07.03.2022 00:23
описание отредактировано 15.03.2024 20:12

Подробнее

Smolyakov Vadim. Machine Learning Algorithms in Depth

Раздел: Искусственный интеллект → Машинное обучение (Machine Learning)

Manning Publications, 2024. — 328 p. — ISBN-13: 978-1633439214. Learn how machine learning algorithms work from the ground up so you can effectively troubleshoot your models and improve their performance. Fully understanding how machine learning algorithms function is essential for any serious ML engineer. In Machine Learning Algorithms in Depth you’ll explore practical...

25,30 МБ
добавлен 01.08.2024 01:59
описание отредактировано 01.08.2024 23:16

Главная

Наверх