Зарегистрироваться
Восстановить пароль
FAQ по входу

Vermeulen Andreas Francois. Practical Data Science

  • Файл формата pdf
  • размером 7,57 МБ
  • Добавлен пользователем
  • Отредактирован
Vermeulen Andreas Francois. Practical Data Science
Apress, 2018. — 821 p. — ISBN 1484230531.
Learn how to build a data science technology stack and perform good data science with repeatable methods. You will learn how to turn data lakes into business assets.
The data science technology stack demonstrated in Practical Data Science is built from components in general use in the industry. Data scientist Andreas Vermeulen demonstrates in detail how to build and provision a technology stack to yield repeatable results. He shows you how to apply practical methods to extract actionable business knowledge from data lakes consisting of data from a polyglot of data types and dimensions.
Data Science Technology Stack
Rapid Information Factory Ecosystem
Data Science Storage Tools
Data Lake
Data Vaul
Data Warehouse Bus Matrix
Data Science Processing Tools Spark
Mesos
Akka
Cassandra
Kafka
Elastic Search
R
Scala
Python
MQTT (MQ Telemetry Transport)
What’s Next?
Vermeulen-Krennwallner-Hillman-Clark
Windows
Linux
It’s Now Time to Meet Your Customer
Processing Ecosystem
Example Ecosystem
Sample Data
Summary
Layered Framework
Definition of Data Science Framework
Cross-Industry Standard Process for Data Mining (CRISP-DM)
Homogeneous Ontology for Recursive Uniform Schema
The Top Layers of a Layered Framewor
Layered Framework for High-Level Data Science and Engineering
Summarу
Business Layer
Business Layer
Engineering a Practical Business Layer
Summary
Utility Layer
Basic Utility Design
Engineering a Practical Utility Layer
Summary
Three Management Layers
Operational Management Layer
Audit, Balance, and Control Layer
Balance
Control
Yoke Solution
Cause-and-Effect Analysis System
Functional Layer
Data Science Process
Summary
Retrieve Superstep
Data Lakes
Data Swamps
Training the Trainer Model
Understanding the Business Dynamics of the Data Lake
Actionable Business Knowledge from Data Lakes
Engineering a Practical Retrieve Superstep
Connecting to Other Data Sources
Summary
Assess Superstep
Assess Superstep
Errors
Analysis of Data
Practical Actions
Engineering a Practical Assess Superstep
Summary
Process Superstep
Data Vault
Time-Person-Object-Location-Event Data Vault
Data Science Process
Data Science
Summary
Transform Superstep
Transform Superstep
Building a Data Warehouse
Transforming with Data Science
Hypothesis Testing
Overfitting and Underfitting
Precision-Recall
Cross-Validation Test
Univariate Analysis
Bivariate Analysis
Multivariate Analysis
Linear Regression
Logistic Regression
Clustering Techniques
ANOVA
Principal Component Analysis (PCA)
Decision Trees
Support Vector Machines, Networks, Clusters, and Grids
Data Mining
Pattern Recognition
Machine Learning
Bagging Data
Random Forests
Computer Vision (CV)
Natural Language Processing (NLP)
Neural Networks
TensorFlow
Summary
Organize and Report Supersteps
Organize Superstep
Report Superstep
Graphics
Pictures
Showing the Difference
Summary
Closing Words
Index
  • Чтобы скачать этот файл зарегистрируйтесь и/или войдите на сайт используя форму сверху.
  • Регистрация