Data Science For Dummies

Data Science For Dummies
Автор книги: id книги: 2139682     Оценка: 0.0     Голосов: 0     Отзывы, комментарии: 0 2824 руб.     (26,59$) Читать книгу Купить и скачать книгу Электронная книга Жанр: Базы данных Правообладатель и/или издательство: John Wiley & Sons Limited Дата добавления в каталог КнигаЛит: ISBN: 9781119811619 Скачать фрагмент в формате   fb2   fb2.zip Возрастное ограничение: 0+ Оглавление Отрывок из книги

Реклама. ООО «ЛитРес», ИНН: 7719571260.

Описание книги

Make smart business decisions with your data by design!  Take a deep dive to understand how developing your data science dogma can drive your business—ya dig? Every phone, tablet, computer, watch, and camera generates data—we’re overwhelmed with the stuff. That’s why it’s become increasingly important that you know how to derive useful insights from the data you have to understand which piece of data in the sea of data is important and which isn’t (trust us: not as scary as it sounds!), and to rely on said data to make critical business decisions. Enter the world of data science: the practice of using scientific methods, processes, and algorithms to gain knowledge and insights from any type of data.  Data Science For Dummies  provides a comprehensive introduction in that friendly and approachable way you’ve come to know from Dummies. Your new go-to guide breaks down this vast topic into three smaller parts—big data, data science, and data engineering—and then shows you how to combine those areas to produce value and make informed decisions to drive business growth. It’s also filled with real-world examples and applications that you can apply to your situation.  Data Science For Dummies  demonstrates:  How natural language processing works Strategies around data science How to make decisions using probabilities Ways to display your data using a visualization model How to incorporate various programming languages into your strategy Whether you’re a professional or a student,  Data Science For Dummies  will get you caught up on all the latest data trends. Find out how to ask the pressing questions you need your data to answer by picking up your copy today.

Оглавление

Lillian Pierson. Data Science For Dummies

Data Science For Dummies® To view this book's Cheat Sheet, simply go to www.dummies.com and search for “Data Science For Dummies Cheat Sheet” in the Search box. Table of Contents

List of Tables

List of Illustrations

Guide

Pages

Introduction

About This Book

Foolish Assumptions

Icons Used in This Book

Beyond the Book

Where to Go from Here

Getting Started with Data Science

Wrapping Your Head Around Data Science

Seeing Who Can Make Use of Data Science

Inspecting the Pieces of the Data Science Puzzle

Collecting, querying, and consuming data

Applying mathematical modeling to data science tasks

Deriving insights from statistical methods

Coding, coding, coding — it’s just part of the game

Applying data science to a subject area

Communicating data insights

Exploring Career Alternatives That Involve Data Science

The data implementer

The data leader

The data entrepreneur

Tapping into Critical Aspects of Data Engineering

Defining Big Data and the Three Vs

Grappling with data volume

Handling data velocity

Dealing with data variety

Identifying Important Data Sources

Grasping the Differences among Data Approaches

Defining data science

Defining machine learning engineering

Defining data engineering

Comparing machine learning engineers, data scientists, and data engineers

Storing and Processing Data for Data Science

Storing data and doing data science directly in the cloud

Using serverless computing to execute data science

Containerizing predictive applications within Kubernetes

Sizing up popular cloud-warehouse solutions

Introducing NoSQL databases

Storing big data on-premise

Reminiscing about Hadoop

Incorporating MapReduce, the HDFS, and YARN

Storing data on the Hadoop distributed file system (HDFS)

Putting it all together on the Hadoop platform

Introducing massively parallel processing (MPP) platforms

Processing big data in real-time

Using Data Science to Extract Meaning from Your Data

Machine Learning Means … Using a Machine to Learn from Data

Defining Machine Learning and Its Processes

Walking through the steps of the machine learning process

Becoming familiar with machine learning terms

Considering Learning Styles

Learning with supervised algorithms

Learning with unsupervised algorithms

Learning with reinforcement

Seeing What You Can Do

Selecting algorithms based on function

Using Spark to generate real-time big data analytics

Math, Probability, and Statistical Modeling

Exploring Probability and Inferential Statistics

Probability distributions

Conditional probability with Naïve Bayes

Quantifying Correlation

Calculating correlation with Pearson’s r

Ranking variable-pairs using Spearman’s rank correlation

Reducing Data Dimensionality with Linear Algebra

Decomposing data to reduce dimensionality

Reducing dimensionality with factor analysis

Decreasing dimensionality and removing outliers with PCA

Modeling Decisions with Multiple Criteria Decision-Making

Turning to traditional MCDM

Focusing on fuzzy MCDM

Introducing Regression Methods

Linear regression

Logistic regression

Ordinary least squares (OLS) regression methods

Detecting Outliers

Analyzing extreme values

Detecting outliers with univariate analysis

Detecting outliers with multivariate analysis

Introducing Time Series Analysis

Identifying patterns in time series

Modeling univariate time series data

Grouping Your Way into Accurate Predictions

Starting with Clustering Basics

Getting to know clustering algorithms

Examining clustering similarity metrics

Identifying Clusters in Your Data

Clustering with the k-means algorithm

Estimating clusters with kernel density estimation (KDE)

Clustering with hierarchical algorithms

Dabbling in the DBScan neighborhood

Categorizing Data with Decision Tree and Random Forest Algorithms

Drawing a Line between Clustering and Classification

Introducing instance-based learning classifiers

Getting to know classification algorithms

Making Sense of Data with Nearest Neighbor Analysis

Classifying Data with Average Nearest Neighbor Algorithms

Classifying with K-Nearest Neighbor Algorithms

Understanding how the k-nearest neighbor algorithm works

Knowing when to use the k-nearest neighbor algorithm

Exploring common applications of k-nearest neighbor algorithms

Solving Real-World Problems with Nearest Neighbor Algorithms

Seeing k-nearest neighbor algorithms in action

Seeing average nearest neighbor algorithms in action

Coding Up Data Insights and Decision Engines

Seeing Where Python and R Fit into Your Data Science Strategy

Using Python for Data Science

Sorting out the various Python data types

Numbers in Python

Strings in Python

Lists in Python

Tuples in Python

Sets in Python

Dictionaries in Python

Putting loops to good use in Python

Having fun with functions

Keeping cool with classes

Checking out some useful Python libraries

Saying hello to the NumPy library

Getting up close and personal with the SciPy library

Peeking into the Pandas offering

Bonding with MatPlotLib for data visualization

Learning from data with Scikit-learn

Using Open Source R for Data Science

Comprehending R’s basic vocabulary

Delving into functions and operators

Iterating in R

Observing how objects work

Sorting out R's popular statistical analysis packages

Examining packages for visualizing, mapping, and graphing in R

Visualizing R statistics with ggplot2

Analyzing networks with statnet and igraph

Mapping and analyzing spatial point patterns with spatstat

Generating Insights with Software Applications

Choosing the Best Tools for Your Data Science Strategy

Getting a Handle on SQL and Relational Databases

Investing Some Effort into Database Design

Defining data types

Designing constraints properly

Normalizing your database

Narrowing the Focus with SQL Functions

MINING TEXT WITH SQL

Making Life Easier with Excel

Using Excel to quickly get to know your data

Filtering in Excel

Using conditional formatting

Excel charting to visually identify outliers and trends

Reformatting and summarizing with PivotTables

Automating Excel tasks with macros

Telling Powerful Stories with Data

Data Visualizations: The Big Three

Data storytelling for decision makers

Data showcasing for analysts

Designing data art for activists

Designing to Meet the Needs of Your Target Audience

Step 1: Brainstorm (All about Eve)

Step 2: Define the purpose

Step 3: Choose the most functional visualization type for your purpose

Picking the Most Appropriate Design Style

Inducing a calculating, exacting response

Eliciting a strong emotional response

Selecting the Appropriate Data Graphic Type

Standard chart graphics

Comparative graphics

Statistical plots

Topology structures

Spatial plots and maps

Testing Data Graphics

Adding Context

Creating context with data

Creating context with annotations

Creating context with graphical elements

KNOWING WHEN TO GET PERSUASIVE

Taking Stock of Your Data Science Capabilities

Developing Your Business Acumen

Bridging the Business Gap

Contrasting business acumen with subject matter expertise

Defining business acumen

Traversing the Business Landscape

Seeing how data roles support the business in making money

Leveling up your business acumen

Fortifying your leadership skills

Surveying Use Cases and Case Studies

Documentation for data leaders

Documentation for data implementers

Improving Operations

Establishing Essential Context for Operational Improvements Use Cases

Exploring Ways That Data Science Is Used to Improve Operations

Making major improvements to traditional manufacturing operations

Optimizing business operations with data science

An AI case study: Automated, personalized, and effective debt collection processes

The solution

The result

Gaining logistical efficiencies with better use of real-time data

Another AI case study: Real-time optimized logistics routing

The solution

The result

Modernizing media and the press with data science and AI

Generating content with the click of a button

A SAMPLE OF GPT-3 GENERATED CONTENT

Yet another case study: Increasing content generation rates

The problem

The solution

The result

Making Marketing Improvements

Exploring Popular Use Cases for Data Science in Marketing

Turning Web Analytics into Dollars and Sense

Getting acquainted with omnichannel analytics

Mapping your channels

Building analytics around channel performance

Scoring your company’s channels

HEEDING THE DEMAND FOR DATA PRIVACY

Building Data Products That Increase Sales-and-Marketing ROI

Increasing Profit Margins with Marketing Mix Modeling

Collecting data on the four Ps

Inspecting important product features

Playing with the price aspect

Placing your product

Promoting your offer

Implementing marketing mix modeling

Increasing profitability with MMM

Enabling Improved Decision-Making

Improving Decision-Making

Barking Up the Business Intelligence Tree

Using Data Analytics to Support Decision-Making

Types of analytics

Common challenges in analytics

Data wrangling

Increasing Profit Margins with Data Science

Seeing which kinds of data are useful when using data science for decision support

Directing improved decision-making for call center agents

Case study: Improving call center operations

THE NEED

THE ACTION

THE OUTCOME

Discovering the tipping point where the old way stops working

Decreasing Lending Risk and Fighting Financial Crimes

Decreasing Lending Risk with Clustering and Classification

Preventing Fraud Via Natural Language Processing (NLP)

Monetizing Data and Data Science Expertise

Setting the Tone for Data Monetization

Monetizing Data Science Skills as a Service

Data preparation services

Model building services

Selling Data Products

Direct Monetization of Data Resources

Coupling data resources with a service and selling it

Making money with data partnerships

MONETIZING A PRODUCT THAT’S BUILT SOLELY FROM PARTNERS’ DATA RESOURCES

Pricing Out Data Privacy

Assessing Your Data Science Options

Gathering Important Information about Your Company

Unifying Your Data Science Team Under a Single Business Vision

Framing Data Science around the Company’s Vision, Mission, and Values

Taking Stock of Data Technologies

Inventorying Your Company’s Data Resources

Requesting your data dictionary and inventory

Confirming what’s officially on file

Unearthing data silos and data quality issues

People-Mapping

Requesting organizational charts

Surveying the skillsets of relevant personnel

Avoiding Classic Data Science Project Pitfalls

Staying focused on the business, not on the tech

Drafting best practices to protect your data science project

Tuning In to Your Company’s Data Ethos

Collecting the official data privacy policy

Taking AI ethics into account

Making Information-Gathering Efficient

Narrowing In on the Optimal Data Science Use Case

Reviewing the Documentation

Selecting Your Quick-Win Data Science Use Cases

Zeroing in on the quick win

Producing a POTI model

Picking between Plug-and-Play Assessments

Carrying out a data skill gap analysis for your company

Assessing the ethics of your company’s AI projects and products

Illustrating the need for ethical AI

Proving accountability for AI solutions

Vouching for your company’s AI

Unbiasing AI

Assessing data governance and data privacy policies

Planning for Future Data Science Project Success

Preparing an Implementation Plan

Supporting Your Data Science Project Plan

Analyzing your alternatives

Interviewing intended users and designing accordingly

POTI modeling the future state

Executing On Your Data Science Project Plan

Blazing a Path to Data Science Career Success

Navigating the Data Science Career Matrix

Landing Your Data Scientist Dream Job

Leaning into data science implementation

Acing your accreditations

Making the grade with coding bootcamps and data science career accelerators

Networking and building authentic relationships

Developing your own thought leadership in data science

Building a public data science project portfolio

Showcasing your data science skills

Deciding which data science activities to publish

Taking inspiration from the data science greats

Leading with Data Science

BECOMING YOUR COMPANY’S DATA SCIENCE LEADER: A TRUE STORY

Starting Up in Data Science

Choosing a business model for your data science business

Selecting a data science start-up revenue model

Taking inspiration from Kam Lee’s success story

Following in the footsteps of the data science entrepreneurs

The Part of Tens

Ten Phenomenal Resources for Open Data

Digging Through data.gov

Checking Out Canada Open Data

Diving into data.gov.uk

Checking Out US Census Bureau Data

Accessing NASA Data

Wrangling World Bank Data

Getting to Know Knoema Data

Queuing Up with Quandl Data

Exploring Exversion Data

Mapping OpenStreetMap Spatial Data

Ten Free or Low-Cost Data Science Tools and Applications

Scraping, Collecting, and Handling Data Tools

Sourcing and aggregating image data with ImageQuilts

Wrangling data with DataWrangler

Data-Exploration Tools

Getting up to speed in Gephi

Machine learning with the WEKA suite

Designing Data Visualizations

Getting Shiny by RStudio

Mapmaking and spatial data analytics with CARTO

Talking about Tableau Public

Using RAWGraphs for web-based data visualization

Communicating with Infographics

Making cool infographics with Infogram

Making cool infographics with Piktochart

Index. Symbols and Numerics

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

Y

Z

About the Author

Dedication

Author’s Acknowledgments

WILEY END USER LICENSE AGREEMENT

Отрывок из книги

This book was written as much for expert data scientists as it was for aspiring ones. Its content represents a new approach to doing data science — one that puts business vision and profitably at the heart of our work as data scientists.

Data science and artificial intelligence (AI, for short) have disrupted the business world so radically that it's nearly unrecognizable compared to what things were like just 10 or 15 years ago. The good news is that most of these changes have made everyone’s lives and businesses more efficient, more fun, and dramatically more interesting. The bad news is that if you don’t yet have at least a modicum of data science competence, your business and employment prospects are growing dimmer by the moment.

.....

You have a number of products to choose from when it comes to cloud-warehouse solutions. The following list looks at the most popular options:

A traditional RDBMS isn’t equipped to handle big data demands. That’s because it’s designed to handle only relational datasets constructed of data that’s stored in clean rows and columns and thus is capable of being queried via SQL. RDBMSs are incapable of handling unstructured and semistructured data. Moreover, RDBMSs simply lack the processing and handling capabilities that are needed for meeting big data volume-and-velocity requirements.

.....

Добавление нового отзыва

Комментарий Поле, отмеченное звёздочкой  — обязательно к заполнению

Отзывы и комментарии читателей

Нет рецензий. Будьте первым, кто напишет рецензию на книгу Data Science For Dummies
Подняться наверх