Smarter Data Science
Реклама. ООО «ЛитРес», ИНН: 7719571260.
Оглавление
Cole Stryker. Smarter Data Science
Table of Contents
List of Illustrations
Guide
Pages
Praise For This Book
Smarter Data Science. Succeeding with Enterprise-Grade Data and AI Projects
About the Authors
Acknowledgments
Foreword for Smarter Data Science
Epigraph
Preamble
Why You Need This Book
What You'll Learn
CHAPTER 1 Climbing the AI Ladder
Readying Data for AI
Technology Focus Areas
Taking the Ladder Rung by Rung
THE BIG PICTURE
Constantly Adapt to Retain Organizational Relevance
ECONOMICALLY VIABLE
Data-Based Reasoning Is Part and Parcel in the Modern Business
LEARNING
Toward the AI-Centric Organization
SCALE
Summary
CHAPTER 2 Framing Part I: Considerations for Organizations Using AI
Data-Driven Decision-Making
Using Interrogatives to Gain Insight
NOTE
The Trust Matrix
The Importance of Metrics and Human Insight
THE ZACHMAN FRAMEWORK
Democratizing Data and Data Science
DEMOCRATIZATION
Aye, a Prerequisite: Organizing Data Must Be a Forethought. NOTE
NOTE
NOTE
NOTE
Preventing Design Pitfalls
ARCHITECTURE AND DESIGN
Facilitating the Winds of Change: How Organized Data Facilitates Reaction Time
NOTE
MUTABLE
Quae Quaestio (Question Everything)
QUESTIONING
Summary
CHAPTER 3 Framing Part II: Considerations for Working with Data and AI
Personalizing the Data Experience for Every User
NOTE
NOTE
WATER
Context Counts: Choosing the Right Way to Display Data
CONTEXT
Ethnography: Improving Understanding Through Specialized Data
DRILLING DOWN
Data Governance and Data Quality
The Value of Decomposing Data
Providing Structure Through Data Governance
Curating Data for Training
Additional Considerations for Creating Value
STANDARDIZATION
Ontologies: A Means for Encapsulating Knowledge
SEMANTICALLY DISAMBIGUATE
Fairness, Trust, and Transparency in AI Outcomes
NOTE
NOTE
ETHICS
Accessible, Accurate, Curated, and Organized
CURATED
Summary
CHAPTER 4 A Look Back on Analytics: More Than One Hammer
Been Here Before: Reviewing the Enterprise Data Warehouse
NOTE
NOTE
NOTE
NOTE
NOTE
NOTE
NOTE
A RELATIONSHIP IS NOT JUST A LINE BETWEEN OBJECTS
Drawbacks of the Traditional Data Warehouse
NOTE
NOTE
NOTE
NOTE
NOTE
THE LOGIC BEHIND A BEST PRACTICE
Paradigm Shift
NOTE
ANY VOLUME IN ZERO SECONDS
Modern Analytical Environments: The Data Lake
By Contrast
NOTE
NOTE
NOTE
Indigenous Data
Attributes of Difference
RAW DATA
Elements of the Data Lake
BIG DATA QUALITY
The New Normal: Big Data Is Now Normal Data
Liberation from the Rigidity of a Single Data Model
Streaming Data
Suitable Tools for the Task
Easier Accessibility
Reducing Costs
Scalability
Data Management and Data Governance for AI
FACTORS
Schema-on-Read vs. Schema-on-Write
AN UNDERLYING METAMODEL
Summary
NOTE
NOTE
CHAPTER 5 A Look Forward on Analytics: Not Everything Can Be a Nail
A Need for Organization
NOTE
NOTE
NOTE
The Staging Zone
The Raw Zone
The Discovery and Exploration Zone
The Aligned Zone
The Harmonized Zone
The Curated Zone
DATA RICH, INFORMATION POOR
Data Topologies
NOTE
Zone Map
Data Pipelines
NOTE
Data Topography
MISGUIDED TENETS
Expanding, Adding, Moving, and Removing Zones
LEAF ZONES
Enabling the Zones
Ingestion
Data Governance
Data Storage and Retention
NOTE
Data Processing
Data Access
Management and Monitoring
Metadata
WHITE BOX, GRAY BOX, BLACK BOX
Summary
CHAPTER 6 Addressing Operational Disciplines on the AI Ladder
A Passage of Time
NOTE
NOTE
ADAPTIVE OVER AGILE
Create
Stability
NOTE
Barriers
Complexity
REDUCTION
Execute
Ingestion
NOTE
Visibility
Compliance
MVP
Operate
NOTE
Quality
NOTE
Reliance
Reusability
ADAPTIVE
The xOps Trifecta: DevOps/MLOps, DataOps, and AIOps
DevOps/MLOps
DataOps
AIOps
ADAPTIVE
Summary
CHAPTER 7 Maximizing the Use of Your Data: Being Value Driven
Toward a Value Chain
NOTE
NOTE
Chaining Through Correlation
NOTE
Enabling Action
Expanding the Means to Act
IT'S ALL JUST METADATA
Curation
NOTE
FIT FOR PURPOSE
Data Governance
WAIVERS
Integrated Data Management
Onboarding
Organizing
NOTE
Cataloging
Metadata
Preparing
Provisioning
Multi-Tenancy
FEATURE ENGINEERING
NOTE
Summary
CHAPTER 8 Valuing Data with Statistical Analysis and Enabling Meaningful Access
Deriving Value: Managing Data as an Asset
NOTE
NOTE
ATTENTION TO DETAIL
NOTE
NOTE
NOTE
An Inexact Science
NOTE
FROM THE BEGINNING
Accessibility to Data: Not All Users Are Equal
HIDDEN BY NECESSITY
Providing Self-Service to Data
AVOIDING VAGUE OR AMBIGUOUS METADATA
Access: The Importance of Adding Controls
LYING
Ranking Datasets Using a Bottom-Up Approach for Data Governance
DATA QUALITY APPLIES
How Various Industries Use Data and AI
BOUNDARIES
Benefiting from Statistics
NOTE
MISAPPLIED
Summary
CHAPTER 9 Constructing for the Long-Term
The Need to Change Habits: Avoiding Hard-Coding
NOTE
NOTE
NOTE
Overloading
NOTE
Locked In
SIMPLE ISSUES MAY NOT ALWAYS BE SIMPLE TO MITIGATE
Ownership and Decomposition
Design to Avoid Change
OSAPI
Extending the Value of Data Through AI
NOTE
TIME IS AN INTANGIBLE ASSET
Polyglot Persistence
NOTE
MODELS DEPLOYED AS MICROSERVICES
Benefiting from Data Literacy
Understanding a Topic
Skillsets
NOTE
It's All Metadata
NOTE
NOTE
The Right Data, in the Right Context, with the Right Interface
NOTE
ASCERTAINING CONTEXT
Summary
CHAPTER 10 A Journey's End: An IA for AI
Development Efforts for AI
RETRAINING
Essential Elements: Cloud-Based Computing, Data, and Analytics
RESILIENCY
Intersections: Compute Capacity and Storage Capacity
SUSTAIN
Analytic Intensity
xPU ACCELERATION
Interoperability Across the Elements
NOTE
NOTE
A USE CASE
Data Pipeline Flight Paths: Preflight, Inflight, Postflight
A USE CASE (CONTINUED)
Data Management for the Data Puddle, Data Pond, and Data Lake
NOTE
A POTENTIAL BABEL FISH
Driving Action: Context, Content, and Decision-Makers
EXPLAINABLE AI
Keep It Simple
ALTERNATIVES TO COMPLEX DATA SECURITY PROFILES
The Silo Is Dead; Long Live the Silo
NOTE
THE BODY AS A MYRIAD OF SILOS
Taxonomy: Organizing Data Zones
THE BODY AS A MYRIAD OF SILOS: SPECIALIZATION
Capabilities for an Open Platform
Summary
Appendix Glossary of Terms
Index
WILEY END USER LICENSE AGREEMENT
Отрывок из книги
The authors have obviously explored the paths toward an efficient information architecture. There is value in learning from their experience. If you have responsibility for or influence over how your organization uses artificial intelligence you will find Smarter Data Science an invaluable read. It is noteworthy that the book is written with a sense of scope that lends to its credibility. So much written about AI technologies today seems to assume a technical vacuum. We are not all working in startups! We have legacy technology that needs to be considered. The authors have created an excellent resource that acknowledges that enterprise context is a nuanced and important problem. The ideas are presented in a logical and clear format that is suitable to the technologist as well as the businessperson.
Christopher Smith, Chief Knowledge Management and Innovation Officer, Sullivan & Cromwell, LLC
.....
Advanced analytics, including AI, can provide a basis for establishing reasoning by using inductive and deductive techniques. Being able to interpret user interactions as a series of signals can allow a system to offer content that is appropriate for the user's context in real time.
To maximize the usefulness of the content, the data should be of an appropriate level of quality, appropriately structured or tagged, and, as appropriate, correlated with information from disparate systems and processes. Ascertaining a user's context is also an analytical task and involves the system trying to understand the relationship between the user and the user's specific work task.
.....