Читать книгу Data Cleaning - Ihab F. Ilyas - Страница 7

Оглавление

Contents

Preface

Figure and Table Credits

Chapter 1 Introduction

1.1 Data Cleaning Workflow

1.2 Book Scope

Chapter 2 Outlier Detection

2.1 A Taxonomy of Outlier Detection Methods

2.2 Statistics-Based Outlier Detection

2.3 Distance-Based Outlier Detection

2.4 Model-Based Outlier Detection

2.5 Outlier Detection in High-Dimensional Data

2.6 Conclusion

Chapter 3 Data Deduplication

3.1 Similarity Metrics

3.2 Predicting Duplicate Pairs

3.3 Clustering

3.4 Blocking for Deduplication

3.5 Distributed Data Deduplication

3.6 Record Fusion and Entity Consolidation

3.7 Human-Involved Data Deduplication

3.8 Data Deduplication Tools

3.9 Conclusion

Chapter 4 Data Transformation

4.1 Syntactic Data Transformations

4.2 Semantic Data Transformations

4.3 ETL Tools

4.4 Conclusion

Chapter 5 Data Quality Rule Definition and Discovery

5.1 Functional Dependencies

5.2 Conditional Functional Dependencies

5.3 Denial Constraints

5.4 Other Types of Constraints

5.5 Conclusion

Chapter 6 Rule-Based Data Cleaning

6.1 Violation Detection

6.2 Error Repair

6.3 Conclusion

Chapter 7 Machine Learning and Probabilistic Data Cleaning

7.1 Machine Learning for Data Deduplication

7.2 Machine Learning for Data Repair

7.3 Data Cleaning for Analytics and Machine Learning

Chapter 8 Conclusion and Future Thoughts

References

Index

Author Biographies

Data Cleaning

Подняться наверх