-
Data Quality?selfstudy 2020. 5. 19. 01:23
https://neilpatel.com/blog/data-quality/
What is Data Quality and How Do You Measure It for Best Results?
Learn how to qualify the data you gather and measure it for better ROI on your business and advertising campaigns.
neilpatel.com
From the above link:
Why we need data 'quality'?
- So much data is flooding and we need to make a decision quickly but efficiently. Because we can't get perfect complete data with 100% accuracy(it is expensive and time-consuming.), balancing between completeness and accuracy is the most critical work.
4 options to determine data quality
- Accept the error
- Reject the error(e.g. incorrect data)
- Correct the error(e.g. misspelling)
- Create a default value
Data quality dimensions from DAMA UK
- Completeness
- Uniqueness
- Timeliness
- Validity
- Accuracy
- Consistency
https://www.scnsoft.com/blog/guide-to-data-quality-management
Guide to Data Quality Management: Metrics, Process and Best Practices
Your Guide to Data Quality Management Data Analytics Researcher, ScienceSoft Published: Dec 13, 2018 Updated: Mar 23, 2020 Editor's note: In the article, Irene reveals some tips on how a company can measure and improve the quality of their data. If you w
www.scnsoft.com
From the above link:
Definition of data quality = the state of data, which is tightly connected with its ability to solve business tasks. It has the following attributes: consistency, accuracy, completeness, *auditability, **orderliness, uniqueness, timeliness.
* Auditability: Data is accessible and it's possible to trace introduced changes.
** Orderliness: Because the US and Europe have different ways to write the date, it requires format and structure.
The problem of low data quality
- Unreliable information
- Incomplete data
- Ambiguous data interpretation
- Duplicated data
- Outdated information
- Late data entry/update
Data Quality Management
- Define data quality thresholds and rules: you may need various thresholds for different fields.
- Assess the quality of data: Does it follow the rules? (e.g. misspelling check, orderliness check)
- Resolve data quality issues: to eliminate their root cause.
- Monitor and control data
Data Quality: Concepts, Methodologies and Techniques (공)저: Carlo Batini, Monica Scannapieco
- you should sort the types of data-stable, long-term changing, frequently changing. This may increase the complexity of your model.
'selfstudy' 카테고리의 다른 글
갑자기 재밌어 보여서 하는 공부 (0) (0) 2021.05.23 buffer sizing,...etc. (0) 2020.06.13 Internet protocol stack (0) 2020.05.18 Router & Switch + Hub (0) 2020.05.18 Bloom filter (0) 2020.05.11