The Open Source Feature Store for Machine Learning
-
Updated
Jun 3, 2024 - Python
The Open Source Feature Store for Machine Learning
Source-available data quality tool
Test data management tool for any data source, batch or real-time
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.
CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files.
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
lakeFS - Data version control for your data lake | Git for data
Example API implementation for Data Caterer
数据质量检查工具, 用于诊断数据的问题
On this site I share personal thoughts about data, data governance, data quality, metadata, and side projects.
Tool for automatic determination of data quality (accuracy and precision) of wearable eye tracker recordings
The open-source tool for building high-quality datasets and computer vision models
Client interface for all things Cleanlab Studio
Always know what to expect from your data.
Home of the Open Data Contract Standard (ODCS).
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
DataOps TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling, new dataset screening and hygiene review, algorithmic generation of data quality validation tests, ongoing testing of new data refreshes, & continuous data anomaly monitoring
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
This automated anomaly detection preprocessing pipeline can be used to automatically preprocess tabular data for anomaly detection methods.
Add a description, image, and links to the data-quality topic page so that developers can more easily learn about it.
To associate your repository with the data-quality topic, visit your repo's landing page and select "manage topics."