Here are
96 public repositories
matching this topic...
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
Replace Splunk in your small company with this one weird trick!
Updated
Sep 29, 2023
Python
Data Prepper is a component of the OpenSearch project that accepts, filters, transforms, enriches, and routes data at scale.
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Updated
May 9, 2024
Scala
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)
Use local files or public GitHub repository as a source and ask questions through ChatGPT about it
Updated
Oct 9, 2023
TypeScript
Apache Spark examples exclusively in Java
Updated
Apr 21, 2023
Java
A free, open-source, web-based self-service BI tailor-made for clickhouse, google bigquery, mysql, postgresql, vertica
Updated
Nov 30, 2023
Scala
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Updated
Apr 13, 2022
Python
Extensible streaming ingestion pipeline on top of Apache Spark
Updated
Mar 21, 2024
Scala
Media Management System: ingestion, processing, encoding, delivery, ...
Updated
Aug 24, 2020
Haskell
💰 A bot for maximizing the borrow subreddit
Updated
Feb 13, 2017
JavaScript
A simple demo application for uploading, ingesting, embedding videos and converting them to mp4s. From api.video (https://api.video )
Updated
Dec 20, 2022
JavaScript
Spark in Action, 2e - chapter 9 - Advanced ingestion: finding data sources and building your own
Updated
Apr 21, 2023
Java
Periodically ingest incremental updates (inserts / deletes) into BigQuery using Cloud Composer / Airflow orchestration workflow
Updated
Dec 12, 2019
Python
Parallel Streaming Transformation Loader
Updated
Apr 23, 2019
Java
tagbase-server is a data management web service for working with eTUFF and nc-eTAG files.
Updated
Jun 4, 2024
Python
Updated
Jan 23, 2023
Python
Biocaddie Data Processing Pipeline. A data ingestion pipeline that collects and transforms original metadata information to a unified metadata model, called DatA Tag Suite (DATS).
Improve this page
Add a description, image, and links to the
ingestion
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
ingestion
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.