Show HN: Artie – Real-time data replication to your warehouse, now self-serve

artie.com

12 points by tang8330 16 hours ago

Hey HN, cofounder of Artie here. We’ve built a real-time data replication tool that captures every row-level change in your source database and streams it to your warehouse in under 60 seconds.

The last time I posted here, people had to book a call with us in order to access Artie. Today, that’s no longer the case. You can now connect your source and destination and start streaming immediately.

I spent years of my career building large-scale data pipelines and experienced how difficult it was to get real-time data firsthand. I believed there must be a better way to stream data into our warehouse, which resulted in Artie being born. And now with AI agents, reducing data latency has become more and more crucial as agents need to make decisions off of fresh data.

When I first started building Artie, I quickly learned that the components meant to keep CDC running smoothly are very much bolted on with tons of edge cases. Unfortunately in practice, they were not built to work together. We ended up dealing with schema drift, backfill race conditions, Kafka offset commits, and TOAST columns. I’d love to know if others have hit these same issues while building in-house.

artie.com, would love feedback!

anoop_kumar 10 hours ago

What does Artie do differently from Debezium for TOAST columns and schema drift, or is it Debezium under the hood?

  • tang8330 4 hours ago

    Great question - there's no Debezium under the hood. Artie has its own Reader and Transfer layers, built from scratch.

    TOAST columns: Artie has automatic detection built in. If a TOAST column hasn't changed, its value won't appear in the WAL. Artie detects this and skips the update for that column in the destination. This works without needing to set REPLICA IDENTITY FULL on your tables.

    Schema drift: Artie never requires a schema registry. For relational sources like Postgres, Artie reads the source schema directly and syncs new columns immediately. For DDL changes, Artie uses lazy schema evaluation. On the next DML event for the table, it compares source vs. destination schema and applies any outstanding changes before writing the row.

    Let me know if you have any other questions!

arsalanb an hour ago

Congrats on the launch! Very impressive product!

  • j-cheong 32 minutes ago

    Thank you! Appreciate it :)