TLDR Data 2025-11-10

Reliable data retrieval for production-ready AI (Sponsor)

Scaling RAG workloads can be complex without the right data infrastructure. Pinecone Vector Database integrates with Amazon Bedrock and Amazon S3 to deliver accurate, low-latency responses for search and generative AI applications.

Pinecone and AWS work together to accelerate RAG workflows and help teams scale AI development without compromising reliability, performance or cost. Explore step-by-step guidance and tutorials to build faster, more dependable apps in your AWS environment.

Strengthen AI dependability with Pinecone on AWS ›

📱

Deep Dives

The Transactional Graph-Enhanced LLM: A Definitive Guide to Read/Write Chatbots for Relational Data (9 minute read)

A robust architectural framework enables LLM-based conversational agents to both read from and write to enterprise relational databases by leveraging a Knowledge Graph (KG) intermediary. Three architecture patterns can be employed to this aim: KG as cache, KG as source of truth, and a hybrid Command Query Responsibility Segregation (CQRS)-inspired approach. The CQRS architecture uses a Knowledge Graph as a semantic layer for query translation and validation. The KG handles reads while the RDBMS ensures transactional integrity for writes.

The Write Last, Read First Rule (6 minute read)

When systems span multiple components without a shared transaction boundary, correctness hinges on a simple but powerful ordering: write to the system of record last, and always read from it first. In practice, this means you update the golden source of truth only after auxiliary stores, and you query it to validate existence before touching the references. Following this order preserves safety (no orphan records) and traceability even in failure-prone, distributed environments.

How Would You Like Your Iceberg Sir? Stream or Batch Ordered? (9 minute read)

Stream-order preserves ingest locality for sequential processing, while batch-order optimizes query locality. Attempting to support both in a single Iceberg table leads to performance penalties, such as costly sorting and shuffling when bootstrapping stream jobs from batch-ordered data. Forcing a single layout hides compute costs that exceed storage savings. Instead, Confluent Tableflow materializes stream data into Iceberg and offers flexibility while doubling storage.

Building Blobd: Single-Machine Object Store with Sub-millisecond Reads and 15 GB/s Uploads (17 minute read)

Blobd is an open source, single-node object store written in Rust. It is built for maximum NVMe SSD performance, achieving sub-millisecond random reads and about 15 GB per second upload throughput, faster than MinIO and RocksDB. It focuses on low-latency content serving and purposely skips S3-style features like listing or distribution to keep performance high. Blobd uses io_uring, direct I/O, atomic writes, and async Rust. On startup, it rebuilds all state in memory from atomic tuple blocks, removing the need for disk indexes and avoiding write amplification.

🚀

Opinions & Advice

The Art of Lean Governance: The Cybernetics of Data Quality (5 minute read)

Modern data quality management demands a cybernetic approach: treating the data ecosystem as an adaptive, self-regulating system powered by real-time feedback, control, and learning loops. Key elements include dynamic reconciliation engines, embedded business glossaries for semantic consistency, and comprehensive data lineage to enable traceable causality and robust AI governance.

The Search API Reset: Incumbents Retreat, Innovators Step Up (3 minute read)

Microsoft is retiring the Bing Search API, and Google is limiting its search API to 10 results per query, signaling a strategic shift toward tightly controlled, AI-driven retrieval within their own ecosystems. This move restricts bulk access to web data, pushing enterprises and developers toward AI-mediated services and driving up the value of performant, flexible retrieval layers in RAG and agentic workflows. Meanwhile, newcomers like Perplexity and Parallel are raising the bar with superior relevance and transparency.

Eroding the Edges: (AI-Generated) Build vs. Buy and the Future of Software (5 minute read)

Coding tools are transforming the classic "build vs. buy" dilemma into "AI-generated vs. buy," enabling rapid creation of customized software and eroding vendor moats by fulfilling specific needs without full replication. Vendors face existential threats, as AI can build simple business tools or automations in minutes. Defensible moats now require proprietary data, network effects, regulatory barriers, or deep relationships, not just features.

💻

Launches & Tools

TOON (GitHub Repo)

TOON is a compact, human-readable alternative to JSON that reduces token usage and improves LLM accuracy, especially for uniform arrays of objects. It keeps the same data as JSON but uses a tabular, indentation-based format that LLMs parse more reliably, often using 30 to 60 percent fewer tokens.

ShadowTraffic's Postgres Connector (Tool)

ShadowTraffic's Postgres connector lets you stream generated data directly into Postgres, with optional automatic table creation and batching controls. You can choose whether ShadowTraffic manages tables (auto-create, drop and recreate, or manual), and you can control inserts, updates, and deletes using simple generator settings. Column types, batching frequency, and schema hints are fully customizable, making it easy to rapidly simulate or evolve data while maintaining control over Postgres behavior.

Perplexity's Open-Source Tool to Run Trillion-Parameter Models Without Costly Upgrades (4 minute read)

Perplexity AI's open-source TransferEngine enables efficient GPU-to-GPU communication across heterogeneous AWS and Nvidia hardware, eliminating the need for costly next-gen GPUs to run trillion-parameter models like DeepSeek V3 and Kimi K2. Achieving 400 Gbps throughput over both ConnectX-7 and AWS EFA, TransferEngine overcomes vendor lock-in by providing RDMA-based, portable, high-speed LLM inference and Mixture-of-Experts routing.

🎁

Miscellaneous

State of Containers and Serverless (8 minute read)

Datadog's survey of thousands of cloud-native environments shows five major trends: GPU adoption is climbing rapidly (now ~6% of orgs, with 3x the instance-hours of two years ago), AI workloads (≈7% of containerized workloads) are emerging amid databases and web services, most containers use under 50% memory and under 25% CPU (revealing wide over-provisioning), over 64% of Kubernetes clusters use Horizontal Pod Autoscaler (HPA) yet only 20% use custom application metrics, and Arm-based platforms have grown from ~9% to ~15-19% in two years.

LLM-As-Judge: 7 Best Practices & Evaluation Templates (9 minute read)

LLM-as-judge involves using an AI to assess another AI's responses on criteria like relevance or helpfulness, leveraging different prompting strategies for generation versus critical evaluation. To implement LLM-as-judge effectively, use few-shot examples, decompose complex evaluations into sequential steps and individual criteria, and apply structured grading rubrics with chain-of-thought.

⚡

Quick Links

Large Scale Distributed LLM Inference with Kubernetes (3 minute read)

Benchmarks show how tailored batching strategies on K8s eliminate GPU underutilization for multimodal LLM serving.

SQL Arena (Website)

Database leaderboard ranking the quality of planners using TPC-H queries using an open source collection tool built for this purpose.

Love TLDR? Tell your friends and get rewards!

Share your referral link below with friends to get free TLDR swag!

https://refer.tldr.tech/9a7c3e77/11

Track your referrals here.

Want to advertise in TLDR? 📰

If your company is interested in reaching an audience of data engineering professionals and decision makers, you may want to advertise with us.

Want to work at TLDR? 💼

Apply here or send a friend's resume to [email protected] and get $1k if we hire them!

If you have any comments or feedback, just respond to this email!

Thanks for reading,
Joel Van Veluwen, Tzu-Ruey Ching & Remi Turpaud

Manage your subscriptions to our other newsletters on tech, startups, and programming. Or if TLDR Data isn't for you, please unsubscribe.