Indexes and denormalization: keys to scaling sites with massive content

Submitted by David Strauss on Wed, 07/30/2008 - 22:35.

Session recording

Watch or download video

Attached files

Indexes and DNA.pdf (843.63 KB)

Placement

Session time:

08/27/2008 - 16:00 - 08/27/2008 - 16:45

Overview

Relational databases store, index, and retrieve data using using predictable patterns. Indexing data well -- but not excessively -- requires understanding indexing overhead and usage by query execution planners.

But, even the perfect indexing plan layered on top of existing tables cannot satisfy all needs: relational databases have significant limitations for indexing data. Particularly, they cannot index data across multiple tables. While Oracle and Microsoft SQL Server provide some in-built tools (materialized views and indexed views) for alleviating this limitation, users of open-source databases like MySQL and PostgreSQL must consolidate and preprocess data to scale popular services.

The Denormalization API streamlines the process of consolidating node data into tables structured for radical scalability.

Agenda

* How is Drupal data stored?
* Overview of indexing tables
* Index data structures
* Query execution plans and indexes
* What is denormalization, and why is it necessary?
* Typical hurdles for implementing denormalization
* Using the Denormalization API

Goals

Attendees should leave with an understanding of the benefits and caveats of indexing and denormalizing data. And, if they choose to denormalize, how the Denormalization API can streamline the work.

Resources

Attendees should be familiar with Drupal's node system, SQL, and basic database administration.

Login to post comments
25 points

Welcome!

Attendees

Weather in Szeged

Latest on the whiteboard

Indexes and denormalization: keys to scaling sites with massive content

Session recording

Attached files