Skip to content

Data Model

This page documents the raw data model across all four source domains: Ads, News, Podcasts, and Streaming.

Entity Relationship Diagram

Want to zoom in? Visit the full ERD here.

erDiagram
    ads.campaigns {
        string campaign_id PK
        string advertiser_id
        string campaign_name
        string campaign_type
        date   start_date
        date   end_date
        int    budget_cents
    }

    ads.impressions {
        string impression_id PK
        string campaign_id   FK
        string content_id    FK
        date   impression_date
        int    impressions_count
        int    clicks
    }

    ads.spend {
        string spend_id    PK
        string campaign_id FK
        date   spend_date
        int    spend_cents
        int    platform_fee_cents
    }

    news.authors {
        string author_id PK
        string name
        string email
        date   joined_at
    }

    news.articles {
        string article_id  PK
        string author_id   FK
        string title
        string category
        date   published_at
        date   updated_at
        string status
        int    word_count
    }

    news.page_views {
        string view_id    PK
        string article_id FK
        string user_id
        date   viewed_at
        string referrer_source
    }

    podcasts.shows {
        string show_id    PK
        string show_name
        string host_name
        string category
        date   launched_at
    }

    podcasts.episodes {
        string episode_id      PK
        string show_id         FK
        string title
        date   published_at
        int    duration_seconds
        int    season
        int    episode_number
    }

    podcasts.listens {
        string listen_id              PK
        string episode_id             FK
        string user_id
        date   listened_at
        int    listen_duration_seconds
        string platform
    }

    streaming.content_catalog {
        string content_id      PK
        string title
        string genre
        string content_type
        date   release_date
        int    runtime_minutes
    }

    streaming.subscriptions {
        string subscription_id  PK
        string user_id
        string plan_type
        string status
        date   started_at
        date   ended_at
        int    monthly_fee_cents
    }

    streaming.watch_events {
        string event_id              PK
        string user_id
        string content_id            FK
        date   watched_at
        int    watch_duration_seconds
        string device_type
    }

    streaming.content_catalog ||--o{ streaming.watch_events : "watched as"
    ads.campaigns        ||--o{ ads.impressions           : "generates"
    ads.campaigns        ||--o{ ads.spend                 : "incurs"
    streaming.content_catalog ||--o{ ads.impressions      : "appears in"
    news.authors         ||--o{ news.articles             : "writes"
    news.articles        ||--o{ news.page_views           : "receives"
    podcasts.shows       ||--o{ podcasts.episodes         : "publishes"
    podcasts.episodes    ||--o{ podcasts.listens          : "recorded as"

Domains

Ads

Table Description
ads.campaigns Advertiser campaigns with budget and date range
ads.impressions Impressions and clicks per campaign and content item
ads.spend Daily spend and platform fees per campaign

News

Table Description
news.authors Author profiles
news.articles Articles with author, category, and status
news.page_views Page view events per article and user

Podcasts

Table Description
podcasts.shows Podcast show metadata
podcasts.episodes Episodes with season, number, and duration
podcasts.listens Listen events per episode and user

Streaming

Table Description
streaming.content_catalog Video content with genre and runtime
streaming.subscriptions User subscription plans and status
streaming.watch_events Watch events per content item and user

Cross-domain Relationships

ads.impressions links to streaming.content_catalog via content_id, meaning ad impressions are served against streaming content items.

Users

user_id appears in news.page_views, podcasts.listens, streaming.subscriptions, and streaming.watch_events but there is no users table in the raw data. A unified users table may exist upstream.