SQL

Correlación física de las columnas: candidatos a BRIN y CLUSTER

Lectura de pg_stats.correlation para las columnas indexadas: cerca de 1, basta con un BRIN diminuto; cerca de 0 en una columna de ordenación frecuente, un CLUSTER reordena la tabla y divide las I/O.

Requisitos

PostgreSQL (pg_stats, ANALYZE récent)

SQL
SELECT
    s.tablename,
    s.attname AS colonne,
    ROUND(s.correlation::numeric, 3) AS correlation,
    c.reltuples::bigint AS lignes,
    pg_size_pretty(pg_relation_size(c.oid)) AS taille_table,
    CASE
        WHEN abs(s.correlation) > 0.95 THEN 'BRIN candidat idéal'
        WHEN abs(s.correlation) < 0.10 THEN 'CLUSTER utile si tris fréquents'
        ELSE '-'
    END AS recommandation
FROM pg_stats s
JOIN pg_class c ON c.relname = s.tablename
JOIN pg_namespace n ON n.oid = c.relnamespace AND n.nspname = s.schemaname
WHERE s.schemaname = 'public'
  AND c.reltuples > 1000000
  AND s.attname IN ('created_at', 'recorded_at', 'order_date', 'customer_id')
ORDER BY abs(s.correlation) DESC;

Resultado

  tablename  |   colonne   | correlation |  lignes  | taille_table |         recommandation
-------------+-------------+-------------+----------+--------------+---------------------------------
 events_raw  | recorded_at |       0.998 | 48211570 | 11 GB        | BRIN candidat idéal
 order_lines | order_date  |       0.971 | 14201930 | 2.1 GB       | BRIN candidat idéal
 orders      | created_at  |       0.964 |  5522010 | 1.2 GB       | BRIN candidat idéal
 order_lines | customer_id |       0.042 | 14201930 | 2.1 GB       | CLUSTER utile si tris fréquents
(4 rows)

3 colonnes quasi parfaitement corrélées : BRIN = ~1000x plus petit qu'un B-tree
SQLBRINCLUSTERStatistiquesPerformance

Snippets relacionados

Volver al Data Lab