From Raw Data to Storytelling Visuals

Created by Senanur Kök

Trends

(Show change over time or ordered sequences)

Purpose: Visualize how a numeric measure evolves along an ordered axis (time, steps, sequences).
When to use: time series, progression of metrics, cumulative metrics.

🧠 Tip: Aggregate your data (e.g., daily → weekly) and use ci bands to make trends clearer.

What story does your data tell?

Welcome to the basic module of data visualization with Seaborn!

This OrgPad document is structured as a hierarchical visual guide for data visualization using Seaborn (a Python library built on Matplotlib).

Bx4AkTJ

sns.lineplot()

basic line chart for trends

import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style('whitegrid')
sns.lineplot(x="year", y="passengers", data=df)
plt.show()

lineplot 3 0

sns.relplot(kind='line')

top-level function supporting faceting (col, row) and hue.

sns.relplot(data=fmri, x="timepoint", y="signal", col="region",hue="event", style="event", kind="line",)

relplot 16 0 line

Relationships

(Explore relationships between variables)

Purpose: Understand how two or more variables relate to each other and whether patterns or correlations exist.
When to use: correlation analysis, feature engineering insights, regression exploration.

🧠 Practical tips: if dataset is dense, use transparency (alpha) or downsample; for categorical hues with many levels, consider grouping.

sns.relplot(kind='scatter')

high-level scatter function with faceting.

sns.relplot(x='total_bill', y='tip', hue='day', col='time', kind='scatter', data=df)

relplot 6 0

sns.jointplot()

scatter + marginal distributions.

sns.jointplot(data=df, x="bill_length_mm", y="bill_depth_mm", hue="species", kind="kde")

jointplot 5 0

sns.heatmap()

visualize matrices (e.g., correlation matrix or confusion matrix).

sns.heatmap(attention_weights,
xticklabels=tokens,
yticklabels=tokens,
cmap="Reds",
annot=True,
fmt=".2f",
cbar_kws={'label': 'Attention Weight'})

download

Distributions

(Show distribution and spread of a single variable)

Purpose: Understand shape, skew, modality, outliers and spread of variables.
When to use: quality checks, distribution assumptions, outlier detection.

🧠 Practical tips: overlay KDE+hist for clarity; use log_scale=True if data has heavy tails; boxplot is great for comparing spreads across categories.

sns.violinplot()

combines boxplot with density (useful for multimodal distributions).

sns.violinplot(data=df, x="age", y="class")

violinplot 3 0

sns.displot()

high-level distribution plot (supports kind='hist', 'kde', 'ecdf').

g = sns.displot(data=penguins, x="flipper_length_mm", y="bill_length_mm", kind="kde", rug=True)

displot 13 0

sns.kdeplot()

smooth density estimate.

sns.kdeplot(data=df, x="total_bill", hue="time", multiple="fill")

kdeplot 15 0

grouped.plot(pandas)

Seaborn does not have a native ‘stacked bar’—use pandas or matplotlib for stacked charts:

grouped = df.groupby(['month','category']).size().unstack(fill_value=0)
grouped.plot(kind='bar', stacked=True)

Compositions

(Show parts of a whole)

Purpose: Display how components make up a whole (ratios/percentages).
When to use: market share, product mix, budget breakdown.

🧠 Practical tips: annotate percentages directly on bars; consider a pie chart only for few (<5) categories.

Comparisons

(Compare groups or categories)

Purpose: Compare central tendency or counts across categorical groups.
When to use: A/B test results, category performance comparisons.

🧠 Practical tips: for skewed data use median (via estimator=np.median); rotate x-tick labels for many categories.

sns.countplot()

simple counts for categories.

sns.countplot(titanic, x="class", hue="survived", stat="percent")

countplot 5 0

sns.ecdfplot()

empirical cumulative distribution.

sns.ecdfplot(data=df, x="bill_length_mm", hue="species")

ecdfplot 7 0

sns.barplot()

shows estimate (by default mean) with CI.

sns.barplot(penguins, x="island", y="body_mass_g")

barplot 1 0

sns.catplot(kind='bar')

can be used to show grouped bars (not stacked) for composition-like views.

sns.catplot(
data=df, x="class", y="survived", col="sex",
kind="bar", height=4, aspect=.6,
)

catplot 9 0

Multivariate

(Many variables together)

Purpose: Explore multi-dimensional interactions simultaneously.
When to use: feature selection, exploratory data analysis for many variables.

🧠 Practical tips: limit to 4–6 variables for readability; consider using corner=True to show only lower triangle.

sns.pairplot()

pairwise relationships and marginal distributions.

sns.pairplot(penguins, hue="species", markers=["o", "s", "D"])

pairplot 11 0

sns.scatterplot()

scatter plot for two numeric variables.

x="total_bill",
y="tip",
hue="time",
style="time",
data=df
)

scatterplot 7 0

sns.regplot()

scatter + regression line (simple use).

sns.regplot( x="weight", y="acceleration", data=df)

regplot 1 0

sns.lmplot()

combines regplot with faceting and hue.

sns.lmplot(
data=df, x="bill_length_mm", y="bill_depth_mm",
hue="species", col="sex", height=4,
)

lmplot 6 0

sns.catplot(kind='bar' / 'box' / 'violin')

general categorical grid function.

sns.catplot(data=df, x="age", y="class", hue="sex", kind="boxen")

catplot 5 0

sns.boxplot()

compact summary (median, quartiles, whiskers, outliers).

sns.boxplot(data=titanic, x="class", y="age", hue="alive")

boxplot 5 0

sns.histplot()

histogram with optional KDE overlay.

sns.histplot(data=penguins, x="flipper_length_mm", bins=30)

histplot 7 0

References

Image source: Seaborn Documentation (https://seaborn.pydata.org)
Kaggle Data Visualization Course (https://www.kaggle.com/learn/data-visualization)