From ef4b86e552702b7809316cbbecf141782629b939 Mon Sep 17 00:00:00 2001 From: "Weng, Chia-Ling" <75072960+ChiaLingWeng@users.noreply.github.com> Date: Sun, 8 Oct 2023 01:01:39 +0800 Subject: [PATCH] [Doc] Add integers to four digit year format example (#3218) * [Doc] Add integers to four digit year format example * [Doc] Add integers to four digit year format example * Horizontally concatenate color charts for consistency * Update language and charts for better flow --------- Co-authored-by: Joel Ostblom --- doc/user_guide/encodings/index.rst | 46 +++++++++++++++++++----------- doc/user_guide/times_and_dates.rst | 7 +++-- 2 files changed, 35 insertions(+), 18 deletions(-) diff --git a/doc/user_guide/encodings/index.rst b/doc/user_guide/encodings/index.rst index 4de8616fc..87fa7c3e7 100644 --- a/doc/user_guide/encodings/index.rst +++ b/doc/user_guide/encodings/index.rst @@ -170,7 +170,7 @@ Effect of Data Type on Color Scales ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ As an example of this, here we will represent the same data three different ways, with the color encoded as a *quantitative*, *ordinal*, and *nominal* type, -using three vertically-concatenated charts (see :ref:`vconcat-chart`): +using three horizontally-concatenated charts (see :ref:`hconcat-chart`): .. altair-plot:: @@ -178,11 +178,11 @@ using three vertically-concatenated charts (see :ref:`vconcat-chart`): x='Horsepower:Q', y='Miles_per_Gallon:Q', ).properties( - width=150, - height=150 + width=140, + height=140 ) - alt.vconcat( + alt.hconcat( base.encode(color='Cylinders:Q').properties(title='quantitative'), base.encode(color='Cylinders:O').properties(title='ordinal'), base.encode(color='Cylinders:N').properties(title='nominal'), @@ -198,35 +198,49 @@ Effect of Data Type on Axis Scales ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Similarly, for x and y axis encodings, the type used for the data will affect the scales used and the characteristics of the mark. For example, here is the -difference between a ``quantitative`` and ``ordinal`` scale for an column +difference between a ``ordinal``, ``quantitative``, and ``temporal`` scale for an column that contains integers specifying a year: .. altair-plot:: - pop = data.population.url + pop = data.population() base = alt.Chart(pop).mark_bar().encode( - alt.Y('mean(people):Q').title('total population') + alt.Y('mean(people):Q').title('Total population') ).properties( - width=200, - height=200 + width=140, + height=140 ) alt.hconcat( - base.encode(x='year:Q').properties(title='year=quantitative'), - base.encode(x='year:O').properties(title='year=ordinal') + base.encode(x='year:O').properties(title='ordinal'), + base.encode(x='year:Q').properties(title='quantitative'), + base.encode(x='year:T').properties(title='temporal') ) -Because quantitative values do not have an inherent width, the bars do not +Because values on quantitative and temporal scales do not have an inherent width, the bars do not fill the entire space between the values. -This view also makes clear the missing year of data that was not immediately -apparent when we treated the years as categories. +These scales clearly show the missing year of data that was not immediately +apparent when we treated the years as ordinal data, +but the axis formatting is undesirable in both cases. + +To plot four digit integers as years with proper axis formatting, +i.e. without thousands separator, +we recommend converting the integers to strings first, +and the specifying a temporal data type in Altair. +While it is also possible to change the axis format with ``.axis(format='i')``, +it is preferred to specify the appropriate data type to Altair. + +.. altair-plot:: + + pop['year'] = pop['year'].astype(str) + + base.mark_bar().encode(x='year:T').properties(title='temporal') This kind of behavior is sometimes surprising to new users, but it emphasizes the importance of thinking carefully about your data types when visualizing data: a visual encoding that is suitable for categorical data may not be -suitable for quantitative data, and vice versa. - +suitable for quantitative data or temporal data, and vice versa. .. _shorthand-description: diff --git a/doc/user_guide/times_and_dates.rst b/doc/user_guide/times_and_dates.rst index 8f62b061c..066e4d032 100644 --- a/doc/user_guide/times_and_dates.rst +++ b/doc/user_guide/times_and_dates.rst @@ -50,9 +50,12 @@ example, we'll limit ourselves to the first two weeks of data: y='temp:Q' ) -(notice that for date/time values we use the ``T`` to indicate a temporal +Notice that for date/time values we use the ``T`` to indicate a temporal encoding: while this is optional for pandas datetime input, it is good practice -to specify a type explicitly; see :ref:`encoding-data-types` for more discussion). +to specify a type explicitly; see :ref:`encoding-data-types` for more discussion. +If you want Altair to plot four digit integers as years, +you need to cast them as strings before changing the data type to temporal, +please see the :ref:`type-axis-scale` for details. For date-time inputs like these, it can sometimes be useful to extract particular time units (e.g. hours of the day, dates of the month, etc.).