Skip to content

Commit

Permalink
[Doc] Add integers to four digit year format example (#3218)
Browse files Browse the repository at this point in the history
* [Doc] Add integers to four digit year format example

* [Doc] Add integers to four digit year format example

* Horizontally concatenate color charts for consistency

* Update language and charts for better flow

---------

Co-authored-by: Joel Ostblom <joel.ostblom@gmail.com>
  • Loading branch information
ChiaLingWeng and joelostblom authored Oct 7, 2023
1 parent cf7bdbd commit ef4b86e
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 18 deletions.
46 changes: 30 additions & 16 deletions doc/user_guide/encodings/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -170,19 +170,19 @@ Effect of Data Type on Color Scales
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
As an example of this, here we will represent the same data three different ways,
with the color encoded as a *quantitative*, *ordinal*, and *nominal* type,
using three vertically-concatenated charts (see :ref:`vconcat-chart`):
using three horizontally-concatenated charts (see :ref:`hconcat-chart`):

.. altair-plot::

base = alt.Chart(cars).mark_point().encode(
x='Horsepower:Q',
y='Miles_per_Gallon:Q',
).properties(
width=150,
height=150
width=140,
height=140
)

alt.vconcat(
alt.hconcat(
base.encode(color='Cylinders:Q').properties(title='quantitative'),
base.encode(color='Cylinders:O').properties(title='ordinal'),
base.encode(color='Cylinders:N').properties(title='nominal'),
Expand All @@ -198,35 +198,49 @@ Effect of Data Type on Axis Scales
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Similarly, for x and y axis encodings, the type used for the data will affect
the scales used and the characteristics of the mark. For example, here is the
difference between a ``quantitative`` and ``ordinal`` scale for an column
difference between a ``ordinal``, ``quantitative``, and ``temporal`` scale for an column
that contains integers specifying a year:

.. altair-plot::

pop = data.population.url
pop = data.population()

base = alt.Chart(pop).mark_bar().encode(
alt.Y('mean(people):Q').title('total population')
alt.Y('mean(people):Q').title('Total population')
).properties(
width=200,
height=200
width=140,
height=140
)

alt.hconcat(
base.encode(x='year:Q').properties(title='year=quantitative'),
base.encode(x='year:O').properties(title='year=ordinal')
base.encode(x='year:O').properties(title='ordinal'),
base.encode(x='year:Q').properties(title='quantitative'),
base.encode(x='year:T').properties(title='temporal')
)

Because quantitative values do not have an inherent width, the bars do not
Because values on quantitative and temporal scales do not have an inherent width, the bars do not
fill the entire space between the values.
This view also makes clear the missing year of data that was not immediately
apparent when we treated the years as categories.
These scales clearly show the missing year of data that was not immediately
apparent when we treated the years as ordinal data,
but the axis formatting is undesirable in both cases.

To plot four digit integers as years with proper axis formatting,
i.e. without thousands separator,
we recommend converting the integers to strings first,
and the specifying a temporal data type in Altair.
While it is also possible to change the axis format with ``.axis(format='i')``,
it is preferred to specify the appropriate data type to Altair.

.. altair-plot::

pop['year'] = pop['year'].astype(str)

base.mark_bar().encode(x='year:T').properties(title='temporal')

This kind of behavior is sometimes surprising to new users, but it emphasizes
the importance of thinking carefully about your data types when visualizing
data: a visual encoding that is suitable for categorical data may not be
suitable for quantitative data, and vice versa.

suitable for quantitative data or temporal data, and vice versa.

.. _shorthand-description:

Expand Down
7 changes: 5 additions & 2 deletions doc/user_guide/times_and_dates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,12 @@ example, we'll limit ourselves to the first two weeks of data:
y='temp:Q'
)

(notice that for date/time values we use the ``T`` to indicate a temporal
Notice that for date/time values we use the ``T`` to indicate a temporal
encoding: while this is optional for pandas datetime input, it is good practice
to specify a type explicitly; see :ref:`encoding-data-types` for more discussion).
to specify a type explicitly; see :ref:`encoding-data-types` for more discussion.
If you want Altair to plot four digit integers as years,
you need to cast them as strings before changing the data type to temporal,
please see the :ref:`type-axis-scale` for details.

For date-time inputs like these, it can sometimes be useful to extract particular
time units (e.g. hours of the day, dates of the month, etc.).
Expand Down

0 comments on commit ef4b86e

Please sign in to comment.