Visualization principles

A quick intro to data visualization

Background story

For my master thesis, I decided to go on with a completely new topic. I had to learn more about JavaScript and yet to discover all about data visualization. I did the thesis at Research Group Software Construction at the Faculty of Mathematics, Computer Science, and Natural Sciences of RWTH Aachen in the context of the ARAMIS project. It researched the question of how the software architecture (SA) visualization of reconstructed data can be improved. This is necessary to provide comprehension of the running system in different abstraction levels. The thesis had an exploratory purpose. The goal was to research existing visualizations in SA as well as other domains. Moreover, I created several software architecture visualization prototypes. The following are my findings and learnings, only related to visualizations design in general.

You sense far more than you are conscious of. Whether you want to or not.

Tor Norretranders

What is it all about?

The conversion of human senses to computer terms is a diagram by Danish Physicist Tor Norretranders. It is called “Bandwidth of our senses”. It compares the amount of information each of our senses perceives per second. Thereby, the diagram shows the power of vision compared to our other senses. It demonstrates why visualizations are so effective in conveying huge amounts of information in split seconds. Oh, and by the way, the white spot on the right lower corner is when we become conscious of what we just sensed. As a fun fact, it is worth mentioning that our tasting sense bandwidth goes head to head with one of a calculator’s.

The bandwidth of our senses by Tor Norretranders.

For designing effective visual graphics it is important to understand how visual perception works. How the human brain processes the perceived information is also crucial to know. That said, memory is a function of the human brain that is in charge of storing, retrieving, and processing the information. So, the brain perceives the visual representations, processes, and stores them in our memory. According to cognitive psychology, there are three types of memory, namely sensory, short-term, and long-term memory.

Right after we sense the information, it is automatically stored in our sensory memory. It is more of an impression of sensed data, imagery in the brain. It is an independent action and does not require our attention [3]. We sense the info a lot earlier before we start to process it. This process is called preattentive processing.

After preattentive processing, part of the data is transferred to short-term memory for further processing. As the name suggests, short-term memory’s capacity is limited in capacity. The information remains there for a short time. The information in short-term memory can be transferred to long-term memory, where it can even stay for whole life by periodic rehearsing or making meaningful associations.

The characteristics of these memory types are crucial for designing information visualization graphics. Preattentive processing is particularly important because preattentive attributes are perceived unconsciously. It is also important to remember that short-term memory is limited. Thus, it is not advisable to carry a visualization into separate screens, for example.

Preattentive attributes

Visual characteristics of shapes that are perceived during preattentive processing are called preattentive attributes. Ware C. [4] has grouped these attributes into four groups which are color, form, spacial position, and movement. The position of data points in a scatter plot carries the value of that data point and is a good example of a spacial position attribute. The image below illustrates the most commonly used ones among these attributes.

Comparison of common preattentive attributes taken from [4]

It is also worth mentioning that each of the attributes is more suitable for one or more quantitative or categorical data. These terms refer to statistical data types. Quantitative means that it carries some numeric values (ordered or not). Categorical indicates that it encompasses some limited number of categories. Color hue, for example, is not suitable for quantitative data. The brightness attribute should be applied in this case.

Gestalt laws/principles

Gestalt principles are rules that define the way we perceive certain configurations of shapes. These rules are defined as a result of studies carried out by Wertheimer M. [6], Kohler W. [7], and Koffka K. [8]. The basic principle is that humans tend to perceive similar visual entities as a whole.

The perception of a group also happens when entities are located close to each other, enclosed by a boundary or connected. Moreover, the closure principle describes our perception of incomplete shapes as being complete.

Closure principle in WWF for Nature logo.

A good example of this is a well-known logo of the World Wide Fund for Nature. Although the panda in the image is incomplete, it is just enough for us to recognize a panda. This is why it is enough to have only two lines for x and y-axis in a line plot, to perceive the area of the chart.

Continuity principle

The principle of continuity on the other hand, describes our tendency to group sequence of shapes as a whole. In the image on the left, even though the color distinction suggests grouping by color, the continuity principle fights over and wins the battle. We tend to perceive the circles as two crossing lines.

Gestalt laws are valuable means during visualization design, while they characterize our perception of groups of shapes.

Visualization principles

Now that we have some insight on laws of perception and how our visual sense works, some basic guidelines are built upon this knowledge and aim to help the design process.

Data-ink principle

This principle concerns the amount of redundant data-ink [9]. Data ink is the ink used for data points. So, Tufte proposes that the ratio of data-ink to total ink used should be maximized. Every element of a visualization (colors, shapes, etc) drives users’ attention. Therefore it is important to leave the visualization as simple as possible, and remove elements that do not represent any data. In the first pie chart below, the third dimension does not represent any properties of data. It distracts and potentially conveys the wrong information to the user. Therefore, it should be removed as shown in the second chart.

Data ink principle violated and complied

Maintain consistency

This principle emerges from the continuity principle of Gestalt laws. If visual transitions such as color or size do not convey any information, it will confuse the user seeking for some message of this difference. If a transition is the result of the data update, it should display as one. Animation in interactive visualizations is a good way to achieve a clear transition. According to experiment results conducted on the effectiveness of transition animations between statistical graphics (bar charts, scatter plots, etc.) in [10], often animations provide a clear perception of consistency. In the experiment they also discuss the principles for animation and introduce their taxonomy of transitions. Although they focus on statistical graphics, most of the information mentioned can also be applied to other sorts of visualizations.

Make the visualization aesthetically pleasing

We enjoy the beauty and beautiful design. Different colors have different effects on people. By choosing the right colors, shapes, etc. we can achieve visually pleasing results. Moreover, colorblind people cannot distinguish certain colors. Brewer C. provides some guidelines for choosing the right color for different types of data (ordinal, categorical, etc.)[11]. It is the designer’s responsibility to consider all the attributes when creating the visualization. In the first bar chart below, the chosen color combinations have an offensive effect on the user. The chart also violates the data-ink principle. It has too much redundant color (or ink) usage. As we can see, the second chart visualizes the same data but in a more appealing way. In this case, I removed the unnecessary background and applied more pleasing colors. This color palette is also colorblind safe. I used a web-based tool called ColorBrewer to generate these colors. This tool is also part of the research by Brewer C.

Make the visualization aesthetically pleasing violated and complied

Context is important

When isolated, without any context data may not be completely true. It also may convey the wrong message. An article written by visualization journalist McCandless D. [12] gives a good example of this. It describes a visualization of the army budget of different countries. First he shows the budget of the countries without any context information. The USA stands in first place among other countries. Then, the author argues that because the USA’s total budget is also big, the army budget is not that big, in ratio with the total budget. In the second version, the army budget is given by the percentage of the total budget. The USA is not in first place anymore. This makes more sense for the message it conveys. In this scenario, the context means the total amount of budget.

The context may also refer to the neighborhood (e.g. close located data points) of the visual element of interest. In this case, the context should also be provided when the data area is zoomed in and not all data points are visible. This is important because of the limits of short-term memory. When the user analyzes a specific data point, the surrounding elements should also be provided. It helps to determine where in the big picture the data point stands, thereby avoiding going back and forth.

Interactive exploration

Interactive visualizations should give the user an overview and allow zooming and filtering at the same time. It should be able to provide details on demand. Shneiderman called this principle visual information seeking mantra [13].

Consider your audience

A visualization that is going to be used by professionals daily, has different requirements than the one in a blog post. The latter is most likely to be analyzed just once and should be simpler and easier to learn. In that matter, we need to consider the target audience before the actual design process.

Further reading material

  1. Riccardo Mazza. Introduction to Information Visualization. Springer Publishing Company, Incorporated, 1 edition, 2009.
  2. Stephen Few. Information Dashboard Design: The Effective Visual Communication of Data. O’Reilly Media, Inc., 2006.
  3. Anne Treisman. Preattentive processing in vision. Computer Vision, Graphics, and Image Processing, 31(2):156–177, 1985.
  4. Colin Ware. Information Visualization: Perception for Design. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2004.
  5. Christian Behrens. The Form of Facts and Figures. Ph.D. thesis, University of Applied Sciences Potsdam, February 2008.
  6. Max Wertheimer. Untersuchungen zur lehre von der gestalt. ii. Psychologische Forschung, 4(1):301–350, 1923.
  7. Wolfgang Kohler. Gestalt Psychology. H. Liveright, New York, 1929.
  8. Kurt Koffa. Principles of Gestalt Psychology. Harcourt., New York, 1935.
  9. Edward R. Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, CT, USA, 1986.
  10. Jeffrey Heer and George Robertson. Animated transitions in statistical data graphics. IEEE Trans. Visualization & Comp. Graphics (Proc. InfoVis), 13:1240–1247, 2007.
  11. Cynthia A. Brewer. Guidelines for use of the perceptual dimensions of color for mapping and visualization. Proc. SPIE, 2171:54–63, 1994.
  12. David McCandless. Information is beautiful: war games. who really spends the most on their armed forces?
  13. B. Shneiderman. The eyes have it: a task by data type taxonomy for information visualizations. Proceedings 1996 IEEE Symposium on Visual Languages, pages 336–343, 1996.

Software developer. Be kind and stay curious.