Vega: A New Grammar-Based Specification for Visualizations

I’m a big fan of using languages for visualization rather than canned chart types. I’ve been working with the Grammar of Graphics approach for a number of years within SPSS and now IBM, and my book “Visualizing Time” is composed 95% of Grammar-based visualizations. It’s pretty safe to say it’s my preferred approach.

Protovis (the forerunner of D3, to a great extent) was built on Grammar approach; Bostock and Heer’s 2009 article (on Heer’s site at http://hci.stanford.edu/jheer/files/2009-Protovis-InfoVis.pdf) gives a very good statement of the benefits of the Grammar-based approach as opposed to the “Chart Type” approach:

The main drawback of [the chart type] approach is that it requires a small, closed system. If the desired chart type is not supported, or the desired visual parameter is not exposed in the interface, no recourse is available to the user and either the visualization design must be compromised or another tool adopted. Given the high cost of switching tools, and the iterative nature of visualization design, frequent compromise is likely.

Within the Grammar-based systems (ggplot by Hadly Wickham is another popular one) most of the teams went with a programming model; you write programs in some computer language using the grammar components as building blocks. The SPSS approach of defining a specification language as a way to access the underlying programming language was unique. In fact, SPSS designed two languages – a “user friendly” one called GPL and a more detail-oriented one called ViZml. You can get an overview for these grammar-based specifications in SPSS’s documentation – here’s a direct link to the ViZml one:

http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fvizml_basics.htm

So I was quite excited to take a look at Vega, the new layer on top of D3. It’s exactly what we had done in SPSS (and now IBM): “a higher-level visualization specification language”. The goals of the two systems are a little different – The IBM system is designed to focus attention on the high-level and not requiring low-level programming, whereas D3 and Vega focus on letting programmers work with low-level visualization entities better and faster – but they still serve roughly the same tasks, and so it’s informative to look at how the various specification languages describe things. I’ll take the Vega “try this first” tutorial bar chart as an example.

In GPL (2000) a bar chart was specified like this:

SOURCE: s = userSource(id("Employeedata"))
DATA: jobcat=col(source(s), name("jobcat"), unit.category())
DATA: salary=col(source(s), name("salary"))
SCALE: linear(dim(2), include(0))
GUIDE: axis(dim(2))
GUIDE: axis(dim(1))
ELEMENT: interval(position(jobcat*salary))

Pretty compact, but it did make a lot of decisions for you, such as default colors, bar appearance and so on. I’m going to skip the ViZml (2004) version (if you are interested, you can see examples in the ViZml docs linked above) and go directly to a comparison with the specification language we are currently working on in IBM, and using as a core visualization capability. The engine is called RAVE, and the language VizJSON.

I took the Vega example and recreated it as a VizJSON spec. Here are the two charts generated by the systems:

Sample Bar Charts (Vega first, VizJSON second)

A few minor differences; most notable are different default fonts and different default ticks on the vertical axis, but pretty much the same. Unfortunately, for legal reasons (that I do not 100% understand), I cannot show the exact VizJSON specification, so I have taken the specification and modified it a little so that the specification below, while not VizJSON, has the same structure and is almost identical in length. Mostly names have been changed.

Vega/VizJSON Comparison (pdf)

The Vega version is a bit longer; partly because in Vega each mark defines its own coordinate systems piecemeal by defining scales for each position (“x”, “y”, etc). VizJSON groups scales into coordinate systems and shares the coordinate systems among multiple elements, which is a little more compact.

Another reason is based on the difference in philosophy between the core engines; D3 assumes you are designing for a specific data set and so you have to tell it to allocate space for the axes, whereas RAVE adapts to the data unless you override it with a specific preference. This is why Vega has the padding element at the top; the data has y values in the range [0,100] and, for the default font, this requires a left padding of 30 pixels to make space for the ticks. RAVE works this out for you, so the padding is not needed in the VizJSON specification.

Vega also has finer control over the hover behavior; VizJSON currently does not allow such control and you have to use the programmatic interface to set the hover style.

These are minor details though; the main take-away here is that, despite a difference in the underlying engines and a different set of goals, the languages are very similar – it would be a simple job to write a translator from one to the other, for example. To a great extent this demonstrates that “Grammar of Graphics” approach is a very robust and powerful solution. It’s a language that works.

Which, considering I have been working on it for over a decade, is good news!

Vega: A New Grammar-Based Specification for Visualizations

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112