Transcript from the "Chart Types & Vega-Lite" Lesson

[00:00:00]

>> Shirley Wu: And I think for today, we probably only have the time to explore one or two of these questions. But keeping these in mind, I want to then start talking about how do we explore that data quickly to come to some sort of answers? Or to either validate or debunk some of these questions and hypotheses.

[00:00:22] And so what I want to talk about is some basic chart types to help you with the data exploration. And so I think I have just four in here that I've asked around and seeing what charts help with the basic data exploration. So a bar chart is one of those really common, really easy ones that are quick.

[00:00:48] So bar charts are really good for categorical comparisons where the domain is categorical and the range is quantitative. So in this particular example, it's the movie genres, and then the number of movies that fall into those genres. Histograms are really great for if you want to look up a distribution of something.

[00:01:15] So if the domain is, for where the domain is quantitative, and you want to bucket those records into these bins. So the example I like to give is if you have movies and their meta-scores that we were talking about earlier. So if you want to put them into every ten points or every 10% or something, and then the range is the frequency, the number of movies that fall into those bins.

[00:01:48]

>> Shirley Wu: Scatter plots are really great for correlations. So all of the versus that we were talking about like this meta-score versus this IMDB rating versus etc, that's really great to test out with correlation. So correlations is great for when you have two attributes, and you want to see the relationship between their quantitative values.

[00:02:15]

>> Shirley Wu: And then for the temporal, it's line charts, line charts are really great for looking for temporal trends. And so the domain for that is temporal and the range is quantitative.

>> Shirley Wu: And so those are kind of the four that I hear people talk about a lot in terms of just kind of understanding the data, really quickly understanding the data.

[00:02:40] And for us today to explore the data, I want to use something called Vega-Lite, which is out of the, I think UW, I'm gonna get this wrong, the UW,

>> Shirley Wu: I call it the Jeff Heer lab. [LAUGH]

>> Shirley Wu: So he runs the Interactive Data Lab. And Jeff Heer is actually the professor who was overseeing Mike Bostock when he was working on an earlier version with D3.

[00:03:15] And then now Jeff Heer leads this Interactive Data Lab, and one of the things that they do is Vega and Vega-Lite. And they call it a visualization of grammar, I think, a grammar of interactive graphics. And so this is just a really quick way of being able to compose visualizations.

[00:03:37] And the way that you can do it is you just give it a data source, you tell it what each mark is supposed to be. So it is a tick, is it a bar, is it a point, is it a line, how is the data represented? And then you tell it what encodings each of those marks have, so that could be x, y, color, etc.

[00:04:02] And for Vega-Lite, for each encoding you need to give it a type. So that data type we were talking about earlier, is it qualitative, is it temporal is it, instead of categorical they call it nominal. And so that's why I talked a lot about data types earlier and the field.

[00:04:22] And the field is basically the attribute, the name of that attribute in your data object. So those are the things that we need to draw graphs with Vega-Lite.