How to Create a Multi-Dimensional Visualisation in R
Aim of any visualisation is to gain insight into the data, which by no means should be limited to just two factors at a time. Because in real life you always have multiple factors involved in any process. Challenge here is that traditional scatter plots can at max be scaled to 3 dimensions. Beyond that it becomes impossible to add more axes to your plot. But i don’t agree with the thought that inability to add more axes results in restriction on dimensions that you can show in your scatter plot. Visualisation on 2D planes is not restricted to just two dimensions opposed to general belief.
Let me give you a simple example using “mtcars” data in R. Those who are not really interested in programming, can ignore the code bit. Data set mtcars contains information on how various factors affect mileage of a car. here is a quick look at the data.
|Mazda RX4 Wag||21.0||6||160||110||3.90||2.875||17.02||0||1||4||4|
We’ll start with a simple 2D plot depicting how mileage varies with weight of the vehicle
As apparent from the plot, Mileage goes down with increase in Weight. Now lets put in another dimension in this and see how having automatic transmission affects mileage.
We can see that most of the cars with automatic transmission tend to have higher mileage. One thing to note here is that most of the high weight cars tend to have manual transmission which might be the real underlying reason for cars with automatic transmission to have higher mileage.
Ok, now lets add one more dimension to find out how number of gears change across these different vehicles.
You can see number of gears dont really affect mileage as they tend to take all possible values across entire range of mileage, same goes for weight. But a curious thing to observe here is that cars with automatic transmission managed to have higher number of gears in comparison to manual transmission cars. In fact there seems to be a limit on the number of gears which can be in the manual transmission cars.
Lets add one more dimension depciting number of cylinders in engines.
You can see that number of cylinders certainly seem to have an effect on mileage. Low mileage and high weight cars tend to have 8 cylinders in the engine where as high mileage and low weight cars tend to have 4 cylinders.
If you have noticed , by now we have 5 dimensions in a 2-D plot. Lession here is that visualising multiple factors [ dimension ] is not really about making n-D [impossible!] plot. Its not really feasible to add more axes to your plot. What we can do however is to give more features to our “points”, which is exactly what we have done here. The additional 3 dimensions that we introduced, are by adding features like shape, size and color to our points.
Thats what I wanted to convey, let your imagination [ and Mr hadley wickham :author of ggplot2] take you out of those dimensionality constraints! Happy Plotting in R!
Follow us on