Car Auction

Professor: Hannah Bailey

Data Visualization - Summer 2024

MEMORANDUM

TO: Auction Garage (Classic Car Auctions)

FROM: Veronique Nedeau

DATE: July 8, 2024

SUBJECT: Car Auction Interactive Dashboard, Initial Data Observations

Objective

The objective of this project is to develop an interactive dashboard for prospective buyers to understand cars and their worth through previous auction sales. This project is currently in the development stage.

Data Investigation

The first step of preparing the data is understanding the included variables. This includes what they are, what the variable means, and whether there are any missing values. There are 159 Corvettes sold, 69 Camaros sold, and 92 Mustangs sold in the dataset. Table 1 shows the total number of missing values per variable per model of car.

Table 1: Missing Values per Variable

The most common missing value across all variables is ’Miles’. For the variables ‘Interior’, ‘Engine’, ‘Transmission’, and ‘Convertible’ the best course of action was to remove these from the dataset. This brought the total number of observations for Corvettes from 159 down to 152 observations and 95 (62.5%) without any mileage data.  The same was done for Camaros and Mustangs. For Camaros, removing the rows with missing ‘Interior’, ‘Engine’, ‘Transmission’, and ‘Convertible’ information brought the observations from 69 down to 61 and 48 (78.7%) with missing mileage information. For Mustangs, removing the rows of missing information reduced the total number of observations from 92 to 80, with 60 (75%) remaining with missing mileage.

‘Miles’ are more difficult to account for in terms of dealing with missing values. ‘Miles’ information, which is taken from the odometer of the car, cannot be falsified (a crime). The option to replace the values with the most common option would not be helpful to customers, as things like the Engine or Transmission type are very important to potential car buyers and should not be misrepresented. For the purpose of building an accurate dashboard for the customers, cars with missing mileage values were removed from the dataset, though this greatly reduced the amount of observed cars.

Mustang has a redundant variable ‘Sale Age’ which is the difference between the model year and auction year. This variable was removed from the dataset.

The data was combined into one sheet, ‘All Car Data’, with an additional variable added ‘Car Make’ with the labels ‘Corvette’, ‘Camaro’, and ‘Mustang’.

Questions and Additional data

The main request for additional data that I have after the initial investigation would be obtaining the missing data on the mileages of the Corvettes, Camaros, and Mustangs sold. To car buyers, some of the most important information would be how much the car has been driven. More mileage means more wear and tear on the car’s components, especially key parts like the transmission and engine. Different transmissions and engines have different usage levels, more miles on one engine may not have the same effect on a different engine.

When dealing with cars for sale, their condition is another key variable that would be useful for potential customers to know up front. These conditions could include ‘Mint’ where all original materials are in pristine condition, ‘Original’ where all parts of the car are original from the manufacturer, ‘As Is’ could possible mean that the car is not in great condition, ‘Restored’ the car is now in good condition but is not original from the manufacturer. This information may or may not be available, but it would be useful to have.

A possible solution to the missing mileage values would be to impute the values based on cars of similar make (model, engine, transmission) which also sold for a similar price. This would have to be stated clearly so as not to falsely advertise to customers.

Tableau Dashboard Captures

View Dashboard Online: Tableau Public

pdf format