Predict a Magic Card’s Color Identity Based on Artwork

Problem


Can we predict the color identity of a Magic card by analyzing the colors represented in the artwork?

Collecting Data


Merging MTGJSON, Scryfall Data – Jupyter Notebook – Python

For this analysis, we used two datasets:

Before joining the datasets for our analysis, we had to first pull some nested data into their own columns. The Scryfall ID is the column we plan on using to join the tables, and the Cropped Art links are links to the image files we plan on analyzing.

Lastly, we filtered the set for Alpha edition cards exclusively, and subset the data to remove unnecessary columns.

Joined and Filtered dataset.

Building a Solution


Analyzing MTG Art – Most Prominent Colors, Part 1 – Jupyter Notebook – Python

Analyzing MTG Art – Most Prominent Colors, Part 2 – Jupyter Notebook – Python

The way I want to analyze the color distribution is by taking each picture in a card’s artwork and classify it as a color, then returning a list of the colors that are featured the most along with the proportion at which they are featured.

To make help with classification of colors, we will first convert each image to a ‘numpy‘ array, with each element holding the RGB representation of a single pixel.

Import card art, then converting the image to a numpy array.

With the image in numerical form, we can perform a K Means clustering analysis with n = 5 clusters to group each of image’s pixels into one of five groups. The center RGB value of these five clusters will represent the five most prominent colors in the artwork we are analyzing. We can also compare the size of each cluster to calculate the proportion of pixels belong to each cluster.

Grouping pixels using K Means clustering analysis.

The output of our clustering analysis is converted into a dataframe and we concatenate the new color and color proportion value features to our original dataset.

Dataframe containing the most prominent colors and their proportion by pixel density.

The last step is to turn each unique color into it’s own feature/column in the dataset. Then we want zero out the values for artworks where that feature is not prominent.

Final dataset.

Data Visualizations Built with Dataset


Analyzing MTG Art – Distribution of Colors – Jupyter Notebook – R

Color Distribution in MTG Art – Plotly – Jupyter Notebook – Python