How does OpenAI Playground respond to a request to characterize fonts by gender?

The Preparation

  • We repeated the prompt and copied the results into a spreadsheet until we had 100 font pairings for “feminine,” 100 font pairings for “masculine,” and 100 font pairings for “non-gendered”.
  • We cleared the history after every prompt, just to avoid influencing future data sets.

The Settings

  • Mode: Complete
  • Model: text-davinci-003
  • Temperature: 1
  • Maximum Length: 256
  • Stop sequences: None
  • Top P: 1
  • Frequency Penalty: 0
  • Presence Penalty: 0
  • Best of: 1
  • Inject Start Text: Yes
  • Inject Restart Text: Yes
  • Show Probabilities: Off

The Cleanup

  • Due to the default setting for Maximum Length, we usually didn’t get a full 20 font pairings returned for each prompt. As we got close to 100, we modified the prompt to fill in the remainder (e.g. “six font pairings” instead of “twenty font pairings”).
  • For both the Feminine and Non-Gendered prompts, one font pairing was returned with only two descriptors. Rather than going back, removing the pairing, and generating an additional pairing, we opted to process it the same way as the rest of the data.
  • For the Non-Gendered prompt, one font pairing was returned with a truncated third descriptor. We opted to process this one by removing the truncated descriptor and processing the pairing as having only two descriptors, similar to the point above.
  • For both the Feminine and Masculine prompts, one data set was returned where the word “feminine” or “masculine” was used as a descriptor for every single font. In both instances, we removed that data set to avoid creating a statistical outlier in the descriptors.
  • For both the Feminine and Masculine prompts, one data set was returned where each font pairing got two sets of three descriptors (six total). In both instances, we removed the second set of descriptors from each datum to standardize the data set with the others.
  • For the Feminine prompt, one data set was returned where a sentence of description was provided for each font pairing, in addition to the keywords. We removed the sentence from each datum to standardize the data set with the others.

The Processing

  • We divided each datum out into its corresponding elements: Font 1, Font 2, Descriptor 1, Descriptor 2, and Descriptor 3.
  • We then processed the data sets to return the following lists:
    • All fonts in each specific category and how often they each occurred.
    • All descriptors in each specific category and how often they each occurred.
    • All fonts across all categories, and how often they each occurred in each gender category and in total.
    • All descriptors across all categories, and how often they each occurred in each gender category and in total.
    • Which fonts were paired with which fonts, and how often.
  • We next assigned additional contextual data for each font, including:
    • Its source (although we attempted to specify only Google Fonts, Playground still returned fonts from Microsoft, Adobe Fonts, and independent publishers).
    • Its type classification (serif, sans-serif, display, handwriting, or monospace).
    • Whether the font was a hallucination, i.e., Playground had made up a font to suggest that did not actually exist.
  • Since Playground uses a language model and not a visual model, it relies on written descriptions, categories, and keywords that people across the internet have assigned to these fonts. To explore this, we also added tags to fonts if their names featured:
    • An identifiable name in any language, and what gender category that name was most commonly associated with (such as “Arial,” “Oswald,” “Kalam,” “Sofia,” “Reenie,” “Josefin”).
    • An identifiable gendered term in any language (such as “girls,” “thambi,” (Tamil for “younger brother”), “guy,” “daughter,” “king”).
    • An activity that is commonly associated with a specific gender category (such as “homemade,” “dancing,” “playball”).
BY THE NUMBERS

How does AI’s gender bias reflect in typefaces?

Top 4 Feminine Fonts

  • Playfair Display
  • Roboto
  • Raleway
  • Merriweather

Top 4 Masculine Fonts

  • Open Sans
  • Oswald
  • Raleway
  • Roboto

Top 4 Non-Gendered Fonts

  • Montserrat
  • Merriweather
  • Raleway
  • Roboto
F/N (5%)
Feminine (32%)
F/M (2%)
Masculine (18%)
M/N (10%)
Non-Gendered (14%)
F/M/N (19%)

Only Feminine
49

F/M
3

Only Masculine
28

M/N
16

Only Non-Gendered
22

F/N
8

F/M/N
29

Font Number of Occurrences
Abel
Masculine (2)
Non-Gendered (1)
Abhaya Libre
Non-Gendered (1)
Abril Fatface
Feminine (1)
Acme
Feminine (1)
Aladin
Non-Gendered (1)
Aldrich
Masculine (1)
Alegreya
Masculine (2)
Aleo
Feminine (2)
Allura
Feminine (1)
Amaranth
Non-Gendered (1)
Amatic SC
Feminine (5)
Masculine (2)
Non-Gendered (2)
Anton
Feminine (1)
Masculine (1)
Architects Daughter
Feminine (2)
Archivo Narrow
Non-Gendered (1)
Arial
Non-Gendered (1)
Arimo
Non-Gendered (1)
Arvo
Feminine (2)
Masculine (2)
Non-Gendered (4)
Audiowide
Feminine (1)
Average Sans
Feminine (1)
Baloo Chettan 2
Feminine (1)
Baloo Paaji 2
Masculine (1)
Baloo Thambi 2
Feminine (1)
Bangers
Feminine (1)
Barlow
Masculine (1)
Bebas Neue
Masculine (2)
Berlin Sans
Feminine (1)
Bevan
Feminine (1)
Bihar
Non-Gendered (1)
Bitter
Feminine (3)
Masculine (2)
Non-Gendered (2)
Bungee
Masculine (1)
Cabin
Feminine (3)
Masculine (2)
Non-Gendered (1)
Cardo
Non-Gendered (1)
Caveat
Feminine (1)
Cinema Insta
Feminine (1)
Cinzel
Feminine (2)
Comfortaa
Feminine (1)
Corben
Masculine (1)
Cormorant
Feminine (1)
Non-Gendered (1)
Cormorant Garamond
Masculine (2)
Non-Gendered (1)
Courgette
Feminine (1)
Crafty Girls
Feminine (1)
Craw Modern
Masculine (1)
Crimson Text
Feminine (2)
Masculine (3)
Non-Gendered (2)
Dancing Script
Feminine (7)
Masculine (1)
Dante
Masculine (1)
Decked Out
Non-Gendered (1)
DM Sans
Masculine (1)
Domine
Masculine (1)
Droid Sans
Non-Gendered (2)
Droid Serif
Masculine (1)
Non-Gendered (3)
EB Garamond
Feminine (1)
Masculine (2)
Non-Gendered (1)
Exo
Feminine (2)
Non-Gendered (1)
Fira Sans
Masculine (1)
Non-Gendered (3)
Fjalla One
Masculine (1)
Non-Gendered (1)
Gentium Basic
Non-Gendered (1)
Gentium Book Basic
Masculine (1)
Grand Hotel
Feminine (1)
Graublau Web
Feminine (1)
Great Vibes
Feminine (2)
Non-Gendered (1)
Haste
Feminine (1)
Homemade Apple
Feminine (1)
Ib Leeanna
Masculine (1)
Ibam DM Serif
Masculine (1)
IBM Plex Serif
Masculine (1)
Impact
Masculine (1)
Inconsolata
Non-Gendered (1)
Indie Flower
Masculine (1)
Non-Gendered (1)
Inter
Non-Gendered (1)
Jim Nightshade
Feminine (1)
Josefin Sans
Feminine (2)
Non-Gendered (2)
Kalam
Feminine (2)
Karla
Feminine (1)
Masculine (2)
Non-Gendered (2)
Kaushan Script
Feminine (2)
Kreon
Masculine (1)
Lato
Feminine (7)
Masculine (9)
Non-Gendered (7)
Lexend Deca
Feminine (1)
Libre Baskerville
Feminine (1)
Non-Gendered (1)
Libre Franklin
Non-Gendered (2)
Lobster
Feminine (3)
Masculine (1)
Non-Gendered (2)
Lolita
Feminine (1)
Lora
Feminine (4)
Masculine (3)
Non-Gendered (7)
Loved by the King
Feminine (1)
Luckiest Guy
Feminine (1)
Magneto
Feminine (1)
Marcellus
Masculine (1)
Mari
Feminine (1)
Maven Pro
Masculine (1)
Non-Gendered (1)
Merriweather
Feminine (9)
Masculine (7)
Non-Gendered (8)
Merriweather Sans
Feminine (1)
Masculine (2)
Non-Gendered (1)
Monda
Masculine (1)
Monoton
Feminine (1)
Montserrat
Feminine (5)
Masculine (9)
Non-Gendered (8)
Montserrat Alternates
Masculine (1)
Non-Gendered (2)
Montserrat Subrayada
Feminine (1)
Muli
Masculine (2)
Non-Gendered (4)
Museo Sans
Masculine (2)
Neuton
Masculine (1)
Noto Sans
Feminine (2)
Masculine (3)
Non-Gendered (7)
Noto Serif
Feminine (3)
Masculine (3)
Non-Gendered (3)
Nunito
Masculine (2)
Non-Gendered (5)
Nunito Sans
Masculine (1)
Non-Gendered (1)
Old Standard TT
Feminine (1)
Masculine (1)
Open Sans
Feminine (8)
Masculine (10)
Non-Gendered (8)
Open Sans Condensed
Feminine (1)
Masculine (2)
Non-Gendered (4)
Orbitron
Feminine (1)
Non-Gendered (1)
Oswald
Feminine (8)
Masculine (10)
Non-Gendered (6)
Ovo
Feminine (1)
Non-Gendered (1)
Oxygen
Non-Gendered (2)
Pacifico
Feminine (7)
Non-Gendered (1)
Patua One
Feminine (1)
Paytone One
Feminine (1)
Pinyon Script
Feminine (1)
Playball
Feminine (2)
Playfair Display
Feminine (10)
Masculine (9)
Non-Gendered (7)
Playfair Display SC
Non-Gendered (1)
Plus Ultra
Masculine (1)
Poppins
Feminine (5)
Masculine (5)
Non-Gendered (5)
Prata
Feminine (1)
PT Sans
Feminine (3)
Masculine (7)
Non-Gendered (6)
PT Sans Narrow
Masculine (1)
Non-Gendered (1)
PT Serif
Feminine (1)
Masculine (7)
Non-Gendered (2)
Quattrocento
Feminine (1)
Quicksand
Feminine (5)
Masculine (2)
Non-Gendered (2)
Raahar
Masculine (1)
Raleway
Feminine (5)
Masculine (10)
Non-Gendered (8)
Reenie Beanie
Feminine (2)
Roboto
Feminine (9)
Masculine (10)
Non-Gendered (8)
Roboto Condensed
Feminine (1)
Masculine (4)
Non-Gendered (3)
Roboto Mono
Non-Gendered (2)
Roboto Slab
Feminine (1)
Masculine (5)
Non-Gendered (6)
Rock Salt
Feminine (1)
Rokkitt
Non-Gendered (1)
Rozha One
Masculine (1)
Rubik
Masculine (3)
Non-Gendered (2)
Sacramento
Feminine (1)
Satisfy
Feminine (4)
Shadows Into Light Two
Feminine (1)
Slabo 13px
Masculine (1)
Slabo 27px
Feminine (1)
Snapdragon
Feminine (1)
Sofia
Masculine (1)
Source Sans Pro
Feminine (4)
Masculine (8)
Non-Gendered (7)
Source Serif Pro
Masculine (1)
Non-Gendered (1)
Spectral
Masculine (1)
Star Jedi
Non-Gendered (1)
System
Non-Gendered (1)
Tangerine
Feminine (3)
Ubuntu
Masculine (3)
Non-Gendered (4)
Ubuntu Condensed
Non-Gendered (2)
Vidaloka
Non-Gendered (1)
Vollkorn
Feminine (1)
Masculine (1)
Non-Gendered (1)
Work Sans
Masculine (1)
Non-Gendered (3)
Yanone Kaffeesatz
Feminine (1)
Yellowtail
Feminine (1)
Zilla Slab
Masculine (1)

Here’s what we found

As a language model (not a visual model), Playground relies on the written content people across the internet have produced and attached to these fonts. Captions describing a font as “elegant and classic, a great fit for your wedding invitations,” or tags like “cute” or “dramatic” or “imposing,” or a blog post stating “We used the font Lato for our branding because it has a clean, masculine look to it” — examples like these would have cropped up across the enormous data set Playground was trained on.

Just like predictive text, Playground tries to give you the most likely answer to follow your question, and for an AI, “most likely” means “most statistically frequent, given these parameters”. Playground, and other generative AIs trained on the content of the internet, will reproduce the biases that show up in that content.

(Jump to our footnotes to read more about the risks and strategies for dealing with this issue.)

With that in mind, the results aren’t surprising. Let’s take a look at our five bullet points from above, and see if any of them match up to human biases we should beware of when using generative text:

“Feminine” fonts had more unique fonts per data set.

This doesn’t seem particularly conclusive. Does this point to more fonts across the internet being marketed or recommended to women? Does it point to female designers or female creatives being more common than other genders in those industries, or rather to more designs being aimed at women? This is vague enough that it doesn’t really seem to represent a bias.

“Non-gendered” fonts were described in the widest variety of terms.

Like the above bullet point, this may not mean much — but it’s possible that it could point to a bit of uncertainty in the design world towards how to market towards an explicitly gender-neutral audience. Some of the descriptors were also fairly unusual to see applied to a font: “adaptable,” “compatible,” “non-assuming,” etc. It’s possible that Playground, like many humans, may struggle to accurately portray the concept of being gender-neutral.

“Masculine” fonts were the most likely to be hallucinations.

But given the small percentage of fonts that were hallucinations at all (4.5%), and the even smaller percentage that occurred specifically in the Masculine category (2.6%), this doesn’t seem like a bias either.

Handwriting fonts were significantly overrepresented in the Feminine category.

This one does potentially point to a bias, although a much larger data set would be needed to demonstrate anything truly conclusive. With five times the number of handwriting fonts in the Feminine category than in either of the other categories, it’s safe to say that Playground may have a little trouble thinking outside the artsy, elegant Handwriting box for women.

The one time a monospace font occurred, it was considered “gender-neutral”.

One data point isn’t a trend. But a long-standing narrative trend exists of nonbinary representation being limited to characters who are robots, aliens, or otherwise non-human, and seeing the lone instance of a monospace, code-like font fall into the Non-Gendered category may raise some eyebrows. Like we said before, Playground may occasionally reproduce some of the internet’s clumsy characterizations of a gender-neutral demographic.

Font names containing gendered terms didn’t always line up with the matching gender category.

This was a very odd scatter of data points. While activities and terms gendered female did align with the Feminine category a little more frequently, terms gendered male also more frequently occurred in the Feminine category, and activities gendered male were evenly split between all three categories. When it came to names, female names most frequently ended up in the Masculine category along with male names (although gender-neutral names did most frequently occur in the Non-Gendered category).

If anything, the emerging trends were that: font names containing a gendered activity or term, regardless of that activity or term’s specific gender, were most likely to occur in the Feminine category; font names containing either a male or female name were most likely to occur in the Masculine category. If this demonstrates a bias, we’re not really sure what it is.

89

fonts in the Feminine category, including overlap

76

fonts in the Masculine category, including overlap

75

fonts in the Non-Gendered category, including overlap

57%

of suggested fonts were Feminine, including overlap

49%

of suggested fonts were Masculine, including overlap

48%

of suggested fonts were Non-Gendered, including overlap

Other things we noticed

0.46

ratio of unique fonts to total fonts in the Feminine category

91/200 unique fonts

0.38

ratio of unique fonts to total fonts in the Masculine category

76/200 unique fonts

0.38

ratio of unique fonts to total fonts in the Non-Gendered category

75/200 unique fonts

0.33

ratio of unique descriptors to total descriptors in the Feminine category

98/300 unique descriptors

0.35

ratio of unique descriptors to total descriptors in the Masculine category

105/300 unique descriptors

0.42

ratio of unique descriptors to total descriptors in the Non-Gendered category

127/300 unique descriptors

14%

of hallucinations occurred in the Feminine category

57%

of hallucinations occurred in the Masculine category

29%

of hallucinations occurred in the Non-Gendered category

* Most fonts suggested did exist! The ratio of hallucinations to extant fonts was 0.045.

ClassificationNumber of Occurrences
Display
Feminine (14)
Masculine (5)
Non-Gendered (3)
Hallucination
Feminine (1)
Masculine (4)
Non-Gendered (2)
Handwriting
Feminine (25)
Masculine (4)
Non-Gendered (5)
Monospace
Non-Gendered (1)
Sans
Feminine (28)
Masculine (35)
Non-Gendered (41)
Serif
Feminine (21)
Masculine (28)
Non-Gendered (23)
  • Sans (28)
  • Handwriting (25)
  • Serif (21)
  • Display (14)
  • Hallucination (1)
  • Monospace (0)
  • Sans (35)
  • Serif (28)
  • Display (5)
  • Hallucination (4)
  • Handwriting (4)
  • Monospace (0)
  • Sans (41)
  • Serif (23)
  • Handwriting (5)
  • Display (3)
  • Hallucination (2)
  • Monospace (1)
Gendered TermNumber of Occurrences
gendered activity - female
Feminine (4)
Masculine (1)
gendered activity - male
Feminine (1)
Masculine (2)
Non-Gendered (1)
gendered term - female
Feminine (2)
Masculine (1)
Non-Gendered (1)
gendered term - male
Feminine (3)
Masculine (1)
name - female
Feminine (13)
Masculine (14)
Non-Gendered (12)
name - gender-neutral
Feminine (11)
Masculine (6)
Non-Gendered (10)
name - male
Feminine (18)
Masculine (21)
Non-Gendered (20)

Click the buttons below to view the individual fonts from each category that showed up in the above chart.

Why does it matter?

Asking Playground to provide us with a ton of fonts to which it’s assigned some type of gendered meaning is a fairly silly question. It doesn’t really matter if an AI thinks a handwriting font looks feminine or not.

But it does matter when an AI learns to automatically associate gendered content in ways that cost someone a job or a credit card application. It matters when an AI identifies skin tones in ways that could get someone hit by a car or automatically flag them as a criminal or animal. These are actual, ongoing issues that AI engineers are trying to solve and prevent even after their software has exploded into widespread use in government, healthcare, finance, and other vital industries.

This site is just for fun

Tools like Playground are a lot of fun, and powerful brainstorming tools as well. But they’re also moving incredibly fast, and picking up everything people have been putting on the internet for the 40 years that the internet has existed — ugliness as well as creativity. And until AI engineers are able to effectively filter and refine bias out of the datasets, the rest of us need to be vigilant, review what we produce, and ensure that our use of these tools doesn’t perpetuate any further harm.

Explore Analyzed Typefaces by

Further reading

The strategies:

Share our project

The font project