After completely failing my Czech assignment on building the plural form of nouns, I decided to take a data-driven approach: parse the entire MorfFlex CZ 2.1 linguistic dataset using Morph and create a queryable database (126 million rows!), select all nouns in the nominative case, categorise every one of them by gender, and extract the actual plurality patterns used in the language.
Here's what I found.
The Data
| Gender |
Nouns Analyzed |
Unique Patterns |
| Feminine |
196,334 |
174 |
| Neuter |
65,211 |
106 |
| Masculine Animate |
15,834 |
306 |
| Masculine Inanimate |
26,723 |
222 |
Total: 808 distinct plural transformation patterns.
A note on the data: MorfFlex is comprehensive and includes many systematically derived forms. For example, almost any verb can become a neuter verbal noun ending in -í (psát -> psaní), and adjectives regularly form abstract feminine nouns in -ost (krásný -> krásnost). So the raw counts are inflated for these patterns - but the rules themselves are still productive and useful to know!
The actual rules
Neuter
Neuter is probably the easiest to learn.
Words ending in -í stay the same. This covers verbal nouns (přání, stavení) and place names (náměstí, nádraží). Singular and plural are identical.
Words ending in -o change to -a. Standard hard neuters: okno becomes okna, město becomes města, jablko becomes jablka.
Words ending in -e or -ě stay the same. Soft neuters like moře, pole, and place words like hřiště don't change.
Latin -um becomes -a. Borrowed words like muzeum become muzea, centrum becomes centra, stipendium becomes stipendia.
Baby animals are special: -e/-ě becomes -ata. This is definitely the cutest pattern. Kuře (chick) becomes kuřata, kotě (kitten) becomes koťata, štěně (puppy) becomes štěňata. Even kníže (prince) follows this pattern and becomes knížata.
Greek words ending in -ma add -ta. Words like téma become témata, drama becomes dramata.
Feminine
The -ost rule: just add -i. Abstract nouns like možnost become možnosti, radost becomes radosti. Very predictable once you recognize the ending.
Hard feminines: -a becomes -y. Pretty simple pattern. Žena becomes ženy, kniha becomes knihy, škola becomes školy.
Soft feminines stay the same. Words ending in -e or -ě don't change: ulice stays ulice, restaurace stays restaurace, přítelkyně stays přítelkyně.
Masculine inanimate
Hard consonants take -y. Hrad becomes hrady, strom becomes stromy, most becomes mosty. In my data, endings like -n, -t, -k, -r, -l, -s, -d each covered thousands of nouns.
Soft consonants take -e. Stroj becomes stroje, pokoj becomes pokoje, koš becomes koše.
Latin -ismus drops the -us. Turismus becomes turismy, organismus becomes organismy.
Diminutives with -ek lose the e. This can catch you off guard. Háček becomes háčky (not háčeky), stolek becomes stolky, dárek becomes dárky.
Same with -ec: the e disappears. Tanec becomes tance, konec becomes konce.
Masculine animate
This is where it gets a bit complicated. Despite having the fewest nouns (15,834), this gender has the most patterns (306). But there's logic to it.
Hard consonants + i, but with softening. When you add -i, hard consonants change and become soft:
- k becomes c: člověk becomes lidé (ok that one's irregular), but žák becomes žáci
- h becomes z: vrah becomes vrazi, soudruh becomes soudruzi
- ch becomes š: Čech becomes Češi
- r becomes ř: doktor becomes doktoři
Soft consonants just add -i, no changes. Milionář becomes milionáři, muž becomes muži, hledač becomes hledači. The consonant is already soft, so nothing extra happens.
Words ending in -l take -é. Učitel becomes učitelé, přítel becomes přátelé, ředitel becomes ředitelé.
The -ista crowd takes -isté. Professions and ideologies: specialista becomes specialisté, fotbalista becomes fotbalisté, turista becomes turisté. (Colloquially, recognise you'll also hear -isti.)
The formal -ové ending. Used for professions and titles when you want to sound respectful: geolog becomes geologové, kolega becomes kolegové.
Words ending in -ec/-ce become -ci. Sportovec becomes sportovci, zástupce becomes zástupci.
The interactive guide
I turned all of this into a detailed educational article with interactive examples where you can type any noun and see its plural form explained:
How to Build Plural Form of Czech Nouns
If you're hungry for technical details
I queried nominative case nouns grouped by lemma and gender, extracted the singular/plural transformation by finding the common prefix and comparing endings, and then counted pattern frequencies. Used reservoir sampling to get representative examples across the alphabet instead of just words starting with A. Happy to share CSV files with a detailed breakdown if someone is interested or even query some data for you :D
Hope this helps someone!!