Langua

A suite of language tools

Back

LanguaGen Help

LanguaGen is a tool for automatically building a set of words based on arbitrary rules of phonotactics. This can be used to create a dummy vocabulary for linguistic experimentation, to generate words or names for a naming language in a work of fantasy or science fiction, or as the basis for building the vocabulary of a constructed language.

Using LanguaGen

The most important part of the tool is the Pattern. This field defines how words will be formed using a specific syntax, defined below. This field can define specific letters that can be used, or it can contain references to Subpatterns.

Up to 26 Subpatterns are available for use. These subpatterns are defined in exactly the same way as the Pattern. Their power lies in the fact that each Subpattern can be referenced by the Pattern, allowing significantly more variability in word generation.

In the most common usage, a Subpattern generally represents a class of phonemes while the Pattern represents the possible combinations of all the phoneme classes. For example, one might use a Subpattern named V to represent vowels, C to represent consonants, and N to represent nasals. Alternatively, one might use a Subpattern named O to represent the syllable onset, N to represent the nucleus, and C to represent the coda. There is great flexibility in Subpattern use.

Syntax

The syntax is identical for all Pattern and Subpattern fields. Uppercase versions of the standard English letters (e.g. C, V, or N) are variables referring to Subpatterns while any other letter (e.g. a, s, or n) represents that specific glyph.

Options – /

Multiple options can be separated using forward slashes (/). For each word, the tool will randomly select one of the options. By default, each option will have the same chance of being chosen. This can be changed by assigning weights. For example, with the Subpatterns V: a/i, C: t/s, and N: m/n and the Pattern CVN, the tool will output the words sam tan tam tin sim san tim sin.

Single Units – [ ]

This functionality has not yet been implemented.

Everything contained within brackets ([ ]) is treated as a single unit. For example, the Pattern as[tu/top/kan] will produce the output astu astop askan. Brackets and parentheses can be nested unlimitedly.

Optional Units – ( )

This functionality has not yet been implemented.

Everything contained within parentheses (( )) is treated as a single unit that is optional. For example, the Pattern as(tu/top/kan) will produce the output as astu astop askan. Parentheses and brackets can be nested unlimitedly.

Weights – *

This functionality has not yet been implemented.

Weights can be added to certain choices using an asterisk * and a number to improve the likelihood of it being chosen. For example, with the pattern a/e*3/i*2/o/u*5, the tool would output a or o with equal probability, but compared to these would be twice as likely to output i, three times as likely to output e, and five times as likely to output u. The number used as a weight must be an integer between 1 and 128.

Filtering – ^

This functionality has not yet been implemented.

Filtering can be added to units using a caret ^ and a potential output to restrict the possible results. Multiple filters can be added to the same unit to filter out multiple options. For example, with the Subpattern V: a/i/u and hte Pattern t[VV]^aa^ii^uu, the tool will output tai tau tia tiu tua tui, filtering out taa tii tuu.

Esaping Characters – " "

This functionality has not yet been implemented.

Special characters otherwise used for the tool’s syntax can be escaped using double quotes " ". For example, with the Subpattern V: a/i/u and the Pattern Vt"V[n/m]", the tool will produce the output atV[n/m] itV[n/m] utV[n/m].

Generating Words

Once the Pattern and all Subpatterns have been set up, you can click the Generate button to generate words. There are also several adjustments you can make, including the total number of words that should be generated, whether each word should be written on a new line, and whether the tool should filter out duplicate words.

After the words have been generated, some statistics are shown below, including how many words were printed to the output, how many duplicates were filtered out of the results (if any), and how many words are possible based on the given Pattern and Subpatterns. (Note that the number of unique words possible may actually be lower if there are multiple ways to obtain the same word.)

Saving and Loading Settings

Clicking the Save button will save the current settings to the browser’s local storage and generate a small .lngg text file containing the current settings that can be saved to your disk. This .lngg file can be loaded using the Open button to reload saved settings.

Acknowledgments

Much thanks should be given to Petr Mejzlík and Awkwords. LanguaGen was mainly built as a modernized and updated version of Awkwords.