[accepted, implemented] Context-free Grammar for unit names
Moderator: Forum Moderators
Forum rules
Before posting a new idea, you must read the following:
Before posting a new idea, you must read the following:
- skeptical_troll
- Posts: 503
- Joined: August 31st, 2015, 11:06 pm
Re: How about using Context-free Grammar to generate unit na
Just out of curiosity, is the current algorithm based on MCMC on letters , not syllables or blocks? If the length is the problem, isn't just possible to include it in the likelihood by hand so that probability of long names go down?
Re: How about using Context-free Grammar to generate unit na
You can find the current implementation in /src/race.cpp.skeptical_troll wrote:Just out of curiosity, is the current algorithm based on MCMC on letters , not syllables or blocks? If the length is the problem, isn't just possible to include it in the likelihood by hand so that probability of long names go down?
I wonder if there is already an existing library which does exactly the opposite of what GNU flex does.
Re: How about using Context-free Grammar to generate unit na
If you are lazy to find it, it takes a pair of letters to pick what will follow. It can be configured to use the last triplet or more, but it's not used anywhere as far as I know.skeptical_troll wrote:Just out of curiosity, is the current algorithm based on MCMC on letters , not syllables or blocks? If the length is the problem, isn't just possible to include it in the likelihood by hand so that probability of long names go down?
I have tried the algorithm with next letter determined from the previous single letter, but it sucked.
EDIT:
Some improved grammars whose average number of recruits till a pair of namesakes appear are significantly better than it was before (no experiment was made because it can be estimated mathematically, I am adding rough estimates).
Male elves (around 70)
Female elves (around 70)
Male humans (around 80)
Female humans (around 80)
Orcs (around 90)
Other grammars were already better than the current markov generator.
Some further improvements were made and the pull request was accepted and will be a part of wesnoth 1.13.5 and later.
Re: [accepted, implemented] Context-free Grammar for unit na
I have finally been playing with translating the name generation. I am running into a few difficulties with the town names:
- I have prefixed the base names with "XXX", but I get names with "XX", "XXXX" or "XXXXXX" in them. This should not happen - I should see "XXX" only here.
- I have prefixed the rule-generated base names with "NOCOM". I don't see any of those at all.
- Segmentation of base names is broken. Seems like blank space is used in addition to
,
for parsing the names into a list, resulting in nonsense names. - There are town names that consist of base names only. Since my base names need to be in the genitive case for the composition rules, I need to get rid of pure base town names without any prefixes. I haven't found a way to do that.
wesnoth
textdomain, gd
locale.- Attachments
-
- wesnoth.zip
- (173.6 KiB) Downloaded 566 times
Re: [accepted, implemented] Context-free Grammar for unit na
Hello,
All mentions that I have contributed something at all were removed from my forum profile, so this is not my responsibility any more.
However, because it's you, I will give you some advice. The implementation tries to find the name generator. If it fails to find it, is falls back to the old Markov chains. The Markov chain generated names work like that, if you add XXX before all base names, the result may have a random number of X letters in it. You have prefixed the context-free grammar generated names of villages with NOCOM, but the code is messed up (the second line is missing a newline), the parsing fails and falls back to the old method. I do not know how is the old name system implemented, so I have no idea what can be broken in it and cause the segmantation issue.
HTH,
-Dugi
All mentions that I have contributed something at all were removed from my forum profile, so this is not my responsibility any more.
However, because it's you, I will give you some advice. The implementation tries to find the name generator. If it fails to find it, is falls back to the old Markov chains. The Markov chain generated names work like that, if you add XXX before all base names, the result may have a random number of X letters in it. You have prefixed the context-free grammar generated names of villages with NOCOM, but the code is messed up (the second line is missing a newline), the parsing fails and falls back to the old method. I do not know how is the old name system implemented, so I have no idea what can be broken in it and cause the segmantation issue.
HTH,
-Dugi
Re: [accepted, implemented] Context-free Grammar for unit na
Thanks, Dugi. Seems like that's not the only thing that's broken in my code though, so I need to find a way to really debug this thing.
Re: [accepted, implemented] Context-free Grammar for unit na
You can use my website to debug it (most of those links I have posted link to that). I have expanded its functionality since then, but you probably won't come across any of the new syntactic features.
You may need to add that \n at the end of each line. The code needs the newlines to be there, but I am not sure how do the translation file deal with the newlines.
You may need to add that \n at the end of each line. The code needs the newlines to be there, but I am not sure how do the translation file deal with the newlines.
Re: [accepted, implemented] Context-free Grammar for unit na
Thanks, that is a very helpful tool! It's working via the website now, but not in Wesnoth.
I compiled Wesnoth on my Linux box and added some debug output. I spent a few hours digging into the code and it seems like calling generate() for "main" always returns an empty string, even for English. This means that the $base variable is then filled by the Markov generator.
So, this is definitely a bug in the context free generator in Wesnoth, but I have no idea what's wrong with it yet.
I compiled Wesnoth on my Linux box and added some debug output. I spent a few hours digging into the code and it seems like calling generate() for "main" always returns an empty string, even for English. This means that the $base variable is then filled by the Markov generator.
So, this is definitely a bug in the context free generator in Wesnoth, but I have no idea what's wrong with it yet.
Re: [accepted, implemented] Context-free Grammar for unit na
I found the bug
https://github.com/wesnoth/wesnoth/pull/921
I am still getting pure base names though.
https://github.com/wesnoth/wesnoth/pull/921
I am still getting pure base names though.
Re: How about using Context-free Grammar to generate unit na
That would be exactly how it should be. I remember the use of this for pen and paper RPG in the late 80ties. Making lists in excel and using for NPCs. How many more years will it take for Wesnoth to catch up?Spixi wrote: ↑April 8th, 2016, 6:37 pm The problem with Markov chains is that there may be loops or dead ends which can cause very long or very short names.
This small example shows, what I mean:
Given are the following names:
LILA
ANNE
ALENA
This produces the following Markov chain:
<start> -> { A, A, L }
A -> { <end>, <end>, L, N }
E -> { <end>, N }
I -> { L }
L -> { A, E, I }
N -> { A, N, N, E }
The probability to generate the name "A" is 4/9, because 2/3 of all names start with A and 2/3 of all names end with A.
The likelihood that a name, which contains a N, contains at least three Ns in a row is (1/2)^3 = 1/8, which makes names like "ANNNA" very common.
If a name contains a I, it will contain at least four characters, because it has to contain the path L -> I -> L -> {A, E, I}
We conclude that names usually do not follow Markov chains. Many names are based on context-free grammars, however. This example shows a simple grammar for old German names:
NAME = {PREFIX} + {SUFFIX}
PREFIX = "A", "Al", "Bal", "Ed", "Eg", "Frie", "Gott", "Hein", "Hin", "Rein", "Sig", "Ul", "Wil", "Win", "Wal", "Wol"
SUFFIX = "bert", "dolf", "drich", "dulin", "dur", "fried", "helm", "hold", "lieb", "ram", "rich", "win"
Example names are: Edwin, Reinhold, Friedrich and Winfried.
As you see, this would generate names with a better quality than the current implementation.
-
- Inactive Developer
- Posts: 503
- Joined: April 24th, 2016, 4:18 pm
Re: [accepted, implemented] Context-free Grammar for unit names
alalalalalalalalalalalalalalalalalalalalalalalal
Good recognizer, lousy generator. That probably answers why we're not "with it" .. we like junk to work and not cause infinite loops or other silliness.
Good recognizer, lousy generator. That probably answers why we're not "with it" .. we like junk to work and not cause infinite loops or other silliness.
I forked real life and now I'm getting merge conflicts.
Re: [accepted, implemented] Context-free Grammar for unit names
Ok, how do I use this? Is it used now as standard for all names? Is it part of mainline?
Re: [accepted, implemented] Context-free Grammar for unit names
It's in mainline, in wesnoth.po.
You can test your rules at Dugi's site: https://www.physics.muni.cz/~dugi/index.fcgi/cfggen
You can test your rules at Dugi's site: https://www.physics.muni.cz/~dugi/index.fcgi/cfggen
Re: [accepted, implemented] Context-free Grammar for unit names
https://wiki.wesnoth.org/Context-free_grammar
found it. Now tryin to make random banter in messages like the examples. Is that possible? plz help with wml syntax. Name gen I managed on my own by copying mainline.
found it. Now tryin to make random banter in messages like the examples. Is that possible? plz help with wml syntax. Name gen I managed on my own by copying mainline.