Forum Replies Created

Viewing 3 posts - 1 through 3 (of 3 total)

Author

Posts
December 6, 2023 at 11:43 in reply to: Jurafsky & Martin – Chapter 9 #17330
Louise
Student
Thank you, but I still can’t see the contradiction in the 2 references:
in the 1st reference: the LMSF can lower LM probability(which means a larger penalty)
in the 2nd reference: but LMSF has the side-effect of decreasing penalty.

So how can LMSF both increase and decrease the penalty?

Additionally, I understand why the decoder prefers fewer words when LM probability is low, but I can’t see why it prefers longer words.
December 6, 2023 at 11:41 in reply to: Jurafsky & Martin – Chapter 9 #17329
Louise
Student
The word insertion penalty is a fixed value added to each token

For “fixed”, does mean that “-p” is equal to “logWIP” in formula 9.50 ?

For “each token”, is the number of tokens equal to N (the N in formula 9.50, which means the number of words in the sentence), and that’s why formula 9.50 uses N×logWIP, because the penalty for each token(word) will be added together at last for a word sequence.
December 5, 2023 at 21:32 in reply to: Jurafsky & Martin – Chapter 9 #17325
Louise
Student
Hi,
I have 2 questions regarding the language model scaling factor(LMSF) and the word insertion penalty(WIP).

1
On J&M 9.6, page 315, the authors mentioned:

“This factor(LMSF) is an exponent on the language model probability P(W).
Because P(W) is less than one and the LMSF is greater than one (between 5 and 15, in many systems), this has the effect of decreasing the value of the LM probability”

So what I understand is that: common values for LMSF(5-15) can decrease LM probability, however, on the same page, the authors mentioned:

“Thus if (on average) the language model probability decreases (causing a larger penalty), the decoder will prefer fewer, longer words…Thus our use of a LMSF to
balance the acoustic model has the side-effect of decreasing the word insertion penalty.”

I’m confused by these 2 references, because in the first reference, LMSF is used to decrease LM probability. But in the second reference, it first said that the decreased LM probability can cause a larger penalty, however, LFSM has the side-effect of decreasing penalty, isn’t this a contradiction?

Still, for the second reference, I understand why it prefers fewer words, but I can’t see why it prefers longer words.

2
In the HTK manual(version 3.4), page 43, it mentioned that for HVite, “The options -p and -s set the word insertion penalty and the grammar scale factor, respectively.”

Does -s (grammar scale factor) refer to LMSF in J&M 9.6?
Does -p exactly refer to “WIP” in J&M 9.6, page 316, formula (9.50), or does it actually mean “N×logWIP”?

Thank you for your help.
Author

Posts

Viewing 3 posts - 1 through 3 (of 3 total)

Louise

Forum Replies Created

Search the forums

Note

Latest Activity

Search the forums

Speech Synthesis