Talk:Formant

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Synthesis[edit]

The term "Formant synthesis" redirects here (and there's a synthesis template present), but there is no actual discussion about synthesis in this article. —Preceding unsigned comment added by 131.107.0.98 (talk) 22:01, 21 April 2010 (UTC)[reply]

Throat Singing[edit]

I added the paragraph on throat singing, but didn't realize I was logged out.--Theodore Kloba 22:37, Dec 16, 2004 (UTC)

should have spectrogram images. - Omegatron 17:45, Jan 26, 2005 (UTC)

Vowel charts[edit]

The list of frequencies for various vowels has two major problems that concern me. The first is that the anon who first added them didn't cite any source. More importantly, though, it uses orthography rather than IPA values when showing the vowels. It doesn't even say what language those vowels occur in (the only language I've found that uses all of <ä>, <ö>, <ü>, and <å> is Finnish, but it doesn't use <å> in any native words, and I don't know how it's pronounced in the words borrowed from Swedish anyway; these symbols have significantly different pronunciations in the different languages in which they occur). Does anyone know what the phonetic values of these vowels are intended to be? --Whimemsz 01:12, 23 January 2006 (UTC)[reply]

Finnish actually doesn't use ü (they use 'y' for the close front rounded vowel, the same as in IPA). I guessed that ü was y, so if the person that posted that intended something other than y, please correct it. Also, I changed the vowels from capital to lower case because there is a difference in IPA between i and I, for example. (Technically, it's i and ɪ, I know.) In Finnish, ä is IPA æ, ö is ø. I've always heard Swedish å realized [o]. I guessed which IPA vowel 'å' was supposed to be based on the formant frequencies and changed it to ɑ. The changes, however, may not be right because å, ä, ö and ü vary from language to language (hence IPA).
Also, if there are sources for these formant frequencies, I'd like to see them. I have Peter Ladefoged's A Course in Phonetics (4e, p 172) in front of me and the numbers listed for F1 and F2 of some of the vowels are quite different (ie the number listed for F2 is closer the Ladefoged's F3 number on [i]). Sources/suggestions? JordeeBec 19:32, 9 March 2006 (UTC)[reply]

I am also concerned about the vowel charts. First, the two charts seem rather redundant - better (imho) to go with a single chart that incorporates what both of these charts are trying to do: list approximate frequencies of the first two formants and indicate the range of variation. Second, The frequency of the second formant given for [i] in both charts seems too high, and is probably actually the third formant frequency (the other formant frequencies are in the right ballpark - but of course there is the issue already mentioned about variation from language to language, getting the IPA symbols right, and variation from utterance to utterance for a given speaker). Third, the second chart's "main formant regions" (only one for back vowels, but two for front vowels) are related to auditory processing of vowel acoustics, and not to the acoustics themselves. Since this page is specifically about formants, I suggest sticking with the actual formants rather than main formant regions. Lulich 02:07, 26 May 2006 (UTC)[reply]

The fact the charts are unsourced is a serious problem, as a couple of the previous contributors have pointed out, yet in the last three and a half years nothing has been done about them. This is a serious problem, as I've seen them being cited in an academic context, where they refer to what language? I suggest removing them and replacing by the charts from Deterding (1997) or some other reliable source. MarcusCole12 (talk) 05:35, 20 November 2009 (UTC)[reply]

There are various things editors can do to warrant attention to sourcing. In order of increasing effectiveness, they are: bringing it up in the talk page, marking it (and/or the page) as uncited, and taking it out. If Deterding (1997) has more authoritative formants, I'd be sure to cite them inline with the proper page number. — Ƶ§œš¹ [aɪm ˈfɻɛ̃ⁿdˡi] 05:02, 21 November 2009 (UTC)[reply]
Does anyone have access to Catford (1988) p. 161 that could share it to clarify his data on the vowel formants? The citation was originally added on March 2013 by Matty1487 who no longer exists. 41.5.94.123 (talk) 14:04, 1 May 2017 (UTC)[reply]

Introduction[edit]

I find the first line confusing. Please could someone make this clearer.

This whole article is in need of editing with rigour and and with careful attention to English expression.Yergnaws (talk) 13:42, 17 August 2009 (UTC)[reply]

Okay, I'll have a try at the first line of the intro. The old first line was indeed confusing: a formant is not the 'shaping' (which might mean maxima or minima or both); it's the spectral maxima that are called formants. Further, it's not in much contrast to the second definition: in reference [2], a broad consensus of voice acoustics researchers (including the author of [1]) support the definition given as that for acoustics, rather than the 'shaping' definition. Whether or not the strongest harmonic is taken as the formant frequency could depend on whether the fundamental frequency is low (allowing one to interpolate the envelope of the spectrum) or high (where the lack of resolution in frequency does not allow accurate estimation of resonance frequencies). Inala (talk) 11:10, 13 July 2020 (UTC)[reply]

@Inala: Thank you for your edits! I personally like the fact that your definition, by mentioning a "broad peak", covers the two definitions that are generally found in the literature. However, I think that mentioning the existence of the two definitions (a peak in the filter function vs. a peak in the spectrum) may help readers that came to this page after encountering one or the other. But perhaps it indeed doesn't need to be the first thing the reader sees. So how about starting with you general definition, but then, maybe as part of History, we can mention that the term is used to mean different things? – egaudrain (talk) 15:29, 15 July 2020 (UTC)[reply]

Formants in sonorants vs. obstruents[edit]

The article makes some incorrect statements about what kinds of sounds formants belong in. For instance, "Not all sounds used in human language are composed of formants. Formants are restricted to sonorants [...]". Since formants are basically resonances of the vocal tract, they are always present regardless of the type of sound being produced. The distinction between sonorants and obstruents (and various gradations between) concerns the means of exciting the resonances - whether by periodic voicing (in vowels) or by noise (in fricatives), for instance. "Note that fricatives always lack formant structure and are distinguished by the frequency range with the most noise, as well as overall strength of frication." Fricatives actually do have formant structure. For instance, [S] and [s] are largely differentiated by the strength of the 3rd formant. The fact that certain frequency ranges have more noise than others (depending on the fricative) is derived from the location of the noise source within the mouth. For [s] it is near the teeth, so that low formants (resonances) belonging to the back of the mouth (behind the source) are not excited much - this is in contrast to vowels, in which the periodic voicing source is at the larynx so that all of the formants are excited strongly. Check out Ken Stevens' book Acoustic Phonetics for confirmation. Lulich 02:07, 26 May 2006 (UTC)[reply]

Spectrogram[edit]

I have a problem with the spectrogram. It represents three sounds that don't change through time, and yet the time resolution is quite high, definitly overkill, and because of that, the frequency resolution is very poor, as it's really all that matters (so much that we could content ourselves with the magnitude part of the DFT of these sounds). That'd be cool if someone could do it again with a much lower time resolution.

Also due to the nature of speech it wouldn't be bad if the frequency scale was logarithmic (base 2 of course)

Opening paragraph[edit]

The reference to Alvin Lucier's "I Am Sitting in a Room" clearly doesn't belong in the opening paragraph. Apparently, it was included only to illustrate that the word "formant" can refer to a peak in the acoustic resonance of a room. —Preceding unsigned comment added by Unfree (talkcontribs) 12:43, 15 April 2008 (UTC)[reply]

Fant?[edit]

What's Fant? Is it a company? A person? I'm confused. —Preceding unsigned comment added by 64.122.56.143 (talk) 02:27, 6 July 2009 (UTC)[reply]

Gunnar Fant, Swedish phonetician. See the first reference. MarcusCole12 (talk) 09:27, 24 November 2009 (UTC)[reply]

Look at: Peterson, G.E., Barney, H.L., 1952. Control methods used in the study of the vowels. J. Acoust. Soc. Amer. 24, 175–184. They use term "formants" in the current meaning before Fant's publication. Is anyone from wikipedia who can correct this? —Preceding unsigned comment added by 178.42.58.70 (talk) 12:47, 3 November 2010 (UTC) 178.42.58.70 (talk) 12:51, 3 November 2010 (UTC)[reply]

What does the first sentence "Formants are defined by Gunnar Fant as..." mean? Was Fant the one who first defined formants? Or are we simply using his definition as a starting place? Could someone clarify this? Goochelaar (talk) 15:49, 1 June 2011 (UTC)[reply]
actually the term was coined by Erich Schumann in 1929 (de:Formant), the article should definitely be updated 79.49.120.35 (talk) 12:17, 22 March 2013 (UTC)[reply]
Rather, Ludimar Hermann (1838-1914), according to the article about him. Bn (talk) 14:17, 5 June 2018 (UTC)[reply]

Vowels[edit]

By definition, the information that humans require to distinguish between vowels can be represented purely quantitatively by the frequency content of the vowel sounds.

By definition? Whose definition of vowel says this? 68.239.116.212 (talk) 03:34, 15 November 2009 (UTC)[reply]

What is an "antiformant"?[edit]

I reached this article from a link on the word "antiformant" in the article Rhinoglottophilia. Neither article explains what an "antiformant" is. Collin237 166.147.104.149 (talk) 22:48, 1 February 2012 (UTC)[reply]

Paraphrasing Glottopedia, an antiformant lessons amplitude at a given frequencies. The example is the effect of the nasal cavity on nasal vowels and consonants. The nasal consonant anti formants are more pronounced; the frequencies show greater "silencing".

66.81.29.144 (talk) 03:06, 28 November 2013 (UTC)[reply]

Needs a basic explanation[edit]

Nowhere does this article clearly explain, in a way that a novice can understand, what a formant is. The only actual definitions of formant that appear in the article are: (1) "the spectral peaks of the sound spectrum |P(f)|' of the voice", (2) "an acoustic resonance of the human vocal tract", (3) "the distinguishing or meaningful frequency components of human speech and of singing". Of these, (1) uses undefined technical terms, and (2) and (3) are too general, in that they include various things besides formants.

The article does contain a lot of information about what the formant frequencies of various vowels are, how they're affected by consonants, how they're labeled (f1 etc.), how they're produced by the vocal tract, etc. — Preceding unsigned comment added by Linguistatlunch (talkcontribs) 21:32, 22 August 2012 (UTC)[reply]

I definitely agree. I came here to find out what "F3" means. Well, I found out, not surprisingly, that it is the "third formant," but I don't know what a formant is after struggling through this contorted article. Why are so many Wikipedia articles written as if they are trying to hide the essence of what they are about? If I lived somewhere where there were a decent library, I'd just go to the library and get a decent description I'm sure. Alas, I'm stuck with only the Internet. 202.179.19.8 (talk) 07:14, 13 February 2013 (UTC)[reply]
In human speech not all frequencies are of equal amplitude (loudness). This becomes visually apparent when a spectrogram is used. In the clearest layman terms I can use formants are dark bands you see in a spectrogram, or hill peaks in a sound spectrum. Formants are most pronounced in vowels. If you have the formants of a vowel you can generate them synthetically with free computer programs such as Praat or VocalTractLab.
In anything that isn't white noise, not all frequencies are of equal amplitude. It's not anything special to human voice. All frequencies of equal amplitude = white noise. Cyisfor (talk) 22:17, 19 September 2014 (UTC)[reply]
If I were to create a dictionary definition I would call a formant a frequency of locally maximum amplitude in a frequency spectrum. In less technical language they are the "notes"->Hz that are most prominent in a sound.
A frequency is something that repeats, not a characteristic of a sound. That's like if for example someone were saying "a" over and over again, once a second. The frequency spectrum would show their local maximum amplitudes repeating with a frequency of 1Hz. So the formant as "a frequency of locally maximum amplitude in a frequency spectrum" would be in this case 1Hz. And if they said "b" repeatedly, once a second, it would have the exact same formant of 1Hz. But if they said "a" twice a second, it would have a different formant of 2Hz. That just doesn't seem right. Cyisfor (talk) 22:17, 19 September 2014 (UTC)[reply]
What do you call the pattern evident in the distance between peak harmonics of a characteristic sound at any given pitch? That sound at a higher pitch will have different harmonics, but they will fit into the same pattern. That's what I thought a formant was, that pattern. It's not quite a proportion, because some harmonics stay the same, hisses and pops and strikes and such, but the bigger, lower frequency harmonics will change proportionally with pitch. Cyisfor (talk) 22:17, 19 September 2014 (UTC)[reply]

66.81.29.144 (talk) 02:43, 28 November 2013 (UTC)[reply]

Gender bias[edit]

The study of speech acoustics has a long and inglorious history of treating the adult male voice as the basis for generalizations about speakers in general. Female voices are usually relegated to footnote status if mentioned at all. This article does nothing to redress the balance, but I think it should. RoachPeter (talk) 20:57, 17 March 2014 (UTC)[reply]

A complicated matter, and somewhat ironic: Women’s vocal tracts are more similar children’s and infants’, apart from being larger. It is men’s that are different from the others. So all that male-subject research focused on the anatomically NON-standard case. Jmacwiki (talk) 18:44, 13 November 2021 (UTC)[reply]

Is a formant related to pitch?[edit]

As the formants are described in Hertz, it sounds like the article is saying that different vowels are made at different pitches. Are the two actually related? I am only confused.74.102.216.186 (talk) 02:04, 28 November 2016 (UTC)[reply]

It seems that there is a difference. I found out that a pitch is defined as fundamental frequency. However, formant frequency can be described based off the length and width of your vocal cord when making the sound. I found this article that can explain the difference: [1] (f.k.a. 74.102.216.186)LakeKayak (talk) 15:17, 21 January 2017 (UTC)[reply]

References

A human voice will produce no single frequency, even when speaking a "simple" vowel, but a frequency spectrum. The formants are essentially the frequencies of the overtones. You can approximate the sound of a particular human voice with speech synthesis by successively adding its formants, starting with the most significant; in each iteration the synthesized voice will sound more natural and more similar to the original speaker. -- Theoprakt (talk) 07:55, 31 January 2020 (UTC)[reply]

Not of the overtones (which arise solely from larynx). From the resonances of the mouth, which filter those overtones to emphasize some and suppress others. Jmacwiki (talk) 18:39, 13 November 2021 (UTC)[reply]

lead section has both style and definition issues[edit]

The lead section does not conform to MOS:INTRO as it is unnecessarily complicated by immediately diving into technical details how different fields of research use slightly different definitions, that are, as the lead itself states, in most cases irrelevant. Furthermore, this edit, said to "clarify" things, completely sweeped the formerly used definition from acoustics under the rug (i.e. a definition that is in the domain of sound without referring to human anatomy) and folded it in under the speech science definition of formants that refers to the human vocal tract. -- Theoprakt (talk) 07:41, 31 January 2020 (UTC)[reply]

incomprehensible intro[edit]

A vowel, according to him, is a special acoustic phenomenon, depending on the intermittent production of a special partial, or “formant”, or “characteristique”. The frequency of the “formant” may vary a little without altering the character of the vowel. For a, for example, the “formant” may vary from 350 to 440 Hz even in the same person.

This talks about only one formant whereas the rest of the article talks about vowels consisting of two. In addition, the frequency values given for the formants of vowel a are completely different, 850 and 1610. --Espoo (talk) 14:20, 27 April 2020 (UTC)[reply]

It also uses “partial” as a noun, which is nonstandard usage, stilted at best (except sometimes in multivariate calculus). I’ll take a crack at it, but only in small steps. This talk page shows that lots of people have been very poorly informed by this page, both over the years and even in recent months. Jmacwiki (talk) 18:37, 13 November 2021 (UTC)[reply]

Vowel space diagram[edit]

I have seen many, many vowel-space figures. Like most scientific figures where two variables have a conventional order (“x” before “y”, or F1 before F2), they plot the first variable horizontally, the second vertically.

Where did we get this diagram, which plots the transpose? Does anyone have a more standard replacement? Jmacwiki (talk) 19:36, 13 November 2021 (UTC)[reply]

Width inconsistency[edit]

We start by saying that formants (apparently by definition) are spectrally broad. But in the Phonetics section, we say that singers create sharp resonances (formants) to select high vocal harmonics.

If we believe the second, then apparently we do not accept our own definition. Which to edit? Jmacwiki (talk) 19:58, 13 November 2021 (UTC)[reply]

Redundant diagrams[edit]

The last 3 diagrams are all describing the same thing (actually the first two as well, just in a different form), so I've removed two of them. I really don't see the point of the "Average vowel formants for a male voice" table but I'll leave it for now. Also, this article seems extremely limited to only the perspective of phonetics when formants are supposedly integral to so many other fields as well. 30103db (talk) 16:54, 22 November 2021 (UTC)[reply]

Long e[edit]

The History paragraph gives F1 value for high vowels, but it lists a, which is low (adult F1 about 700 Hz, not 400) as the example. Now corrected. However, this example doesn’t fit very well in a History section anyway. Does anyone know the editor’s intention? Should the sentence be moved, or the section renamed?

Also, do we have agreed notation for vowels? IPA may be unkind to our readers, so I used a DARPAbet code, but we should have a better solution. Jmacwiki (talk) 06:20, 25 November 2021 (UTC)[reply]

Ellipses[edit]

I've seen a lot of charts based on empirical data (or statistical models) displaying ellipses rather than point estimates to give some idea of the distribution. (I'm writing this from the LSA annual meeting but I'm not a phoneticist.) It might be good to display an example here, both for its own sake and also to give readers a visual sense of the range of variation. 207.237.164.107 (talk) 04:10, 7 January 2024 (UTC)[reply]

I have an as-yet unpublished paper from William Kretzschmar and Joseph Stanley in which they describe the history of this practice (and also explain the unusual 3rd quadrant axes) as a lead-up to criticizing it. Historically, once phoneticians began to be able to measure F1 and F2, they started plotting vowel sounds in F1/F2 space — using the 3rd quadrant axes once they realized that this would align the points with the IPA vowel chart quadrilateral. Originally, the ellipses were drawn by hand, encircling all of the data points (or all of the points that, by the researcher's impressionistic judgment, represent a particular vowel), but today, with much larger speech corpora and the ability to do automatic forced alignment with high accuracy, it is common to either present point estimates or to draw ellipses automatically to reflect a 95% confidence interval. Kretzschmar and Stanley note that F1/F2 values in large speech corpora are not normally distributed and thus using statistics that assume a normal distribution, such as mean and variance, is invalid. They further note that the empirical distribution is scale-invariant. This material was presented by Kretzschmar at the 2024 ADS annual meeting but the paper is as yet unpublished so it isn't WP:V and can't (yet) be used in this article. However, all of the historical claims are themselves cited to earlier sources; I just don't have the time to verify them in the library. 121a0012 (talk) 04:00, 15 January 2024 (UTC)[reply]
For a WP:V reference on the non-normality of linguistic data distributions, I did find William A. Kretzschmar, Jr.; Brendan A. Kretzschmar; Irene M. Brockman (April 2013). "Scaled measurement of geographic and social speech data". Literary and Linguistic Computing. 28 (1). Oxford University Press: 173–187. doi:10.1093/llc/fqs058.. 121a0012 (talk) 04:20, 15 January 2024 (UTC)[reply]