Category talk:Konkani language

From Wiktionary, the free dictionary
Latest comment: 3 years ago by Bhagadatta in topic Konkani References
Jump to navigation Jump to search

Translit[edit]

@AryamanA Hi, I know I was the one who was opposing auto-translit for Konkani based on how च, ज, अ, ए and ओ were realized. That was because back then I was confusing transliteration and transcription. It was stupid of me to oppose auto-translit because it's a real PITA whenever a Konkani cognate/etymon/descendant is to be mentioned. I'm gonna work on Konkani a whole lot more now and will be adding pronunciation too so the ts vs c or the dz vs j ambiguity won't really be a problem. Could you make MOD:mr-translit apply to Konkani too? -- Bhagadatta (talk) 04:58, 22 September 2020 (UTC)Reply

@Bhagadatta: Great to hear your support for Konkani auto-translit and willingness to work on Konkani a whole lot more. The lack of a standardised transliteration system really is a ʻPITAʼ when comparing Marathi and Konkani. Your choice to use MOD:mr-translit instead of making a separate a MOD:kok-translit is very interesting. When working on MOD:mr-translit, I thought the differences from MOD:hi-translit were only applicable to Marathi. However, your choice to use MOD:mr-translit for Konkani and looking at related lects that use Devanagari suggest that MOD:mr-translit may represent a regional way of using Devanagari. If MOD:mr-translit is to be used for Konkani and Marathi, then feel free to edit it so that it can be used for both. ʻWeʼ tried to figure out an automated way of handing the ʻts vs c or the dz vs j ambiguityʼ, but the conclusion was that they should be indicated with manual transliteration and/or the Pronunciation section as you propose to do for Konkani.
(Modern Marathi can be written in the Modi script (as indicated at CAT:Marathi language), and ignoring the legibility issue it actually makes more sense to do so. However, using the Modi script for Marathi is obviously not mainstream anymore and now it's only used by hobbyists and historians.)
Although I'm no expert on CAT:Ahirani language or the Varhadi dialect, any writing in those primarily spoken lects suggests that they should also use MOD:mr-translit. I've been thinking of adding coverage of those lects based on theses from Shodhganga for a more regional perspective, but I've been holding off. However, it seems that all entries for CAT:Vaghri language and CAT:Kachchi language are from Shodhganga theses so it might be okay to use Shodhganga as a source. Kutchkutch (talk) 10:03, 22 September 2020 (UTC)Reply
@Bhagadatta, Kutchkutch: I've added it. I think it's a decent fallback when we're lacking manual translit. Looking forward to entries in more Southern IA languages! BTW, do you think a MOD:kok-IPA is possible? IPA for multiple dialects? —AryamanA (मुझसे बात करेंयोगदान) 04:08, 23 September 2020 (UTC)Reply
@Kutchkutch: Well, there's no letter/combination of letters in Konkani that the Marathi transliteration module cannot accurately deal with so I thought let's tie Konkani to the Marathi module instead of duplicating the content with a new Konkani module. Konkani's understanding of Devanagari is influenced by Marathi because Marathi was the language of the educated here and Marathi was the language that was considered to be worthy of putting pen to paper for. So it's no surprise that when Konkani was written down in Devanagari, it used Devanagari the way Marathi used it. Similarly the Latin script for Konkani is modelled on the Portuguese language's use of the Latin script. I'm sure that can be said of Ahirani too as they may speak Ahirani at home but Marathi outside so they'll use the script the way Marathi uses it. I think we could apply MOD:mr-translit to Ahirani too if that is really the case.
@AryamanA: Thanks for the changes! kok-IPA can happen one day, I'm supposing it will be modelled on mr-IPA. I'm already providing pronunciation manually to many Konkani entries but am limiting it to those entries where it's not obvious. Also, in entries like खेळ्टा (kheḷṭā, plays), मेळ्टा (meḷṭā, meets), मेळ्चे (meḷce, to meet), I've given the dialect specific pronunciation as they substantially vary. kok-IPA will be a bit more challenging because ओ randomly alternates between open mid ɔ and mid rounded o̞ (we can just write that as o), sometimes even within the inflection paradigm. Eg: चोर (cor, thief, nominative) is t͡soːɾ but चोरान (corān, ergative) is t͡sɔːɾɑːn and चोरार (corār, on the thief, superessive) is t͡soːɾɑːɾ. The parameter 2 of that template has an ō there with a macron anyway because the nominative has it. I'll change that real soon. ए does that too with e and ɛ. If in the future we make a kok-IPA, there should be a parameter for re-spelling to deal with this kind of ambiguities.
Anyway, should the transliteration describe the "openness" of ओ and ए or not? If yes then we can keep the manual transliteration on चोर (cor) and the like. -- Bhagadatta (talk) 10:29, 23 September 2020 (UTC)Reply
@Bhagadatta: Thanks for that perspective. Just like Konkani, the anusvara was used to indicate nasalisation in Marathi (see तेव्हां at ओळ (oḷ)). However, at some point the nasalisation either became optional or was dropped in speech, and the nasalisation anusvara was gradually omitted in writing. The nasalisation may still occur before र, श, ष, स, य, ल, व, which is the reason why there is an additional vowel in संशय (sauśay), संवाद (sauvād), etc. even in the non-nasalised pronunciation. This subsequent development in Marathi might be one way in which Konkani and Marathi transliteration may differ. So, are you going to use manual transliteration for words like हांव (hāuva), ऊंस (ūusa), सोंपें (sompẽ), लेंव्चे (leuvce), मांय (māiya), वोंट (voṇṭa) etc.? Is ज्ञ (jña) used in Konkani, and would it also be 'dny'? The word-final anusvara seem to be working well (it works for {{m|kok|सोंपें}} and {{m|kok|पयलें}}, see {{word-final anusvara form of}})
@AryamanA: When the Kannada script is used in descendants trees (such as Lua error in Module:parameters at line 95: Parameter 1 should be a valid language or etymology language code; the value "pmh" is not valid. See WT:LOL and WT:LOL/E.), {{desc|kok|ಪಿವ್ಚೆ}} duplicates the word in the transliteration. Wouldn't using MOD:translit-redirect/data be necessary to fix this issue? Kutchkutch (talk) 11:19, 23 September 2020 (UTC)Reply
@Kutchkutch: Right, so this is an awkward cross between transliteration and transcription. The way I see it, mr-translit is very phonetic in nature because it inserts a u in places where nasalization occurred, it writes ऋ as ru, etc. I used to do the same for Konkani which is why I had opposed auto translit but now I decided that a more standardized way of doing things would be helpful.
I) In Konkani, there is a nasalization-induced u in tatsama words (which, 1] are rare and 2] come indirectly from Skt via Marathi). So we say sauskrut for संस्कृत (sauskŕt), Kaus for कंस (kausa) etc. But then in inherited words there IS nasalization, although in some North Goan dialects हांव is indeed hāuv.
II) What about nasal + consonant?
If it is a voiceless stop, there is nasalization. So आंत is ā̃t and सोंपे is sỗpễ.
If it's a voiced stop then you get the corresponding nasal: ङ for ग and घ, म for ब and भ and so on. Following this, the voiced stop is then dropped ( see the pronunciation of उंदिर (undir), भांगर (bhāṅgar)). With affricates and sibilants the outcome appears to be random as सोंसो (souso) is sỗsô (ie, not sônsô) but सांज (sāñja) is sānz (ie, not sā̃z).
But the above "rule" of nasal consonant + voiced stop clearly has to apply in pronunciation and not translit.
ज्ञ is dny just like in Marathi and appears only in tatsama words. It wouldn't pose an issue anyway as tatsama words are very rare in Konkani.
My solution is to use manual translit cases like these because otherwise the module seems to work fine for Konkani. If we made a kok-translit, Konkani being pretty irregular would have a counter-example waiting out there to whatever rule we may try to make.
(BTW, interestingly, even in Kannada, the anusvara before , ಶ, ಷ, ಸ, ಯ, ಲ, ವ (, śa, ṣa, sa, ya, la, va) has the exact same effect so ಹಂಸ (haṃsa) is "hausa", ಸಂಸ್ಕೃತ (saṃskṛta) is "sauskruta", etc. I had even changed () to "ru" but then changed it back as it doesn't seem to be done anywhere else). -- Bhagadatta (talk) 12:27, 23 September 2020 (UTC)Reply
@Bhagadatta: Yes, MOD:mr-translit is an awkward cross between transliteration and transcription. The reason for doing this awkward cross was that I thought MOD:mr-IPA was MOD:mr-translit to show these processes using local correspondences = {[transliteration] = [IPA]}, and MOD:mr-IPA wasn't being used on the mainspace until recently. However, if MOD:mr-IPA can show these processes without relying on MOD:mr-translit, then such processes could possibly be removed from MOD:mr-translit. Using ċ and j̈ with manual transcriptions are additional instances of this awkward cross. {{R:mr:Berntsen}} doesn't show any of these processes in its transliteration. If it is useful for MOD:mr-translit to show these processes then they could remain. For comparison, MOD:gu-translit has ['ઋ'] = 'ru' but also ['ૃ'] = 'ṛ'. MOD:hi-translit has ['ऋ'] = 'ŕ', and ['ृ'] = 'ŕ'. So the question that arises is: If MOD:mr-IPA can independently show these processes, then should these processes be removed from MOD:mr-translit? MOD:mr-IPA already attempts to show /t͡s/ and /d͡z/ independently of MOD:mr-translit. Kutchkutch (talk) 13:39, 23 September 2020 (UTC)Reply
@Kutchkutch: I don't mind showing ċ and j̈ because Turner differentiates between the two. But if transliteration is taken to be solely the process of rendering one script into another without any regard to phonetics whatsoever then even ċ and j̈ may be dropped and we can let the pronunciation section show the difference. -- Bhagadatta (talk) 14:33, 23 September 2020 (UTC)Reply
@Bhagadatta: I think Turner differentiates between the two because it attempts to show the /broad transcription/. As a comparative dictionary it wouldn't make sense to consider the way scripts render words. However, for a unidirectional Bilingual dictionary, the way in which the two languages have different scripts, the way scripts render words seems to be important. Perhaps the two pronunciations of चमचा (camcā) and मध्ये (madhye) illustrate why the transliteration should be free of these phonetics. For Konkani, perhaps खेळ्टा (kheḷṭā) with four pronunciations shows why transliteration should only rely on the script. Kutchkutch (talk) 06:58, 24 September 2020 (UTC)Reply
@Kutchkutch: Agreed. So will you be replacing ċ and its voiced counterpart with c and j respectively? I'll do that with Konkani too. Also will mr-translit be edited to show ऋ as ṛ instead of ru? And what happens to anusvara + स etc? -- Bhagadatta (talk) 02:17, 27 September 2020 (UTC)Reply
@Bhagadatta: Yes, all those changes should be made eventually unless someone (such as User:AryamanA) has a different opinion.
However, if they cause errors or ʻbreakʼ MOD:mr-IPA and/or MOD:mr-translit, then the changes would have to be put on hold. Changing the last two lines in local nasal_assim = {...} in MOD:mr-IPA for anusvara + etc. causes errors.
Removing ċ and j̈ from manual transliterations don't involve modules so this can be done for entries that have {{mr-IPA}}. For Konkani, removing ċ and j̈ from transliterations without indicating them in the Pronunciation sections might be a loss of information. If there are manual transliterations to override other processes such as schwa-dropping, then those could remain. How do you find all instances of manual transliterations? Template:tracking/headword/has-manual-translit/mr is only for headwords.
Since there's already significant coverage for Sanskrit words in Hindi and Sanskrit entries, making separate entries for Sanskrit borrowings isn't interesting unless they differ in some aspect that's not immediately obvious (such as in meaning or pronunciation). Replacing ['ृ'] = 'ru', with ['ृ'] = 'ŕ', seems to work fine for कर्तृत्व (kartŕtva), संस्कृत (sauskŕt), नेतृत्व (netŕtva), वृक्ष (vŕkṣa). (It would be helpful to have an automated CAT:Hindi terms spelled with ृ). What is the correct symbol to transliterate ['ऋ'] and ['ृ']? Is it ṛ, r̥ or ŕ? Various entries and languages have different standards. Hindi has ŕ and IAST has ṛ. Kutchkutch (talk) 08:51, 27 September 2020 (UTC)Reply

────────────────────────────────────────────────────────────────────────────────────────────────────@Kutchkutch: I found the entries at CAT: Terms with redundant transliterations/kok. It doesn't show manual translit in the headword though. So there must be like 300 odd Konkani entries not included here. There is no pressing need to empty the category or to remove manual translit at the earliest as I believe they will be removed as and when I'll work on the entries (providing pronunciation, moving, clean up etc). For now I just removed manual translit from some of the entries there to test how well the auto translit works.
All Konkani entries which need pronunciation have them or will have them very shortly so once that is done, I'll remove ċ and j̈ (In the earlier entries, I used ts for the former and z for the latter).
Hindi has ŕ for ऋ as the r with a dot below is already used by ड़. Using ṛ for ड़ is actually more consistent as a dot below is supposed to indicate retroflexion in t, d, n, s, l etc even in IAST. Only ṃ and ṛ have a dot below and do not indicate retroflexion. As for the former, m with a dot above is just as common. My personal preference would be to use r with the ring to show that it is a syllabic r, for Sanskrit. But IAST is well established and we can't change it. Since Marathi does not have ड़, there doesn't seem to be any harm in using ṛ unless you prefer ŕ for consistency with Hindi. -- Bhagadatta (talk) 10:13, 27 September 2020 (UTC)Reply

@Bhagadatta: Yes, there's no pressing need to remove manual transliterations.
Manual transliterations ≠ Automated
CAT:Terms with manual transliterations different from the automated ones/kok
CAT:Terms with manual transliterations different from the automated ones/mr
Manual transliterations = Automated
CAT:Terms with redundant transliterations/kok
CAT:Terms with redundant transliterations/mr
I don't have a preference for how to transliterate ['ऋ'] and ['ृ']. MOD:mr-IPA has
["ṛ"] = "ɽ" and translit = gsub(translit, "ŕ", "ru") so changing the transliteration and keeping the same IPA would require more than just the replacement of two values in MOD:mr-translit. Kutchkutch (talk) 11:41, 27 September 2020 (UTC)Reply

Konkani Wikipedia[edit]

@Bhagadatta The Wikipedia links on Konkani entries are helpful, but clicking on it produces the link https://en.wikipedia.org/wiki/kok:Entry, which leads to a nonexistent English Wikipedia page since Konkani Wikipedia uses the code gom rather than kok. A possible workaround for now is {{wikipedia|gom:Moddganv|Moddganv}}}, but it would say English Wikipedia has an article on:. तियात्र (tiyātra) has entries in Devanagari and Latin scripts so it could possibly have both {{wikipedia|gom:तियात्र|तियात्र}}} and {{wikipedia|gom:Tiatr|Tiatr}}}. If issue with the code is later fixed, then it would be helpful to have an automated list of Konkani entries with Wikipedia links to update. Kutchkutch (talk) 07:08, 28 September 2020 (UTC)Reply

@Kutchkutch: Good catch, thanks for pointing it out! I also tried {{wikipedia|gom:Moddganv|Moddganv|lang=kok}} but it doesn't work. It's best to remove the link for now. -- Bhagadatta (talk) 07:41, 28 September 2020 (UTC)Reply

Konkani References[edit]

@Bhagadatta If you're not already aware of this list, perhaps it might be of some use to you:

https://gom.wiktionary.org/wiki/विक्शनरी:Strot Kutchkutch (talk) 13:26, 26 November 2020 (UTC)Reply
@Kutchkutch: Thanks a lot!! 👍 I had come across Dantas's dictionary on Google books some time ago but did not know about the others. This is very useful, especially the one from archive.org !! -- Bhagadatta(talk) 14:19, 26 November 2020 (UTC)Reply