A GRAMMAR OF MODERN INDO-EUROPEAN

1.7.2. Southern Indo-European Dialects

I. Greek

Ancient Greek dialects by 400 BC after R.D. Woodard (2008).

Greek is an Indo-European branch with a documented history of 3,500 years. Today, Modern Greek is spoken by 15 million people in Greece, Cyprus, the former Yugoslavia (especially in the FYROM), Bulgaria, Albania and Turkey.

The major dialect groups of the Ancient Greek period can be assumed to have developed not later than 1120 BC, at the time of the Dorian invasions, and their first appearances as precise alphabetic writing began in the 8th century BC. The ancient Greeks themselves considered there to be three major divisions of the Greek people, into Dorians, Aeolians, and Ionians (including Athenians), each with their own defining and distinctive dialects. Allowing for their oversight of Arcadian, an obscure mountain dialect, and Cyprian, far from the center of Greek scholarship, this division of people and language is quite similar to the results of modern archaeological and linguistic investigation.

Greek has been spoken in the Balkan Peninsula since 2000 BC. The earliest evidence of this is found in the Linear B tablets dating from 1500 BC. The later Greek alphabet is unrelated to Linear B, and was derived from the Phoenician alphabet; with minor modifications, it is still used today.

Linear B has roughly 200 signs, divided into syllabic signs with phonetic values and logograms with semantic values.

Mycenaean is the most ancient attested form of the Greek branch, spoken on mainland Greece and on Crete in the 16^th to 11^th centuries BC, before the Dorian invasion. It is preserved in inscriptions in Linear B, a script invented on Crete before the 14^th century BC. Most instances of these inscriptions are on clay tablets found in Knossos and in Pylos. The language is named after Mycenae, the first of the palaces to be excavated.

The tablets remained long undeciphered, and every conceivable language was suggested for them, until Michael Ventris deciphered the script in 1952 and proved the language to be an early form of Greek. The texts on the tablets are mostly lists and inventories. No prose narrative survives, much less myth or poetry. Still, much may be glimpsed from these records about the people who produced them, and about the Mycenaean period at the eve of the so-called Greek Dark Ages.

Unlike later varieties of Greek, Mycenaean probably had seven grammatical cases, the nominative, the genitive, the accusative, the dative, the instrumental, the locative, and the vocative. The instrumental and the locative however gradually fell out of use.

NOTE. For the Locative in *-ei, compare di-da-ka-re, ‘didaskalei’, e-pi-ko-e, ‘Epikóhei’, etc (in Greek there are syntactic compounds like puloi-genēs, ‘born in Pylos’); also, for remains of an Ablative case in *-ōd, compare (months’ names) ka-ra-e-ri-jo-me-no, wo-de-wi-jo-me-no, etc.

Proto-Greek, a southern PIE dialect, was spoken in the late 3^rd millennium BC, roughly at the same time as North-West Indo-European and Proto-Indo-Iranian, most probably in the Balkans. It was probably the ancestor of Phrygian too, and possibly that of Ancient Macedonian, Dacian, Thracian, and arguably Armenian. The unity of Proto-Greek probably ended as Hellenic migrants, speaking the predecessor of the Mycenaean language, entered the Greek paeninsula around the 21st century BC. They were then separated from the Dorian Greeks, who entered the peninsula roughly one millennium later, speaking a dialect that in some respects had remained more archaic.

NOTE. For Pelasgian and other Greek substrates as IE, some have cited different phonological developments in words like τύμβος (tumbos < PIE *d^hmb^hos) or πύργος (purgos < PIE *b^hrg^hos).

Proto-Greek was affected by a late Satemization trend, evidenced by the (post-Mycenaean) change of labiovelars into dentals before e (e.g. k^we → te “and”).

The primary sound changes from (laryngeal) PIE to Proto-Greek include:

· Aspiration of PIE intervocalic *s → PGk h.

NOTE. The loss of PIE prevocalic *s- was not completed entirely, famously evidenced by sus “sow”, dasus “dense”; sun “with”, sometimes considered contaminated with PIE *kom (cf. Latin cum) to Homeric / Old Attic ksun, is possibly a consequence of Gk. psi-substrate (See Villar).

· De-voicing of voiced aspirates: *b^h→p^h, *d^h→t^h, *g^h→k^h, *g^wh→k^wh.

· Dissimilation of aspirates (Grassmann’s law), possibly post-Mycenaean.

· PIE word-initial *j- (not *Hj-) is strengthened to PGk dj- (later Gk. ζ-).

· Vocalization of laryngeals between vowels and initially before consonants, i.e. *h₁→e, *h₂→a, *h₃→o.

NOTE. The evolution of Proto-Greek should be considered with the background of an early Palaeo-Balkan Sprachbund that makes it difficult to delineate exact boundaries between individual languages. The characteristically Greek representation of word-initial laryngeals by prosthetic vowels is shared by the Armenian language, which also shares other phonological and morphological peculiarities of Greek, vide infra.

· The sequence CRHC (where C = consonant, R = resonant, H = laryngeal) becomes PIE CRh₁C → PGk CRēC; PIE CRh₂C → PGk CRāC; PIE CRh₃C → PGk CRōC.

· The sequence PIE CRHV (where V = vowel) becomes PGk CaRV.

NOTE. It has also been proposed by Sihler (2000) that Vk^w→uk^w; cf. PIE *nok^wts, “night” → PGk nuk^wts → Gk. nuks/nuxt-; cf. also *k^wek^wlos, “circle” → PGk k^wuk^wlos → Gk. kuklos; etc.

Later sound changes between Proto-Greek and the attested Mycenaean include:

o Loss of final stop consonants; final m→n.

o Syllabic m̥→am, and n̥→an, before resonants; otherwise both were nasalized m̥/n̥→ã→a.

o loss of s in consonant clusters, with supplementary lengthening, e.g. esmi→ēmi.

o creation of secondary s from clusters, ntja→nsa. Assibilation ti→si only in southern dialects.

o Mycenaean i-vocalism and replacement of double-consonance -kw- for -k^wk^w-.

NOTE. On the problematic case of common Greek ἵππος (^hippos), horse, derived from PIE and PGk ekwos, Meier-Brügger (2003): “the i-vocalism of which is best understood as an inheritance from the Mycenaean period. At that time, e in a particular phonetic situation must have been pronounced in a more closed manner, cf. di-pa i.e. dipas neuter ‘lidded container fror drinking’ vs. the later δέρας (since Homer): Risch (1981), O. Panagl (1989). That the i-form extended to the entire Greek region may be explained in that the word, very central during Mycenaean rule of the entire region (2^nd millennium BC), spread and suppressed the e-form that had certainly been present at one time. On the -pp-: The original double-consonance -ku̯- was likely replaced by -k^wk^w- in the pre-Mycenaean period, and again, in turn by -pp- after the disappearance of the labiovelars. Suggestions of an ancient -k^wk^w- are already given by the Mycenaean form as i-qo (a possible *i-ko-wo does not appear) and the noted double-consonance in alphabetic Greek. The aspiration of the word at the beginning remains a riddle”.

Other features common to the earliest Greek dialects include:

· The PIE dative, instrumental and locative cases were syncretized into a single dative.

· Dialectal nominative plural in -oi, -ai fully replaces Late PIE common *-ōs, *-ās.

· The superlative on -tatos (<PIE *-tṃ-to-s) becomes productive.

· The peculiar oblique stem gunaik- “women”, attested from the Thebes tablets is probably Proto-Greek; it appears, at least as gunai- also in Armenian.

· The pronouns houtos, ekeinos and autos are created. Use of ho, hā, ton as articles is post-Mycenaean.

· The first person middle verbal desinences -mai, -mān replace -ai, -a. The third singular pherei is an analogical innovation, replacing the expected PIE *b^héreti, i.e. Dor. *phereti, Ion. *pheresi.

· The future tense is created, including a future passive, as well as an aorist passive.

· The suffix -ka- is attached to some perfects and aorists.

· Infinitives in -ehen, -enai and -men are also common to Greek dialects.

II. Armenian

Armenian is an Indo-European language spoken in the Armenian Republic , as well as in the region of Nagorno-Karabakh, and also used by ethnic Armenians in the Diaspora.

Distribution of ethnic Armenians in the 20^th c.

Armenian has been traditionally regarded as a close relative of Phrygian, apparently closely related to Greek, sharing major isoglosses with it. The Graeco-Armenian hypothesis proposed a close relationship to the Greek language, putting both in the larger context of Paleo-Balkans languages – notably including Phrygian, which is widely accepted as an Indo-European language particularly close to Greek, and sometimes Ancient Macedonian –, consistent with Herodotus’ recording of the Armenians as descending from colonists of the Phrygians.

NOTE. That traditional linguistic theory, proposed by Pedersen (1924), establishes a close relationship between both original communities, Greek and Armenian, departing from a common subdialect of IE IIIa (Southern Dialect of Late PIE). That vision, accepted for a long time, was rejected by Clackson (1994) in The linguistic relationship between Armenian and Greek, which, supporting the Graeco-Aryan linguistic hypothesis, dismisses that the coincidences between Armenian and Greek represent more than those found in the comparison between any other IE language pair. Those findings are supported by Kortlandt in Armeniaca (2003), in which he proposes an old Central IE continuum Daco-Albanian / Graeco-Phrygian / Thraco-Armenian. Adrados (1998), considers an older Southern continuum Graeco-[Daco-]Thraco-Phrygian / Armenian / Indo-Iranian. Olteanu (2009) proposes a Graeco-Daco-Thracian language.

The earliest testimony of the Armenian language dates to the 5^th century AD, the Bible translation of Mesrob Mashtots. The earlier history of the language is unclear and the subject of much speculation. It is clear that Armenian is an Indo-European language, but its development is opaque.

NOTE. Proto-Armenian sound-laws are varied and eccentric, such as IE *dw- yielding Arm. k-, and in many cases still uncertain. In fact, that phonetic development is usually seen as *dw- to erk-, based on PIE numeral *dwo-, “two”, a reconstruction Kortlandt (ibidem) dismisses, exposing alternative etymologies for the usual examples.

Armenian manuscript, ca. 5^th-6^th c.

PIE voiceless stops are aspirated in Proto-Armenian, a circumstance that gave rise to the Glottalic theory, which postulates that this aspiration may have been sub-phonematic already in Proto-Indo-European. In certain contexts, these aspirated stops are further reduced to w, h or zero in Armenian – so e.g. PIE *p’ots, into Arm. otn, Gk. pous, “foot”; PIE *t’reis, Arm. erek’, Gk. treis, “three”.

The reconstruction of Proto-Armenian being very uncertain, there is no general consensus on the date range when it might have been alive. If Herodotus is correct in deriving Armenians from Phrygian stock, the Armenian-Phrygian split would probably date to between roughly the 12^th and 7^th centuries BC, but the individual sound-laws leading to Proto-Armenian may have occurred at any time preceding the 5^th century AD. The various layers of Persian and Greek loanwords were likely acquired over the course of centuries, during Urartian (pre-6^th century BC) Achaemenid (6^th to 4^th c. BC; Old Persian), Hellenistic (4^th to 2^nd c. BC Koine Greek) and Parthian (2^nd c. BC to 3^rd c. AD; Middle Persian) times.

Grammatically, early forms of Armenian had much in common with classical Greek and Latin, but the modern language (like Modern Greek) has undergone many transformations. Interestingly enough, it shares with Italic dialects the secondary IE suffix *-tjōn, extended from *-ti-, cf. Arm թյուն (t’youn).

III. Indo-Iranian

The Indo-Iranian or Aryan language group constitutes the easternmost extant branch of the Indo-European family of languages. It consists of two main language groups, Indo-Aryan and Iranian, and probably Nuristani; Dardic is usually classified within the Indic subgroup.

Map of the Sintashta-Petrovka culture (red), its expansion into the Andronovo culture during the 2^nd millennium BC, showing the overlap with the BMAC in the south. The location of the earliest chariots is shown in purple.

The contemporary Indo-Iranian languages form therefore the second largest sub-branch of Indo-European (after North-West Indo-European), with more than one billion speakers in total, stretching from Europe (Romani) and the Caucasus (Ossetian) to East India (Bengali and Assamese). The largest in terms of native speakers are Hindustani (Hindi and Urdu, ca. 540 million), Bengali (ca. 200 million), Punjabi (ca. 100 million), Marathi and Persian (ca. 70 million each), Gujarati (ca. 45 million), Pashto (40 million), Oriya (ca. 30 million), Kurdish and Sindhi (ca. 20 million each).

Proto-Indo-Iranians are commonly identified with the bearers of the Andronovo culture and their homeland with an area of the Eurasian steppe that borders the Ural River on the west, the Tian Shan on the east – where the Indo-Iranians took over the area occupied by the earlier Afanasevo culture –, and Transoxiana and the Hindu Kush on the south. Historical linguists broadly estimate that a continuum of Indo-Iranian languages probably began to diverge by 2000 BC, preceding both the Vedic and Iranian cultures. A Two-wave model of Indo-Iranian expansion have been proposed (see Burrow 1973 and Parpola 1999), strongly associated with the chariot.

Aryans spread into the Caucasus, the Iranian plateau, and South Asia, as well as into Mesopotamia and Syria, introducing the horse and chariot culture to this part of the world. Sumerian texts from EDIIIb Ngirsu (2500-2350 BC) already mention the ‘chariot' (gigir) and Ur III texts (2150-2000 BC) mention the horse (anshe-zi-zi). They left linguistic remains in a Hittite horse-training manual written by one “Kikkuli the Mitannian”. Other evidence is found in references to the names of Mitanni rulers and the gods they swore by in treaties; these remains are found in the archives of the Mitanni's neighbors, and the time period for this is about 1500 BC.

The standard model for the entry of the Indo-European languages into South Asia is that the First Wave went over the Hindu Kush, either into the headwaters of the Indus and later the Ganges. The earliest stratum of Vedic Sanskrit, preserved only in the Rigveda, is assigned to roughly 1500 BC. From the Indus, the Indo-Aryan languages spread from ca. 1500 BC to ca. 500 BC, over the northern and central parts of the subcontinent, sparing the extreme south. The Indo-Aryans in these areas established several powerful kingdoms and principalities in the region, from eastern Afghanistan to the doorstep of Bengal.

The Second Wave is interpreted as the Iranian wave. The Iranians would take over all of Central Asia, Iran, and for a considerable period, dominate the European steppe (the modern Ukraine) and intrude north into Russia and west into central and eastern Europe well into historic times and as late as the Common Era. The first Iranians to reach the Black Sea may have been the Cimmerians in the 8th century BC, although their linguistic affiliation is uncertain. They were followed by the Scythians, who are considered a western branch of the Central Asian Sakas, and the Sarmatian tribes.

The Medes, Parthians and Persians begin to appear on the Persian plateau from ca. 800 BC, and the Achaemenids replaced Elamite rule from 559 BC. Around the first millennium of the Common Era, the Iranian Pashtuns and Baloch began to settle on the eastern edge of the Iranian plateau, on the mountainous frontier of northwestern Pakistan in what is now the North-West Frontier Province and Balochistan, displacing the earlier Indo-Aryans from the area.

The main changes separating Proto-Indo-Iranian from Late PIE include:

· Early Satemization trend:

o Loss of PIE labiovelars into PII plain velars: *k^w→k , *g^w→g, *g^wh→g^h .

o Palatalization of PII velars in certain phonetic environments: *k→ķ, *g→ģ, *g^h→ģ^h.

· Loss of laryngeals: *HV→a, *VH→ā. Interconsonantal *H → i, cf. *ph₂tḗr → PII pitr.

NOTE. A common exception is the Brugmann’s law. For those linguists who consider the laryngeal loss to have occurred already in Late PIE, Aryan vocalism is described as a collapse of PIE ablauting vowels into a single PII vowel; i.e. *e,*o→a; *ē,*ō→ā.

· Grassmann’s law, Bartholomae’s law, and the Ruki sound law were complete in PII.

NOTE. For a detailed description of those Indo-Iranian sound laws and the “satemization” process, see Appendix II. For Ruki sound law, v.s. Baltic in §1.7.1.

· Sonorants are generally stable in PII, but for the confusion *l/*r, which in the oldest Rigveda and in Avestan gives a general PIE *l̥ → PII r̥, as well as l→r.

Among the sound changes from Proto-Indo-Iranian to Indo-Aryan is the loss of the voiced sibilant *z; among those to Iranian is the de-aspiration of PIE voiced aspirates.

Current distribution of Iranian dialects.

A. Iranian

The Iranian languages are a branch of the Indo-Iranian subfamily, with an estimated 150-200 million native speakers today, the largest being Persian (ca. 60 million), Kurdish (ca. 25 million), Pashto (ca. 25 million) and Balochi (ca. 7 million).

Proto-Iranian dates to some time after the Proto-Indo-Iranian breakup, or the early second millennium BC, as the Old Iranian languages began to break off and evolve separately as the various Iranian tribes migrated and settled in vast areas of southeastern Europe, the Iranian plateau, and Central Asia. The oldest Iranian language known, Avestan, is mainly attested through the Avesta, a collection of sacred texts connected to the Zoroastrian religion.

Linguistically, the Old Iranian languages are divided into two major families, the Eastern and Western group, and several subclasses. The so-called Eastern group includes Scythian, even though the Scyths lived in a region extending further west than the Western group. The northwestern branch included Median, and Parthian, while the southwestern branch included Old Persian.

B. Indo-Aryan

The Indo-Aryan or Indic languages are a branch of the Indo-Iranian subfamily with a total number of native speakers of more than 900 million. The largest languages in terms of native speakers are Hindustani (about 540 million), Bangali (about 200 million), Punjabi (about 100 million), Marathi (about 90 million), Gujarati (about 45 million), Nepali (about 40 million), Oriya (about 30 million), Sindhi (about 20 million) and Assamese (about 14 million).

The earliest evidence of the group is from Vedic Sanskrit, the language used in the ancient preserved texts of the Indian subcontinent, the foundational canon of Hinduism known as the Vedas. The Indo-Aryan superstrate in Mitanni is of similar age as the Rigveda, but the only evidence is a number of loanwords.

In the 4^th c. BC, the Sanskrit language was codified and standardised by the grammarian Panini, called “Classical Sanskrit” by convention. Outside the learned sphere of Sanskrit, vernacular dialects (Prakrits) continued to evolve and, in medieval times, diversified into various Middle Indic dialects.

C. Nuristani

The recent view is to classify Nuristani as an independent branch of the Indo-Iranian language family, instead of the the Indic or Iranian group. In any event, it would seem they arrived in their present homeland at a very early date, and never entered the western Punjab of Pakistan.