The 12 most spoken languages - 1999 Loglan/Lojban Project baseline values

Following is data derived from the 1999 Encyclopedia Brittanica Book of the Year regarding language populations for the top 12 languages, which are the baseline set for the Loglan/Lojban project (only the top 6 are used for Lojban gismu making). For comparison, summary numbers from 1995 are also shown, along with amount of change, as well as the numbers used in the original 1987 Lojban word-making. I think that these numbers serve as a fairly authoritative estimate of the number of speakers of the 12 languages, and unlike other published estimates, my methodology in generating the numbers is open to inspection, along with the source data I used for individual countries.

The number of 2nd language speakers is determined by taking actual counts of such 2nd language (or creole) speakers generated by official sources and reported in the Brittanica. An increment is added to reflect 2nd language literacy in the official language of a country, presuming that all official languages of the country are taught in the schools, based on official-source literacy figures. Finally, for officially Arabic/Moslem countries, the status of Arabic as a religious language that is taught to believers is used to generate an additional increment. This is most significant for Iran where the religion is heavily state supported even though the official language is not Arabic, and there are few native speakers of that language.

Having determined these numbers, the Lojban gismu-making weights are determined by summing the number of native speakers and 1/2 the total from all 3 methods of estimating 2nd language speakers (since these 3 methods include an elimination of overlap in the calculation). The total of 1st and all 2nd language speakers is not used in the Lojban algorithm.

The 1999 numbers are summarized as follows (in millions). Note that Arabic has now passed Russian and moved into 5th place among the languages used. This, in addition to Hindi passing English several years ago, suggest that the gismu list would look somewhat different if remade from scratch today, since when languages are close together in population, a change in order will significantly affect tie-breaking results in scoring of words.

Given that the Lojban gismu list is baselined, these numbers are primarily for academic interest. However, they can be used in making fu'ivla (borrowings) where it is not clear that a particular language root is appropriate. Using an algorithm like the gismu algorithm (though not necessarily with the same constraints on word form, would give an "international" or "Lojbanic" root to use as the basis of a fu'ivla.

native 2nd/creole+literacy+religion Total speakers

native+1/2*2nd

normalized weight for 6 languages based on 1.0 total.

(change from 1995) (change from 1987)

Chinese 837.161 349.964+21.529 1208.654

1022.908

.334 (-.013) (-.026)

Hindi 455.352 221.315+61.362 738.029

596.690

.195 (-.001) (+.039)

English 346.126 381.337+69.936 797.399

571.763

.187 (+.027) (-.021)

Spanish 341.977 15.686+13.891 371.553

356.765

.116 (-.007) (+.000)

Arabic 230.533 0+25.387+50.801 306.721

268.627

.088 (+.003) (+.015)

Russian 204.517 0+86.956 291.473

247.995

.081 (-.008) (-.006)

Bengali 205.66 0+1.104 206.764

206.212

Portuguese 169.154 12.048+7.070 188.272

178.713

Japanese 126.407 0+1.118 127.525

126.966

French 80.077 60.391+24.105 164.573

122.325

Malay-Indon. 40.439 0+157.457 197.896

119.167

German 92.247 2.168+10.019 104.434

98.340

The 1995 numbers are summarized as follows (in millions):

native 2nd/creole+literacy+religion Total speakers

native+1/2*2nd

normalized weight for 6 languages based on 1.0 total.

Chinese 801.552 314.039+25.225 1140.816

971.184

.347

Hindi 413.231 66.39+206.000 685.621

549.426

.196

English 334.786 187.907+59.895 582.588

448.343

.160

Spanish 330.999 12.644+11.531 355.174

343.086

.123

Russian 210.948 0+77.965 288.913

249.930

.089

Arabic 205.272 0+19.705+46.991 271.968

238.620

.085

Bengali 183.860 0+.927 184.787

184.323

Portuguese 166.662 6.294+10.028 182.984

174.823

Japanese 125.086 0 125.086

125.086

French 74.529 41.198+29.477 145.204

109.866

Malay-Indon. 37.752 137.526 175.278

106.515

German 94.768 1.714+8.511 104.993

99.880

For comprison, here is the total speakers from the 1987 World Almanac,

the comparable figures from the 1997 World Almanac, and the numbers used

in the 1987 original Lojban gismu-remaking effort, which were based on

the 1985 Brittanica BotY. Note that Hindi passed up English in about

1989 due to rapidly increasing numbers of native speakers along with a

major increase in literacy which is continuing. A significant part of

the drop in native English, French, German, and Indonesian speakers is

due to the switching of creole speakers and some estimates of non-native

official language speakers (especially in Africa) from native to 2nd

language totals.

1987 1997 1987 gismu-remaking 1999

World Almanac native 2nd n+1/2s weight weight

Chinese 788 853/999 752.1 319.1 911.7 .360 .334

English 420 330/487 366.5 322.4 527.7 .208 .187

Hindi 382 348/457 294 200.3 394.2 .156 .195

Spanish 296 346/401 264.7 58.2 293.8 .116 .116

Russian 285 168/280 164.3 109.7 219.12 .087 .081

Arabic 177 195/230 155.9 57.7 184.8 .073 .088

Bengali 171 197/204 87 80.8 127.4

Portuguese 164 173/188 110.4 45.5 133.2

Malay/Indon. 128 54/164 121.1 39.5 140.9

Japanese 122 125/126 120.1 0.6 120.4

German 118 98/124 105.4 18.3 114.6

French 114 74/126 81.1 75.5 118.9

Following are the 6 columns of 1999 raw data, by language, by country.

In the raw data, Column 1 is native speakers of the language from the Britannica BotY. Column 2 is non-native speakers, speakers of the language as a lingua franca, and speakers of creoles and other significantly non-standard dialects (e.g. Catalan and Galician for Spanish, Luxembourgish for German, and non-Mandarin Chinese.) These numbers also come straight from the BotY. Ukrainian and Belarussian are considered native Russian speakers, since the differences are more political than linguistic (though in the longer term, Ukrainian speakers probably should be switched into the 2nd language column). Urdu is considered a native dialect of Hindi.

What is rarely carried in the BotY are speakers of the official language of a country as a second language. For example, how many non-native-English speakers in the UK speak English as a second language. The answer is something less than 100%; so I used the percentage literacy multiplied by the number of non-native-or-creole speakers of an official language. For European countries, literacy is close to 100%, but for 3rd world countries, the number is far less. For countries with 2 official languages, I further reduced the result of the above calculation by the ratio of the speakers of the official language divided by the total speakers of all official languages. The result of this calculation is considered as an increment to any number of 2nd language speakers given in column 2. That increment is shown in column 3, and the data used in the calculation is shown in column 4.

(In previous iterations of these statistics, I have used variations on this method to estimate 2nd language speakers. Creole speakers were originally treated as native speakers, though I have since learned that the creoles are sufficiently different from the standard language that a native speaker level of knowledge of the standard language is improbable.)

The former Soviet states are a special case, in that Russian (or a dialect) is an official language in only 3 of the current countries, but the educational system up to a couple of years ago was built around Russian as the official language. Because of this, I calculated 2nd language Russian speakers, as if it *were* the official language, but then subtracted the number of native Russian speakers in the country from this total to determine the column 3 number. In future years, this number may need to be slowly prorated downward as a new education system supplants the Russian one, but this should not have significant effect for at least a decade, as the older 2nd language Russian speakers will probably retain their educated knowledge of the language for as long as Russia is the dominant economic power of the region.

Columns 5 and 6 exist for Arabic only, and are an increment based on countries in which Arabic is the official language or the Muslim religion is militantly supported by the government (Iran being the major example). In this case, I determined if there was an excess of followers of the Muslim religion above the total number of 1st and 2nd language speakers of Arabic determined in columns 1-4. This excess was then multiplied by the literacy rate to get a guesstimate of non-Arabic native speakers who might still have considerable knowledge of the Arabic language through religious training. I did not calculate a religion-based number for countries that are Muslim, but which are unlikely to have government-sponsored teaching of the language (e.g. Indonesia).

Chinese (Mandarin) Cantonese/undiff. other

Australia .098 .214

Brunei .051

Cambodia .330

Canada .322

China 817.000 325.990 21.184 (1242.980-817.0) * .815 - 325.99

Costa Rica .007

Fr. Polynesia .013

Guam .002

HongKong .074 6.454 --- (6.66-0.074) * 7.59/(7.59+2.1)* .922 - 6.454

Japan .240

N.Korea .030

S.Korea .050

Macau .005 .410 (.426-.005) * .415/.425) *.895 -.410

Malaysia 2.000

Mauritius .004

Nauru .0009

N. Marianas I. .0047

Palau .0003

Panama .008

Phillipines .070

Reunion .020

Singapore 1.371 1.070 .345 (3.164-1.371) * 3.812/(1.183+.446+2.441+.235) * .891 -1.070

Taiwan 4.390 16.970 --- (21.843-4.390) * .940 - 16.97

Thailand 7.420

USA 1.520

Vietnam 1.070

837.161 349.964 21.529

1022.908 185.747

Hindi/Urdu (Nepali Pahari/Bhojpuri/Malthili in Mauritius/Nepal/Bhutan)

Bhutan .220

Fiji .347

India 442.620 206.78 11.798 (984.004-442.62) * 649.4/(649.4+187.0) *.520 - 206.78

Jamaica .050

Mauritius .021 .245

Nepal .880 14.29

Pakistan 10.780 49.563 (141.900-10.78) * .378

Trinidad .044

USA .390

455.352 221.315 61.362

596.690 141.338

English

Amer. Samoa .002 .061

Antigua .066 .003

Aruba .008

Australia 15.204 2.896 .607 (18.725-15.204) * .995 - 2.896

Bahamas .260 .032 (.293-.260) * .982

Bangladesh 3.300

Barbados .252 creole .006 .265 * .974 -.252

Belize .119 .061 creole .021 (.235-.119) * .703 -.061

Bermuda .062

Botswana .580 .431 1.448 * .698 - .580

Brunei .120

Cameroon 7.500 2.028 15.029 * .634 - 7.500

Canada 19.328 7.842 (30.677-19.328) * 19.328/(19.328+7.693) * .966

Colombia .050 creole

Costa Rica .071 creole

Denmark .024

Dominica .076 creole

Fiji .160 .566 .793 * .916 - .160

France .080

Gambia .499 1.292 * .386

Ghana 1.290 10.641 18.497 * .645 - 1.29

Gibraltar .024 .003 (.0271-.024) * .99

Grenada .100

Guam .055 .092 --- (.148-.055) * .99 - .092

Guernsey .062

Guyana .746 .035 (.782-.746) * .981

Honduras .011 creole

Hong Kong .147 1.953

India .210 186.790 --- (984.004-.210) * 187/(187+649.4) * .520 -186.79

Ireland 3.590 .043 (3.647-3.590) * 3.590/(3.590+1.190) * 1.000

Isle of Man .073

Jamaica 2.400

Japan .080

Jersey .086

Kenya 2.600 .193 28.337 * 2.6/20.6 * .781 - 2.6

Kiribati .021 .055 .084 * .900 - .021

Lesotho .500 --- 2.09 * .713 * .5/2.28 - .5

Liberia .55+2.222 creole

Luxembourg .004

Macau .002

Malawi .510 5.040 9.84 * .564 - .51

Malaysia .360 6.340

Malta .008 .008 (.377-.008) * .008/.369 * .96

Marshall Isl .0628

Mauritius .002

Micronesia .0005

Monaco .002

Namibia .013 .297

Nauru .0008 .0096

Nepal 6.500

Nether Antill .017

New Zealand 3.457 .329 (3.801-3.457) * 3.457/(3.457+.161) * 1.0

Nicaragua .027 creole

Nigeria 50.0 creole 13.114 110.532 *.571 - 50.0

N Mariana Isl .0032 .0571 .004 (.0666-.0032) * .963 - .0571

Norway .024

Pakistan 16.000

Palau .0006 .0174

Panama .387 creole

Papua New Guin .07+2.990 creole .261 4.60 * .722 - 3.060

Phillipines 38.000 6.243 73.131 * .946 * 38.0/(38.0+21.42) - 38.0

Puerto Rico 1.794

St Kitts Nevis .042

St Lucia .157

St. Vincent .112 .001 (.113-.112) * .960

Samoa .090

Seychelles .003 .025

Sierra Leone 4.400 --- 4.577 * .314 - 4.4

Singapore 1.183 --- 3.164 * 1.183/(1.183+.446+2.441+.235) * .891 - 1.183

Solomon Isl .158 .072 .426 * .541 - .158

South Africa 3.990 3.013 (42.835-3.99) * 3.99/(3.99+6.47+.64+1.11+7.5+9.6+4.2+2.96+3.08+1.8+.73) * .818

Sri Lanka 1.930

Surinam .400

Swaziland .040 .701 .966 * .767

Sweden .032

Tanzania 3.300 30.609 * 3.3/(2.2+28.0) * .678 -3.3

Tonga .029 .062 .098 * .928 - .029

Trinidad 1.235 creole .013 1.275 * .979 - 1.235

Tunisia .300

Tuvalu .010 .0104 * .950

Uganda 2.400

Unit Kingdom 57.520 1.606 (59.126-57.52) * 1.0

USA 232.910 29.090 6.581 (270.262-232.91) * .955 - 29.09

Vanuatu .060 .120

Virgin Isl .096 .020 (.118-.096) * .897

Zambia .100 1.700 5.620 (9.461 - .1) * .782 - 1.7

Zimbabwe .250 4.950 4.236 (11.044 -.25) * .851 - 4.95

346.126 381.337 69.936

571.763 225.636

2nd includes

Spanish Catalan/Galician

Andorra .030 .020

Argentina 34.980 1.101 (36.125-34.98) * .962

Aruba .007

Australia .098

Belgium .050

Belize .074 .056

Bolivia 6.980 .492 (7.957-6.98) * 6.98/(6.98+1.82+2.71) * .831

Canada .101

Chile 13.290 1.458 (14.822-13.29) * .952

Colombia 37.320 .333 (37.685-37.32) * .913

Costa Rica 3.445 .083 (3.533-3.445) * .948

Cuba 11.116

Dominican Rep 7.730 .126 (7.883-7.73) * .821

Ecuador 11.320 .770 (12.175-11.32) * .901

El Salvador 5.752

Equat. Guinea .178 .454 * .785 * 1/2

France .220 .260

Guatemala 6.990 2.119 (10.802-6.990) * .556

Honduras 5.752 .121 (5.919-5.752) * .727

Italy .030

Mexico 88.270 6.150 .624 (95.830-88.27) * .896 - 6.15

Nicaragua 4.648 .076 (4.763-4.648) * .657

Panama 2.125 .583 (2.767-2.125) * .908

Paraguay 2.879 .827 (5.223-2.879) * 2.879/(2.879+4.636) * .921

Peru 19.790 3.599 (24.801-19.79) * 19.79/(19.79+4.65) * .887

Puerto Rico 3.718 .041 (3.786-3.718) * .897 * 3.718/(3.718+1.794)

Spain 29.290 9.170 .558 (39.371-29.29) * .965 - 9.17

Sweden .056

USA 20.340

Uraguay 3.080 .132 (3.216-3.08) * .973

Venezuela 22.510 .667 (23.242-22.51) * .911

Virgin Islands .016

341.977 15.686 13.891

356.765 14.788

Russian/Ukrainian/Belarusian

Armenia 3.754 3.800 * .988

Australia .034

Azerbaijan .230 7.622 8.070 * .973 - .23

Belarus 10.120 .113 (10.235-10.12) * .979

Canada .283

Czech .013

Estonia .470 .973 1.447 *.997 - .47

Finland .018

Georgia .480 4.924 5.431 *.995 - .480

Israel .520

Kazakhstan 6.430 8.972 15.797 *.975 - 6.43

Kyrgyzstan .840 3.710 4.691 *.970 - 0.84

Latvia .970 1.463 2.445 *.995 - 0.97

Lithuania .390 3.295 3.704 *.995 - .39

Moldova .458 3.645 4.243 *.967 - .458

Poland .420

Romania .094

Russia 129.480 17.033 (146.861-129.48) * .98

Slovakia .034

Tajikistan .590 5.381 6.112 *.977 - .590

Turkmenistan .343 4.279 4.731 *.977 - .343

Ukraine 49.200 1.084 (50.302-49.20) * .984

USA .390

Uzbekistan 2.710 20.706 24.091 *.972 - 2.71

204.517 86.956

247.995 43.478

Arabic

Algeria 25.840 2.590 (30.045-25.84) * .616 .979 30.02 religion-25.84-2.590 * .616

Australia .182

Bahrain .430 .173 (.633-.43) * .852 .0 .520 religion-.430-.173 * .852

Belgium .160

Cameroon .150

Canada .049

Chad 1.920 1.219 (7.360-1.92) * 1.92/(1.92+2.2) * .481 .395 3.96 religion-1.92-1.219 * .481

Comoros .009 .004 (.546-.009) * .009/(.543+.009+.110) *.573 .303 .542 religion-.004-.009 * .573

Denmark .024

Djibouti .070 .111 (.652-.070) * .070/.170 *.462 .209 .634 religion-.111-.070 * .462

Egypt 62.500 .391 (63.261-62.50) * .514 .0 56.30 religion-62.5-.391 *.514

Eritrea .010 .530 2.66 religion -.010 * .200

France 1.490

Gaza 1.076 .006 (1.082-1.076) * 1.076/1.082 * .956 .0 1.068 religion

Gibraltar .002

Iran 1.330 43.000 60.97 religion-1.33 * .721

Iraq 16.750 2.884 (21.722-16.750) * .580 .833 21.07 religion-16.75-2.884 * .580

Israel 1.030 .997 (5.740-1.03) * 1.03/(1.03+3.62) * .956

Jordan 4.590 .080 (4.682-4.59) * .866 4.52 religion-4.59-.080

Kenya .070

Kuwait 1.460 .319 (1.866-1.46) * .786 1.59 religion-1.46-.319

Lebanon 3.260 .227 (3.506-3.26) * .924

Libya 5.460 .176 (5.691-5.46) * .762 .0 5.52 religion-5.46-.176

Mali .160

Mauritania 2.050 .174 (2.511-2.05) * .377 .104 2.50 religion-2.05-.174 * .377

Mayotte .119 .130 religion *.919

Morocco 18.050 4.249 (27.772-18.05) * .437 2.374 27.73 religion-28.05-4.249 * .437

Netherlands .140

Niger .030

Nigeria .300

Oman 1.810 .326 (2.364-1.81) * .588 .0 2.08 religion-1.81-.326 *.588

Panama .015

Qatar .230 .277 (.579-.230) * .794 .034 .550 religion-.230-.277 * .794

Saudi Arabia 19.750 .651 (20.786-19.750) * .628 .0 20.09 religion-19.75-.651 * .628

Somalia .027 (6.842-6.730 Somali) * .240 1.633 6.83 religion -.027 * .240

Sudan 16.560 7.833 (33.551-16.56) * .461 .045 24.49 religion-16.56-7.833 * .461

Sweden .068

Syria 13.800 1.087 (15.335-13.80) * .708 .0 13.19 religion-13.8-1.087 *.708

Tunisia 9.330 .033 (9.380-9.33) * .667 9.33 religion-9.33-.033 *.667

Turkey .880

UAE 1.150 1.262 (2.744-1.15) * .792 .156 2.61 religion-1.15-1.262 * .792

USA .420

West Bank 1.740 .124 (1.881-1.74) * 1.74/(1.74+.15) * .956 1.54 religion-1.74-.124

Western Sahara .288 .288 religion

Yemen 16.000 .168 (16.388-16.00) * .432 .087 16.37 religion-16.0-.168 * .432

230.533 25.387 50.801

268.627 38.094

Bengali

Bangladesh 124.670 1.104 127.567-124.67 *.381

India 80.920

Nepal .030

USA .040

205.660 1.104

206.212 .552

Portuguese

Andorra .007

Angola 3.800 .731 10.865 * .417 - 3.8

Australia .027

Brazil 157.800 3.304 (161.766-157.80) *.833

Canada .187

Cape Verde .400

France .680

Guinea-Bissau .124 .411 creole .318 (1.206-.535) * .549

Luxembourg .054

Macau .010

Mozambique .230 4.800 2.583 (18.641 -.230) * .401 - 4.8

Paraguay .165

Portugal 9.870 .084 (9.964-9.87) * .896

Sao Tome .117 .0 .136 * .542 - .117

Spain 2.520 (Galician)

USA .500

169.154 12.048 7.070

178.713 9.559

Japanese

Brazil .610

Guam .003

Hong Kong .013

Japan 125.280 1.118 126.398-125.28 * 1.0

N.Marianas I. .0013

USA .500

126.407 1.118

126.966 .559

French

Algeria 6.000

Andorra .004

Australia .043

Bahamas .030 creole

Belgium 3.340 2.420 10.208-3.340 * 3.34/(3.34+6.05+.09) *1.0

Benin .600 1.657 6.101 *.370 -.600

Burkina Faso .030 4.570 --- 11.266-.03 *.192 - 4.57

Burundi .520 1.435 5.537 *.353 -.520

Cameroon 4.500 2.316 10.751 *.634 -4.50

Canada 7.693 6.307 30.677-7.693 *7.693/(7.683+19.388) *.966

Cent Afr Rep .800 .076 3.376 *.600 *.8/3.8 -.35

Chad 2.200 --- 7.360 *.481 * 2.2/4.12 - 2.2

Comoros .091 .019 .043 .546-.091 * .110/(.543+.110+.009) *.573 -.019

Congo Rep 1.400 .591 2.658 *.749 - 1.4

Congo (Zaire) 3.800 --- 49.001 *.773 * 3.8/77.8 (other lingua franca) - 3.8

Ivory Coast 7.700 --- 15.446 *.401 -7.7

Djibouti .100 .077 .652 *.462 * .10/.17 -.10

Dominica .069 creole

Dominican Rep .160 creole

Egypt .260

Equ. Guinea .178 .454 * 1/2 * .785

France 55.100 3.251 58.390 -55.1 *.988

French Guiana .159 creole .008 .169-.159 *.830

Fr Polynesia .184 .042 .228-.184 *.950

Gabon 1.000 --- 1.208 *.632 -1.000

Guadaloupe .413 .019 .434-.413 *.901

Guinea .700 1.984 7.477 *.359 -.7

Guinea-Bissau .120

Haiti 6.180

Italy .310

Jersey .006 .080 .0856*1.0 -.006

Lebanon .840

Luxembourg .016

Madagascar 2.200 --- 14.463 * 2.2/16.51 * .802 -2.2

Mali 1.000 2.134 10.109 * .310 - 1.0

Martinique .385 .012 .398-.385 * .925

Mauritania .250

Mauritius .040 .817 creole

Mayotte .056 .067 .134 * .919 - .056

Monaco .013 .019 .032 -.013 * 1.0

Morocco 11.100

New Caledonia .070 .078 .204-.070 * .579

Niger 1.500 --- 9.672 * .136 - 1.5

Reunion .630 creole .048 .692-.63 * .782

Rwanda .600

St. Lucia .121 creole

Sao Tome & Pr .001

Senegal 3.400 --- 9.723 *.331 - 3.4

Seychelles .001 .074 creole .004 .0794-.075 * .842

Switzerland 1.370 1.223 7.118-1.37 * 1.37/(1.37+4.53+.54) * 1.0

Togo 2.500 .036 4.906 * .517 - 2.5

Tunisia 2.760

USA 2.000 .220 creole

Vanuatu .030

Vietnam .370

Virgin Islands .003

80.077 60.391 24.105

122.325 42.248

Malay-Indonesian

Australia .029

Brunei .249 .058 .315-.249 * .878

Indonesia 24.580 149.480 202.957-24.58 * .838

Malaysia 12.900 7.668 22.083-12.90 * .835

New Caledonia .005

Singapore .446 .251 3.164-.446 * .446/(1.183+.446+2.441+.235) *.891

Thailand 2.230

40.439 157.457

119.167 78.728

German

Australia .109

Austria 7.424 .646 8.070-7.424* 1.000

Belgium .090 .096 10.208-.09 * .09/9.48 * 1.000

Belize .003

Brazil .890

Canada .531 .028 (Yiddish)

Czech .048

Denmark .027

France 1.510

Germany 74.830 7.318 82.148-74.83 * 1.000

Hungary .040

Italy .310

Kazakhstan .480

Kyrgyzstan .030

Liechtenstein .028 .003 .0314-.028 *1.0

Luxembourg .010 .280 (Lux'ish) .135 .425-.010 *1.0 -.280

Namibia .015

Paraguay .045

Poland .500

Romania .097

Russia .350

Slovakia .005

Sweden .045

Switzerland 4.530 1.820 7.118-4.530 * 4.530/6.440 *1.0

USA 1.810 .350 (Yiddish, PA Dutch)

92.247 2.168 10.019

98.340 6.093