- The following documentation is located at Module:data consistency check/documentation. [edit]
- Useful links: subpage list • links • transclusions • testcases • sandbox
This module checks the validity and internal consistency of the language, language family, and script data used on Wiktionary: the modules in Category:Language data modules as well as Module:scripts/data.
Output
Discrepancies detected:
- Old Indo-Aryan languages (
inc-old) has no child families or languages. - Middle Iranian languages (
ira-mid) has no child families or languages. - Old Iranian languages (
ira-old) has no child families or languages. - Papuan languages (
paa) has no child families or languages. - creole languages (
qfa-cre) has no child families or languages. - pidgin languages (
qfa-pid) has no child families or languages.
- Norwegian Bokmål (
nb) has Middle Norwegian (gmq-mno) set as an ancestor, but is not in the West Scandinavian languages (gmq-wes). - Norwegian Bokmål (
nb) has Danish (da) set as an ancestor, but is not in the East Scandinavian languages (gmq-eas).
- Caribbean Hindustani (
hns) has Bhojpuri (bho) set as an ancestor, but is not in the Bihari languages (inc-bih). - Caribbean Hindustani (
hns) has Awadhi (awa) set as an ancestor, but is not in the Eastern Hindi languages (inc-hie).
- Khmu (
kjg) has Proto-Khmuic (mkh-khm-pro) set as an ancestor, but is not in the Khmuic languages (mkh-khm).
- Proto-language with no family: Proto-Amuesha-Chamicuro (
awd-amc-pro) should be the proto-language of"awd-amc", which doesn't exist. - Proto-language with no family: Proto-Kampa (
awd-kmp-pro) should be the proto-language of"awd-kmp", which doesn't exist. - Proto-language with no family: Proto-Paresi-Waura (
awd-prw-pro) should be the proto-language of"awd-prw", which doesn't exist. - Proto-language with no family: Proto-Rukai (
dru-pro) should be the proto-language of"dru", but Rukai (dru) is not a family. - Proto-language with no family: Proto-Puroik (
sit-khp-pro) should be the proto-language of"sit-khp", which doesn't exist.
apcis set as an ISO 639-3 code on multiple items:Q56593andQ22809485.kjvis set as an ISO 639-3 code on multiple items:Q838165andQ31199873.msnis set as an ISO 639-3 code on multiple items:Q3331111andQ3563857.tttis set as an ISO 639-3 code on multiple items:Q56489andQ123964178.
- Blissymbolic script (
Blis) is not used by any language and has no characters listed for auto-detection. - Cypro-Minoan script (
Cpmn) is not used by any language. - Hiragana script (
Hira) is not used by any language. - Kana script (
Hrkt) is not used by any language. - Image-rendered script (
Image) is not used by any language and has no characters listed for auto-detection. - International Phonetic Alphabet (
Ipach) is not used by any language and has no characters listed for auto-detection. - Moon script (
Moon) is not used by any language and has no characters listed for auto-detection. - Morse code (
Morse) is not used by any language and has no characters listed for auto-detection. - musical notation (
Music) is not used by any language. - Proto-Cuneiform script (
Pcun) is not used by any language and has no characters listed for auto-detection. - Proto-Elamite script (
Pelm) is not used by any language and has no characters listed for auto-detection. - Proto-Sinaitic script (
Psin) is not used by any language and has no characters listed for auto-detection. - Rongorongo script (
Roro) is not used by any language and has no characters listed for auto-detection. - Rumi numerals (
Rumin) is not used by any language. - flag semaphore (
Semap) is not used by any language and has no characters listed for auto-detection. - Visible Speech script (
Visp) is not used by any language and has no characters listed for auto-detection. - mathematical notation (
Zmth) is not used by any language. - symbolic script (
Zsym) is not used by any language. - undetermined script (
Zyyy) is not used by any language and has no characters listed for auto-detection. - uncoded script (
Zzzz) is not used by any language and has no characters listed for auto-detection. - The codes
fa-Arab,ug-Arab,ks-Arab,ps-Arab,ur-Arab,ku-Arab,tt-Arab,ota-Arab,mzn-Arabandsd-Arabare currently alias codes. Only one code should be used in the data. - The codes
ms-Arabandkk-Arabare currently alias codes. Only one code should be used in the data.
Checks performed
For multiple data modules:
- Codes for languages, families and etymology-only languages must be unique and cannot clash with one another.
- Canonical names for languages, families, and etymology-only languages must not be found in the list of other names.
- Each name in the list of other names must appear only once.
otherNames, if present, must be an array.- Wikidata item IDs must be a positive integer or a string starting with
Qand ending with decimal digits.
The following must be true of the data used by Module:languages:
- Each code must be defined in the correct submodule according to whether it is two-letter, three-letter or exceptional.
- The canonical name (field
1) must be present and must not be the same as the canonical name of another language. - If field
2is notnil, it must a valid Wikidata item ID. - If field
3orfamilyis given and notnil, it must be a valid family code. - If field
4orscriptsis given and notnil, it must be an array, and each string in the array must be a valid script code. - If
ancestorsis given, it must be an array, and each string in the array must be a valid language or etymology language code. - If
familyis given, it must be a valid family code. - If
typeis given, it must be one of the recognised values (regular,reconstructed,appendix-constructed). - If
entry_nameis given, it must be a table that contains either two arrays (fromandto) or a string (remove_diacritics) or both. - If
sort_keyis given, it may either be a string, or at table that in turn contains either two arrays (fromandto) or a string (remove_diacritics). - If
entry_nameorsort_keyis given, thefromarray must be longer or equal in length to thetoarray. - If
standard_charsis given, it must form a valid Lua string pattern when placed between square brackets with^before it ("[^...]). (It should match all characters regularly used in the language, but that cannot be tested.) - If
override_translitis set,translitmust also be set, because there must be a transliteration module that can override manual transliteration. - If
link_tris present, it must betrue. - Have no data keys besides these:
1,2,3,"entry_name","sort_key","display","otherNames","aliases","varieties","type","scripts","ancestors","wikimedia_codes","wikipedia_article","standard_chars","translit","override_translit","link_tr".
Checks not performed:
- If
translitis present, it should be the name of a module, and this module should contain atrfunction that takes a pagename (and optionally a language code and script code) as arguments. - If
sort_keyis a string, it should be the name of a module, and this module should contain amakeSortKeyfunction that takes a pagename (and optionally a language code and script code) as arguments. - If
entry_nameorsort_keyis a table and contains a fieldremove_diacritics, the value of the field should be a string that forms a valid Lua pattern when it is placed inside negated set notation ([^...]).
These are not checked here, because module errors will quickly crop up in entries if these conditions are not met, assuming that Module:utilities attempts to generate a sortkey for a category pertaining to the language in question, or full_link attempts to use the transliteration module.
Module:languages/code to canonical name and Module:languages/canonical names must contain all the codes and canonical names found in the data submodules of Module:languages, and no more.
The following must be true of the data used by Module:etymology languages:
canonicalNamemust be given.parentmust be given must be a valid language, family or etymology-only language code.- If
ancestorsis given, it must be an array, and each string in the array must be a valid language or etymology language code. The etymology language should also be listed as the ancestor of a regular language. - Have no data keys besides these:
"canonicalName","otherNames","parent","ancestors","wikipedia_article","wikidata_item".
Codes in Module:families data must:
- Have
canonicalName, which must not be the same as the canonical name of another family. - If
familyis given, it must be a valid family code. - Have at least one language or subfamily belonging to it.
- Have no data keys besides these:
"canonicalName","otherNames","family","protoLanguage","wikidata_item".
Codes in Module:scripts data must:
- Have
canonicalName. - Have at least one language that lists it as one of its scripts.
- Have a
characterspattern for script autodetection, and this must form a valid Lua string pattern when placed between square brackets ("[...]"). (It should match all characters in the script, but that cannot be tested.) - Have no data keys besides these:
"canonicalName","otherNames","parent","systems","wikipedia_article","characters","direction".
-- TODO: -- ietf_subtag field used with a 2/3-letter langauge/family code except qaa-qtz, or a 4-letter script code. -- Check against files containing up-to-date ISO data, to cross-check validity. localexport={} localmw=mw localrequire=require localstring=string localArray=require("Module:array") localm_en_utilities=require("Module:en-utilities") localm_etym_languages_canonical_names=require("Module:etymology languages/canonical names") localm_etym_languages_codes=require("Module:etymology languages/code to canonical name") localm_etym_languages_data=require("Module:etymology languages/data") localm_families=require("Module:families") localm_families_canonical_names=require("Module:families/canonical names") localm_families_codes=require("Module:families/code to canonical name") localm_families_data=require("Module:families/data") localm_languages=require("Module:languages") localm_languages_canonical_names=require("Module:languages/canonical names") localm_languages_codes=require("Module:languages/code to canonical name") localm_languages_data_all=require("Module:languages/data/all") localm_load=require("Module:load") localm_scripts=require("Module:scripts") localm_scripts_canonical_names=require("Module:scripts/canonical names") localm_scripts_codes=require("Module:scripts/code to canonical name") localm_scripts_data=require("Module:scripts/data") localm_str_utils=require("Module:string utilities") localm_table=require("Module:table") localadd_indefinite_article=m_en_utilities.add_indefinite_article localcodepoint=m_str_utils.codepoint localconcat=table.concat localdump=mw.dumpObject localformat=string.format localgcodepoint=m_str_utils.gcodepoint localget_data_module_name=m_languages.getDataModuleName localget_family_by_code=m_families.getByCode localget_family_by_canonical_name=m_families.getByCanonicalName localget_indefinite_article=m_en_utilities.get_indefinite_article localget_language_by_code=m_languages.getByCode localget_language_by_canonical_name=m_languages.getByCanonicalName localget_script_by_code=m_scripts.getByCode localget_script_by_canonical_name=m_scripts.getByCanonicalName localgmatch=string.gmatch localgsub=string.gsub localinsert=table.insert localipairs=ipairs localis_callable=require("Module:fun").is_callable localis_positive_integer=require("Module:math").is_positive_integer localis_known_language_tag=mw.language.isKnownLanguageTag localisutf8=mw.ustring.isutf8 localjson_decode=mw.text.jsonDecode locallanguage_link=require("Module:links").language_link locallist_to_set=m_table.listToSet locallist_to_text=mw.text.listToText localload_data=m_load.load_data locallog=mw.log localmain_loader=package.loaders[2] localmake_family=m_families.makeObject localmake_lang=m_languages.makeObject localmake_script=m_scripts.makeObject localmatch=string.match localnew_title=mw.title.new localnext=next localpairs=pairs localpcall=pcall localremove_comments=require("Module:string/removeComments") localsafe_require=m_load.safe_require localsorted_pairs=m_table.sortedPairs localsplit=m_str_utils.split localsub=string.sub localtable_len=m_table.length localtag_text=require("Module:script utilities").tag_text localtype=type localumatch=m_str_utils.match localunpack=unpackortable.unpack-- Lua 5.2 compatibility localaliases=require("Module:languages/data").aliases localmessages localfunctiondiscrepancy(modname,...) localsuccess,result=pcall(function(...) messages[modname]:insert(format(...)) end,...) ifnotsuccessthen log(result,...) end end localmessages_mt={} functionmessages_mt:__index(k) localval=Array() self[k]=val returnval end localall_codes={} locallanguage_names={} localetym_language_names={} localfamily_names={} localscript_names={} localnonempty_families={} localallowed_empty_families={tbq=true} localnonempty_scripts={} localfunctionlink(obj,code_first) returntype(obj)=="string"andobjor code_firstandformat("<code>%s</code> (%s)",obj:getCode(),obj:makeCategoryLink())or format("%s (<code>%s</code>)",obj:makeCategoryLink(),obj:getCode()) end localfunctioncheck_data_keys(...) localvalid_keys=Array(...):toSet() returnfunction(modname,obj,data) localinvalid_keys forkinpairs(data)do ifnotvalid_keys[k]then ifnotinvalid_keysthen invalid_keys=Array(k) else invalid_keys:insert(k) end end end ifinvalid_keys==nilthen return end localplural=#invalid_keys~=1 discrepancy(modname, "The data key%s %s for %s %s invalid.", pluraland"s"or"", invalid_keys:map(function(key) return"<code>"..key.."</code>" end):concat(", "), link(obj), pluraland"are"or"is" ) end end -- Modification of isArray in [[Module:table]]. -- This assumes all keys are either integers or non-numbers. -- If there are fractional numbers, the results might be incorrect. -- For instance, find_gap{"a", "b", [0.5] = true} evaluates to 3, but there -- isn't a gap at 3 in the sense of there being an integer key greater than 3. localfunctionfind_gap(t,can_contain_non_number_keys) locali=0 forkinpairs(t)do ifnot(can_contain_non_number_keysandtype(k)~="number")then i=i+1 ift[i]==nilthen returni end end end end localfunctioncheck_true_or_string_or_nil(modname,obj,data,key) localfield=data[key] ifnot(field==nilorfield==trueortype(field)=="string")then discrepancy(modname, "%s has %s <code>%s</code> value that is not <code>nil</code>, <code>true</code> or a string: <code>%s</code>", link(obj),get_indefinite_article(key),key,dump(data[key]) ) end end localfunctioncheck_array(modname,obj,data,array_name,parent_array_name,can_contain_non_number_keys) localparent_table=data ifparent_array_namethen parent_table=assert(data[parent_array_name],parent_array_name) parent_array_name="the <code>"..parent_array_name.."</code> field in " else parent_array_name="" end localarray_type=type(parent_table[array_name]) ifarray_type=="table"then localgap=find_gap(parent_table[array_name],can_contain_non_number_keys) ifgapthen discrepancy(modname, "The <code>%s</code> array in %sthe data table for %s has a gap at index %d.", array_name, parent_array_name, link(obj), gap ) else returntrue end else discrepancy(modname, "The <code>%s</code> field in %sthe data table for %s should be an array (table) but is %s.", array_name, parent_array_name, link(obj), array_type=="nil"and"nil"or"a "..array_type ) end end localfunctioncheck_no_alias_codes(modname,mod_data) locallookup,discrepancies={},{} fork,vinpairs(mod_data)do localcheck=lookup[v] ifcheckthen discrepancies[check]=discrepancies[check]or{"<code>"..check.."</code>"} insert(discrepancies[check],"<code>"..k.."</code>") else lookup[v]=k end end for_,vinpairs(discrepancies)do discrepancy(modname, "The codes %s are currently alias codes. Only one code should be used in the data.", list_to_text(v,", "," and ") ) end end localfunctioncheck_wikidata_item(modname,obj,data,key) localdata_item=data[key] ifdata_item==niloris_positive_integer(data_item)then return end discrepancy(modname, "%s has a Wikidata item ID that is not a positive integer: <code>%s</code>", link(obj),dump(data_item) ) end localfunctioncheck_name_field(modname,obj,data,canonical_name,data_key,allow_nested,allow_canonical_name_in_table) localarray=data[data_key] ifnotarraythen return end check_array(modname,obj,data,data_key,nil,true) localnames={} localfunctioncheck_other_name(other_name) ifnotallow_canonical_name_in_tableandother_name==canonical_namethen discrepancy(modname, "%s has its canonical name (<code>%s</code>) repeated in the table of <code>%s</code>.", link(obj),dump(canonical_name),data_key ) end ifnames[other_name]then discrepancy(modname, "The name %s is found twice or more in the list of <code>%s</code> for %s.", other_name,data_key,link(obj) ) end names[other_name]=true end for_,other_nameinipairs(array)do iftype(other_name)=="table"then ifnotallow_nestedthen discrepancy(modname, "A nested table is found in the list of <code>%s</code> for %s, but isn't allowed.", data_key,link(obj) ) else for_,oninipairs(other_name)do check_other_name(on) end end else check_other_name(other_name) end end end localfunctioncheck_other_names_aliases_varieties(modname,obj,data,canonical_name) ifdata.other_namesthen check_name_field(modname,obj,data,canonical_name,"other_names") end ifdata.aliasesthen check_name_field(modname,obj,data,canonical_name,"aliases") end ifdata.varietiesthen -- Sometimes a variety legitimately has the same name as the language as a whole, so allow that. check_name_field(modname,obj,data,canonical_name,"varieties","allow_nested","allow_canonical_name_in_table") end end localfunctionvalidate_pattern(pattern,modname,obj,standard_chars) iftype(pattern)~="string"then returndiscrepancy(modname, "\"%s\", the %spattern for %s, is not a string.", pattern,standard_charsand"standard character "or"",link(obj) ) elseifnotisutf8(pattern)then returndiscrepancy(modname, "%s specifies a pattern for for %scharacter detection which is not valid UTF-8: <code>%s</code>", link(obj),standard_charsand"standard "or"",dump(pattern) ) end localranges forlower,higheringmatch(pattern,"(.[\128-\191]*)%-%%?(.[\128-\191]*)")do ifcodepoint(lower)>=codepoint(higher)then ranges=rangesorArray() insert(ranges,{lower,higher}) end end ifrangesandranges[1]then localplural=#ranges~=1and"s"or"" discrepancy(modname, "%s specifies an invalid pattern ".. "for %scharacter detection: <code>%s</code>. The first codepoint%s ".. "in the range%s %s %s must be less than or equal to the second.", link(obj),standard_charsand"standard "or"",dump(pattern),plural,plural, ranges:map(function(range) returnformat(range[1].."-"..range[2].." (U+%X, U+%X)",codepoint(range[1]),codepoint(range[2])) end):concat(", "), #ranges~=1and"are"or"is" ) end localsuccess,result=pcall(umatch,"","["..pattern.."]") ifnotsuccessthen discrepancy(modname, "%s specifies an invalid pattern for %scharacter detection: <code>%s</code> (%s)", link(obj),standard_charsand"standard "or"",dump(pattern),result ) end end localremove_exceptions_addition=0xF0000 localmaximum_code_point=0x10FFFF localremove_exceptions_maximum_code_point=maximum_code_point-remove_exceptions_addition -- TODO: check modules exist. -- TODO: validate script codes and check inner tables. localfunctioncheck_replacement_data(modname,obj,data,key,func_name) localreplacements=data[key] ifreplacements==nilthen return end localreplacements_type=type(replacements) ifreplacements_type=="string"then localmod=main_loader("Module:"..replacements) ifnotmodthen discrepancy(modname, "The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which does not exist.", key,link(obj),replacements ) else mod=mod() ifnot(type(mod)=="table"andis_callable(mod[func_name]))then discrepancy(modname, "The <code>%s</code> field in the data table for %s specifies the module [[Module:%s]], which exists, but does not contain the expected function <code>%s()</code>.", key,link(obj),replacements,func_name ) end end return elseifreplacements_type~="table"then discrepancy(modname, "The <code>%s</code> field in the data table for %s must be a string or table, not a %s.", key,link(obj),replacements_type ) return end localfrom,to=replacements.from,replacements.to if(from~=nil)~=(to~=nil)then discrepancy(modname, "The <code>from</code> and <code>to</code> arrays in the <code>%s</code> table for %s are not both defined or both undefined.", key,link(obj) ) elseiffromthen for_,kinipairs{"from","to"}do check_array(modname,obj,data,k,key) end end localremove_diacritics=replacements.remove_diacritics ifnot(remove_diacritics==nilortype(remove_diacritics)=="string")then discrepancy(modname, "The <code>remove_diacritics</code> field in the <code>%s</code> table for %s table must be a string.", key,link(obj) ) end localremove_exceptions=replacements.remove_exceptions ifremove_exceptionsthen ifcheck_array(modname,obj,data,"remove_exceptions",key)then forsequence_i,sequenceinipairs(remove_exceptions)do localcode_point_i=0 forcode_pointingcodepoint(sequence)do code_point_i=code_point_i+1 ifcode_point>remove_exceptions_maximum_code_pointthen discrepancy(modname, "Code point #%d (0x%04X) in field #%d of the <code>remove_exceptions</code> array for %s is over U+%04X.", code_point_i,code_point,sequence_i,link(obj),remove_exceptions_maximum_code_point ) end end end end end iffromandtoandtable_len(to)>table_len(from)then discrepancy(modname, "The <code>from</code> array in the <code>%s</code> table for %s must be shorter or the same length as the <code>to</code> array.", key,link(obj) ) end end localfunctioncheck_replacements_data(modname,obj,data) for_,replacement_specinipairs{ {"translit","tr"}, {"display_text","makeDisplayText"}, {"strip_diacritics","stripDiacritics"}, {"sort_key","makeSortKey"}, }do check_replacement_data(modname,obj,data,unpack(replacement_spec)) end end localfunctionhas_ancestor(lang,code) for_,ancinipairs(lang:getAncestors())do ifcode==anc:getCode()orhas_ancestor(anc,code)then returntrue end end end localfunctionget_default_ancestors(lang) iflang:hasType("language","etymology-only")then localparent=lang:getParent() ifnothas_ancestor(parent,lang:getCode())then returnparent:getAncestorCodes() end end localfam_code,def_anc=lang:getFamilyCode() whilefam_codeandfam_code~="qfa-not"do localfam=m_families_data[fam_code] def_anc=fam.protoLanguageor m_languages_data_all[fam_code.."-pro"]andfam_code.."-pro"or m_etym_languages_data[fam_code.."-pro"]andfam_code.."-pro" ifdef_ancanddef_anc~=lang:getCode()then return{def_anc} end fam_code=fam[3] end end localfunctioniterate_ancestor(obj,modname,anc_code) localanc=get_language_by_code(anc_code,nil,true) ifnotancthen discrepancy(modname, "%s lists the invalid language code <code>%s</code> as its ancestor.", link(obj),dump(anc_code) ) return end localanc_fam=anc:getFamily() ifnotanc_famthen discrepancy(modname, "%s has no family.", link(anc) ) return end localanc_fam_code=anc_fam:getCode() localdef_ancs=get_default_ancestors(obj) ifdef_ancsthen for_,def_ancinipairs(def_ancs)do def_anc=get_language_by_code(def_anc,nil,true) ifdef_ancand( anc_code==def_anc:getCode()or has_ancestor(def_anc,anc_code)or def_anc:hasParent(anc_code)andnothas_ancestor(anc,def_anc:getCode()) )then discrepancy(modname, "%s has the ancestor %s listed in its ancestor field, which is redundant, since it is determined to be ancestral automatically.", link(obj),link(anc) ) end end end ifnotobj:inFamily(anc_fam_code)then discrepancy(modname, "%s has %s set as an ancestor, but is not in the %s.", link(obj),link(anc),link(anc_fam) ) end localfam,proto=obj repeat fam=fam:getFamily() proto=famandfam:getProtoLanguage() untilprotoornotfamorfam:getCode()=="qfa-not" ifprotoandnot( proto:getCode()==anc:getCode()or proto:hasAncestor(anc:getCode())or anc:hasAncestor(proto:getCode()) )then localfam=obj:getFamily() discrepancy(modname, "%s is in the %s and has %s set as an ancestor, but it is not possible to form an ancestral chain between them.", link(obj),link(fam),link(anc) ) end end localfunctioncheck_ancestors(modname,obj,data) localancestors=data.ancestors ifancestors==nilthen return end localancestors_type=type(ancestors) ifancestors_type=="string"then ancestors=split(ancestors,",",true,true) elseifancestors_type~="table"then discrepancy(modname, "The <code>ancestors</code> field in the data table for %s must be a string or table, not a %s.", link(obj),ancestors_type ) end for_,ancinipairs(ancestors)do iterate_ancestor(obj,modname,anc) end end localfunctioncheck_wikimedia_codes(modname,obj,data) localwikimedia_codes=data.wikimedia_codes ifwikimedia_codes==nilthen return end localwikimedia_codes_type=type(wikimedia_codes) ifwikimedia_codes_type=="string"then wikimedia_codes=split(wikimedia_codes,",",true,true) elseifwikimedia_codes_type~="table"then discrepancy(modname, "The <code>wikimedia_codes</code> field in the data table for %s must be a string or table, not a %s.", link(obj),wikimedia_codes_type ) end for_,codeinipairs(wikimedia_codes)do ifnotis_known_language_tag(code)then discrepancy(modname, "%s lists the invalid Wikimedia code <code>%s</code> in the <code>wikimedia_codes</code> field.", link(obj),dump(code) ) end end end localfunctioncheck_code_to_name_and_name_to_code_maps( source_module_type, source_module_description, code_to_module_map,name_to_code_map, code_to_name_modname,code_to_name_module, name_to_code_modname,name_to_code_module ) localfunctioncheck_code_and_name(modname,code,canonical_name) -- Check the code is in code_to_module_map and that it didn't originate from the wrong data module. localcheck_mod=code_to_module_map[code]orcode_to_module_map[aliases[code]] ifnot(check_modandmatch(check_mod,"^"..source_module_type.."/data"))then ifnotname_to_code_map[canonical_name]then discrepancy(modname, "The code <code>%s</code> and the canonical name %s should be removed; they are not found in %s.", code,canonical_name,source_module_description ) else discrepancy(modname, "<code>%s</code>, the code for the canonical name %s, is wrong; it should be <code>%s</code>.", code,canonical_name,name_to_code_map[canonical_name] ) end elseifnotname_to_code_map[canonical_name]then localdata_table=require("Module:"..code_to_module_map[code])[code] discrepancy(modname, "%s, the canonical name for the code <code>%s</code>, is wrong; it should be %s.", canonical_name,code,data_table[1] ) end end forcode,canonical_nameinpairs(code_to_name_module)do check_code_and_name(code_to_name_modname,code,canonical_name) end forcanonical_name,codeinpairs(name_to_code_module)do check_code_and_name(name_to_code_modname,code,canonical_name) end end localfunctioncheck_extraneous_extra_data( data_modname,data_module,extra_data_modname,extra_data_module) forcode,_inpairs(extra_data_module)do ifnotdata_module[code]then discrepancy(extra_data_modname, "The code <code>%s</code> is not found in [[Module:%s]], and should be removed from [[Module:%s]].", code,data_modname,extra_data_modname ) end end end -- TODO: add collision check between the canonical names "X" and "X [Ll]anguage". localfunctioncheck_languages(frame) localcheck_language_data_keys=check_data_keys( 1,2,3,4,-- canonical name, Wikidata item, family, scripts "display_text","generate_forms","strip_diacritics","sort_key", "other_names","aliases","varieties","ietf_subtag", "type","ancestors","pseudo_families", "wikimedia_codes","wikipedia_article","standard_chars", "translit","override_translit","link_tr", "dotted_dotless_i" ) localfunctioncheck_language(modname,code,data,extra_modname,extra_data) localobj,code_modname,canonical_name=make_lang(code,data,true),get_data_module_name(code),data[1] -- FIXME: this module should use the prefixed module name throughout. code_modname=code_modname:gsub("^Module:","") ifcode_modname~=modnamethen ifcode_modname=="languages/data/2"then discrepancy(modname, "%s is a two-letter code, so should be moved to [[Module:%s]].", link(obj),code_modname ) elseifcode_modname=="languages/data/exceptional"then discrepancy(modname, "%s is an exceptional code, as it does not consist of two or three lowercase letters, so should be moved to [[Module:%s]].", link(obj),code_modname ) else discrepancy(modname, "%s is a three-letter code beginning with '%s', so should be moved to [[Module:%s]].", link(obj),sub(code,1,1),code_modname ) end end check_language_data_keys(modname,obj,data) ifall_codes[code]then discrepancy(modname, "The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].", code,all_codes[code] ) else ifnotm_languages_codes[code]then discrepancy("languages/code to canonical name", "The code %s is missing.", link(obj,true) ) end all_codes[code]=modname end -- TODO: these checks should be consolidated with the proto-language checks in the family data, -- since bad settings there affect the warnings here (e.g. xxx-pro assigned to yyy when xxx also -- doesn't not exist - a warning that xxx has "no family" would be misleading). ifsub(code,-4)=="-pro"then localfam_code=sub(code,1,-5) localfam=get_language_by_code(fam_code,nil,true,true) ifnotfamthen discrepancy(modname, "'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, which doesn't exist.", link(obj),dump(fam_code) ) elseifnotfam:hasType("family")then discrepancy(modname, "'''Proto-language with no family''': %s should be the proto-language of <code>%s</code>, but %s is not a family.", link(obj),dump(fam_code),link(fam) ) else -- Reinstate this as low-priority once message priorities have been implemented. -- local expected_name = "Proto-" .. fam:getCanonicalName() -- if canonical_name ~= expected_name then -- discrepancy(modname, -- "%s does not have the expected name \"%s\", even though it is the proto-language of the %s.", -- link(obj), expected_name, link(fam) -- ) -- end end end ifnotcanonical_namethen discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseiflanguage_names[canonical_name]then localcanonical_lang=get_language_by_canonical_name(canonical_name) ifnotcanonical_langthen discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) elseifdata.main_code~=canonical_lang:getCode()then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj),language_names[canonical_name] ) end else ifnotm_languages_canonical_names[canonical_name]then discrepancy("languages/canonical names", "The canonical name %s is missing.", link(obj) ) end language_names[canonical_name]=code end check_wikidata_item(modname,obj,data,2) ifextra_datathen check_other_names_aliases_varieties(modname,obj,extra_data,canonical_name) end locallang_type=data.type iflang_typeandnot(lang_type=="regular"orlang_type=="reconstructed"orlang_type=="appendix-constructed")then discrepancy(modname, "%s is of the invalid type <code>%s</code>.", link(obj),lang_type ) end ifdata.aliasesthen discrepancy(modname, "%s has an <code>aliases</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].", link(obj),modname,extra_modname ) end ifdata.varietiesthen discrepancy(modname, "%s has the <code>varieties</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].", link(obj),modname,extra_modname ) end ifdata.other_namesthen discrepancy(modname, "%s has the <code>other_names</code> key in [[Module:%s]]. This must be moved to [[Module:%s]].", link(obj),modname,extra_modname ) end ifnotextra_datathen discrepancy(extra_modname, "%s has data in [[Module:%s]], but does not have corresponding data in [[Module:%s]].", link(obj),modname,extra_modname ) --[[elseif extra_data.other_names then discrepancy(extra_modname, "%s has <code>other_names</code> key, but these should be changed to either <code>aliases</code> or <code>varieties</code>.", link(obj) )]] end localsc=data[4] ifscthen iftype(sc)=="string"then sc=split(sc,"%s*,%s*",true) end iftype(sc)=="table"then ifnotsc[1]then discrepancy(modname, "%s has no scripts listed.", link(obj) ) else for_,sccodeinipairs(sc)do localcur_sc=m_scripts_data[sccode] ifnot(cur_scorsccode=="All"orsccode=="Hants")then discrepancy(modname, "%s lists the invalid script code <code>%s</code>.", link(obj),dump(sccode) ) --[[elseif not cur_sc.characters then discrepancy(modname, "%s lists the %s, which does not have any characters.", link(obj), link(get_script_by_code(sccode)) )]] end nonempty_scripts[sccode]=true end end else discrepancy(modname, "The %s field for %s must be a table or string.", 4,link(obj) ) end end ifdata.ancestorsthen check_ancestors(modname,obj,data) end ifdata.wikimedia_codesthen check_wikimedia_codes(modname,obj,data) end ifdata[3]then localfamily=data[3] ifnotm_families_data[family]then discrepancy(modname, "%s has the invalid family code <code>%s</code>.", link(obj),dump(family) ) end nonempty_families[family]=true end check_replacements_data(modname,obj,data) ifdata.standard_charsthen iftype(data.standard_chars)=="table"then localsccodes={} for_,sccodeinipairs(sc)do sccodes[sccode]=true end forsccodeinpairs(data.standard_chars)do ifnot(sccodes[sccode]orsccode==1)then discrepancy(modname, "The field %s in the <code>standard_chars</code> table for %s does not match any script for that language.", sccode,link(obj) ) end end elseifdata.standard_charsandtype(data.standard_chars)~="string"then discrepancy(modname, "The <code>standard_chars</code> field in the data table for %s must be a string or table.", link(obj) ) end end check_true_or_string_or_nil(modname,obj,data,"override_translit") check_true_or_string_or_nil(modname,obj,data,"link_tr") -- This doesn't apply any more since scripts can be script-wide translit methods. -- if data.override_translit and not data.translit then -- discrepancy(modname, -- "%s has the <code>override_translit</code> field set, but no transliteration module", -- link(obj) -- ) -- end end localfunctioncheck_module(modname) localmod_data=load_data("Module:"..modname) localextra_modname=modname.."/extra" localextra_mod_data=load_data("Module:"..extra_modname) forcode,datainpairs(mod_data)do check_language(modname,code,data,extra_modname,extra_mod_data[code]) end check_no_alias_codes(modname,mod_data) check_no_alias_codes(extra_modname,extra_mod_data) check_extraneous_extra_data(modname,mod_data,extra_modname,extra_mod_data) end -- Check two-letter codes check_module( "languages/data/2" ) -- Check three-letter codes fori=0x61,0x7Ado-- a to z check_module( format("languages/data/3/%c",i) ) end -- Check exceptional codes check_module( "languages/data/exceptional" ) -- These checks must be done while all_codes only contains language codes: -- that is, after language data modules have been processed, but before -- etymology languages, families, and scripts have. check_code_to_name_and_name_to_code_maps( "languages", "a submodule of [[Module:languages]]", all_codes,language_names, "languages/code to canonical name",m_languages_codes, "languages/canonical names",m_languages_canonical_names ) -- Check [[Template:langname-lite]] localmodname="Template:langname-lite" forcode,nameingmatch(remove_comments(new_title(modname):getContent()),"\n\t*|#*([^\n]+)=([^\n]*)")do if#code>1andcode~="default"then for_,codeinpairs(split(code,"|",true))do locallang=get_language_by_code(code,nil,true,true) ifmatch(name,"etymcode")then localnonEtym_name=frame:preprocess(name) localnonEtym_real_name=lang:getFullName() ifnonEtym_name~=nonEtym_real_namethen discrepancy(modname, "Code: <code>%s</code>. Saw name: %s. Expected name: %s.", code,nonEtym_name,nonEtym_real_name ) end name=frame:preprocess(gsub(name,"{{{allow etym|}}}","1")) elseifmatch(name,"familycode")then name=match(name,"familycode|(.-)|") else name=name end ifnotlangthen discrepancy(modname, "Code: <code>%s</code>. Saw name: %s. Language not present in data.", code,name ) else localreal_name=lang:getCanonicalName() ifname~=real_namethen discrepancy(modname, "Code: <code>%s</code>. Saw name: %s. Expected name: %s.", code,name,real_name ) end end end end end end localfunctioncheck_etym_languages() localmodname="etymology languages/data" localcheck_etymology_language_data_keys=check_data_keys( 1,2,3,4,-- canonical name, Wikidata item, family, scripts "parent","display_text","generate_forms","strip_diacritics","sort_key", "other_names","aliases","varieties","ietf_subtag", "type","main_code","ancestors","pseudo_families", "wikimedia_codes","wikipedia_article","standard_chars", "translit","override_translit","link_tr", "dotted_dotless_i" ) localchecked={} forcode,datainpairs(m_etym_languages_data)do localobj,canonical_name,parent=make_lang(code,data,true),data[1],data.parent check_etymology_language_data_keys(modname,obj,data) ifall_codes[code]then discrepancy(modname, "The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].", code,all_codes[code] ) else ifnotm_etym_languages_codes[code]then discrepancy("etymology languages/code to canonical name", "The code %s is missing.", link(obj,true) ) end all_codes[code]=modname end ifnotcanonical_namethen discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseiflanguage_names[canonical_name]then localcanonical_lang=get_language_by_canonical_name(canonical_name,nil,true) ifnotcanonical_langthen discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) elseifdata.main_code~=canonical_lang:getCode()then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj),language_names[canonical_name] ) end else ifnotm_etym_languages_canonical_names[canonical_name]then discrepancy("etymology languages/canonical names", "The canonical name %s is missing.", link(obj) ) end etym_language_names[canonical_name]=code end check_other_names_aliases_varieties(modname,obj,data,canonical_name) ifparentthen iftype(parent)~="string"then discrepancy(modname, "%s has a parent code that is %s rather than a string.", link(obj),parent==niland"nil"or"a "..type(parent) ) elseifnot(m_languages_data_all[parent]orm_etym_languages_data[parent])then discrepancy(modname, "%s has the invalid parent code <code>%s</code>%s.", link(obj),dump(parent),m_families_data[parent]and" (a family code)"or"" ) end nonempty_families[parent]=true else discrepancy(modname, "%s has no parent code.", link(obj) ) end ifdata.ancestorsthen check_ancestors(modname,obj,data) end ifdata.wikimedia_codesthen check_wikimedia_codes(modname,obj,data) end ifdata[3]then localfamily=data[3] ifnotm_families_data[family]then discrepancy(modname, "%s has the invalid family code <code>%s</code>.", link(obj),dump(family)) end nonempty_families[family]=true end check_replacements_data(modname,obj,data) check_wikidata_item(modname,obj,data,2) localstack={} whiledatado ifchecked[code]then break elseifstack[code]then localparent=data.parent discrepancy(modname, "%s has a cyclic parental relationship to %s", link(make_lang(code,data,true)), link(get_language_by_code(parent,nil,true)) ) break end stack[code]=true code=data.parent data=m_etym_languages_data[code] end forcodeinpairs(stack)do checked[code]=true end end check_no_alias_codes(modname,m_etym_languages_data) check_code_to_name_and_name_to_code_maps( "etymology languages", "[[Module:etymology languages/data]]", all_codes,etym_language_names, "etymology languages/code to canonical name",m_etym_languages_codes, "etymology languages/canonical names",m_etym_languages_canonical_names) end -- TODO: add collision check between the canonical names "X" and "X [Ll]anguages". localfunctioncheck_families() localmodname="families/data" localcheck_family_data_keys=check_data_keys( 1,2,3,-- canonical name, Wikidata item, (parent) family "type","ietf_subtag", "protoLanguage","other_names","aliases","varieties","pseudo_families" ) localchecked,double_check_if_empty={["qfa-not"]=true},{} forcode,datainpairs(m_families_data)do localobj,canonical_name,family,protolang=make_family(code,data),data[1],data[3],data.protoLanguage check_family_data_keys(modname,obj,data) ifall_codes[code]then discrepancy(modname, "The code <code>%s</code> is not unique; it is also defined in [[Module:%s]].", code,all_codes[code] ) else ifnotm_families_codes[code]then discrepancy("families/code to canonical name", "The code %s is missing.", link(obj,true) ) end all_codes[code]=modname end ifnotcanonical_namethen discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseiffamily_names[canonical_name]then localcanonical_family=get_family_by_canonical_name(canonical_name) ifnotcanonical_familythen discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) elseifdata.main_code~=canonical_family:getCode()then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj),family_names[canonical_name] ) end else ifnotm_families_canonical_names[canonical_name]then discrepancy("families/canonical names", "The canonical name %s is missing.", link(obj) ) end family_names[canonical_name]=code end check_other_names_aliases_varieties(modname,obj,data,canonical_name) iffamilythen iffamily==codeandcode~="qfa-not"then discrepancy(modname, "%s has itself as its family.", link(obj) ) elseifnotm_families_data[family]then discrepancy(modname, "%s has the invalid parent family code <code>%s</code>.", link(obj),dump(family) ) end nonempty_families[family]=true end ifprotolangthen localprotolang_obj=get_language_by_code(protolang,nil,true) ifnotprotolang_objthen discrepancy(modname, "%s has the invalid proto-language code <code>%s</code>.", link(obj),dump(protolang) ) elseifprotolang==code.."-pro"then discrepancy(modname, "%s has %s listed as its proto-language, which is redundant, since it is determined to be the proto-language automatically.", link(obj),link(protolang_obj) ) elseifsub(protolang,-4)=="-pro"then discrepancy(modname, "%s has %s listed as its proto-language, which is supposed to be the proto-language for the family <code>%s</code>.",link(obj),link(protolang_obj),sub(protolang,1,-5) ) end end check_wikidata_item(modname,obj,data,2) -- Could be a false-positive if a child family occurs on a later -- iteration, so set aside any that fail for a second check. This avoids -- having to iterate through the whole list of families once -- nonempty_families has been fully populated. ifnot(nonempty_families[code]orallowed_empty_families[code])then double_check_if_empty[code]=obj end localstack={} whiledatado ifchecked[code]then break elseifstack[code]then localparent=data[3] discrepancy(modname, "%s has a cyclic familial relationship to %s", link(make_family(code,data)), link(get_family_by_code(parent)) ) break end stack[code]=true code=data[3] data=m_families_data[code] end forcodeinpairs(stack)do checked[code]=true end end -- Any languages set aside as candidates for having no children are checked -- again, now that nonempty_families is definitely complete. forcode,objinnext,double_check_if_emptydo ifnot(nonempty_families[code]orallowed_empty_families[code])then discrepancy(modname, "%s has no child families or languages.", link(obj) ) end end check_no_alias_codes(modname,m_families_data) check_code_to_name_and_name_to_code_maps( "families", "[[Module:families/data]]", all_codes,family_names, "families/code to canonical name",m_families_codes, "families/canonical names",m_families_canonical_names) end -- TODO: add collision check between the canonical names "X" and "X [Ss]cript". localfunctioncheck_scripts() localmodname="scripts/data" localcheck_script_data_keys=check_data_keys( 1,2,3,-- canonical name, Wikidata item, writing systems "other_names","aliases","varieties","parent","ietf_subtag","type", "wikipedia_article","ranges","characters","spaces","capitalized","translit","direction", "character_category","normalizationFixes","sort_by_scraping", "display_text","sort_key","strip_diacritics" ) -- Just to satisfy requirements of check_code_to_name_and_name_to_code_maps. localscript_code_to_module_map={} forcode,datainpairs(m_scripts_data)do localobj,canonical_name=make_script(code,data),data[1] ifnotm_scripts_codes[code]and#code==4then discrepancy("scripts/code to canonical name", "The code %s is missing", link(obj,true) ) end check_script_data_keys(modname,obj,data) ifnotcanonical_namethen discrepancy(modname, "The code <code>%s</code> has no canonical name specified.", code ) elseifscript_names[canonical_name]then localcanonical_script=get_script_by_canonical_name(canonical_name) ifnotcanonical_scriptthen discrepancy(modname, "%s has a canonical name that cannot be looked up.", link(obj) ) --[[elseif data.main_code ~= canonical_script:getCode() then discrepancy(modname, "%s has a canonical name that is not unique; it is also used by the code <code>%s</code>.", link(obj), script_names[canonical_name] )]] end else ifnotm_scripts_canonical_names[canonical_name]and#code==4then discrepancy("scripts/canonical names", "The canonical name %s is missing.", link(obj) ) end script_names[canonical_name]=code end check_other_names_aliases_varieties(modname,obj,data,canonical_name) ifnotnonempty_scripts[code]then discrepancy(modname, "%s is not used by any language%s.", link(obj),data.charactersand"" or" and has no characters listed for auto-detection") --[[elseif not data.characters then discrepancy(modname, "%s has no characters listed for auto-detection.", link(obj) )--]] end ifdata.charactersthen validate_pattern(data.characters,modname,obj,false) end check_wikidata_item(modname,obj,data,2) script_code_to_module_map[code]=modname end check_no_alias_codes(modname,m_scripts_data) check_code_to_name_and_name_to_code_maps( "scripts", "a submodule of [[Module:scripts]]", script_code_to_module_map,script_names, "scripts/code to canonical name",m_scripts_codes, "scripts/canonical names",m_scripts_canonical_names) end -- FIXME: this is quite messy. localfunctioncheck_wikidata_languages() localdata=json_decode(new_title("Module:languages/data/wikidata.json"):getContent()) localseen={{},{},{},[5]={}} for_,iteminipairs(data)do localid=item.id fork,vinpairs(item)do ifk~="id"then local_seen=seen[k] for_,codeinipairs(v)do local_code=code[1] local_type=type(_seen[_code]) if_type=="table"then insert(_seen[_code],id) elseif_type=="string"then _seen[_code]={_seen[_code],id} else _seen[_code]=id end end end end end localmodname="languages/data/wikidata.json" fork,vinpairs(seen)do forcode,idsinpairs(v)do iftype(ids)=="table"then localt={} fori,idinipairs(ids)do t[i]=format("<code>[[d:%s|%s]]</code>",id,id) end discrepancy(modname, "<code>%s</code> is set as an ISO 639-%d code on multiple items: %s.", code,k,list_to_text(t) ) end end end end localfunctioncheck_labels() localcheck_label_data_keys=check_data_keys( "display","Wikipedia","glossary", "plain_categories","topical_categories","pos_categories","regional_categories","sense_categories", "omit_preComma","omit_postComma","omit_preSpace", "deprecated","track" ) localfunctioncheck_label(modname,code,data) local_type=type(data) if_type=="table"then check_label_data_keys(modname,code,data) elseif_type~="string"then discrepancy(modname, "The data for the label <code>%s</code> is %s %s; only tables and strings are allowed.", code,add_indefinite_article(_type) ) end end for_,moduleinipairs{"","/regional","/topical"}do localmodname="Module:labels/data"..module module=require(modname) forlabel,datainpairs(module)do check_label(modname,label,data) end end forcodeinpairs(m_languages_codes)do localmodname="Module:labels/data/lang/"..code localmodule=safe_require(modname) ifmodulethen forlabel,datainpairs(module)do check_label(modname,label,data) end end end end localfunctioncheck_zh_trad_simp() localm_ts=require("Module:zh/data/ts") localm_st=require("Module:zh/data/st") localruby=require("Module:ja-ruby").ruby_auto locallang=get_language_by_code("zh") localHant=get_script_by_code("Hant") localHans=get_script_by_code("Hans") localdata={[0]=m_st,m_ts} localmod={[0]="st","ts"} localvar={[0]="Simp.","Trad."} localsc={[0]=Hans,Hant} localfunctionfind_stable_loop(chars,other,j) localdisplay=ruby({["markup"]="["..other.."]("..var[(j+1)%2]..")"}) display=language_link{term=other,alt=display,lang=lang,sc=sc[(j+1)%2],tr="-"} insert(chars,display) ifdata[(j+1)%2][other]==otherthen insert(chars,other) returnchars,1 elseifnotdata[(j+1)%2][other]then insert(chars,"not found") returnchars,2 elseifdata[j%2][data[(j+1)%2][other]]~=otherthen returnfind_stable_loop(chars,data[(j+1)%2][other],j+1) else localdisplay=ruby({["markup"]="["..data[(j+1)%2][other].."]("..var[j%2]..")"}) display=language_link{term=data[(j+1)%2][other],alt=display,lang=lang,sc=sc[j%2],tr="-"} insert(chars,display.." (") display=ruby({["markup"]="["..data[j%2][data[(j+1)%2][other]].."]("..var[(j+1)%2]..")"}) display=language_link{term=data[j%2][data[(j+1)%2][other]],alt=display,lang=lang,sc=sc[(j+1)%2],tr="-"} insert(chars,display.." etc.)") returnchars,3 end returnchars end fori=0,1,1do forch,other_chinpairs(data[i])do ifdata[(i+1)%2][other_ch]~=chthen localchars,issue={} localdisplay=ruby({["markup"]="["..ch.."]("..var[i]..")"}) display=language_link{term=ch,alt=display,lang=lang,sc=sc[i],tr="-"} insert(chars,display) chars,issue=find_stable_loop(chars,other_ch,i) ifissue==1orissue==2then localsc_this,mod_this,j={} ifmatch(chars[#chars-1],var[(i+1)%2])then j=1 else j=0 end mod_this=mod[(i+j)%2] sc_this={[0]=sc[(i+j)%2],sc[(i+j+1)%2]} fork,chinipairs(chars)do chars[k]=tag_text(ch,lang,sc_this[k%2],"term") end localmodname="zh/data/"..mod_this ifissue==1then discrepancy(modname, "character references itself: %s", concat(chars," → ") ) elseifissue==2then discrepancy(modname, "missing character: %s", concat(chars," → ") ) end elseifissue==3then forj,chinipairs(chars)do chars[j]=tag_text(ch,lang,sc[(i+j)%2],"term") end discrepancy("zh/data/"..mod[i], "possible mismatched character: %s", concat(chars," → ") ) end end end end end localfunctioncheck_serialization(modname) localserializers={ ["Hani-sortkey/data/serialized"]="Hani-sortkey/serializer", } ifnotserializers[modname]then returnnil end localserializer=serializers[modname] localcurrent_data=require("Module:"..serializer).main(true) localstored_data=require("Module:"..modname) ifcurrent_data~=stored_datathen discrepancy(modname, "<strong><u>Important!</u> Serialized data is out of sync. Use [[Module:%s]] to update it. If you have made any changes to the underlying data, the serialized data <u>must</u> be updated before these changes will take effect.</strong>", serializer ) end end localfind_code=require("Module:memoize")(function(message) returnmatch(message,"<code>([^<]+)</code>") end) localfunctioncompare_messages(message1,message2) localcode1,code2=find_code(message1),find_code(message2) ifcode1andcode2then returncode1<code2 else returnmessage1<message2 end end -- Warning: cannot be called twice in the same module invocation because -- some module-global variables are not reset between calls. localfunctiondo_checks(frame,modules) messages=setmetatable({},messages_mt) ifmodules["zh/data/ts"]ormodules["zh/data/st"]then check_zh_trad_simp() end check_languages(frame) check_etym_languages() -- families and scripts must be checked AFTER languages; languages checks fill out -- the nonempty_families and nonempty_scripts tables, used for testing if a family/script -- is ever used in the data check_families() check_scripts() check_wikidata_languages() ifmodules["labels/data"]then check_labels() end formoduleinpairs(modules)do check_serialization(module) end setmetatable(messages,nil) for_,msglistinpairs(messages)do msglist:sort(compare_messages) end localret=messages messages=nil returnret end localfunctionformat_message(modname,msglist) localheader;ifmatch(modname,"^Module:")ormatch(modname,"^Template:")then header="===[["..modname.."]]===" else header="===[[Module:"..modname.."]]===" end returnheader..msglist:map(function(msg) return"\n* "..msg end):concat() end functionexport.check_modules_t(frame) localargs=frame.args localmodules=list_to_set(args) localret=Array() localmessages=do_checks(frame,modules) for_,moduleinipairs(args)do localmsglist=messages[module] ifmsglistthen ret:insert(format_message(module,msglist)) end end returnret:concat("\n") end functionexport.perform(frame) localmessages=do_checks(frame,{}) -- Format the messages localret=Array() formodname,msglistinsorted_pairs(messages)do ret:insert(format_message(modname,msglist)) end -- Are there any messages? -- TODO: check how many messages there are. iffalsethen--if i == 1 then return"<b class=\"success\">Glory to Arstotzka.</b>" else ret:insert(1,"<b class=\"warning\">Discrepancies detected:</b>") returnret:concat("\n") end end returnexport
