48,406
edits
Tag: Undo |
No edit summary |
||
| Line 57: | Line 57: | ||
===About different types of affixes ("template", "display", "link", "lookup" and "category"):=== | ===About different types of affixes ("template", "display", "link", "lookup" and "category"):=== | ||
* A "template affix" is an affix in its source form as it appears in a template call. Generally, a template affix has | * A "template affix" is an affix in its source form as it appears in a template call. Generally, a template affix has an | ||
attached template hyphen (see above) to indicate that it is an affix and indicate what type of affix it is (prefix, | |||
suffix, interfix or circumfix), but some of the older-style templates such as {{tl|suffix}}, {{tl|prefix}}, | |||
{{tl|confix}}, etc. have "positional" affixes where the presence of the affix in a certain position (e.g. the second | |||
or third parameter) indicates that it is a certain type of affix, whether or not it has an attached template hyphen. | |||
* A "display affix" is the corresponding affix as it is actually displayed to the user. The display affix may differ | * A "display affix" is the corresponding affix as it is actually displayed to the user. The display affix may differ | ||
from the template affix for various reasons: | from the template affix for various reasons: | ||
| Line 72: | Line 71: | ||
languages have differences between the "template hyphen" specified in the template (which always needs to be | languages have differences between the "template hyphen" specified in the template (which always needs to be | ||
specified somehow or other in templates like {{tl|affix}}, to indicate that the term is an affix and what type of | specified somehow or other in templates like {{tl|affix}}, to indicate that the term is an affix and what type of | ||
affix it is) and the display hyphen (see above), with corresponding differences between template and display affixes. | affix it is) and the display hyphen (see above), with corresponding differences between template and display | ||
affixes. | |||
* A (regular) "link affix" is the affix that is linked to when the affix is shown to the user. The link affix is usually | * A (regular) "link affix" is the affix that is linked to when the affix is shown to the user. The link affix is usually | ||
the same as the display affix, but will differ in one of three circumstances: | the same as the display affix, but will differ in one of three circumstances: | ||
| Line 612: | Line 612: | ||
end | end | ||
if | if affix_type == "non-affix" then | ||
return term | return term | ||
elseif affix_type == "circumfix" then | elseif affix_type == "circumfix" then | ||
| Line 728: | Line 728: | ||
--[==[ | --[==[ | ||
For a given template term in a given language (see the definition of "template affix" near the top of the file), | For a given template term in a given language (see the definition of "template affix" near the top of the file), | ||
possibly in an explicitly specified script `sc` (but usually nil), return the term's affix type ({"prefix"}, {" | possibly in an explicitly specified script `sc` (but usually nil), return the term's affix type ({"prefix"}, | ||
{"suffix"}, {"circumfix"} or { | {"interfix"}, {"suffix"}, {"circumfix"} or {"non-affix"}) along with the corresponding link and display affixes | ||
near the top of the file); also the corresponding lookup affix (if `return_lookup_affix` is specified). The term passed | (see definitions near the top of the file); also the corresponding lookup affix (if `return_lookup_affix` is specified). | ||
in should already have any fragment (after the # sign) parsed off of it. Four values are returned: `affix_type`, | The term passed in should already have any fragment (after the # sign) parsed off of it. Four values are returned: | ||
`link_term`, `display_term` and `lookup_term`. The affix type can be passed in instead of autodetected | `affix_type`, `link_term`, `display_term` and `lookup_term`. The affix type can be passed in instead of autodetected; in | ||
this case, the template term need not have any attached hyphens, and the appropriate hyphens will be added in the | |||
hyphens will be added in the appropriate places. If `do_affix_mapping` is specified, look up the affix in the | appropriate places. If `do_affix_mapping` is specified, look up the affix in the lang-specific affix mappings, as | ||
lang-specific affix mappings, as described in the comment at the top of the file; otherwise, the link and display terms | described in the comment at the top of the file; otherwise, the link and display terms will always be the same. (They | ||
will always be the same. (They will be the same in any case if the template term has a bracketed link in it or is not | will be the same in any case if the template term has a bracketed link in it or is not an affix.) If | ||
an affix.) If `return_lookup_affix` is given, the fourth return value contains the term with appropriate lookup hyphens | `return_lookup_affix` is given, the fourth return value contains the term with appropriate lookup hyphens in the | ||
in the appropriate places; otherwise, it is the same as the display term. (This functionality is used in | appropriate places; otherwise, it is the same as the display term. (This functionality is used in | ||
[[Module:category tree/poscatboiler/data/affixes and compounds]] to convert link affixes into lookup affixes so that | [[Module:category tree/poscatboiler/data/affixes and compounds]] to convert link affixes into lookup affixes so that | ||
they can be looked up in the affix mapping tables.) | they can be looked up in the affix mapping tables.) | ||
| Line 744: | Line 744: | ||
local function parse_term_for_affixes(term, lang, sc, affix_type, do_affix_mapping, return_lookup_affix, affix_id) | local function parse_term_for_affixes(term, lang, sc, affix_type, do_affix_mapping, return_lookup_affix, affix_id) | ||
if not term then | if not term then | ||
return | return "non-affix", nil, nil, nil | ||
end | end | ||
if term == "^" then | |||
-- Indicates a null term to emulate the behavior of {{suffix|foo||bar}}. | |||
term = "" | |||
return "non-affix", term, term, term | |||
end | |||
if term:find("^%^") then | if term:find("^%^") then | ||
-- | -- HACK! ^ at the beginning of Korean languages has a special meaning, triggering capitalization of the | ||
-- transliteration. Don't interpret it as "force non-affix" for those languages. | |||
local langcode = lang:getCode() | |||
if langcode ~= "ko" and langcode ~= "okm" and langcode ~= "jje" then | |||
-- Formerly we allowed ^ to force non-affix type; this is now handled using an inline modifier | |||
-- <naf>, <root>, etc. Throw an error for the moment when the old way is encountered. | |||
error("Use of ^ to force non-affix status is no longer supported; use an inline modifier <naf> or <root> " .. | |||
"after the component") | |||
end | |||
end | end | ||
| Line 763: | Line 775: | ||
thyph = "([" .. thyph .. "])" | thyph = "([" .. thyph .. "])" | ||
if affix_type | if not affix_type then | ||
if rfind(term, thyph .. " " .. thyph) then | if rfind(term, thyph .. " " .. thyph) then | ||
affix_type = "circumfix" | affix_type = "circumfix" | ||
| Line 770: | Line 782: | ||
local has_ending_hyphen = rfind(term, thyph .. "$") | local has_ending_hyphen = rfind(term, thyph .. "$") | ||
if has_beginning_hyphen and has_ending_hyphen then | if has_beginning_hyphen and has_ending_hyphen then | ||
affix_type = " | affix_type = "interfix" | ||
elseif has_ending_hyphen then | elseif has_ending_hyphen then | ||
affix_type = "prefix" | affix_type = "prefix" | ||
elseif has_beginning_hyphen then | elseif has_beginning_hyphen then | ||
affix_type = "suffix" | affix_type = "suffix" | ||
else | |||
affix_type = "non-affix" | |||
end | end | ||
end | end | ||
| Line 780: | Line 794: | ||
local link_term, display_term, lookup_term | local link_term, display_term, lookup_term | ||
if affix_type then | if affix_type == "non-affix" then | ||
link_term = term | |||
display_term = term | |||
lookup_term = term | |||
else | |||
display_term = reconstruct_term_per_hyphens(term, affix_type, scode, thyph, dhyph) | display_term = reconstruct_term_per_hyphens(term, affix_type, scode, thyph, dhyph) | ||
if do_affix_mapping then | if do_affix_mapping then | ||
| Line 800: | Line 818: | ||
lookup_term = display_term | lookup_term = display_term | ||
end | end | ||
end | end | ||
| Line 824: | Line 838: | ||
function export.make_affix(term, lang, sc, affix_type, do_affix_mapping, return_lookup_affix, affix_id) | function export.make_affix(term, lang, sc, affix_type, do_affix_mapping, return_lookup_affix, affix_id) | ||
if not (affix_type == "prefix" or affix_type == "suffix" or affix_type == "circumfix" or affix_type == "infix" or | if not (affix_type == "prefix" or affix_type == "suffix" or affix_type == "circumfix" or affix_type == "infix" or | ||
affix_type == "interfix") then | affix_type == "interfix" or affix_type == "non-affix") then | ||
error("Internal error: Invalid affix type " .. (affix_type or "(nil)")) | error("Internal error: Invalid affix type " .. (affix_type or "(nil)")) | ||
end | end | ||
| Line 887: | Line 901: | ||
-- can be used in the loop below when categorizing. | -- can be used in the loop below when categorizing. | ||
part.affix_type, part.affix_link_term, part.affix_display_term = parse_term_for_affixes(part.term, | part.affix_type, part.affix_link_term, part.affix_display_term = parse_term_for_affixes(part.term, | ||
part.lang, part.sc, | part.lang, part.sc, part.type, not part.alt, nil, part.id) | ||
-- If link_term is an empty string, either a bare ^ was specified or an empty term was used along with inline | -- If link_term is an empty string, either a bare ^ was specified or an empty term was used along with inline | ||
| Line 903: | Line 917: | ||
for i, part in ipairs_with_gaps(data.parts) do | for i, part in ipairs_with_gaps(data.parts) do | ||
local affix_type = part.affix_type | local affix_type = part.affix_type | ||
if affix_type then | if affix_type ~= "non-affix" then | ||
is_affix_or_compound = true | is_affix_or_compound = true | ||
-- Make a sort key. For the first part, use the second part as the sort key; the intention is that if the | -- Make a sort key. For the first part, use the second part as the sort key; the intention is that if the | ||
| Line 951: | Line 962: | ||
end | end | ||
-- Make sure there was either an affix or a compound (two or more | -- Make sure there was either an affix or a compound (two or more non-affix terms). | ||
if not is_affix_or_compound then | if not is_affix_or_compound then | ||
error("The parameters did not include any affixes, and the term is not a compound. Please provide at least one affix.") | error("The parameters did not include any affixes, and the term is not a compound. Please provide at least one affix.") | ||
| Line 999: | Line 1,010: | ||
-- Determine affix type and get link and display terms (see text at top of file). | -- Determine affix type and get link and display terms (see text at top of file). | ||
local affix_type, link_term, display_term = parse_term_for_affixes(part.term, part.lang, part.sc, | local affix_type, link_term, display_term = parse_term_for_affixes(part.term, part.lang, part.sc, | ||
part.type, not part.alt, nil, part.id) | |||
-- If the term is an | -- If the term is an interfix or the type was explicitly given, recognize it as such (which means e.g. that we | ||
-- will display the term without hyphens for East Asian languages). Otherwise, ignore the fact that it looks | |||
-- like an affix and display as specified in the template (but pay attention to the detected affix type for | |||
if affix_type == " | -- certain tracking purposes). | ||
if affix_type == "interfix" or (part.type and part.type ~= "non-affix") then | |||
-- If link_term is an empty string, either a bare ^ was specified or an empty term was used along with | -- If link_term is an empty string, either a bare ^ was specified or an empty term was used along with | ||
-- inline modifiers. The intention in either case is not to link the term. Don't add a '*fixed with' | -- inline modifiers. The intention in either case is not to link the term. Don't add a '*fixed with' | ||
| Line 1,011: | Line 1,023: | ||
-- redundant alt text. | -- redundant alt text. | ||
if link_term and link_term ~= "" and not part.part_lang then | if link_term and link_term ~= "" and not part.part_lang then | ||
table.insert(categories, {cat = data.pos .. " | table.insert(categories, {cat = data.pos .. " " .. affix_type .. "ed with " .. | ||
make_entry_name_no_links(part.lang, link_term), sort_key = part.sort or data.sort_key}) | |||
end | end | ||
part.term = link_term ~= "" and link_term or nil | part.term = link_term ~= "" and link_term or nil | ||
part.alt = part.alt or (display_term ~= link_term and display_term) or nil | part.alt = part.alt or (display_term ~= link_term and display_term) or nil | ||
else | else | ||
if affix_type then | if affix_type ~= "non-affix" then | ||
local langcode = data.lang:getCode() | local langcode = data.lang:getCode() | ||
-- If `data.lang` is an etymology-only language, track both using its code and its full parent's code. | |||
local full_langcode = data.lang:getFullCode() | local full_langcode = data.lang:getFullCode() | ||
else | else | ||
| Line 1,098: | Line 1,111: | ||
part.ts = export.make_affix(part.ts, part.lang, Latn, affix_type) | part.ts = export.make_affix(part.ts, part.lang, Latn, affix_type) | ||
end | end | ||
local function insert_affix_category(categories, pos, affix_type, part, sort_key, sort_base) | local function insert_affix_category(categories, pos, affix_type, part, sort_key, sort_base) | ||