User:Belove/AOM Sample Rename Patterns: Difference between revisions

From AniDB
Jump to navigation Jump to search
 
(4 intermediate revisions by the same user not shown)
Line 45: Line 45:
//    rolling window approach that results in the replacement of occurrences
//    rolling window approach that results in the replacement of occurrences
//    that existed in the original text (like the built-in replace() function)
//    that existed in the original text (like the built-in replace() function)
//    as well as any new occurrences that emerge as a result of the
//    as well as (when length of find >= length of replace) any new occurrences
//    replacements made. (Substitutes for built-in replacerepeat(), which is
//    that emerge as a result of the replacements made (the length limitation
//    not working.)
//    prevents an infinite number of replacements being called for).
//    (Substitutes for built-in replacerepeat(), which is not working.)
function('replacegreedy', _
function('replacegreedy', _
set('replacegreedy_casesensitive', True) + _
set('replacegreedy_casesensitive', True) + _
Line 63: Line 64:
//    -- Do not call directly; called from replacegreedy() / replacegreedyi()
//    -- Do not call directly; called from replacegreedy() / replacegreedyi()
function('replacegreedy_internal', _
function('replacegreedy_internal', _
set('replacegreedy_text_len', length(param1)) + _
set('replacegreedy_find_len', length(param2)) + _
set('replacegreedy_find_len', length(param2)) + _
if((replacegreedy_text_len < replacegreedy_find_len) OR (replacegreedy_text_len = 0), _
if(replacegreedy_find_len = 0, _
param1, _
param1, _
set('replacegreedy_new_text', limit(param1, 1)) + _
if(replacegreedy_find_len < length(param3), _
for('replacegreedy_text_pos', replacegreedy_find_len, replacegreedy_text_len, _
// optimize cases where standard replace()/replacei() will give the same result.
set('replacegreedy_new_text', _
if(replacegreedy_casesensitive, _
replacegreedy_new_text + _
replace(param1, param2, param3), _
copy(param1, replacegreedy_text_pos, 1) _
replacei(param1, param2, param3) _
) + _
), _
set('replacegreedy_text_found', _
// begin greedy algorithm
copy(replacegreedy_new_text, _
set('replacegreedy_text_len', length(param1)) + _
(length(replacegreedy_new_text) - replacegreedy_find_len) + 1, _
if((replacegreedy_text_len < replacegreedy_find_len) OR (replacegreedy_text_len = 0), _
replacegreedy_find_len _
param1, _
) _
) + _
if(if(replacegreedy_casesensitive, _
start(replacegreedy_text_found, param2), _
starti(replacegreedy_text_found, param2) _
), _
set('replacegreedy_new_text_len', length(replacegreedy_new_text)) + _
set('replacegreedy_new_text', _
set('replacegreedy_new_text', _
if(replacegreedy_new_text_len <> replacegreedy_find_len, _
if(replacegreedy_find_len = 1, '', limit(param1, replacegreedy_find_len - 1)) _
limit(replacegreedy_new_text, _
) + _
replacegreedy_new_text_len - replacegreedy_find_len _
for('replacegreedy_text_pos', replacegreedy_find_len, replacegreedy_text_len, _
) + param3, _
set('replacegreedy_new_text', _
replacegreedy_text_found _
replacegreedy_new_text + _
copy(param1, replacegreedy_text_pos, 1) _
) + _
set('replacegreedy_new_text_len', length(replacegreedy_new_text)) + _
set('replacegreedy_text_found', _
copy(replacegreedy_new_text, _
(replacegreedy_new_text_len - replacegreedy_find_len) + 1, _
replacegreedy_find_len _
) _
) + _
if(if(replacegreedy_casesensitive, _
start(replacegreedy_text_found, param2), _
starti(replacegreedy_text_found, param2) _
), _
set('replacegreedy_new_text', _
if(replacegreedy_new_text_len <> replacegreedy_find_len, _
limit(replacegreedy_new_text, _
replacegreedy_new_text_len - replacegreedy_find_len _
) + param3, _
replacegreedy_text_found _
) _
) _
) _
) _
) _
) + _
replacegreedy_new_text _
) _
) _
) + _
) _
replacegreedy_new_text _
) _
) _
)
)
Line 109: Line 123:
| <pre>replacegreedy('[1 2  5    8        ]', '  ', ' ')</pre> || <pre>[1 2 5 8 ]</pre>
| <pre>replacegreedy('[1 2  5    8        ]', '  ', ' ')</pre> || <pre>[1 2 5 8 ]</pre>
|}
|}
== To-do and ideas ==
* bitand(uint1, uint2) As Boolean
* bitor(uint1, uint2) As Boolean
* bitxor(uint1, uint2) As Boolean
* fileversion(F) As Number - supporting above version 6.  May depend on bitand().
* v(F) As String ('&#39;, 'v2', 'v3',...) - Depends on fileversion().
* iscensored(F) As Boolean - Depends  maybe on bitand().
* isuncensored(F) As Boolean - Depends  maybe on bitand().
* centag(F) ['[UNCEN]' or '[CEN]' or '&#39;] As String
* Specific-purpose functions to handle extra spaces in a non-generalized way, illegal filename characters, general filename cleanup.
* Rounding functions.
* left(string, characters As Number) As String - Like limit(string, number), but supports number < 1 by returning '&#39; (limit() returns the whole text).
* right(string, characters) As String - Like left(), but right()!
* replacemid(string, start, length, insert text) As String (modified string) - length can be 0 to just insert text, or longer to replace it.
* findnext(find, start, text) As Number (position) | Boolean (False if not found)
* findlast(find, start, text) As Number (position) | Boolean (False if not found) - searches right-to-left for find in text, but start and position are still relative to start of text.
* count(seperator, text) - Count seperated values in text.
* counte(seperator, text) - Count seperated values in text ignoring escaped seperators.
* escape(character, text)
* unescape(character, text)
* splite(seperator, text, part number) - Like split(), but ignores escaped (doubled) seperators, and returns part unescaped.  Depends on unescape().
* joine(seperator, text1, text2) - Like join(), but escapes seperator found in texts and is limited to two text parameters.
* cbool(string | number) & etc.
* coladd(collection, item name | False | '&#39;, value) As String (collection) - adds or replaces named value in collection, creating collection if necessary, while storing the data type; if item name is boolean False or '&#39; then item can only be referenced by index; limitations: can't support objects, floats are coerced to decimals.  Collection stored in a String beginning with marker indicating it's a collection, followed by item count, item names, and finally item data type/value pairs.  Optionally sort the index by item name for efficiency (if it is worth it and possible).  Collections could be used to implement scope and call stacks.  E.g., At beginning of functions add collection of local variables' existing values to a stack collection.  At end of function, restore all values.  Could complicate collection structure further to implement shared scope, if needed.
* colremove(collection, item name | item number) As String (collection)
* colget(collection, item name | item number) As String (value)
* colcount(collection) As Number (item count)
* colitem(collection, item name | item number) As {Number (item number) | String (item name) | Boolean (False)} - Returns item number if passed a valid item name, item name if passed a valid item number and item is named, '&#39; if passed unnamed item number, False if passed invalid name or invalid number).
* colsort(collection) As String (collection) - e.g. bubble sort (if sorting is even possible to implement).
* Possibly add functions to manipulate one, two, and three-dimensional arrays respectively.  Could be wrappers for (nested) collections without item names.
* replacerepeat_(): Add anti-infinity test (find_len <=(<?) replace_len)
* replacerepeati_()
* repeatgreedy(): Replace matches that begin _before_ the match string by having it build a larger starting buffer up to offset_max = (find_len * 2) - 1 with new_text_len = min(offset_max - 1, text_len - 1) & replacing with for(text_pos, new_text_len + 1, text_len) then replacing {without appending to new_text} for(new_text_pos, new_text_pos + 1, new_text_pos + find_len) {roughly} & therefore consider calling a new function replacegreedy_internal_replace() that handles the instructions in the loop other than appending text if it would reduce redundancy & consider only returning output at end (always storing in new_text), rather than outputting return text in several if() clauses as now -- to make function behavior easier to understand.
* Structure my AOM wiki page to group general-purpose and applied functions seperately.
* Move this wiki page.
* Add a table to compare outputs of test suite of {text, find, replace} with each internal and custom replace function, as results will differ.
* Add most useful and tested applications to main documentation (possibly eliminating or reducing dependencies and use of function() for simplicity)
* Document _ line continuation, + concatenation operator, Mod modulo operator, ^ exponent operator (if indeed supported), whether < and > (etc.) work as string comparison operators, strtoint(), replacerepeat() misbehavior, False as default value for unset variables -- including objects, and other built-in features still undocumented, in main documentation.

Latest revision as of 11:50, 23 January 2016

AOM Sample Rename Patterns

Tested with AOM 0.5.18.276


replacerepeat_()

Recursive replacement for replacerepeat()

replacerepeat_(text, find, replace)

// replacerepeat_(text, find, replace)
//     -- Repeatedly replaces 'find' in 'text' with 'replace', until 'text'
//     stops changing.  (Substitutes for built-in replacerepeat(), which is not
//     working.)
function('replacerepeat_', _
	set('replacerepeat_new_text', _
		replace(param1, param2, param3) _
	) + _
	if(replacerepeat_new_text<>param1, _
		replacerepeat_(replacerepeat_new_text, param2, param3), _
		replacerepeat_new_text _
	) _
)
Example usage Output
replacerepeat_('1a2aa5aaaaa8aAaAaAaA', 'aa', 'a')
1a2a5a8aAaAaAaA
replacerepeat_('[1 2  5     8        ]', '  ', ' ')
[1 2 5 8 ]

replacegreedy() / replacegreedyi()

Non-recursive, single-pass replacements for replacerepeat().

replacegreedy(text, find, replace)

replacegreedyi(text, find, replace) case-insensitive version

// replacegreedy(text, find, replace)
//     -- Replaces all occurrences of 'find' in 'text' with 'replace' using a
//     rolling window approach that results in the replacement of occurrences
//     that existed in the original text (like the built-in replace() function)
//     as well as (when length of find >= length of replace) any new occurrences
//     that emerge as a result of the replacements made (the length limitation
//     prevents an infinite number of replacements being called for).
//     (Substitutes for built-in replacerepeat(), which is not working.)
function('replacegreedy', _
	set('replacegreedy_casesensitive', True) + _
	replacegreedy_internal(param1, param2, param3) _
)

// replacegreedyi(text, find, replace)
//     -- Case-insensitive version of replacegreedy()
function('replacegreedyi', _
	set('replacegreedy_casesensitive', False) + _
	replacegreedy_internal(param1, param2, param3) _
)

// replacegreedy_internal(text, find, replace)
//     -- Do not call directly; called from replacegreedy() / replacegreedyi()
function('replacegreedy_internal', _
	set('replacegreedy_find_len', length(param2)) + _
	if(replacegreedy_find_len = 0, _
		param1, _
		if(replacegreedy_find_len < length(param3), _
			// optimize cases where standard replace()/replacei() will give the same result.
			if(replacegreedy_casesensitive, _
				replace(param1, param2, param3), _
				replacei(param1, param2, param3) _
			), _
			// begin greedy algorithm
			set('replacegreedy_text_len', length(param1)) + _
			if((replacegreedy_text_len < replacegreedy_find_len) OR (replacegreedy_text_len = 0), _
				param1, _
				set('replacegreedy_new_text', _
					if(replacegreedy_find_len = 1, '', limit(param1, replacegreedy_find_len - 1)) _
				) + _
				for('replacegreedy_text_pos', replacegreedy_find_len, replacegreedy_text_len, _
					set('replacegreedy_new_text', _
						replacegreedy_new_text + _
						copy(param1, replacegreedy_text_pos, 1) _
					) + _
					set('replacegreedy_new_text_len', length(replacegreedy_new_text)) + _
					set('replacegreedy_text_found', _
						copy(replacegreedy_new_text, _
							(replacegreedy_new_text_len - replacegreedy_find_len) + 1, _
							replacegreedy_find_len _
						) _
					) + _
					if(if(replacegreedy_casesensitive, _
							start(replacegreedy_text_found, param2), _
							starti(replacegreedy_text_found, param2) _
						), _
						set('replacegreedy_new_text', _
							if(replacegreedy_new_text_len <> replacegreedy_find_len, _
								limit(replacegreedy_new_text, _
									replacegreedy_new_text_len - replacegreedy_find_len _
								) + param3, _
								replacegreedy_text_found _
							) _
						) _
					) _
				) + _
				replacegreedy_new_text _
			) _
		) _
	) _
)
Example usage Output
replacegreedy('1a2aa5aaaaa8aAaAaAaA', 'aa', 'a')
1a2a5a8aAaAaAaA
replacegreedyi('1a2aa5aaaaa8aAaAaAaA', 'aa', 'a')
1a2a5a8a
replacegreedy('[1 2  5     8        ]', '  ', ' ')
[1 2 5 8 ]

To-do and ideas

  • bitand(uint1, uint2) As Boolean
  • bitor(uint1, uint2) As Boolean
  • bitxor(uint1, uint2) As Boolean
  • fileversion(F) As Number - supporting above version 6. May depend on bitand().
  • v(F) As String ('', 'v2', 'v3',...) - Depends on fileversion().
  • iscensored(F) As Boolean - Depends maybe on bitand().
  • isuncensored(F) As Boolean - Depends maybe on bitand().
  • centag(F) ['[UNCEN]' or '[CEN]' or ''] As String
  • Specific-purpose functions to handle extra spaces in a non-generalized way, illegal filename characters, general filename cleanup.
  • Rounding functions.
  • left(string, characters As Number) As String - Like limit(string, number), but supports number < 1 by returning '' (limit() returns the whole text).
  • right(string, characters) As String - Like left(), but right()!
  • replacemid(string, start, length, insert text) As String (modified string) - length can be 0 to just insert text, or longer to replace it.
  • findnext(find, start, text) As Number (position) | Boolean (False if not found)
  • findlast(find, start, text) As Number (position) | Boolean (False if not found) - searches right-to-left for find in text, but start and position are still relative to start of text.
  • count(seperator, text) - Count seperated values in text.
  • counte(seperator, text) - Count seperated values in text ignoring escaped seperators.
  • escape(character, text)
  • unescape(character, text)
  • splite(seperator, text, part number) - Like split(), but ignores escaped (doubled) seperators, and returns part unescaped. Depends on unescape().
  • joine(seperator, text1, text2) - Like join(), but escapes seperator found in texts and is limited to two text parameters.
  • cbool(string | number) & etc.
  • coladd(collection, item name | False | '', value) As String (collection) - adds or replaces named value in collection, creating collection if necessary, while storing the data type; if item name is boolean False or '' then item can only be referenced by index; limitations: can't support objects, floats are coerced to decimals. Collection stored in a String beginning with marker indicating it's a collection, followed by item count, item names, and finally item data type/value pairs. Optionally sort the index by item name for efficiency (if it is worth it and possible). Collections could be used to implement scope and call stacks. E.g., At beginning of functions add collection of local variables' existing values to a stack collection. At end of function, restore all values. Could complicate collection structure further to implement shared scope, if needed.
  • colremove(collection, item name | item number) As String (collection)
  • colget(collection, item name | item number) As String (value)
  • colcount(collection) As Number (item count)
  • colitem(collection, item name | item number) As {Number (item number) | String (item name) | Boolean (False)} - Returns item number if passed a valid item name, item name if passed a valid item number and item is named, '' if passed unnamed item number, False if passed invalid name or invalid number).
  • colsort(collection) As String (collection) - e.g. bubble sort (if sorting is even possible to implement).
  • Possibly add functions to manipulate one, two, and three-dimensional arrays respectively. Could be wrappers for (nested) collections without item names.
  • replacerepeat_(): Add anti-infinity test (find_len <=(<?) replace_len)
  • replacerepeati_()
  • repeatgreedy(): Replace matches that begin _before_ the match string by having it build a larger starting buffer up to offset_max = (find_len * 2) - 1 with new_text_len = min(offset_max - 1, text_len - 1) & replacing with for(text_pos, new_text_len + 1, text_len) then replacing {without appending to new_text} for(new_text_pos, new_text_pos + 1, new_text_pos + find_len) {roughly} & therefore consider calling a new function replacegreedy_internal_replace() that handles the instructions in the loop other than appending text if it would reduce redundancy & consider only returning output at end (always storing in new_text), rather than outputting return text in several if() clauses as now -- to make function behavior easier to understand.
  • Structure my AOM wiki page to group general-purpose and applied functions seperately.
  • Move this wiki page.
  • Add a table to compare outputs of test suite of {text, find, replace} with each internal and custom replace function, as results will differ.
  • Add most useful and tested applications to main documentation (possibly eliminating or reducing dependencies and use of function() for simplicity)
  • Document _ line continuation, + concatenation operator, Mod modulo operator, ^ exponent operator (if indeed supported), whether < and > (etc.) work as string comparison operators, strtoint(), replacerepeat() misbehavior, False as default value for unset variables -- including objects, and other built-in features still undocumented, in main documentation.