User:Dvdkhl/Ideas: Difference between revisions

Line 206: Line 206:
* Image Hashing
* Image Hashing
** http://www.phash.org/docs/pubs/thesis_zauner.pdf
** http://www.phash.org/docs/pubs/thesis_zauner.pdf
** Resizing based (not optimal)
** Resizing based (not optimal?)
*** http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
*** http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
*** http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html
*** http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html
** Detect Text Bubbles
** Simple OCR
*** Recognising which character set is used (Latin, Asian)
**** Infer which language is used
=== Image Hash Algorithm ===
* Convert to grayscale (Black and white with thresholding?)
* Rotate image so the height is always greater or equal to the width
** This is done to reduce the ratio distortions
* Resize to a fixed Dimension
** Needs to ensure that the written text is still readable
** Proposed Dimension: 1024x2048 (common ratios: 1:3 and 2:3 => 1.5:3 => 1:2)
* Tile the image into 64x64 blocks
** With this we get 512 blocks for all images
* Apply 2D DCT for each block
* Take only a specific range of coefficients from the 2D DCT Block
* Produce hash from the selected ranges
==== Grayscale Conversion ====
<pre>byte ToGrayScale(r, g, b) {
  var min = Math.Min(r, Math.Min(g, b));
  var rgDiff = 1 + Math.Log(1 + Math.Abs(r - g), 256) * 2;
  var rbDiff = 1 + Math.Log(1 + Math.Abs(r - b), 256) * 2;
  var bgDiff = 1 + Math.Log(1 + Math.Abs(g - b), 256) * 2;
  var val = (byte)MathEx.Clamp(0, min * (rgDiff + rbDiff + bgDiff) / 3, 255);
  return val;
}</pre>
* Grayscale images are unaffected
* Pixel showing colors are suppressed
** Works as long as there are (black) contour borders
** Might need border detection for contourless images


== Remove/Change "hostile" judgemental anime/character tags? (Abandoned) ==
== Remove/Change "hostile" judgemental anime/character tags? (Abandoned) ==
* Meh, just me overreacting a bit
* Meh, just me overreacting a bit
** Approval rating seems to take care of it
** Approval rating seems to take care of it
227

edits

MediaWiki spam blocked by CleanTalk.
MediaWiki spam blocked by CleanTalk.