One of surprising usages for our image recognition technology is “duplicate” image find. Effective use of such technique allows every image-heavy operations to limit amount images take, by maintaining only one copy of each image. See below how we do that.
Lets say you operate on a database of products containing also images representing those products - like a catalogue/store or classified site. You also allow your customers to upload images to accompany their description – this is crucial part, but also a point where you don’t really controll what your customers upload. In a e-commerce site you’ll end up with plenty of similar images, especially if there’s season for something (and always is) – everybody will download the same image from google search results and re-upload it to your website. Since there’s no effective way to catalogue huge amounts of images (gues how – by file names? tags? lets say you have thousands of them…) – you can optimise it only by performing a check “is there such an image in our database already?”.
By working using pattern matching techniques our technology can do exactly that. Since every image is a sort of “pattern”, we can build a database of patterns (which we call “reference images”). You can than query that database to check if it contains image like the one you point to. Since our engine transforms each “reference image” into a set of
“features” – we can then much easier store those in the database for effective search (set of features is much smaller in size than a bitmap). What’s more – this “feature extraction” is prone to some modifications (the image may be upside-down, or rotated) or even damage (up to 30% of image can be hidden yet still the pattern will match).
Recognize.im Business Development Director