There could be several variants, for instance for one mode the CAPTCHA could generate a set of avatars and descriptions and ask the user to match the description and the avatars. Other variant could be showing either an avatar or a description and then generating a set of counterpart descriptions or avatars, depending on the case, for the user to choose the one that matches the presented avatar or description.
This wouldn’t work. I can easily create a bot that matches those images with the texts. The correspondence between both is public, static and deterministic.
Each pair of avatar and description would be generated from the same string, so I was making my assumptions on the premise that there was no way to go from a given avatar or description back to the string that was used to generate it.
To be honest, I’d have to implement this myself to see what you are saying about the determinism and correspondence between the two sets of avatars and descriptions. But perhaps the avatar could be rasterized and with a different scale and offset inside the hexagon and a watermark could be added, so that image recognition algorithms can’t figure out the content of the image, also something like that could be used for the description
If I’m not mistaken, you go from string to avatar to description. So if I have the avatar I can easily calculate the description and match it to the correct one.
The scale, offset and watermark are good ideas though. That would make it harder.
Thanks for the info. It’s very much appreciated.
An alternative approach could be to create a puzzle with the user’s address identicon. Difficulty of the puzzle (number of pieces to slide) could change based in some heuristics and algorithms like browser fingerprinting.