Uses CLIP (Contrastive Language–Image Pretraining) to match your text query against every image in this gallery by meaning, not keywords.
Each image has a precomputed visual embedding; a vector that captures its visual content. Your query is embedded into the same space, and images are ranked by how closely they match.
Try descriptive phrases like "pink outfit", "heart pose", or "holding a plushie".
These are just a few examples, you arent limited to predefined keywords, you can search for literally anything you can describe; outfits, poses, expressions, accessories, locations, objects, colors, text, concepts and more.
You can also combine multiple concepts in one search, like "blonde hair black hoodie airport" but the more descriptive the query, the better the results. For example, "a photo of a woman with blonde hair wearing a black hoodie at the airport" will return more accurate results.
Member count
Member Filter
Filter images by how many members appear in them, then optionally narrow further by selecting specific co-members.
The available co-member chips update dynamically based on your count selection, and only show members that can further refine the selection. Members that appear in every matching image are hidden since selecting them would not change the results, and co-member chips that have no matching images for the current selection are greyed out.
For example, selecting 2 Members allows you to choose one co-member, while selecting 4 Members allows up to three co-members. The maximum co-members selectable is always count - 1, since one slot is always the gallery member.
You do not need to fill every available co-member slot. Unselected slots are wildcards; selecting 3 Members and Member A returns images with the gallery member, Member A, and anyone else as the third.