Dish-Embed: Food Embedding Benchmark Results

Domain-specialized food embedding model vs general-purpose alternatives. All models evaluated at 384 dimensions on identical benchmark data.

Dish-Embed
OpenAI TE3-Large
BAAI BGE-M3
Qwen3-Embedding-0.6B #1 MTEB Multilingual
Microsoft E5-Large-v2
BGE-Reranker-v2-M3 Best public reranker
Benchmark Glossary
Indian Cuisine Matching Matching "Aloo Gobi" to "Potato Cauliflower Curry", "Dal Makhani" to "Black Lentil Curry" across restaurants.
Cross-Language Matching Matching "ラーメン" to "Ramen", "خبز نان" to "Naan Bread" across languages and scripts.
Bakery & Dessert Matching Matching "Pain au Chocolat" to "Chocolate Croissant", "Crème Brûlée" to "Caramelized Custard".
Beverage Matching Matching "Iced Americano" to "Cold Black Coffee", "Masala Chai" to "Spiced Tea Latte" across naming conventions.
Synonym Recognition Retrieving "Pad Kra Pao" from a query for "Thai Basil Stir-Fry", or "Gyoza" from "Pot Stickers".
Cuisine Classification Classifying "Tom Yum Goong" as Thai, "Cacio e Pepe" as Italian from the dish name alone. 19 cuisine categories.
Category Search Searching "Thai soups" or "grilled appetizers" and ranking relevant menu items.
Typo-Tolerant Search Returning "Margherita Pizza" when a customer types "margarita piza".
Food Search General menu search ranking across diverse food queries and item catalogs.
Global Search Search across multilingual menus spanning 15+ cuisines worldwide.
Portion Size Sensitivity Ignoring portion labels like "Regular", "Family Pack", "Serves 2", "250ml" when matching the same dish. Generic models treat size text as meaningful content.
Noisy Menu Matching Matching "***BEST SELLER*** Paneer Tikka - Chef's Special!!" to "Paneer Tikka" on another menu.
Bilingual Menu Matching Matching "Falafel Wrap فلافل راب" to "Falafel Wrap" on menus that mix scripts.
Embedding Stability Producing identical embeddings for "Fried Rice", "炒飯", and "フライドライス". 1.0 = perfectly consistent across scripts.