Patent application number | Description | Published |
20090063500 | EXTRACTING DATA CONTENT ITEMS USING TEMPLATE MATCHING - Systems and methods for extracting data content items from a web page are provided. A template is created by labeling data content items of interest associated with a web page and generating a template Document Object Model (DOM) tree based on the labeled web page. DOM trees are also generated for additional web pages that contain data content items for which extraction may be desired. These DOM trees are compared to the template DOM tree to determine alignment there between. The aligned data content items may then be extracted from the additional web pages and indexed, as desired. Labeling the data content items of interest prior to generating a template DOM tree allows for the desired data content items to be specified and more accurately extracted from related and/or similarly structured web pages. | 03-05-2009 |
20110103699 | IMAGE METADATA PROPAGATION - Methods and computer-readable media for propagating content category information to images stored in a database are described. A seed image that is associated with a known content category is received. A content-based image retrieval is conducted using the seed image as a search query image. A number of search result images are identified. The content category is propagated to the search result images. Metadata associated with the search result images is aggregated and analyzed to identify domains that should also be associated with the content category. Additional images that are associated with the domain are identified and the content category propagated thereto. The process is iterated using the additional images as search query images for the content-based image retrieval. | 05-05-2011 |
20110106782 | CONTENT-BASED IMAGE SEARCH - Image descriptor identifiers are used for content-based search. A plurality of descriptors is determined for an image. The descriptors represent the content of the image at respective interest points identified in the image. The descriptors are mapped to respective descriptor identifiers. The image can thus be represented as a set of descriptor identifiers. A search is performed on an index using the descriptor identifiers as search elements. A method for efficiently searching the inverted index is also provided. Candidate images that include at least a predetermined number of descriptor identifiers that match those of the image are identified. The candidate images are ranked and at least a portion thereof are presented as content-based search results. | 05-05-2011 |
20110106798 | Search Result Enhancement Through Image Duplicate Detection - Systems, methods, and computer media for enhancing user search query results are provided. Upon receiving a user search query, relevant images are identified. Duplicate image information for the relevant images is accessed in an index. The index includes information extracted from individual images or duplicates and information aggregated according to groups comprised of images and duplicates of the images. The images identified as relevant to the user query are ranked based at least in part on the information accessed in the index. | 05-05-2011 |
20110295775 | ASSOCIATING MEDIA WITH METADATA OF NEAR-DUPLICATES - Techniques for identifying near-duplicates of a media object and associating metadata of the near-duplicates with the media object are described herein. One or more devices implementing the techniques are configured to identify the near duplicates based at least on similarity attributes included in the media object. Metadata is then extracted from the near-duplicates and is associated with the media object as descriptors of the media object to enable discovery of the media object based on the descriptors. | 12-01-2011 |
20110299743 | SCALABLE FACE IMAGE RETRIEVAL - A system for identifying individuals in digital images and for providing matching digital images is provided. A set of images that include faces of known individuals is received. Faces are detected in the images and facial components are identified in each face. Visual words corresponding to the facial components are generated, stored, and associated with identifiers of the individuals. At a later time, a user may provide an image that includes the face of one of the known individuals. Visual words are determined from the face of the individual in the provided image and matched against the stored visual words. Images associated with matching visual words are ranked and presented to the user. | 12-08-2011 |
20120078936 | VISUAL-CUE REFINEMENT OF USER QUERY RESULTS - Methods and computer-storage media having computer-executable instructions embodied thereon that facilitate refining query results using visual cues are provided. Query results are determined in response to an indication of a user query. One or more groups of query results are generated from the query results based on categories of query results that share similar features. Visual cues are associated with each of the query result groups. Visual cues, in association with query result groups, are presented to a user. Query results associated with a selected visual cue may be presented to a user. A refined user query may be generated based on a selected visual cue. | 03-29-2012 |
20120117051 | MULTI-MODAL APPROACH TO SEARCH QUERY INPUT - Search queries containing multiple modes of query input are used to identify responsive results. The search queries can be composed of combinations of keyword or text input, image input, video input, audio input, or other modes of input. The multiple modes of query input can be present in an initial search request, or an initial request containing a single type of query input can be supplemented with a second type of input. In addition to providing responsive results, in some embodiments additional query refinements or suggestions can be made based on the content of the query or the initially responsive results. | 05-10-2012 |
20130173604 | KNOWLEDGE-BASED ENTITY DETECTION AND DISAMBIGUATION - An entity-based search system is described herein that detects and recognizes entities in Internet-based content and uses this recognition to organize search results. The system associates one or more entity identifiers with a web page and stores this information as metadata of the page in a search engine index. This metadata will enable entity-based queries as well as rich data presentations in a search engine result page (SERP), including grouping results by entities, filtering results by one or more particular entities, or re-ranking search results based on user preference of entities. Thus, the entity-based search system allows users to identify a particular entity the user is interested in finding, and to receive search results directly related to that entity. | 07-04-2013 |
20130246435 | FRAMEWORK FOR DOCUMENT KNOWLEDGE EXTRACTION - A knowledge extraction framework may iteratively enrich an ontology that is used to classify structured knowledge obtained from web pages based on structured knowledge previously acquired from other web pages. The framework may enable a user to define the ontology for extracting structured knowledge from a plurality of web pages. The framework applies the ontology using a supervised extraction algorithm to extract seed information from a set of web pages. The framework further applies an unsupervised extraction algorithm to extract the structured knowledge from an additional set of web pages. The framework subsequently maps the structured knowledge to the ontology based on the seed information to enrich the ontology. | 09-19-2013 |
20130332438 | Disambiguating Intents Within Search Engine Result Pages - Systems, computer-readable media, and methods for generating search engine results pages are provided. A user provides a search engine with one or more query terms. The query terms may be associated with an intent, e.g. product entity, store entity, person entity. The search engine classifies the query and identifies search results that correspond to the query. The search results are grouped based on intents associated with the query. A graphical user interface displays the grouped on search results and allows the user to modify the groupings. The graphical user interface is also updated with entity or entity attributes corresponding to the intents used to group the search results. | 12-12-2013 |
20140358887 | APPLICATION CONTENT SEARCH MANAGEMENT - A search service accesses application content accessible via one or more enumerated applications. The search service ranks the accessed application content in combination with non-application content to produce a combined ranking. Responsive to a search query, the search service provides one or more search results based on the combined ranking. | 12-04-2014 |
20140372419 | TILE-CENTRIC USER INTERFACE FOR QUERY-BASED REPRESENTATIVE CONTENT OF SEARCH RESULT DOCUMENTS - Architecture that represents search results as tiles in a tile-based user interface. The tiles can be images or icons selected to represent a search result or multiple search results. In a broader implementation the tiles can be related to entities as derived from the search results. A web document is received, and on which feature processing is performed to obtain features for each (page, image) tuple. The features are also input to representative image classification, along with the other features to output image classification data. Representative image classification calculates representative scores for every (page, image) pair and (page, image set) pair, and the images are ranked for presentation and viewing in the tile-based user interface. User interaction can be via a touch-based user interface to return and view search results related to a selected tile. | 12-18-2014 |