Patent application number | Description | Published |
20090116746 | SYSTEMS AND METHODS FOR PARALLEL PROCESSING OF DOCUMENT RECOGNITION AND CLASSIFICATION USING EXTRACTED IMAGE AND TEXT FEATURES - A method of parallel processing jobs received from a plurality of users by a document analysis system that automatically classifies documents to organize each job, automatically separates each job into its constituent electronic document and automatically separate the document into subsets of electronic pages. For each page of each subset, the method automatically extracts image features that are indicative of how the document is laid out or textually-organized. For each subset, the method automatically compares the extracted features with feature sets associated with each document category to determine a comparison score for the subset. The method then classifies the electronic document as being one of the categories of documents using the comparison score for each of the subsets and organize the job according to the categories of documents the job contains. | 05-07-2009 |
20110249905 | SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELECTRONIC DOCUMENTS INCLUDING TABLES - A method of automatically extracting data from an electronic document including tables is provided. The method includes: automatically identifying rows of the table using gaps in horizontal projections of the plurality of image sections, wherein at least some of the identified rows in close proximity are collected to form table formations; and automatically identifying columns of the table using at least some of the plurality of image sections that are vertically aligned, wherein the identified columns are grown in each of the table formations using gaps in vertical projections of the plurality of image sections until an obstruction is reached. The method further includes automatically identifying labels in the plurality of corresponding image sections to associate the identified labels with at least one of the identified columns and the identified rows; and automatically extracting data from cells of the table formed by the identified rows and columns. | 10-13-2011 |
20110255782 | SYSTEMS AND METHODS FOR AUTOMATICALLY PROCESSING ELECTRONIC DOCUMENTS USING MULTIPLE IMAGE TRANSFORMATION ALGORITHMS - In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of automatically pre-processing each received electronic document using a plurality of image transformation algorithms to improve subsequent data extraction from said document is provided. The method includes: electronically partitioning each received electronic document page into pieces; automatically processing each piece of the received electronic document page using each of a plurality of image pre-processing algorithms to produce a plurality of image variations of each piece; and analyzing the outputs of subsequent processing and data extraction, on each of the image variations of the pieces to determine which output is best, from the plurality of outputs for each piece. | 10-20-2011 |
20110255784 | SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELETRONIC DOCUMENTS USING MULTIPLE CHARACTER RECOGNITION ENGINES - In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of automatically extracting data from each received electronic document using a plurality of character recognition engines is provided. The method includes: automatically processing each received electronic document page using each of a plurality of recognition engines to extract data; comparing quality of data extracted from each of the recognition engines to assign a confidence score to the extracted data; and selecting extracted data having highest confidence score as the correct extracted data. | 10-20-2011 |
20110255788 | SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELECTRONIC DOCUMENTS USING EXTERNAL DATA - In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of automatically extracting data from each received electronic document at least in part using data external to the electronic document but associated with the job containing the document is provided. The method includes: analyzing each electronic document in a job to automatically extract images and text features; and, if any of the images and text features extracted from the electronic document is not recognized, using data external to said document but associated with said job to identify the unrecognized feature, wherein the external source may be one of at least one other document in the job and a database having known values associated with the job. | 10-20-2011 |
20110255789 | SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELECTRONIC DOCUMENTS CONTAINING MULTIPLE LAYOUT FEATURES - A method of automatically extracting data from an electronic document containing a plurality of layout features through progressive refinement is provided. The method includes: analyzing each document to automatically extract images and text features wherein each document includes at least two features that are related to each other, and wherein said analyzing compares extracted features with a first search space of candidate features to try and recognize the extracted features; if one of the at least two related features is not recognized and at least one feature is recognized, selecting a second search space of candidate features in response thereto and in response to predefined rules about the relationship between the two features; and comparing the unrecognized feature with said selected second search space. | 10-20-2011 |
20110255790 | SYSTEMS AND METHODS FOR AUTOMATICALLY GROUPING ELECTRONIC DOCUMENT PAGES - A method of grouping electronic document pages of a job that belong together is provided. The method includes: automatically analyzing images and text features extracted from each received electronic document page to associate the electronic document page with a corresponding document category; automatically identifying features extracted from the electronic document page that potentially indicate to which document group the electronic document page belongs; comparing the identified features with a set of group identifying features associated with corresponding document group, in which the set of group identifying features includes at least a set of page numbers and account numbers; and, if the identified features are found to include a set of a page number and an account number belonging to the set of group identifying features associated with the corresponding document group, grouping the electronic document page into the corresponding document group. | 10-20-2011 |
20110255794 | SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA BY NARROWING DATA SEARCH SCOPE USING CONTOUR MATCHING - A method of extracting data by narrowing a scope of data search using contour matching of select elements in a document is provided. The method includes: analyzing each document to automatically extract images and text features wherein said analyzing compares extracted features with a first search space of candidate features to try and recognize the extracted features; automatically processing each unrecognized feature using a contour recognition engine to generate a contour of the unrecognized feature; automatically selecting a second search space of candidate features through contour matching using the contour of the unrecognized feature, wherein the second search space of candidate features is narrower than the first search space of candidate features; and comparing the unrecognized feature with said second search space to identify the previously unrecognized feature. | 10-20-2011 |
20110258150 | SYSTEMS AND METHODS FOR TRAINING DOCUMENT ANALYSIS SYSTEM FOR AUTOMATICALLY EXTRACTING DATA FROM DOCUMENTS - A method of training a document analysis system to extract data from documents is provided. The method includes: automatically analyzing images and text features extracted from a document to associate the document with a corresponding document category; comparing the extracted text features with a set of text features associated with corresponding category of the document, in which the set of text features includes a set of characters, words, and phrases; if the extracted features are found to consist of the characters, words, and phrases belonging to the set of text features associated with the corresponding document category, storing the extracted text features as the data contained in the corresponding document; and, if the extracted text features are found to include at least one text feature that does not belong to the set of text features associated with the corresponding document category, submitting the unrecognized text features to a training phase. | 10-20-2011 |
20110258170 | Systems and methods for automatically correcting data extracted from electronic documents using known constraints for semantics of extracted data elements - In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of automatically correcting the extracted data using known constraints amongst semantics of extracted data elements is provided. The method includes: analyzing each electronic document in a job to automatically extract data; automatically analyzing the extracted data to identify incorrectly extracted data elements using rules defining constraints amongst semantics of extracted data elements; and automatically attempting to correct the incorrectly extracted data elements using the rules. | 10-20-2011 |
20110258182 | SYSTEMS AND METHODS FOR AUTOMATICALLY EXTRACTING DATA FROM ELECTRONIC DOCUMENT PAGE INCLUDING MULTIPLE COPIES OF A FORM - In a document analysis system that receives and processes jobs from a plurality of users, in which each job may contain multiple electronic documents, to extract data from the electronic documents, a method of extracting data from a received electronic document page that includes multiple copies of a form is provided. The method comprising: automatically processing a received electronic document page that includes multiple copies of a form to group the multiple copies into corresponding number of records; automatically extracting data from each of the multiple copies of the form and saving the extracted data into the corresponding record; automatically comparing the extracted data in the records to determine which copy of the extracted data to select; if all extracted data instances are identical, assigning a high confidence score to the extracted data; and, if all extracted data instances are not identical, flagging the extracted data for a further processing. | 10-20-2011 |
20110258195 | Systems and methods for automatically reducing data search space and improving data extraction accuracy using known constraints in a layout of extracted data elements - A method of automatically narrowing data search space and improving accuracy of data extraction using known constraints in a layout of extracted data elements for classified documented is provided. The method includes: analyzing each document to classify it within a document category, each category having a corresponding set of expected layouts; analyzing each electronic document to automatically extract images and text features; automatically constructing a data structure including a layout of the extracted features and layout relationships amongst the extracted features, wherein each of the extracted features in the layout maintains a reference to neighboring features and wherein closely related features are merged to form a combined feature; automatically narrowing data search space by detecting and removing parts of the layout that are not associated with any data elements using the data structure; and automatically detecting data using the extracted feature layout and the layout relationships amongst the extracted features. | 10-20-2011 |
20120271676 | SYSTEM AND METHOD FOR AN INTELLIGENT PERSONAL TIMELINE ASSISTANT - The present disclosure provides methods and systems for assisting a user in managing a timeline of appointments in which at least one appointment is associated with an event, and the appointment has associated appointment information that describes aspects of the appointment and/or event, including receiving free-form scheduling information from an electronic notification, inferring that at least a portion of the free-form scheduling information relates to an existing appointment and/or associated event, the existing appointment having presently associated appointment information that describes aspects of the appointment and/or associated event, selecting an appointment for modification, and modifying the selected appointment based on (a) the portion of the free-form scheduling information inferred to relate to the existing appointment and/or associated event, and at least one of the appointment information presently associated with the existing appointment, and a user preference signature representing prior actions performed by and/or content preferences learned about the user. | 10-25-2012 |
20140025705 | Method of and System for Inferring User Intent in Search Input in a Conversational Interaction System - A method of inferring user intent in search input in a conversational interaction system is disclosed. A method of inferring user intent in a search input includes providing a user preference signature that describes preferences of the user, receiving search input from the user intended by the user to identify at least one desired item, and determining that a portion of the search input contains an ambiguous identifier. The ambiguous identifier is intended by the user to identify, at least in part, a desired item. The method further includes inferring a meaning for the ambiguous identifier based on matching portions of the search input to the preferences of the user described by the user preference signature and selecting items from a set of content items based on comparing the search input and the inferred meaning of the ambiguous identifier with metadata associated with the content items. | 01-23-2014 |
20140025706 | METHOD OF AND SYSTEM FOR INFERRING USER INTENT IN SEARCH INPUT IN A CONVERSATIONAL INTERACTION SYSTEM - A method of inferring user intent in search input in a conversational interaction system is disclosed. A method of inferring user intent in a search input includes providing a user preference signature that describes preferences of the user, receiving search input from the user intended by the user to identify at least one desired item, and determining that a portion of the search input contains an ambiguous identifier. The ambiguous identifier is intended by the user to identify, at least in part, a desired item. The method further includes inferring a meaning for the ambiguous identifier based on matching portions of the search input to the preferences of the user described by the user preference signature and selecting items from a set of content items based on comparing the search input and the inferred meaning of the ambiguous identifier with metadata associated with the content items. | 01-23-2014 |
20140058724 | Method of and System for Using Conversation State Information in a Conversational Interaction System - A method of using conversation state information in a conversational interaction system is disclosed. A method of inferring a change of a conversation session during continuous user interaction with an interactive content providing system includes receiving input from the user including linguistic elements intended by the user to identify an item, associating a linguistic element of the input with a first conversation session, and providing a response based on the input. The method also includes receiving additional input from the user and inferring whether or not the additional input from the user is related to the linguistic element associated with the conversation session. If related, the method provides a response based on the additional input and the linguistic element associated with the first conversation session. Otherwise, the method provides a response based on the second input without regard for the linguistic element associated with the first conversation session. | 02-27-2014 |
20140163965 | Method of and System for Using Conversation State Information in a Conversational Interaction System - A method of using conversation state information in a conversational interaction system is disclosed. A method of inferring a change of a conversation session during continuous user interaction with an interactive content providing system includes receiving input from the user including linguistic elements intended by the user to identify an item, associating a linguistic element of the input with a first conversation session, and providing a response based on the input. The method also includes receiving additional input from the user and inferring whether or not the additional input from the user is related to the linguistic element associated with the conversation session. If related, the method provides a response based on the additional input and the linguistic element associated with the first conversation session. Otherwise, the method provides a response based on the second input without regard for the linguistic element associated with the first conversation session. | 06-12-2014 |
20140337370 | METHOD OF AND SYSTEM FOR REAL TIME FEEDBACK IN AN INCREMENTAL SPEECH INPUT INTERFACE - The present disclosure provides systems and methods for selecting and presenting content items based on user input. The method includes receiving first input intended to identify a desired content item among content items associated with metadata, determining that an input portion has an importance measure exceeding a threshold, and providing feedback identifying the input portion. The method further includes receiving second input, and inferring user intent to alter or supplement the first input with the second input. The method further includes, upon inferring intent to alter the first input, determining an alternative query by modifying the first input based on the second input, and, upon inferring intent to supplement the first input, determining an alternative query by combining the first input and the second input. The method further includes selecting and presenting a subset of content items based on comparing the alternative query and metadata associated with the subset. | 11-13-2014 |