ABBYY SOFTWARE LTD Patent applications |
Patent application number | Title | Published |
20140141836 | Entering Information Through an OCR-Enabled Viewfinder - An improved method for entering text or objects into fields is provided. Instead of a keyboard, a viewfinder provides text segmenting, text selecting and text recognizing (optical character recognition—OCR) functionalities. Text at a marker (e.g., a cursor or crosshairs) associated with the viewfinder is recognized and insertion of the recognized text is performed. The current frame is generally not captured by a user. As the user moves the camera to position a new word at the marker, the view finder is updated to provide results of recognition associated with the new word. A user is able to identify an area of interest, select text or other object of interest, and insert the same into one or more fields. The viewfinder may operate in conjunction with a camera of the electronic device on which the viewfinder is operating. Other mechanisms and variations are described. | 05-22-2014 |
20140129483 | System and Method of Determining Access to a Cloud Service - The present invention provides a method of predicting usage of and determining access to a cloud service according to an embodiment of the invention. The method includes the step of monitoring the usage of a service by end users of developers for a predetermined test period. Responsive to statistics associated with the monitoring during the predetermined test period, the method includes determining a future usage payment cost for the developers. Responsive to verification of a unique ID, the method includes performing the service by a service provider. | 05-08-2014 |
20140122479 | Automated file name generation - Described herein are methods for determining a type and semi-unique features of electronic files. The methods generally include generating at least one document hypothesis corresponding to the type of the document. For each document hypothesis, the document type is verified. A document type hypothesis is selected. A document name is formed based on the selected document type hypothesis and one or more features of the document. Such steps generally include automatically or programmatically naming of electronic files. A unique or semi-unique name is given, one that reproduces some of the document's contents, attributes and/or characteristics. Each document is provided with a name that can be easily understood and that is related to the content of the document. | 05-01-2014 |
20140118796 | Using a scanning implemented software for time economy without rescanning (S.I.S.T.E.R.) - Methods and devices are described for detecting boundaries of documents on flatbed and multi-function scanners on a first pass of a carriage assembly, and then performing a high resolution scan on a second pass. High resolution images of documents can then be obtained with little or no interaction normally necessary to identify areas of interest on the scanner bed. Patterns on the scanner cover or lid facilitate not only edge determination, but orientation of text and other objects, and straightening of images in preparation for OCR and related functions. Electronic images and files derived from paper documents may be automatically cropped, deskewed, subjected to OCR, and named consistent with content or other information derived from them. | 05-01-2014 |
20140111638 | Capturing Images after Sufficient Stabilization of a Device - Described are techniques that guarantee the best opportunity for a camera to capture an image with an acceptable level of quality. Too frequently, images captured with mobile devices such as smartphones and tablet computers fail to capture images of sufficient quality for optical character recognition, for example. Image capture is allowed only after successfully completing a check of sufficient stabilization and focusing of the camera. A variety of sensors may be used to check stability including gyrometers, proximity sensors, accelerometers, and light sensors. | 04-24-2014 |
20140081620 | Swiping Action for Displaying a Translation of a Textual Image - Disclosed is a method that involves acquiring an image with text, displaying all or a portion of the image on an electronic device. In response to detecting a swiping action or gesture, displaying a result of translation on a screen of the device. A first screen or display becomes a second one. Original text in a first language or source language may be easily and quickly compared to translated text shown on a second screen through a swiping gesture. Electronic dictionaries and machine translation may be used. These services may be independently stored and operated from different locations including on the device performing the translation, on a server or across a network (LAN, WAN, etc.). Optional manual correction of the translated text is also possible. | 03-20-2014 |
20140081619 | Photography Recognition Translation - Methods are described for efficient and substantially instant recognition and translation of text in photographs. A user is able to select an area of interest for subsequent processing. Optical character recognition (OCR) may be performed on the wider area than that selected for determining the subject domain of the text. Translation to one or more target languages is performed. Manual corrections may be made at various stages of processing. Variations of translation are presented and made available for substitution of a word or expression in the target language. Translated text is made available for further uses or for immediate access. | 03-20-2014 |
20130223743 | MODEL-BASED METHODS OF DOCUMENT LOGICAL STRUCTURE RECOGNITION IN OCR SYSTEMS - The invention relates to methods for determining a logical structure of a document. The system stores a collection of models, each of which describes one or more possible logical structures. At least one document hypothesis is generated for the whole document. For each document hypothesis, the system verifies the document hypothesis on each page, for example, by generating at least one block hypothesis for each block in the document based on the document hypothesis, selecting a best block hypothesis for each block, selecting the model that corresponds to a best document hypothesis the document hypothesis that has a best degree of correspondence with the selected best block hypotheses for the document, and forming a representation of the document based on the best document hypothesis described. | 08-29-2013 |
20130198615 | Creating Flexible Structure Descriptions - In one embodiment, the invention provides a method, comprising detecting data fields on a scanned document image; generating a flexible document description based on the detected data fields, including creating a set of search elements for each data field, each search element having associated search criteria; and training or modifying the flexible document description using, for example, a search algorithm to detect the data fields on additional training images based on the set of search elements. | 08-01-2013 |
20130191109 | Translating Sentences Between Languages - A method and computer system for translating sentences between languages from an intermediate language-independent semantic representation is provided. On the basis of comprehensive understanding about languages and semantics, exhaustive linguistic descriptions are used to analyze sentences, to build syntactic structures and language independent semantic structures and representations, and to synthesize one or more sentences in a natural or artificial language. A computer system is also provided to analyze and synthesize various linguistic structures and to perform translation of a wide spectrum of various sentence types. As result, a generalized data structure, such as a semantic structure, is generated from a sentence of an input language and can be transformed into a natural sentence expressing its meaning correctly in an output language. The method and computer system can be applied to in automated abstracting, machine translation, natural language processing, control systems, Internet information retrieval, etc. | 07-25-2013 |
20130191108 | Translation of a Selected Text Fragment of a Screen - Disclosed is a method for translating text fragments displayed on a screen from an input language into an output language and displaying the result. Translation may use electronic dictionaries, machine translation, natural language processing, control systems, information searches, (e.g., search engine via an Internet protocol), semantic searches, computer-aided learning, and expert systems. For a word combination, appropriate local or network accessible dictionaries are consulted. The disclosed method provides a translation in grammatical agreement in accordance with grammatical rules of the output language in consideration of the context of the text. | 07-25-2013 |
20130054612 | Universal Document Similarity - Described herein are methods for finding substantially similar/different sources (files and documents), and estimating similarity or difference between given sources. Similarity and difference may be found across a variety of formats. Sources may be in one or more languages such that similarity and difference may be found across any number and types of languages. A variety of characteristics may be used to arrive at an overall measure of similarity or difference including determining or identifying syntactic roles, semantic roles and semantic classes in reference to sources. | 02-28-2013 |
20130054595 | Automated File Name Generation - Described herein are methods for determining a type and unique features of a document. The methods generally include generating at least one document hypothesis corresponding to the type of the document. For each document hypothesis, the document type is verified. A best type hypothesis is selected. A document name is formed based on the best type hypothesis and one or more unique features of the document. Such steps are generally included in automatically or programmatically naming of documents. A unique or semi-unique name is given, one that reproduces some of the document's contents, attributes and/or characteristics. Each document is provided with a name that can be easily understood and that is related to the content of the document. | 02-28-2013 |
20130044943 | Classifier Combination for Optical Character Recognition Systems - Techniques and methods are disclosed herein for combining and weighting of values from and associated with classifiers. Classifiers are used to recognize characters as part of an optical character recognition (OCR) system. Various methods of normalization facilitate combining of results of classifiers. For example, weight values may be entered into a weight table having two columns, one that includes weights from comparing patterns with images of correct characters, the other column includes weights from comparing patterns with images of incorrect characters. | 02-21-2013 |
20130024186 | Deep Model Statistics Method for Machine Translation - In one embodiment, the invention provides a method for machine translation of a source document in an input language to a target document in an output language, comprising generating translation options corresponding to at least portions of each sentence in the input language; and selecting a translation option for the sentence based on statistics associated with the translation options. | 01-24-2013 |
20130024180 | Deep Model Statistics Method for Machine Translation - In one embodiment, the invention provides a method for machine translation of a source document in an input language to a target document in an output language, comprising generating translation options corresponding to at least portions of each sentence in the input language; and selecting a translation option for the sentence based on statistics associated with the translation options. | 01-24-2013 |
20120321216 | Straightening Out Distorted Perspective on Images - Methods for correcting distortions in an image including text, or an image of a page that includes text, are disclosed. The methods include identifying reliable and substantially straight lines from elements in the image. Vanishing points are determined from the lines. Parameters associated with a rectangle are determined. A coordinate conversion is performed. | 12-20-2012 |
20120294524 | Enhanced Multilayer Compression of Image Files Using OCR Systems - Described herein is a method for segmenting a document image into a picture component, a special or significant picture component, and a non-picture component. The non-picture component is compressed and may include character blocks. Separately, picture components are compressed with a lossy algorithm or with a preliminary defined compression ratio. Subsequently, the compressed picture component, significant picture component and the compressed non-picture component are saved in memory or in a storage location so that the document image may be recomposed based on the compressed picture component or compressed significant picture component and the compressed non-picture component. | 11-22-2012 |
20120271627 | CROSS-LANGUAGE TEXT CLASSIFICATION - Methods are described for performing classification (categorization) of text documents written in various languages. Language-independent semantic structures are constructed before classifying documents. These structures reflect lexical, morphological, syntactic, and semantic properties of documents. The methods suggested are able to perform cross-language text classification which is based on document properties reflecting their meaning. The methods are applicable to genre classification, topic detection, news analysis, authorship analysis, etc. | 10-25-2012 |
20120010872 | Method and System for Semantic Searching - In one embodiment, there is provided a computer-implemented method and system for implementing the method. The method comprises: preliminarily analyzing at least one corpus of natural language text comprising for each sentence of each natural language text of the corpus, performing syntactic analysis using linguistic descriptions to generate at least one syntactic structure for the sentence; building a semantic structure for the sentence; associating each generated syntactic and semantic structure with the sentence; and saving each generated syntactic and semantic structure; for each corpus of natural language text that was preliminarily analyzed, performing an indexing operation to index lexical meanings and values of linguistic parameters of each syntactic structure and each semantic structure associated with sentences in the corpus; and searching in at least one preliminarily analyzed corpora for sentences comprising searched values for the linguistic parameters. | 01-12-2012 |
20110274345 | ACCURACY OF RECOGNITION BY MEANS OF A COMBINATION OF CLASSIFIERS - In one embodiment, there is provided a method for an Optical Character Recognition (OCR) system. The method comprises: recognizing an input character based on a plurality of classifiers, wherein each classifier generates an output by comparing the input character with a plurality of trained patterns; grouping the plurality of classifiers based on a classifier grouping criterion; and combining the output of each of the plurality of classifiers based on the grouping. | 11-10-2011 |
20110091109 | METHOD OF PRE-ANALYSIS OF A MACHINE-READABLE FORM IMAGE - In one embodiment, the invention provides a method for a machine to perform machine-readable form pre-recognition analysis. The method comprises preliminarily assigning at least one graphic image in a form for identification of form type, preliminarily creating at least one model of the said graphic image for identification of the form type, parsing a form image into regions, determining an image form type for the form image, comprising: (a) detecting on the form image at least one of said graphic images for identification of the form type, (b) performing a primary identification of the form image type based on a comparison of the detected graphic image with the said model, and(c) performing a profound analysis using a supplementary data said-primary identification results in multiple possibilities for the form image type. | 04-21-2011 |
20110014944 | TEXT PROCESSING METHOD FOR A DIGITAL CAMERA - Embodiments disclose a technique to recognize text in a current frame of an image in a view finder of a digital camera. In accordance with the technique, text at a marker (e.g. a cursor or cross hairs) associated with the view finder is recognized and a lookup is performed based on the recognized text. Advantageously, the lookup yields useful information e.g. a translation of a recognized word that is displayed in the viewfinder adjacent to the text. The current frame is not captured by a user. As the user moves the camera to position a new word at the marker, the view finder is updated to provide lookup results associated with the new word. Lookups may be performed of a bilingual dictionary, a monolingual dictionary, a reference book, a travel guide, etc. Embodiments of the invention also cover digital cameras or mobile devices that implement the aforementioned technique. | 01-20-2011 |
20110013847 | IDENTIFYING PICTURE AREAS BASED ON GRADIENT IMAGE ANALYSIS - In one embodiment, a method for identifying areas in a document image is provided. The method comprises generating binarized and gradient images based on the document image; and performing a classification operation to classify areas in the document image into one of a noise area and a picture area based on attributes computed on the binarized and gradient images. | 01-20-2011 |
20110013806 | Methods of object search and recognition - Embodiments of the invention disclose techniques for processing of machine-readable forms of unfixed or flexible format. An auxiliary brief description may be optionally specified to determine the spatial orientation of the image. A method of searching for elements of a document comprises the following main operations in addition to the operations of preliminary image processing: selecting the varieties of structural description from several available variants, determining the orientation of the image, selecting the text objects, where the text must be recognized, and determining the minimal required volume of recognition, recognizing the text objects, searching for elements of the form. Searching for elements of the form comprises the following actions: selecting a searched element in the structural description, gaining the algorithm of search constraints from the structural description, searching for the element, testing the obtained variants. | 01-20-2011 |
20100254606 | METHOD OF RECOGNIZING TEXT INFORMATION FROM A VECTOR/RASTER IMAGE - A method is claimed for processing a vector-raster image file which contains a text image. The method comprises the steps of: fragmenting the image to obtain regions containing non-separable, logically connected fragments of text of the maximum possible size; processing text, vector, and raster objects; discarding excessive information; analyzing each object with the help of all available information. The step of processing text objects includes the steps of: dividing into separate characters and character groups according to supposed locations of blank spaces or other non-indicated symbols, and analyzing and assembling character groups into words and verifying and correcting characters encoding based on recognition of assembled words as raster objects. The step of processing vector objects includes the step of identifying separators, background, and substrates of blocks. The step of processing raster objects includes the steps of: analyzing non-text objects on order to detect text images within them, and/or detecting vector objects other than separators. | 10-07-2010 |
20090252439 | METHOD AND SYSTEM FOR STRAIGHTENING OUT DISTORTED TEXT-LINES ON IMAGES - In one embodiment, a method for correcting distortions in a scanned image of a page is disclosed. The method comprises identifying at least one set of collinear elements in the scanned image; and generating a corrected image based on the scanned image including for at least some of the collinear elements in each set applying a spatial location correction to position all collinear elements in the set on a common horizontal rectilinear base line in the corrected image. | 10-08-2009 |