Patent application number | Description | Published |
20120092359 | Extraction Of A Color Palette Model From An Image Of A Document - A system and method are provided for determining a color palette model from an image of a document. Pixel values of the image of the document are clustered to provide image clusters. Color layers of the image are determined, each color layer corresponding to an image cluster. Aspects of the color palette model can be determined using the color layers. Aspects of the color palette model include a foreground-background color pair for a content block in the document and a background-area color of the document. | 04-19-2012 |
20120120436 | REMOTE PRINTING - A remote printing method includes extracting content of a device view caused to be displayed by a first device. The extracted content is communicated to a second device remote from the first device so that the second device can format the extracted content for printing. | 05-17-2012 |
20120212772 | METHOD AND SYSTEM FOR PROVIDING PRINT CONTENT TO A CLIENT - A request for print content is received at a network server system. The request includes variable user input. Webpage content is obtained based at least in part on the variable user input. A subset of the webpage content is identified as print content. A print-ready layout of the print content is formed and the print content in the print-ready layout is provided, via network connection, to a client in response to the request. | 08-23-2012 |
20120246552 | PROVIDING A PARTICULAR TYPE OF UNIFORM RESOURCE LOCATOR - Examples disclosed herein are example systems and methods to provide a particular type of uniform resource locator. In one example, a processor identifies webpage source code associated with a list of text associated with the type of uniform resource locator. The processor may identify a uniform resource locator within the identified webpage source code and provide the uniform resource locator. | 09-27-2012 |
20120303636 | System and Method for Web Content Extraction - A method and system for extracting Web content is disclosed. In one embodiment, Web content in a Webpage is extracted by identifying paragraphs in the Web content based on line-break node determination. A range of text-body associated with the identified paragraphs is then identified using a maximum scoring subsequence. Further, the identified text-body is refined using a heuristic rule of substantially horizontal alignment. Furthermore, one or more titles and one or more images associated with the Web content are extracted. Moreover, the Web content including the identified paragraphs, the one or more titles and the one or more images are outputted. | 11-29-2012 |
20130061132 | SYSTEM AND METHOD FOR WEB PAGE SEGMENTATION USING ADAPTIVE THRESHOLD COMPUTATION - A system and method for an adaptive threshold Web Page segmenting is disclosed. In one embodiment, a method performed by a physical computing system having one or more processors for segmenting a Web page including a plurality of nodes includes parsing content in the Web page into the plurality of nodes using the physical computing system, obtaining feature values between each pair of nodes using the physical computing system, estimating an adaptive threshold value using the obtained feature values using the physical computing system, and segmenting the Web page by comparing the feature values associated with each pair of nodes with the estimated adaptive threshold value. | 03-07-2013 |
20130091150 | DETERMIINING SIMILARITY BETWEEN ELEMENTS OF AN ELECTRONIC DOCUMENT - Disclosed is a computer-implemented method of determining smarty between first and second elements of an electronic document. The method uses a computer to calculate a plurality of measures of similarity between the first and second elements in at least two representations of the electronic document. A computer program product and system implementing this method are also disclosed. | 04-11-2013 |
20130110818 | PROFILE DRIVEN EXTRACTION | 05-02-2013 |
20130114105 | Semantically Ranking Content in a Website - Semantically ranking content in a website ( | 05-09-2013 |
20130159889 | Obtaining Rendering Co-ordinates Of Visible Text Elements - A computer-implemented method for obtaining the rendering co-ordinates of visible text elements on a web page is disclosed. The web page is represented by an input data structure comprising a plurality of text nodes, each of which represents a text element on the web page. The method comprises the following steps:
| 06-20-2013 |
20140236968 | Discrete Wavelet Transform Method for Document Structure Similarity - Examples of the present disclosure may include methods, systems, and computer readable media with executable instructions. An example method for determining document structure similarity can include segmenting path sequences ( | 08-21-2014 |