| |
Electronic to Digital Convertor, a software for digitizing of electronic media
Abstract
“Digitizing Transmuter” or “Electronic to Digital Convertor” as we have called it in our patent filing converts Electronic data to Digital data. Unlike the current converters, which convert data from one electronic form to another, this invention converts electronic data into a digital format that is machine processable. In other words, the data is now made available in a processable or what is commonly called dataminable manner. “Electronic to Digital Convertor” is designed to parse electronic media, retrieve relevant and meaningful, selected type of data and populate a structured file, such as a database, which makes it the data useful. Electronic data like Excel, PDF’s are fetched and populated in database like SQL from where it is further used as per the convenience or requirements for processing into meaningful analysis and reports.
Specifications
This invention provides an improved process of culling relevant information from the electronic media into digitized form and accumulates the information bank more commonly called the database.
The product is comprised of the following components:
Settings: The process of initializing the process. In this step, the type details of the electronic media to be output is identified and stored.
Parser: This step parses the electronic file and recovers the type of structured data as identified in the settings file.
Population of information bank: This step populates the relevant parsed data into an information bank.
Reports: This step allows the user to generate reports based on various input parameters.
The above component “settings” decide from where the input comes and where the output is required.
Based on these parameters, the input file is parsed and the information of the file is converted from electronic to another intermediate form.
These intermediate outputs can be viewed using the Report component or utility.
The intermediate output is then populated in the information bank.
A set of required data can be viewed in a structured manner by searching the data based on any required parameter or criteria. The invention can also have an intermediate or alternate output to another electronic form like a .csv, an .xml or an .xls (Excel). The invention in addition can also have a search engine to get data in batches as required.
The major area where the invention is proved is the parsing and transmuting of PDF's. This invention also has been proven with other electronic media like .xls (Excel) or the .lst file commonly called the List files.
Field of our invention
This invention pertains generally to extracting, interpreting and storing data from electronic media in a digital format. Transmuters have commonly been known and widely used in the industry. The idea behind the commonly used transmuters is to let you access the content in its current form, without adding value. These transmuters transfer the electronic data in one format to another electronic format. This fails to focus on the valued momentous data. An example are the various PDF parsers which fetches complete data and present it into another electronic media, such as html or excel formats.
Background
Converting the PDF text image to an alternate text image is how many of the existing converters work. The idea behind the existing transmuters is that the complete text image is transferred or copied onto an alternate electronic format, that looks identical to the original. This is useful in the context of requiring an exiting file in an alternate format such as requiring a pdf file in a html format.
Since the existing technology converts the text, it does not categorize or index the same. The available transmuters are designed only to change data from one electronic format to another.
Uniqueness in the invention
The uniqueness in the invention is the conversion of electronic data to digital data. The invention enables taking of data in a format where only a human can interpret it and convert it to a format where a machine can interpret it. The significance is that once data is available in a machine interpretable manner or in a digital format as is commonly called, this data becomes usable in analytical programs that can use them to create reports by interpreting the data as required.
The uniqueness also stems in understanding and analyzing the 3 layer electronic architecture. Electronic to Digital Convertor parses the electronic files by using the concept of “Indexing”. Indexing refers to the act of finding the key data from the file format and locate the related data based on the index. This invention uses a unique algorithm, which translates the electronic file format into digital data. The invention uses mapping files to locate the required data from the file format. The file format is an input to the invention, which with the help of mapping structure parses the required data and produces the list of data to be populated into the back office. The mapping structure holds the key to locate the data from the file format. Change in the mapping structure reflects the data fetched from the file format.
The invention has various levels of parsing and validating the file. Initially all the file formats and relevant mapping structure are read from a mentioned path. The mapping fields are matched with the key in the file format and the matched data is fetched for further validation of the data. The data is then verified against a predefined set of rules before presenting it into a digital format.
The Electronic to Digital Converter stands out as it understands that some of the formats like PDF are simply a collection of objects or a dictionary with object number as key and yet fetches the correct and only related data which is otherwise glyph to a file format like PDF. A typical file format, consists of encoding, table structure, number of fields and records of the table varies from file to file. The Electronic to Digital Converter is designed to handle such complexities.
The converter has been proven during the creation of monthly deal data for analytical purposes. The same is available for review at www.datspan.in. When hundred of deals are downloaded, the data required by the dealers, brokers and other stakeholders are fetched by the converter and made available online. This data proves very useful in helping determine trading action. |
|