Information Markets: Data Transmission & Requirements for Flawless Delivery

Dr. Hossein Eslambolchi
March 2012

Electronic data transformation technology enables the construction of information-related applications or services, including information markets. The data transforming techniques that make these markets possible range from data reformatting for report generation to complex algorithms for compression and encryption.

KEY POINTS

• Despite its potential, the information market is still in its infancy. Today it is comprised mainly of companies producing information packages for sale to consumers and other businesses.

• The information market is beginning to open up for commerce between individual customers.

• Data transformation in the form of compression and encryption is fundamental to information markets.

 

THE TECHNOLOGY

As the Internet evolved, so have data and information. Once used primarily in the back-end office to support enterprise operations, they are transforming into commodities that can be bought and sold online. This “information market” is huge, even though it is still in its infancy.

Today’s information market is comprised mainly of companies that produce information packages to sell to other businesses and individuals. At one end of the spectrum are consulting firms that regularly produce and sell expensive, in-depth reports on various aspects of industry, the economy, etc. At the other end of the spectrum, companies such as Yahoo and Google provide services that are much more interactive. These players provide comparatively shallow information packages resulting from the use of their search engines.

Right now individual customers rarely trade in information. Peer-to-peer networks are starting to address this need by enabling groups of users to share storage and to exchange certain types of data with one another. However, the infrastructure of such peer-to-peer networks has not matured. Individual users still cannot create, sell or buy information directly from each other.

When information commerce comes into its own, every user will be capable of opportunistically creating and selling information. As a consequence, the amount of data and information created and stored will explode exponentially.

What is a good example of opportunistic data marketing? Imagine a user creating a video recording of a rare event, such as an earthquake or a tsunami. This video can be made available via wireless connectivity to the rest of the world for a price. Data transformation techniques – compression and encryption – along with secure and scalable micro payment technologies will be essential to guarantee reliable and smooth transactions.

“Community vaults” consisting of many small storage areas offered by distributed participants would be one step toward creating the necessary infrastructure.

The challenges and goals of Exalab include:

• Creating a generalized file system over an unreliable public network

• Ensuring reliable storage and access performance

• Ensuring data security and privacy

• Creating an e-commerce model for information commercialization based on micro-payments.

These goals will take some time to realize. Exalab would have to deal with a multitude of data types and sizes. To remain reliable and secure, data must be compressed, encrypted and replicated over many areas in the community vault; traditional methods for compression and encryption will no longer be adequate.

In response, researchers are designing a general software platform for data transformation called Vcodex. The goals of Vcodex are to provide:

• A robust data architecture for portable, self-describing encoded data

• A general software framework enabling everyone to build and use data transforms

• A repository of reusable and efficient algorithms to ease the construction of new transforms.

Substantial progress has been made in compression. Vcodex currently provides a comprehensive set of standard and new compression transforms that can be combined to yield dramatic compression of common data types.

For example, the Vcdiff delta transform encodes a dataset in relation to another dataset, including earlier versions of the same data.

Consider datasets that are frequently updated, like financial records, source or binary program code, web pages of news organizations, and so on. In these cases, one can use the delta transform together with other standard compression transforms to achieve up to a 500 to 1 compression factor.

The power of Vcdiff is demonstrated by its wide use around the world. Its data format has been standardized by the Internet Engineering Task Force (IETF).

Tabular data – Excel spreadsheets, relational tables from mainframes or Oracle databases – can be managed by the Vctable transform. Vctable employs a simple and efficient machine-learning technique, allowing the system to automatically compute certain dependencies in the table. When used together with other standard compression transforms, Vctable can reach a 100 to 1 compression factor for commonly available tabular data such as telephone billing, stock transactions, etc.

The Vcodex platform is used to dramatically reduce the need for hard disk space that stores telephone transaction and billing data. The next step in its development is to integrate encryption technology so that both compression and encryption can be seamlessly composed and customized.

THE PLAYERS

• Peer-to-peer networks address some of the challenges in exchanging data over the Internet. One example is Freenet, free software that lets users publish and obtain information without fear of censorship. Freenet is entirely decentralized and publishers and consumers alike remain anonymous. However, Freenet is only a file sharing system – there is no guarantee of reliability or privacy.

• Many of the early advances in data compression and encryption were made at university laboratories in occasional collaboration with industry. These technologies were often commercialized by small, niche companies or simply offered for free as open source software. Due to the explosive growth of data in recent years, major players like Google, IBM and Microsoft have begun to focus on this area.

Google is an interesting emerging player in data transformation; their business model, of course, deals exclusively with data and information. On a daily basis Google collects, analyzes and stores many gigabytes of data, and makes this information available to web customers in real time. It should be no surprise that compression research and development is a major component of their daily operations.

IBM and Microsoft have applied compression technologies to enhance their operating systems and services. IBM developed a proprietary delta compression technology similar to the delta transform in Vcodex for use in the TiVo recorder. Microsoft provides technology to integrate compression into their operating systems. However, with the advance of cheap PC storage, this technology is now largely unused by individual PC users.

Vcodex is currently the most comprehensive platform available for data transformation. It provides all standard compression algorithms and newly invented models that can be composed together to optimize compression of specific data types. Compression has enabled the construction of one of the world’s largest data warehouses, capable of storing terabytes of information online. There are plans afoot to integrate robust encryption techniques into the platform.

 

POTENTIAL IMPACTS

Global service providers are already using the best-known compression technologies in-house to store massive amounts of telephone data. This data, in turn, is made available in real time for customer care, billing, marketing, etc.

One promising approach for providers is to leverage their network technology and brand presence to enter the market and trade information as a commodity. One can imagine Google as the Wal-Mart of the information market: it provides reasonably decent quality but shallow information.

Google could someday become the eBay of information by developing the infrastructure to allow customers to find each other and conduct micro-transactions. Information must be wholly owned by individual customers to do so efficiently, but it must also be stored and accessed on the Internet or some publicly accessible network. In addition, users must be able to trust the mediator of such transactions to perform commerce.

Global service providers are in the right place to provide such a service.