Search
Subscribe

Bookmark and Share

About this Blog

As enterprise supply chains and consumer demand chains have beome globalized, they continue to inefficiently share information “one-up/one-down”. Profound "bullwhip effects" in the chains cause managers to scramble with inventory shortages and consumers attempting to understand product recalls, especially food safety recalls. Add to this the increasing usage of personal mobile devices by managers and consumers seeking real-time information about products, materials and ingredient sources. The popularity of mobile devices with consumers is inexorably tugging at enterprise IT departments to shifting to apps and services. But both consumer and enterprise data is a proprietary asset that must be selectively shared to be efficiently shared.

About Steve Holcombe

Unless otherwise noted, all content on this company blog site is authored by Steve Holcombe as President & CEO of Pardalis, Inc. More profile information: View Steve Holcombe's profile on LinkedIn

Follow @WholeChainCom™ at each of its online locations:

Entries in SaaS (6)

Wednesday
Jul112012

The Tipping Point Has Arrived: Market Incentives for Selective Sharing in Web Communications

By Steve Holcombe (@steve_holcombe) and Clive Boulton (@iC)

A Glimmer of Market Validation for Selective Sharing

In late 2005 Pardalis deployed a multi-tenant, enterprise-class SaaS to a Texas livestock market. The web-connected service provided for the selective sharing of data assets in the U.S. beef livestock supply chain.  Promising revenues were generated from a backdrop of industry incentives being provided for sourced livestock. The industry incentives themselves were driven by the specter of mandatory livestock identification promised by the USDA in the wake of the 2003 "mad cow" case.

At the livestock market thousands of calves were processed over several sessions. Small livestock producers brought their calves into the auction for weekly sales where they were RFID tagged. An affordable fee per calf was charged to the producers which included the cost of a RFID tag. The tags identifiers were automatically captured, a seller code was entered, and affidavit information was also entered as to the country of origin (USA) of each calf. Buyers paid premium prices for the tagged calves over and above untagged calves. The buyers made money over and above the affordable fee per calf.  After each sale, and at the speed of commerce, all seller, buyer and sales information was uploaded into an information tenancy in the SaaS that was controlled by the livestock market. For the first time ever in the industry, the livestock auction selectively authorized access to this information to the buyers via their own individual tenancies in the SaaS.

That any calves were processed at all was not possible without directly addressing the fear of information sharing that was held by both the calf sellers and the livestock market. The calf sellers liked that their respective identities were selectively withheld from the calf buyers. And they liked that a commercial entity they trusted – the livestock market – could stand as a kind of trustee between them and governmental regulators in case an auctioned calf later turned out to be the next ‘mad cow’. In turn the livestock market liked the selectiveness in information sharing because it did not have to share its confidential client list in an “all or nothing” manner to potential competitors on down the supply chain. At that moment in time, the immediate future of selective sharing with the SaaS looked very bright. The selective sharing design deployed by Pardalis in its SaaS fixed data elements at a single location with authorizations controlled by the tenants. Unfortunately, the model could not be continued and scaled at that time to other livestock markets. In 2006 the USDA bowed to political realities and terminated its efforts to introduce national mandatory livestock identification.

And so, too, went the regulatory-driven industry incentives. But … hold that thought.

Talking in Circles: Selective Sharing in Google+

Google+ is now 1 year old. In conjunction with Google, researchers Sanjay Kairam, Michael J. Brzozowski, David Huffaker, and Ed H. Chi have published Talking in Circles: Selective Sharing in Google+, the first empirical study of behavior in a network designed to facilitate selective sharing:

"Online social networks have become indispensable tools for information sharing, but existing ‘all-or-nothing’ models for sharing have made it difficult for users to target information to specific parts of their networks. In this paper, we study Google+, which enables users to selectively share content with specific ‘Circles’ of people. Through a combination of log analysis with surveys and interviews, we investigate how active users organize and select audiences for shared content. We find that these users frequently engaged in selective sharing, creating circles to manage content across particular life facets, ties of varying strength, and interest-based groups. Motivations to share spanned personal and informational reasons, and users frequently weighed ‘limiting’ factors (e.g. privacy, relevance, and social norms) against the desire to reach a large audience. Our work identifies implications for the design of selective sharing mechanisms in social networks."

While selective sharing may be characterized as being available on other networks (e.g. ‘Lists’ on Facebook), Google is sending signals that making the design of selective sharing controls central to the sharing model offers a great opportunity to help users manage their self-presentations to multiple audiences in the multi-tenancies we call online social networks. Or, put more simply, selective sharing multiplies opportunities for online engagement.

For the purposes of this blog post, we adopt Google’s definition of "selective sharing" to mean providing information producers with controls for overcoming both over-sharing and fear of sharing. Furthermore, we agree with Google that that the design of tools for such selective sharing controls must allow users to balance sender and receiver needs, and to adapt these controls to different types of content. So defined, we believe that almost seven years since the Texas livestock market project, a tipping point has been reached that militates in favor of selective sharing from within supply chains and on to consumers. Now, there have been a lot of things happen over the last seven years that bring us to this point (e.g., the rise of social media, CRM in the Cloud, the explosion of mobile technologies, etc.). But the tipping point we are referencing "follows the money", as they say. We believe that the tipping point toward selective sharing is to be found in the incentives provided by affiliate networks like Google Affiliate Networks.

Google Affiliate Networks

Google Affiliate networks provide a means for affiliates to monetize websites. Here’s a recent video presentation by Google, Automating the Use of Google Affiliate Links to Monetize Your Web Site:


Presented by Ali Pasha & Shaun Cox | Published 2 July 2012 | 47m 11s

The Google Affiliate Network provides incentives for affiliates to monetize their websites based upon actual sales conversions instead of indirectly based upon the number of ad clicks. These are web sites (e.g., http://www.savings.com/) where ads are the raison d'etre of the web site. High value consumers are increasingly scouring promotional, comparison, and customer loyalty sites like savings.com for deals and generally more information about products. Compare that with websites where ads are peripheral to other content (e.g., http://www.nytimes.com/) and where ad clicks are measured using Web 2.0 identity and privacy sharing models.

In our opinion the incentives of affiliate networks have huge potential for matching up with an unmet need in the Cloud for all participants - large and small - of enterprise supply chains to selectively monetize their data assets. For example, data assets pertaining to product traceability, source, sustainability, identity, authenticity, process verification and even compliance with human rights laws, among others, are there to be monetized.

Want to avoid buying blood diamonds? Go to a website that promotes human rights and click on a diamond product link that has been approved by that site. Want to purchase only “Made in USA” products? There’s not a chamber of commerce in the U.S. that won’t want to provide a link to their members’ websites who are also affiliates of an incentive network. Etc.

Unfortunately, these data assets are commonly not shared because of the complete lack of tools for selective sharing, and the fear of sharing (or understandable apathy) engendered under “all or nothing” sharing models. As published back in 1993 by the MIT Sloan School in Why Not One Big Database? Ownership Principles for Database Design: "When it is impossible to provide an explicit contract that rewards those who create and maintain data, ‘ownership’ will be the best way to provide incentives." Data ownership matters. And selective sharing – appropriately designed for enterprises – will match data ownership up with available incentives.

Remember that thought we asked you to hold?

In our opinion the Google Affiliate Network is already providing incentives that are a sustainable, market-driven substitute for what turned out to be unsustainable, USDA-driven incentives. We presume that Google is well aware of potential synergies between Google+ and the Google Affiliate Network. We also presume that Google is well aware that "[w]hile business-critical information is often already gathered in integrated information systems, such as ERP, CRM and SCM systems, the integration of these systems itself (as well as the integration with the abundance of other information sources) is still a major challenge."

We know this is a "big idea" but in our opinion the dynamic blending of Google+ and the Google Affiliate Network could over time bring within reach a holy grail in web communications – the cracking of the data silos of enterprise class supply chains for increased sharing with consumers of what to-date has been "off limits" proprietary product information.

A glimpse of the future may be found for example in the adoption of Google+ by Cadbury UK, but the design for selective sharing of Google+ is currently far from what it needs to attract broad enterprise usage. Sharing in Circles brings to mind Eve Maler’s blog post, Venn and the Art of Data Sharing.  That’s really cool for personal sharing (or empowering consumers as is the intent of VRM) but for enterprises Google+ will need to evolve its selective sharing functionalities. Sure, data silos of commercial supply chains are holding personal identities close to their chest (e.g., CRM customer lists) but they’re also walling off product identities with every bit as much zeal, if not more. That creates a different dynamic that, again, typical Web 2.0 "all or nothing" sharing (designed, by the way, around personal identities) does not address.

It should be specially noted, however, that Eve Maler and the User-Managed Access (UMA) group at the Kantara Initiative are providing selective sharing web protocols that place "the emphasis on user visibility into and control over access by others".  And Eve in her capacity at Forrester has more recently provided a wonderful update of her earlier blog post, this one entitled A New Venn of Access Control for the API Economy.

But in our opinion before Google+, UMA or any other companies or groups working on selective sharing can have any reasonable chance of addressing "data ownership" in enterprises and their supply chains, they will need to take a careful look at incorporating fixed data elements at a single location with authorizations. It is in regard to this point that we seek to augment the current status of selective sharing. More about that line of thinking (and activities within the WikiData Project) in our earlier “tipping point” blog post, The Tipping Has Arrived: Trust and Provenance in Web Communications.

What do you think? Share your conclusions and opinions by joining us at @WholeChainCom on LinkedIn at http://tinyurl.com/WholeChainCom.

Tuesday
Dec302008

Microsoft Office Applications and Data Ownership

Part I of a two-part series ....

Microsoft Office Applications as Seamless Supply Chain Tools

There is systemic supply chain problem for small businesses (defined here as 1 to 10 employees in size) that reverberates throughout our global economies. It may be seen in any product or service supply chain comprised of small businesses.

  • In other words, in the 'last mile' of any and every supply chain.

Of all the product supply chains in the world the U.S. beef livestock and meat products' industry is arguably the most challenging. There are approximately 110 million cattle in the U.S. and Canadian beef supply chain. Each year, about 44 million animals are slaughtered. In the U.S. there are approximately 1 million beef cattle operations the vast majority of which are small farms and family-owned operations commonly using Microsoft Office Excel for electronically storing and managing their livestock data.

Practically none of that data is shared, and even when it is shared it's in the form of difficult to trace and authenticate paperwork, faxes, e-mails and phone calls.

One reason is that there has heretofore been no 'chain of custody' SaaS designed for small businesses. Not only that but neither Microsoft Excel nor any other components of the Microsoft Office Applications (like Outlook or Word) have yet to be designed to be supply chain traceability and authentication solutions for small businesses.

Other reasons have to do with common fear factors. Farmers and ranchers constantly wrestle with convergent 'data ownership' issues related to genetics, pharmaceuticals, food safety, traceability, authentication, government regulation, product marketability, health records, and information producer confidentiality.

  • Why provide ammunition to a competitor?
  • Why let the government (i.e., the USDA, FDA, IRS, etc.) know how many cattle you - as a farmer - really own?
  • And why do so especially if you - as a farmer - don't see an increase on your return on investment (ROI)?

So, the small businesses of de-centralized U.S. agri-food supply chains are not providing customers or regulators with traceable, pedigree data about their crops and livestock.

  • The result? Continuing U.S. food safety crises. Mad cow prions, tainted spinach, hamburger recalls, etc.

And you don't have to be guilty, either, to be ensnared. The 2008 tomato recalls found the U.S. Food and Drug Administration (FDA) wrongly fingering the tomato industry for salmonella poisonings. That went on for weeks.

  • What if small business users of Microsoft Office Applications could be seamlessly linked to the large and mid-sized enterprises already using ERP, CRM, SCM and other federated supply chain solutions?
  • For example, what if a metadata service layer could transform Microsoft Excel into a supply chain solution for increasing ROI for small farmers who could then be paid for both their cash crop and the pedigree data identifying the history of their cash crop?
  • And what if that metadata service layer also directly addressed the ‘data ownership’ fears prevalent among U.S. farmers that their data will be wrongly used by regulators or unfairly exploited by competitors?

Pardalis’ Metadata Service Platform

Pardalis’ metadata service platform helps draw small businesses into the emerging ‘Cloud’. With Microsoft technology (Windows server, SQL server, .Net, Excel-like UI), Pardalis has engineered a metadata SaaS platform for small business end-users to granularly author, register and control immutable data objects. Pardalis' business rules advance the capabilities of a relational database (i.e., SQL) toward an emerging, object-oriented Cloud. But the end-users merely see it as an affordable service for ‘banking’, porting and controlling access to their data products using a SaaS-anized Excel-like user-interface.

Early Market Validation

Pardalis’ platform is being deployed by CalfAID, a USDA process verified RFID cattle tracking program using ISO 9000 series standards for documented quality management systems. CalfAID is owned by the small farmers comprising the North Dakota Beef Cattle Improvement Association, and administered by North Dakota State University for:

  • Linking small beef producers, feedlots, processors and restaurants with consumers,
  • Bringing ultra-high frequency, RFID tags to commercial viability,
  • Protecting livestock producers, food system industries, veterinary health, and consumer health from accidental or intentional disease outbreaks, and
  • Overcoming the ‘scary picture’ of RFID tracking by empowering small farmers with direct, granular, data portability control over their identities and pedigree data.

The Value of Microsoft Office Applications As Seamless Supply Chain Tools

The vertical value of pedigree data gathered from agri-food supply chains, using Microsoft Office Applications communicating through a Pardalis metadata service layer, can now be monetized:

  • Consumers retrieving deep search results (permission being granted by a data owner)  to determine food history, quality and safety,
  • Retailers promoting consumer loyalty with pedigree-driven purchase orders directly communicated back through the metadata service layer to small business farmers,
  • Farmers discovering a new profit center - pedigree data about their cash crops,
  • RFID product vendors selling outside of federated supply chains and into the ’last mile’, and
  • Regulators receiving more and better data for rapidly responding to food health crises.

Horizontally Monetizing SaaS in the Cloud with Data Ownership

Challenges related to data chain of custody are not limited to agri-food. There are approximately 500 million world-wide end-users of Microsoft Office Applications. So, what would be the definition of 'data ownership' that might horizontally pull these end-users into SaaS-anized versions of their Office Applications residing in the Cloud?

Empower the end-users with SaaS tools for tracing access to their data objects one-step, two steps, three-steps, etc. after the initial share. They'll know what data ownership is when they see it. The result? The Cloud becomes inflated sooner rather than later with traceable, trustworthy, authenticated data that would otherwise go missing from the invisible hand of informational capitalism.

  • That is, sail past the siren-songs of abstract, privacy laws that small businesses don't trust anyway, and capacitate those small business with real, hands-on functionalities that they viscerally recognize as data ownership.

And then watch those small businesses grease the wheels for monetizing SaaS in the Cloud.

Go to Part II ...

Saturday
Nov012008

The Economist: Creating the cumulus

The following is an excerpt from an article published in the October 25th edition of The Economist:

The importance of this shift from a monolithic [software] product to [software as a service] is hard to overstate. In a sense, it has seeded the cloud, allowing the droplets - the services that make up the electronic vapour - to form. It will allow computing to expand in all directions and serve ever more users. The new architecture also helps the less technically minded to shape their own clouds ....

Just as for the industrialisation of data centres, there is a historic precedent for this shift in architecture: the invention of movable type in the 15th century. At the time, printing itself was not a new idea. But it was Gutenberg and his collaborators who thought up the technologies needed to make printing available on a mass scale, creating letters made of metal that could be quickly assembled and re-used.

For the complete article, go to Creating the cumulus: Software will be transformed into a combination of services.

Tuesday
Jul152008

Cloud Computing: Billowing Toward Data Ownership - Part II

[Return to Part I]

Cloud Computing's Achilles Heel

2093760-1723750-thumbnail.jpg
Death of Achilles Peter Paul Rubens 1630-1635
The boom in the data center industry is building the Cloud where the conventional wisdom is that the software services of the Semantic Web will thrive. The expansion of the Cloud is believed to augur well that distributed data within the Cloud will come to substitute to some extent - perhaps substantially so - for data currently distributed outside of the Cloud. But the boom is being built upon a privacy paradigm employed by online companies that allows them to use Web cookies for collecting a wide variety of information about individual usage of the Internet. This assumption is the Cloud’s Achilles’ heel. It is an assumption that threatens to keep the Cloud from fully inflating beyond publicly available information sources.

I'm mulling over a more indepth discussion of Web cookies for a final Part III to this multi-part series. In the meantime the focus of today's blog is that a more likely consequence of the Cloud is that as people and businesses consider moving their computer storage and services into the Cloud, their direct technological control of information becomes more and more of a competitive driver.  As blogged in Part I, the online company that figures out ways of building privacy mechanisms into its compliance systems will be putting itself at a tremendous competitive advantage for attracting the services to operate in the Cloud. But puzzlement reigns as to how to connect the Cloud with new pools of the data (mostly non-artistic) that is private, confidential and classified.

Semantic Web: What’s Right About the Vision

What’s right about the Semantic Web is that its most highly funded visionaries have envisioned beyond a Web of documents to a ‘Data Web’. Here are two examples. A Web of scalably integrated data employing object registries envisioned by Metaweb Technologies’ Danny Hillis. A Web of granularly linked, ontologically defined, data objects envisioned by Radar Network’s Nova Spivack.

2093760-1693914-thumbnail.jpgClick on the thumbnail image to the left and you will see in more detail what Hillis envisions. That is, a database represented as a labeled graph, where data objects are connected by labeled links to each other and to concept nodes. For example, a concept node for a particular category contains two subcategories that are linked via labeled links "belongs-to" and "related-to" with text and picture. An entity comprises another concept that is linked via labeled links "refers-to," "picture-of," "associated-with," and "describes" with Web page, picture, audio clip, and data. For further information, see the blogged entry US Patent App 20050086188: Knowledge Web (Hillis, Daniel W. et al).

2093760-1660744-thumbnail.jpgClick on the thumbnail image to the right and you will see in more detail what Spivack envisions. That is, a picture of a Data Record with an ID and fields connected in one direction to ontological definitions in another direction to other similarly constructed data records with there own fields connected in one direction to ontological definitions, etc. These data records - or semantic web data - are nothing less than self-describing, structured data objects that are atomically (i.e., granularly) connected by URIs. For more information, see the blogged entry, US Patent App 20040158455: Methods and systems for managing entities in a computing device using semantic objects (Radar Networks).

Furthermore, Hillis and Spivack have studied the weaknesses of relational database architecture when applied to globally diverse users who are authoring, storing and sharing massive amounts of data, and they have correctly staked the future of their companies on object-oriented architecture. See, e.g., the blogged entries, Efficient monitoring of objects in object-oriented database system which interacts cooperatively with client programs and Advantages of object oriented databases over relational databases. They both define the Semantic Web as empowering people across the globe to collaborate toward the building of bigger, and more statistically reliable, observations about things, concepts and relationships.

2093760-1595453-thumbnail.jpgClick on the thumbnail to the left for a screen shot of a visualization and interaction experiment produced by Moritz Stefaner for his 2007 master's thesis, Visual tools for the socio–semantic web. See the blogged entry, Elastic Tag Mapping and Data Ownership. Stefaner posits what Hillis and Spivack would no doubt agree with - that the explosive growth of possibilities for information access and publishing fundamentally changes our way of interaction with data, information and knowledge. There is a recognized acceleration of information diffusion, and an increasing process of granularizing information into micro–content. There is a shift towards larger and larger populations of people producing and sharing information, along with an increasing specialization of topics, interests and the according social niches. All of this appears to be leading to a massive growth of space within the Cloud for action, expression and attention available to every single individual.

Semantic Web: What’s Missing from the Vision

Clouds_Missing%20From.PNG Continuing to use Hillis and Spivack as proxies, these two visionaries of the Semantic Web assume that data - all data - will be made available as an open source. Neither of them have a ready answer for the very simple question that Steve Innskeep asks above (in Part I of this two-part blog entry).

Inskeep: "Is somebody who runs a business, who used to have a filing cabinet in a filing room, and then had computer files and computer databases, really going to be able or want to take the risk of shipping all their files out to some random computer they don't even know where it is and paying to rent storage that way?"

Sir Tim Berners-Lee, the widely recognized inventor of the Web, and Director of the W3C, is every bit as perplexed about data ownership. In Data Portability, Traceability and Data Ownership - Part IV I referenced a recent interview excerpt from March, 2008, initiated by interviewer Paul Miller of ZDNet, in which Berners-Lee does acknowledge data ownership fear factors.

Miller: “You talked a little bit about people's concerns … with loss of control or loss of credibility, or loss of visibility. Are those concerns justified or is it simply an outmoded way of looking at how you appear on the Web?”

Berners-Lee: “I think that both are true. In a way it is reasonable to worry in an organization … You own that data, you are worried that if it is exposed, people will start criticizing [you] ….

So, there are some organizations where if you do just sort of naively expose data, society doesn't work very well and you have to be careful to watch your backside. But, on the other hand, if that is the case, there is a problem. [T]he Semantic Web is about integration, it is like getting power when you use the data, it is giving people in the company the ability to do queries across the huge amounts of data the company has.

And if a company doesn't do that, then, it will be seriously disadvantaged competitively. If a company has got this feeling where people don't want other people in the company to know what is going on, then, it has already got a problem ….

(emphasis added)

In other words, 'do the right thing', collegially share your data and everything will be OK. If only the real world worked that way, then Berners-Lee would be spot on. In the meantime, there is a ready answer.

Ownership Web

Cloud%20Over%20Ocean.PNGThe ready answer is an Ownership Web concurrently rising alongside, and complimentary to, the emerging Semantic Web.

For the Semantic Web to reach its full potential in the Cloud, it must have access to more than just publicly available data sources. It must find a gateway into the closely-held, confidential and classified information that people consider to be their identity, that participants to complex supply chains consider to be confidential, and that governments classify as secret. Only with the empowerment of technological ‘data ownership’ in the hands of people, businesses, and governments will the Semantic Cloud make contact with a horizon of new, ‘blue ocean’ data.

The Ownership Web would be separate from the Semantic Web, though semantically connected as layer of distributed, enterprise-class web platforms residing in the Cloud.

Ownership%20Web.PNG

The Ownership Web would contain diverse registries of uniquely identified data elements for the direct authoring, and further registration, of uniquely identified data objects. Using these platforms people, businesses and governments would directly host the authoring, publication, sharing, control and tracking of the movement of their data objects.

The technological construct best suited for the dynamic of networked efficiency, scalability, granularity and trustworthy ownership is the data object in the form of an immutable, granularly identified, ‘informational’ object.

A marketing construct well suited to relying upon the trustworthiness of immutable, informational objects would be the 'data bank'.

Data Banking

Bank_Man%20and%20Money%20Supporting.PNG Traditional monetary banks meet the expectations of real people and real businesses in the real world.

As blogged in Part I ... 

People are comfortable and familiar with monetary banks. That’s a good thing because without people willingly depositing their money into banks, there would be no banking system as we know it. By comparison, we live in a world that is at once awash in on-demand information courtesy of the Internet, and at the same time the Internet is strangely impotent when it comes to information ownership.

In many respects the Internet is like the Wild West because there is no information web similar to our monetary banking system. No similar integrated system exists for precisely and efficiently delivering our medical records to a new physician, or for providing access to a health history of the specific animal slaughtered for that purchased steak. Nothing out there compares with how the banking system facilitates gasoline purchases.

If an analogy to the Wild West is apropos, then it is interesting to reflect upon the history of a bank like Wells Fargo, formed in 1852 in response to the California gold rush. Wells Fargo wasn’t just a monetary bank, it was also an express delivery company of its time for transporting gold, mail and valuables across the Wild West. While we are now accustomed to next morning, overnight delivery between the coasts, Wells Fargo captured the imagination of the nation by connecting San Francisco and the East coast with its Pony Express. As further described in Banking on Granular Information Ownership, today’s Web needs data banks that do for the on-going gold rush on information what Wells Fargo did for the Forty-niners.

Banks meet the expectations of their customers by providing them with security, yes, but also credibility, compensation, control, convenience, integration and verification. It is the dynamic, transactional combination of these that instills in customers the confidence that they continue to own their money even while it is in the hands of a third-party bank.

A data bank must do no less.

Ownership Web: What's Philosophically Needed

Money_Brazilian.PNG Where exactly is the sweet spot of data ownership?

In truth, it will probably vary depending upon what kind of data bank we are talking about. Data ownership will be one thing for personal health records, another for product supply chains, and yet another for government classified information. And that's just for starters because there will no doubt be niches within niches, each with their own interpretation of data ownership. But the philosophical essence of the Ownership Web that will cut across all of these data banks will be this:

  • That information must be treated either or both as a tangible, commercial product or banked, traceable money.

The trustworthiness of information is crucial. Users will not be drawn to data banks if the information they author, store, publish and access can be modified. That means that even the authors themselves must be proscribed from modifying their information once registered with the data bank. Their information must take on the immutable characteristic of tangible, traceable property. While the Semantic Web is about the statistical reliability of data, the Ownership Web is about the reliability of data, period.

Ownership Web: What's Technologically Needed

What is technologically required is a flexible, integrated architectural framework for information object authoring and distribution. One that easily adjusts to the definition of data ownership as it is variously defined by the data banks serving each social network, information supply chain, and product supply chain. Users will interface with one or more ‘data banks’ employing this architectural framework. But the lowest common denominator will be the trusted, immutable informational objects that are authored and, where the definition of data ownership permits, controllable and traceable by each data owner one-step, two-steps, three-steps, etc. after the initial share.

2093760-1700737-thumbnail.jpgClick on the thumbnail to the left for the key architectural features for such a data bank. They include a common registry of standardized data elements, a registry of immutable informational objects, a tracking/billing database and, of course, a membership database. This is the architecture for what may be called a Common Point Authoring™ system. Again, where the definition of data ownership permits, users will host their own 'accounts' within a data bank, and serve as their own 'network administrators'. What is made possible by this architectural design is a distributed Cloud of systems (i.e., data banks). The overall implementation would be based upon a massive number of user interfaces (via API’s, web browsers, etc.) interacting via the Internet between a large number of data banks overseeing their respective enterprise-class, object-oriented database systems.

2093760-1666391-thumbnail.jpgClick on the thumbnail to the right for an example of an informational object and its contents as authored, registered, distributed and maintained with data bank services. Each comprises a unique identifier that designates the informational object, as well as one or more data elements (including personal identification), each of which itself is identified by a corresponding unique identifier. The informational object will also contain other data, such as ontological formatting data, permissions data, and metadata. The actual data elements that are associated with a registered (and therefore immutable) informational object would be typically stored in the Registered Data Element Database (look back at 124 in the preceding thumbnail). That is, the actual data elements and are linked via the use of pointers, which comprise the data element unique identifiers or URIs. Granular portability is built in. For more information see the blogged entry US Patent 6,671,696: Informational object authoring and distribution system (Pardalis Inc.).

Ownership Web: Where Will It Begin?

2093760-1729103-thumbnail.jpg
Aristotle
Metaweb Technologies
is a pre-revenue, San Francisco start-up developing and patenting technology for a semantic ‘Knowledge Web’ marketed as Freebase™. Philosophically, Freebase is a substitute for a great tutor, like Aristotle was for Alexander. Using Freebase users do not modify existing web documents but instead annotate them. The annotations of Amazon.com are the closest example but Freebase further links the annotations so that the documents are more understandable and more findable. Annotations are also modifiable by their authors as better information becomes available to them. Metaweb characterizes its service as an open, collaboratively-edited database (like Wikipedia, the free encyclopedia) of cross-linked data but it is really very much a next generation competitor to Google.

Not that Hillis hasn't thought about data ownership. He has. You can see it in an interview conducted by his patent attorney and filed on December 21, 2001 in the provisional USPTO Patent Application 60/343,273:

Danny Hillis: "Here's another idea that's super simple. I've never seen it done. Maybe it's too simple. Let's go back to the terrorist version [of Knowledge Web]. There's a particular problem in the terrorist version that the information is, of course, highly classified .... Different people have some different needs to know about it and so on. What would be nice is if you ... asked for a piece of information. That you [want access to an] annotation that you know exists .... Let's say I've got a summary [of the annotation] that said,  'Osama bin Laden is traveling to Italy.' I'd like to know how do you know that. That's classified. Maybe I really have legitimate reasons for that. So what I'd like to do, is if I follow a link that I know exists to a classified thing, I'd like the same system that does that to automatically help me with the process of getting the clearance to access that material." [emphasis added]

What Hillis was tapping into just a few months after 9/11 is just as relevant to today's information sharing needs.

In the War on Terror the world is still wrestling with classified information exchange between governments, between agencies within governments, and even between the individuals making up the agencies themselves. Fear factors revolving around data ownership – not legal ownership, but technological ownership – create significant frictions to information sharing throughout these Byzantine information supply chains.

Fear%20Factors_Woman%20Fretting.PNGSomething similar is happening within the global healthcare system. It's a complex supply chain in which the essential product is the health of the patients themselves. People want to share their entire personal health records with a personal physician but only share granular parts of it with an impersonal insurance company. ‘Fear factors’ are keeping people from becoming comfortable with posting their personal health information into online accounts despite the advent of Microsoft HealthVault and Google Health.

And then, in this era of both de facto and de jure deregulation, there are the international product supply chains providing dangerous toys and potential ‘mad cow’ meat products to unsuspecting consumers. Unscrupulous supply chain participants will always hide in the ‘fog’ of their supply chains. The manufacturers of safe products want to differentiate themselves from the manufacturers of unsafe products. But, again, fear factors keep the good manufacturers from posting information online that may put them at a competitive disadvantage to downstream competitors.

I'm painting a large picture here but what Hillis is talking about is not limited to the bureaucratic ownership of data but to matching up his Knowledge Web with another system - like the Ownership Web - for automatically working out the data ownership issues.

But bouncing around ideas about how we need data ownership is not the same as developing methods or designs to solve it. What Hillis non-provisionally filed, subsequent to his provisional application, was the Knowledge Web (aka Freebase) application. Because of its emphasis upon the statistical reliability of annotations, Knowledge web's IP is tailored made for the Semantic Web.  See the blogged entry US Patent App 20050086188: Knowledge Web (Hillis, Daniel W. et al). And because the conventional wisdom within Silicon Valley is that the Semantic Web is about to emerge, Metaweb is being funded like it is “the next big thing”. Metaweb’s Series B raised $42.4M more in January, 2008. What Hillis well recognizes is that as Freebase strives to become the premier knowledge source for the Web, it will need access to new, blue oceans of data residing within the Ownership Web.

2093760-1660740-thumbnail.jpgRadar Networks may be the “next, next big thing”. Also a pre-revenue San Francisco start-up, its bankable founder, Nova Spivack, has gone out of his way to state that his product Twine™ is more like a semantic Facebook while Metaweb’s Freebase is more like a semantic Wikipedia. Twine employs W3C standards in a community-driven, bottom up process, from which mappings are created to infer a higher resolution (see thumbnail to the right) of semantic equivalences or connections among and between the data inputted by social networkers. Again, this data is modifiable by the authors as better information becomes available to them. Twine holds four pending U.S. patent applications though none of these applications. See the blogged entry US Patent App 20040158455:  Methods and systems for managing entities in a computing device using semantic objects (Radar Networks). Twine’s Series B raised $15M-$20M in February, 2008 following on the heels of Metaweb's latest round. Twine’s approach in its systems and its IP is to emphasize perhaps a higher resolution Web than that of MetaWeb. Twine and the Ownership Web should be especially complimentary to each other in regard to object granularity. You can see this, back above, in the comparative resemblance between the thumbnail image of Spivack's Data Record ID object with the thumbnail image of Pardalis' Informational Object. Nonetheless, the IP supportive of Twine, like that Hillis' Knowledge Web, places a strong emphasis upon the statistical reliability of information. Twine's IP is tailored made for the Semantic Web.

Dossia is a private consortium pursuing the development of a national, personally controlled health record (PCHR) system. Dossia is also governed by very large organizations like AT&T, BP America, Cardinal Health, Intel, Pitney Bowes and Wal-Mart. In September, 2007, Dossia outsourced development to the IndivoHealth™ PCHR system. IndivoHealth, funded from public and private health grants, shares Pardalis' philosophy that "consumers are managing bank accounts, investments, and purchases online, and … they will expect this level of control to be extended to online medical portfolios." IndivoHealth empowers patients with direct access to their centralized electronic medical records via the Web.

But given the current industry needs for a generic storage model, the IndivoHealth medical records, though wrapped in an XML structure (see the next paragraph), are essentially still just paper documents in electronic format. IndivoHealth falls far short of empowering patients with the kind of control that people intuitively recognize as ‘ownership’. See US Patent Application 20040199765 entitled System and method for providing personal control of access to confidential records over a public network in which access privileges include "reading, creating, modifying, annotating, and deleting." And it reasonably follows that this is one reason why personal health record initiatives like those of not just Dossia, but also Microsoft’s HealthVault™ and GoogleHealth™, are not tipping the balance. For Microsoft and Google another reason is that they so far have not been able to think themselves out of the silos of the current privacy paradigm. The Ownership Web is highly disruptive of the prevailing privacy paradigm because it empowers individuals with direct control over their radically standardized, immutable data.

World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. W3C is headed by Sir Tim Berners-Lee, creator of the first web browser and the primary originator of the Web specifications for URL, HTTP and HTML. These are the principal technologies that form the basis of the World Wide Web. W3C has further developed standards for XML, SPARQL, RDF, URI, OWL, and GRDDL with the intention of facilitating the Semantic Web. While Berners-Lee has described in his own words (above) his perplexity about data ownership, nonetheless, the data object standards created by the W3C should be more than friendly to an Ownership Web employing object-oriented architecture. Surely, in Common Point Authoring™ will be found many of the ‘best of breed’ standards for an Ownership Web that is most complimentary to the emerging Semantic Web.

2093760-1723853-thumbnail.jpgEPCglobal is a private, standards setting consortium governed by very large organizations like Cisco Systems, Wal-Mart, Hewlett-Packard, DHL, Dow Chemical Company, Lockheed Martin, Novartis Pharma AG, Johnson & Johnson, Sony Corporation and Proctor & Gamble. EPCglobal is architecting essential, core services (see EPCglobal's Architectural Framework in the thumbnail to the right) for tracking physical products identified by unique electronic product codes (including RFID tags) across and within enterprise-scale, relational database systems controlled by large organizations.

Though it would be a natural extension to do so, EPCglobal has yet to envision providing its large organizations (and small businesses, individual supply chain participants and even consumers) with the ability to independently author, track, control and discover granularly identified informational products. See the blogged entry EPCglobal & Prescription Drug Tracking. It is not difficult to imagine that the Semantic Web, without a complimentary Ownership Web, would frankly be abhorrent to EPCglobal and its member organizations. For the Semantic Web to have any reasonable chance of connecting itself into global product and service supply chains, it must work through the Ownership Web.

Ownership Web: Where It Will Begin

The Ownership Web will begin along complex product and service supply chains where information must be trustworthy, period. Statistical reliability is not enough. And, in fact, the Ownership Web is beginning to form along the most dysfunctional of information supply chains. But that's for discussion in later blogs, as the planks of the Ownership Web are nailed into place, one by one.


[This concludes Part II of a three part series. On to Part III.]

Tuesday
Jun102008

Cloud Computing: Billowing Toward Data Ownership - Part I

Let's begin with the definition of Cloud Computing as currently found in Wikipedia, the free encyclopedia -

"The term ... derives from the common depiction in most technology architecture diagrams, of the Internet ... using an illustration of a cloud. The computing resources being accessed are typically owned and operated by a third-party provider on a consolidated basis in data center locations. Target consumers are not concerned with the underlying technologies used to achieve the increase in server capability, [the availability of which] is sold simply as a service available on demand."

As reported in Data Center Knowledge, Microsoft Chairman Bill Gates recently spoke about the boom in data center growth -

"The shift of services to the Cloud is getting us to think about data centers on a scale we never have before .... When you think about design, you can be very radical and come up with some huge improvements as you design for this kind of scale ...."

As reported in The Economist there is indeed a boom in the data center industry -

"Data centers are essential to nearly every industry and have become as vital to the functioning of society as power stations are .... American alone has more than 7,000 data centers .... And each is housing ever more servers, the powerful computers that crunch and dish up data .... Google is said to operate a global network of about three dozen data centers with ... more than 1 million servers. To catch up, Microsoft is investing billions of dollars and adding up to 20,000 servers a month .... In America the number of servers is expected to grow to 15.8 million by 2010 - three times as many as a decade earlier."

This boom is building the Cloud where software as a service (SaaS) will find the 'oxygen' it needs to survive and thrive. The expansion of the Cloud augurs well that distributed data within the Cloud will come to substitute to some extent - perhaps substantially so - for data distributed outside of the Cloud.

One resulting consequence will likely be that mobile technology becomes promoted not as a storage device but as a utilitarian tool for taking a sip of data as needed, when needed, from the moisture of the Cloud. Imagine the Cloud holding the data for each user's personal mobile technology. Imagine users traveling as 'digital nomads' without laptop computers, because they will stay connected to the Cloud through their internet-accessible mobile phones.

Jonathan Schwartz, CEO of Sun Microsystems, spent a week recording his life as one such digital nomad for The Economist.


Click on image to play video podcast [5m 25s]

Here's the takeaway quote (paraphrased) from this podcast - "In my travels I keep a pen and a BlackBerry. My assumption is that the network has become ubiquitous across the world. The network is more a utility for me than a destination." 

    Again, data distribution becomes more about distribution of data stored within the Cloud and less about distribution of data stored on mobile technology. Lost your internet-accessible mobile phone in Paris, France? Purchase another one when you land in San Francisco and re-connect to the Cloud (and your address book, scheduling calendar, etc.) without missing a beat.

    Another likely consequence of the Cloud is that as people and businesses consider moving their computer storage and services into the Cloud, their direct technological control of information becomes more and more of a competitive driver. The buzz created by Dataportability.org is the early evidence of this driver. See Portability, Traceability and Data Ownership (Part I).

    But mere portability is not enough. It only whets the appetite of people and business owners for more technological control - not just legislated or contractual privacy protections. See, e.g., Personal Health Records, Data Portability and the Continuing Privacy Paradigm. That is, the Cloud significantly increases the opportunities for privacy friendly technologies, including data ownership technologies.

    "The [online] company that ... figures out ways of ... [technologically] building into [its] compliance systems ... [privacy] compliance mechanisms ... will be putting itself at a tremendous competitive advantage for attracting the services to operate in [the cloud computing environment]." Quoting Reidenberg in Computing in the Cloud: Possession and ownership of data.

    Steve Inskeep of NPR's Morning Edition recently talked with Craig Balding, an information technology security expert for a Fortune 500 company, about Cloud Computing. Here's the takeaway exchange from the 3m 30s audio clip 'Cloud Computing' Puts Computer Resources on Tap -

    Inskeep (2m25s): "Is somebody who runs a business, who used to have a filing cabinet in a filing room, and then had computer files and computer databases, really going to be able or want to take the risk of shipping all their files out to some random computer they don't even know where it is and paying to rent storage that way?"

    Balding: "Yes, that's really a key question, even though these are reputable companies ... there's going to be a whole ecosystem that builds up around around [Cloud Computing] ... of smaller companies that will offer additional services on top of [Cloud Computing's] basic services .... so what I've done is [I've} actually started up a blog [called] cloudsecurity.org and what I'm trying to do is to get the various cloud providers to come and have a discussion about what security they are doing ..."

    If you are an IT security expert, I would encourage you to take a moment to familiarize yourself with Mr. Balding's Cloud Security blog. Security is an absolute essential for the Cloud, as it has been for databases of any size since they were first engineered.

    But security ... is not enough.

    As Bruce Schneier has written in Secrets And Lies - Digital Security In A Networked World -

    "The average person can not tell good security from bad security... the world is filled with specialties that are critical to public safety and security, and yet are beyond the comprehension of the general population... Commerce works the same way. When was the last time you personally checked the accuracy of a gas station's pumps, or a taxicab's meter, or the weight and volume information on packaged foods?" [emphasis added]

    Schneier's language parallels the central theme of Banking on Granular Information Ownership -

    "People are comfortable and familiar with monetary banks. That’s a good thing because without people willingly depositing their money into banks, there would be no banking system as we know it .... [By comparison, we] live in a world that is at once awash in on-demand information courtesy of the Internet, and at the same time the Internet is strangely impotent when it comes to information ownership ....

    In many respects the Internet is like the Wild West because there is no information web similar to our monetary banking system. No similar integrated system exists for precisely and efficiently delivering our medical records to a new physician, or for providing access to a health history of the specific animal slaughtered for that purchased steak. Nothing out there compares with how the banking system facilitates gasoline purchases."

    Banks meet the expectations of their customers by providing them with security, yes, but also credibility, compensation, control, convenience, integration and verification. It is the dynamic combination of these that instills in customers the confidence that they continue to own their money, even while it is in the hands of a third-party bank.


    No, security is not sufficient by itself to compel the hypothetical business owner, whom Inskeep was referencing, to take the risk of putting his or her information into the Cloud.

    As blogged in Portability, Traceability and Data Ownership (Part IV), nobody has done a better job of describing why data ownership matters to the use and effectiveness of big databases than Marshall Van Alstyne. And I continue to be charmed with the 1994 publication he co-authored entitled, Why Not One Big Database? Ownership Principles for Database Design.

    You might have seen this before, but here’s my favorite quote from Van Alstyne's paper -

    The fundamental point of this research is that ownership matters. Any group that provides data to other parts of an organization requires compensation for being the source of that data. When it is impossible to provide an explicit contract that rewards those who create and maintain data, "ownership" will be the best way to provide incentives. Otherwise, and despite the best available technology, an organization has not chosen its best incentives and the subtle intangible costs of low effort will appear as distorted, missing, or unusable data.” (emphasis added)

    I know I am in effect bootstrapping Van Alstyne's research results.

    I am taking liberties by stretching his research from big organizational databases to cover that of the Cloud. I recognize that the Cloud is already of a scale that is astronomically larger than even what Van Alstyne in his mid-1990's research could have possibly imagined it would become today.

    But when I read Van Alstyne's paper there is an insistent voice inside of me that says "data ownership matters to the Cloud for the same reasons it matters to big, organizational databases." 


    Clouds V1 from Robert Beyer on Vimeo.

    [This concludes Part I of a three part series. On to Part II.]