Search
Subscribe

Bookmark and Share

About this Blog

As enterprise supply chains and consumer demand chains have beome globalized, they continue to inefficiently share information “one-up/one-down”. Profound "bullwhip effects" in the chains cause managers to scramble with inventory shortages and consumers attempting to understand product recalls, especially food safety recalls. Add to this the increasing usage of personal mobile devices by managers and consumers seeking real-time information about products, materials and ingredient sources. The popularity of mobile devices with consumers is inexorably tugging at enterprise IT departments to shifting to apps and services. But both consumer and enterprise data is a proprietary asset that must be selectively shared to be efficiently shared.

About Steve Holcombe

Unless otherwise noted, all content on this company blog site is authored by Steve Holcombe as President & CEO of Pardalis, Inc. More profile information: View Steve Holcombe's profile on LinkedIn

Follow @WholeChainCom™ at each of its online locations:

Entries in Object-orientation (5)

Thursday
Apr262012

The Tipping Point has Arrived: Trust and Provenance in Web Communications

By Steve Holcombe (@steve_holcombe) and Clive Boulton (@iC)

"The Web was originally conceived as a tool for researchers who trusted one another implicitly. We have been living with the consequences ever since." Sir Tim Berners-Lee

"One of the issues of social networking silos is that they have the data and I don't … There are no programmes that I can run on my computer which allow me to use all the data in each of the social networking systems that I use plus all the data in my calendar plus in my running map site, plus the data in my little fitness gadget and so on to really provide an excellent support to me." Sir Tim Berners-Lee.

The tipping point has arrived for trust and provenance in web communications. And it is not just because Tim Berners-Lee thinks it is a good idea. The control of immutable data in the Cloud by content providers is on the verge of moving out of research projects and into commercial platforms. The most visible, first-mover example known to us is provided by the Wikidata Project.

The rapidly emerging Wikidata Project, the next iteration of Wikipedia, will in its first phase (to be finished within the next 6 months) implement the deposit by content providers of data elements (e.g., someone's birth date) at a single, fixed location for supporting in Phase 2 (targeted to be completed by the end of 2012) the semantic relationships (i.e., ontologies) that Wikipedia users are seeking. Paul Allen's Institute of Artificial Intelligence and Google are two of the three primary benefactors of the Wikidata Project. And it is no surprise that the base of operations for this ground-breaking work is in Germany. The European Commission proposed in January, 2012 a comprehensive reform of data protection rules to strengthen online privacy rights and boost Europe's digital economy.

This blog site exists to discuss whole chain communications between enterprises and consumers. Along that line the Wikipedia folks aren't really thinking about the Wikidata Project in terms of supply chains. But that is what they are backing into. Daniel Matuschek (@matuschd) would seem to agree in his blog post, Wikidata - some expectations. Here's an excerpt:

"Some ideas for open databases that could make our live easier or better [include] Product data: Almost every product has an EAN code. There are some companies building and selling databases for specific products (e.g. food, DVDs), sometimes generated with community support .... The Wikidata project is currently not addressing [this kind of database], but if a platform is available, there’s a good chance that users start creating databases like this."

And granular permissions (in the hands of content providers) over individual data elements are on Wikipedia's wish list to be introduced later this year during Phase 2:

  • O2.5. Add a more fine granular approach towards protecting single facts instead of merely the whole entity.
  • O2.6. Export trust and provenance information about the facts in Wikidata. Since the relevant standards are not defined yet, this should be done by closely monitoring the W3C Provenance WG.

We suspect that as the Wikidata Project begins to provide "trust and provenance" in its form of web communications, they will not just be granularizing single facts but also immutabilizing the data elements to which those facts are linked so that even the content providers of those data elements cannot change them. This is critical for trust and provenance in whole chain communications between supply chain participants who have never directly interacted.

What are the other signs of the "tipping point"?

Another sign is the shift to forecasting demand certainty directly from a consumer interest graph. Walmart purchased Kosmix in 2011 to push into social commerce and to integrate products with social identity. This ia an important new way to give shoppers information, and get information from them. Analysts at the research firm Booz and Company said in a 2010 report.

“Social media, or places where people congregate to share information and mutual understanding, are replacing broadcast media as the primary way many people learn about products and services.”

"Doc" Searls, co-author of The Cluetrain Manifesto, and a former Fellow of the Berkman Center for Internet & Society at Harvard University, calls this a shift to the Intention Economy, Where Consumers Take Charge. Here is an excerpt from his May, 2012 publication:

Today, Walmart and Tesco and other global grocers have to wait for the checkout register to record a sale and pass the product sale information through a network of EDI processing to reforecast demand. Imagine the improvements when Walmart can see supply chain intent before the sale. Unlike Walmart, the FT calls Tesco tired.   

Indeed, Keith Teare on Tech Crunch posits Facebook's purchase of Instagram (and Google's falling earnings) signals the end of the Web 2.0 era. In the Web 2.0 era we consumed services on a web browser monetized by display ads. Now we are moving to a mobile app-centric world without desktop display ads. This is fertile ground for a shift into sharing at the identity and granular detail level via trust and provenance.

Does the Instagram purchase signal that Facebook will become a "trusted site" for granular information saved and shared in immutable objects? Facebook has to aggregate more and more data to build better services and makes its post IPO numbers. Will Facebook services come to provide W3C-type trust and provenance? We will see. But it is interesting to imagine that the Wikidata Project will be a "tipping point" for Facebook and other Web 2.0 providers toward granular trust and provenance in the Cloud.

 

What do you think? Share your conclusions and opinions by joining us at @WholeChainCom on LinkedIn at http://tinyurl.com/WholeChainCom.

Friday
Jul312009

Ars Technica: Inside the AP's plan to "wrap" its content

The following was published by Nate Anderson for ARS Technica on 28 July 2009:

The Associated Press last week rolled out its brave new plan to "apply protective format to news." The AP's news registry will "tag and track all AP content online to assure compliance with terms of use," and it will provide a "platform for protect, point, and pay." That's a lot of "p"-prefaced jargon, but it boils down to a sort of DRM for news—"enforcement," in AP-speak.

For the complete article go to DRM for news? Inside the AP's plan to "wrap" its content. And see in particular the technical specifications for hNews, an extension of hAtom.

For a comparable publication, see Author-Level Digital Rights Management and the Common Point Authoring System: Protecting Information Exchange.

Thursday
Sep252008

Laying the First Plank of a Supply Chain Ownership Web in North Dakota

My brother, Scot, and I traveled to North Dakota last week for meetings in Dickinson, N.D. (in southwest North Dakota) and Fargo, N.D. (southeastern North Dakota on the border with sister city, Moorhead, MN). Leaving Oklahoma we traveled north through Kansas, spent the night in North Platte, Nebraska. The next day took us by the Black Hills of South Dakota and Mt. Rushmore. If you have never been there, it's definitely worth the stop. I hadn't realized that the four President's gazed out of the Black Hills toward the east and a vast sea of prairie. We also drove on some beautiful, shoulderless 'blue' highways like that of Highway 85 between Belle Fourche, S.D. and Belfield, N.D. If you love the movie Dances with Wolves, you'll love this stretch of scenery. Lots and lots of pronghorn antelope, too. And Redig, S.D. really is one of those 'towns' with one house sitting on a rail straight road stretching endlessly into the distance. No kidding.

But, I digress.

For those of you who think you are unfamiliar with complex supply chains, allow me to jog your memory because you actually know more than you think you do. Spinach. Lead painted toys. Mad cows. Tomatoes. Jalapeno peppers. Hamburger. What do they all have in common? They are products that originate at the frayed ends of lengthy (even international) supply chains beset by many, many fears related to information sharing. And they are products that have been deemed poisonous (lead paint) or unhealthy (contaminated with e. coli, the prions that apparently cause BSE, or salmonella). Actually, the tomato industry got hammered this summer and they weren't even at fault. But take a look at some of the wonderful, free advertising the tomato industry received before the FDA called off the dogs.

What is the value of immediately accessible, credible, supply chain information? If it incontrovertibly points to you and your business as the culprit in a food disease crisis, for instance, then, yes, you are limiting your options. But if your company uses best practices in its crop management and limits its risks in advance, the value of credible information at your fingertips in a disease crisis is to immediately distinguish your company from (a) the actual culprits, and (b) all other companies who perform best crop management practices just like you but can't provide credible information for months. In fact the damage is not measured in months but in hours. Unfortunately, within hours the damage to reputation has been seeded into the minds of wholesale buyers and retail consumers. And without your ability to immediately provide exonerating information, the government regulators are going to 'play it safe' and cast a broad net that unfortunately ropes in a lot of innocent parties.

Information is like a sword. Unfortunately, if you don't firmly grab the sword and make it cut for you, in a crisis the sword will be out of your hands and you will potentially be sliced to death in the name of 'public health'.

The Dickinson Research Extension Center of North Dakota State University (NDSU DREC) is the first land grant extension center, and perhaps still the only one, to operate a beef livestock age and source verification program sanctioned by the USDA. It's called the CalfAID USDA PVP (i.e., process verified program employing RFID ear tags) and it's managed by NDSU DREC for the real cattlemen of the North Dakota Beef Cattle Improvement Association. The Agro-Security Resource Center at Dickinson State University also makes a significant contribution. It's partly research driven (with Congressional funding) and partly market-driven. It's market-driven in that those real livestock producers pay a fee per animal out of their pockets with the expectation that they will receive greater dollars (i.e., premiums) later on that the market pays for credible information about those calves. The CalfAID PVP exists to keep the calves connected with their age (i.e., birthdate) and source (i.e., origin) as each calf winds its way along an otherwise 'information dysfunctional' supply chain.

How dysfunctional is the information sharing? The U.S. has a national herd of about 100 million cattle. There's about a million cattle operations of one sort or the other. The vast majority of calf producers don't know where their calves eventually end up being slaughtered. Most packers don't know from what ranch or farm the animals they slaughter originated from. It's pretty much the same as it was in the 19th century. Most products (i.e., the livestock) are pushed one-step at a time as 'as is' commodities along a supply chain in which each segment only sees one step back, and one step forward. It's kind of like standing in a bucket line helping to pass along that bucket of water to put out a fire. You know who is passing you the bucket, and you know to whom you are passing it on. But in the beef industry, chances are you don't know where the fire is or even where the water is coming from.

In order to set the stage for scaling out from tracking thousands of cattle to tracking potentially hundreds of thousands or even millions of cattle, NDSU DREC has adopted a web service for their supply chain that empowers livestock producers to do with their cattle data what the following jazzy video envisions for social networks.


DataPortability - Connect, Control, Share, Remix from Smashcut Media on Vimeo

The web service, patented and engineered by Pardalis, is called a 'data bank' and it's coded in .NET with SQL server architecture on the back-end (though I would be very interested to see an open source, adjacent Linux system similarly funded and architected from Pardalis' IP as a data bank for the social networking space).


Common Point Authoring Model
So how exactly does the 'data bank' work. To the right is the information model for the Common Point Authoring system (CPA) - that's the name that Pardalis has used in its most recent patents. You can also compare this image with other views, images and information about the CPA system to be found elsewhere in this blog site. Within the CPA system data cannot be changed once set (i.e., registered) so that the data can be used for verification and certification. Or, put another way, Pardalis has transformed the traditional application of immutable objects beyond run-time efficiencies, and empowered end-users with tools for granularly authoring, registering, controlling, and sharing these immutable objects.

The end-users 'own' and directly control sharing rights over what they author and register (or automatically collect and register), they just can't change it once it's authored. It becomes a part of a permanent, trustworthy record of the bank albeit controlled by the author. Other data bank account holders who receive any information from another data bank account holder know that. And they can remix it with their own data, and further share it, permission being granted to do so by the original author. This all helps build confidence, data credibility and, especially, trusted communication where it did not exist before. It provides a means for supply chain participants to reap benefits not just from their traditional products, but now also from their informational products. And, yes, there's no free lunch. The government might very well be able to subpoena those electronic records in their quest to protect the public's health. But they do the same with traditional monetary banks, too, don't they?

Now there are a number of technological ways to accomplish the same thing, it's just that Pardalis' object oriented approach provides certain long term advantages in terms of scalability, efficiency and granularity in 'the Cloud' that match up extremely well to an emerging Semantic Web. And you don't have to take my word for it. See, for example, the blogged entries, Efficient monitoring of objects in object-oriented database system which interacts cooperatively with client programs and Advantages of object oriented databases over relational databases. And Pardalis’ granular information banking system provides a substantial head-start in the race toward the standardization of a metadata platform for what I call an Ownership Web.

Online encyclopedias like Metaweb's Freebase Parallax are beginning to roll out tools for semantic search and semantic visualizations of publicly accessible information. See the nifty video clip in Freebase Parallax and the Ownership Web. Others like Google, Yahoo! , and Wikipedia will follow. The intrinsic value for connecting these search engines and encyclopedias with the Ownership Web will be the opportunity to likewise empower their authenticated end-users with the same semantic tools for accessing information that people consider to be their identity, that participants to complex supply chains consider to be confidential, and that governments classify as secret.

But, again, I digress.

Here's a film clip demonstrating the the authoring and portability of immutable data objects along the beef livestock supply chain.  The interface is neither sexy nor jazzy. But it is effective. This type of look and feel makes sense for the beef livestock supply chain as Microsoft Excel is familiar to a large percentage of cattle producers (at least the ones who have moved on from pencil and paper). Currently, there's no audio because, frankly, I've provided the audio 'live' when called upon to do so. If you, too, would like a verbal walk through, drop me an e-mail. Or, in the alternative, I've scripted a written walk-through that you can download, print and follow as the clip runs its course. If you want to see a full screen version, click on the hyperlinked text below the graphic to take you to the Vimeo website.


NDSU CalfAID Data Bank Demo from Steve Holcombe on Vimeo.

In the coming months the data bank will be used not just to track the data uploaded and ported by CalfAID members, but also for helping to keep data connected with the animals from other age and source programs, and probably even for COOL compliance, too.

Once again, there's way more to the data bank than its application to the beef industry. As Dr. Kris Ringwall, Director of NDSU DREC, said in Fargo to a large vegetable growing company during a live demonstration of the data bank, "whether it's an animal or a vegetable, it's a product with a pedigree".

Well, that may be more of a paraphrase than a quote, but I know that Kris in this Presidential campaign season would nonetheless 'approve this message'.

Thursday
Sep042008

Freebase Parallax and the Ownership Web

What's Right About the Semantic Web

What’s right about the Semantic Web is that its most highly funded visionaries have envisioned beyond a Web of documents to a ‘Data Web’. Here's an example: a Web of scalably integrated data employing object registries envisioned by Metaweb Technologies’ Danny Hillis and manifested in Freebase Parallax™, a competitive platform and application to both Google and Wikipedia.

2093760-1729103-thumbnail.jpg
Aristotle
Metaweb Technologies
is a San Francisco start-up developing and patenting technology for a semantic ‘Knowledge Web’ marketed as Freebase Parallax. Philosophically, Freebase Parallax is a substitute for a great tutor, like Aristotle was for Alexander. Using Freebase Parallax users do not modify existing web documents but instead annotate them. The annotations of Amazon.com are the closest example but Freebase Parallax further links the annotations so that the documents are more understandable and more findable. Annotations are also modifiable by their authors as better information becomes available to them. Metaweb characterizes its service as an open, collaboratively-edited database (like Wikipedia, the free encyclopedia) of cross-linked data but, as you will see in the video below, it is really very much a next generation competitor to both Google and Wikipedia.

The Intellectual Property Behind Freebase Parallax

2093760-1693914-thumbnail.jpgClick on the thumbnail image to the left and you will see in more detail what Hillis envisions. That is, a database represented as a labeled graph, where data objects are connected by labeled links to each other and to concept nodes. For example, a concept node for a particular category contains two subcategories that are linked via labeled links "belongs-to" and "related-to" with text and picture. An entity comprises another concept that is linked via labeled links "refers-to," "picture-of," "associated-with," and "describes" with Web page, picture, audio clip, and data. For further information about this intellectual property - entitled Knowledge Web - see the blogged entry US Patent App 20050086188: Knowledge Web (Hillis, Daniel W. et al).

Freebase Parallax Incarnate

In the following video let's look at how this intellectual property for Knowledge Web is actually being engineered and applied by Metaweb Technologies in the form of Freebase Parallax.


Freebase Parallax: A new way to browse and explore data from David Huynh on Vimeo.

The Semantic Web's Achilles Heel

You can hear it in the video. What Hillis and Metaweb Technologies well recognize is that as Freebase Parallax strives to become the premier knowledge source for the Web, it will need access to new, blue oceans of data. It must find a gateway into the closely-held, confidential and classified information that people consider to be their identity, that participants to complex supply chains consider to be confidential, and that governments classify as secret. That means that data ownership must be entered into the equation for the success of Freebase Parallax and the emerging Semantic Web in general.

Not that Hillis hasn't thought about data ownership. He has. You can see it in an interview conducted by his patent attorney and filed on December 21, 2001 in the provisional USPTO Patent Application 60/343,273:

Danny Hillis: "Here's another idea that's super simple. I've never seen it done. Maybe it's too simple. Let's go back to the terrorist version [of Knowledge Web]. There's a particular problem in the terrorist version that the information is, of course, highly classified .... Different people have some different needs to know about it and so on. What would be nice is if you ... asked for a piece of information. That you [want access to an] annotation that you know exists .... Let's say I've got a summary [of the annotation] that said,  'Osama bin Laden is traveling to Italy.' I'd like to know how do you know that. That's classified. Maybe I really have legitimate reasons for that. So what I'd like to do, is if I follow a link that I know exists to a classified thing, I'd like the same system that does that to automatically help me with the process of getting the clearance to access that material." [emphasis added]

What Hillis was tapping into just a few months after 9/11 is just as relevant to today's information sharing needs.

But bouncing around ideas about how we need data ownership is not the same as developing methods or designs to solve it. What Hillis non-provisionally filed, subsequent to his provisional application, was the Knowledge Web application. Because of its emphasis upon the statistical reliability of annotations, Knowledge web's IP is tailored made for the Semantic Web. But it is not designed for data ownership.

The Ownership Web

For the Semantic Web to reach its full potential, it must have access to more than just publicly available data sources. Only with the empowerment of technological data ownership in the hands of people, businesses, and governments will the Semantic Web make contact with a horizon of new, ‘blue ocean’ data.

Conceptually, the Ownership Web would be separate from the Semantic Web, though semantically connected as layer of distributed, enterprise-class web platforms residing in the Cloud.

Ownership%20Web.PNG

The Ownership Web would contain diverse registries of uniquely identified data elements for the direct authoring, and further registration, of uniquely identified data objects. Using these platforms people, businesses and governments would directly host the authoring, publication, sharing, control and tracking of the movement of their data objects.

The technological construct best suited for the dynamic of networked efficiency, scalability, granularity and trustworthy ownership is the data object in the form of an immutable, granularly identified, ‘informational’ object.

A marketing construct well suited to relying upon the trustworthiness of immutable, informational objects would be the 'data bank'.

Data Banking

Bank_Man%20and%20Money%20Supporting.PNG Traditional monetary banks meet the expectations of real people and real businesses in the real world.

People are comfortable and familiar with monetary banks. That’s a good thing because without people willingly depositing their money into banks, there would be no banking system as we know it. By comparison, we live in a world that is at once awash in on-demand information courtesy of the Internet, and at the same time the Internet is strangely impotent when it comes to information ownership.

In many respects the Internet is like the Wild West because there is no information web similar to our monetary banking system. No similar integrated system exists for precisely and efficiently delivering our medical records to a new physician, or for providing access to a health history of the specific animal slaughtered for that purchased steak. Nothing out there compares with how the banking system facilitates gasoline purchases.

If an analogy to the Wild West is apropos, then it is interesting to reflect upon the history of a bank like Wells Fargo, formed in 1852 in response to the California gold rush. Wells Fargo wasn’t just a monetary bank, it was also an express delivery company of its time for transporting gold, mail and valuables across the Wild West. While we are now accustomed to next morning, overnight delivery between the coasts, Wells Fargo captured the imagination of the nation by connecting San Francisco and the East coast with its Pony Express. As further described in Banking on Granular Information Ownership, today’s Web needs data banks that do for the on-going gold rush on information what Wells Fargo did for the Forty-niners.

Banks meet the expectations of their customers by providing them with security, yes, but also credibility, compensation, control, convenience, integration and verification. It is the dynamic, transactional combination of these that instills in customers the confidence that they continue to own their money even while it is in the hands of a third-party bank.

A data bank must do no less.

Ownership Web: What's Philosophically Needed

Money_Brazilian.PNG Where exactly is the sweet spot of data ownership?

In truth, it will probably vary depending upon what kind of data bank we are talking about. Data ownership will be one thing for personal health records, another for product supply chains, and yet another for government classified information. And that's just for starters because there will no doubt be niches within niches, each with their own interpretation of data ownership. But the philosophical essence of the Ownership Web that will cut across all of these data banks will be this:

  • That information must be treated either or both as a tangible, commercial product or banked, traceable money.

The trustworthiness of information is crucial. Users will not be drawn to data banks if the information they author, store, publish and access can be modified. That means that even the authors themselves must be proscribed from modifying their information once registered with the data bank. Their information must take on the immutable characteristic of tangible, traceable property. While the Semantic Web is about the statistical reliability of data, the Ownership Web is about the reliability of data, period.

Ownership Web: What's Technologically Needed

What is technologically required is a flexible, integrated architectural framework for information object authoring and distribution. One that easily adjusts to the definition of data ownership as it is variously defined by the data banks serving each social network, information supply chain, and product supply chain. Users will interface with one or more ‘data banks’ employing this architectural framework. But the lowest common denominator will be the trusted, immutable informational objects that are authored and, where the definition of data ownership permits, controllable and traceable by each data owner one-step, two-steps, three-steps, etc. after the initial share.

2093760-1700737-thumbnail.jpgClick on the thumbnail to the left for the key architectural features for such a data bank. They include a common registry of standardized data elements, a registry of immutable informational objects, a tracking/billing database and, of course, a membership database. This is the architecture for what may be called a Common Point Authoring™ system. Again, where the definition of data ownership permits, users will host their own 'accounts' within a data bank, and serve as their own 'network administrators'. What is made possible by this architectural design is a distributed Cloud of systems (i.e., data banks). The overall implementation would be based upon a massive number of user interfaces (via API’s, web browsers, etc.) interacting via the Internet between a large number of data banks overseeing their respective enterprise-class, object-oriented database systems.

2093760-1666391-thumbnail.jpgClick on the thumbnail to the right for an example of an informational object and its contents as authored, registered, distributed and maintained with data bank services. Each comprises a unique identifier that designates the informational object, as well as one or more data elements (including personal identification), each of which itself is identified by a corresponding unique identifier. The informational object will also contain other data, such as ontological formatting data, permissions data, and metadata. The actual data elements that are associated with a registered (and therefore immutable) informational object would be typically stored in the Registered Data Element Database (look back at 124 in the preceding thumbnail). That is, the actual data elements and are linked via the use of pointers, which comprise the data element unique identifiers or URIs. Granular portability is built in. For more information see the blogged entry US Patent 6,671,696: Informational object authoring and distribution system (Pardalis Inc.).

The Beginning of the Ownership Web

Common Point Authoring is going live this fall in the form of a data bank for cattle producers in the upper plains. Why the livestock industry? Because well-followed commentators like Dr. Kris Ringwall, Director of the Dickinson Research Extension Center for North Dakota State University, recognize that there are now two distinct products being produced along our nation's complex agricultural supply chains: (1) a traditional product, and (2) an informational product describing the pedigree of the traditional product.

The following excerpt is from a BeefTalk article, Do We Exist Only If Someone Else Knows We Exist?, recently authored by Dr. Ringwall.

BeefTalk_Do%20We%20Exist.PNG"The concept of data collection is knocking on the door of the beef industry, but the concept is not registering. In fact, there actually is a fairly large disconnect.

This is ironic because most, if not all, beef producers pride themselves on their understanding of the skills needed to master the production of beef. Today, there is another player simply called “data.”

The information associated with individual cattle is critical. Producers need to understand how livestock production is viewed ....

That distinction is not being made and the ramifications are lost revenue in the actual value of the calf and lost future opportunity. This is critical for the future of the beef business ...."

Ownership Web: Where It Will Begin

The Ownership Web will begin along complex product and service supply chains where information must be trustworthy, period. Statistical reliability is not enough. And, as I mentioned above, the Ownership Web will begin this fall along an agricultural supply chain which is among the most challenging of supply chains when it comes to information ownership. Stay tuned as the planks of the Ownership Web are nailed into place, one by one.

Tuesday
Jul152008

Cloud Computing: Billowing Toward Data Ownership - Part II

[Return to Part I]

Cloud Computing's Achilles Heel

2093760-1723750-thumbnail.jpg
Death of Achilles Peter Paul Rubens 1630-1635
The boom in the data center industry is building the Cloud where the conventional wisdom is that the software services of the Semantic Web will thrive. The expansion of the Cloud is believed to augur well that distributed data within the Cloud will come to substitute to some extent - perhaps substantially so - for data currently distributed outside of the Cloud. But the boom is being built upon a privacy paradigm employed by online companies that allows them to use Web cookies for collecting a wide variety of information about individual usage of the Internet. This assumption is the Cloud’s Achilles’ heel. It is an assumption that threatens to keep the Cloud from fully inflating beyond publicly available information sources.

I'm mulling over a more indepth discussion of Web cookies for a final Part III to this multi-part series. In the meantime the focus of today's blog is that a more likely consequence of the Cloud is that as people and businesses consider moving their computer storage and services into the Cloud, their direct technological control of information becomes more and more of a competitive driver.  As blogged in Part I, the online company that figures out ways of building privacy mechanisms into its compliance systems will be putting itself at a tremendous competitive advantage for attracting the services to operate in the Cloud. But puzzlement reigns as to how to connect the Cloud with new pools of the data (mostly non-artistic) that is private, confidential and classified.

Semantic Web: What’s Right About the Vision

What’s right about the Semantic Web is that its most highly funded visionaries have envisioned beyond a Web of documents to a ‘Data Web’. Here are two examples. A Web of scalably integrated data employing object registries envisioned by Metaweb Technologies’ Danny Hillis. A Web of granularly linked, ontologically defined, data objects envisioned by Radar Network’s Nova Spivack.

2093760-1693914-thumbnail.jpgClick on the thumbnail image to the left and you will see in more detail what Hillis envisions. That is, a database represented as a labeled graph, where data objects are connected by labeled links to each other and to concept nodes. For example, a concept node for a particular category contains two subcategories that are linked via labeled links "belongs-to" and "related-to" with text and picture. An entity comprises another concept that is linked via labeled links "refers-to," "picture-of," "associated-with," and "describes" with Web page, picture, audio clip, and data. For further information, see the blogged entry US Patent App 20050086188: Knowledge Web (Hillis, Daniel W. et al).

2093760-1660744-thumbnail.jpgClick on the thumbnail image to the right and you will see in more detail what Spivack envisions. That is, a picture of a Data Record with an ID and fields connected in one direction to ontological definitions in another direction to other similarly constructed data records with there own fields connected in one direction to ontological definitions, etc. These data records - or semantic web data - are nothing less than self-describing, structured data objects that are atomically (i.e., granularly) connected by URIs. For more information, see the blogged entry, US Patent App 20040158455: Methods and systems for managing entities in a computing device using semantic objects (Radar Networks).

Furthermore, Hillis and Spivack have studied the weaknesses of relational database architecture when applied to globally diverse users who are authoring, storing and sharing massive amounts of data, and they have correctly staked the future of their companies on object-oriented architecture. See, e.g., the blogged entries, Efficient monitoring of objects in object-oriented database system which interacts cooperatively with client programs and Advantages of object oriented databases over relational databases. They both define the Semantic Web as empowering people across the globe to collaborate toward the building of bigger, and more statistically reliable, observations about things, concepts and relationships.

2093760-1595453-thumbnail.jpgClick on the thumbnail to the left for a screen shot of a visualization and interaction experiment produced by Moritz Stefaner for his 2007 master's thesis, Visual tools for the socio–semantic web. See the blogged entry, Elastic Tag Mapping and Data Ownership. Stefaner posits what Hillis and Spivack would no doubt agree with - that the explosive growth of possibilities for information access and publishing fundamentally changes our way of interaction with data, information and knowledge. There is a recognized acceleration of information diffusion, and an increasing process of granularizing information into micro–content. There is a shift towards larger and larger populations of people producing and sharing information, along with an increasing specialization of topics, interests and the according social niches. All of this appears to be leading to a massive growth of space within the Cloud for action, expression and attention available to every single individual.

Semantic Web: What’s Missing from the Vision

Clouds_Missing%20From.PNG Continuing to use Hillis and Spivack as proxies, these two visionaries of the Semantic Web assume that data - all data - will be made available as an open source. Neither of them have a ready answer for the very simple question that Steve Innskeep asks above (in Part I of this two-part blog entry).

Inskeep: "Is somebody who runs a business, who used to have a filing cabinet in a filing room, and then had computer files and computer databases, really going to be able or want to take the risk of shipping all their files out to some random computer they don't even know where it is and paying to rent storage that way?"

Sir Tim Berners-Lee, the widely recognized inventor of the Web, and Director of the W3C, is every bit as perplexed about data ownership. In Data Portability, Traceability and Data Ownership - Part IV I referenced a recent interview excerpt from March, 2008, initiated by interviewer Paul Miller of ZDNet, in which Berners-Lee does acknowledge data ownership fear factors.

Miller: “You talked a little bit about people's concerns … with loss of control or loss of credibility, or loss of visibility. Are those concerns justified or is it simply an outmoded way of looking at how you appear on the Web?”

Berners-Lee: “I think that both are true. In a way it is reasonable to worry in an organization … You own that data, you are worried that if it is exposed, people will start criticizing [you] ….

So, there are some organizations where if you do just sort of naively expose data, society doesn't work very well and you have to be careful to watch your backside. But, on the other hand, if that is the case, there is a problem. [T]he Semantic Web is about integration, it is like getting power when you use the data, it is giving people in the company the ability to do queries across the huge amounts of data the company has.

And if a company doesn't do that, then, it will be seriously disadvantaged competitively. If a company has got this feeling where people don't want other people in the company to know what is going on, then, it has already got a problem ….

(emphasis added)

In other words, 'do the right thing', collegially share your data and everything will be OK. If only the real world worked that way, then Berners-Lee would be spot on. In the meantime, there is a ready answer.

Ownership Web

Cloud%20Over%20Ocean.PNGThe ready answer is an Ownership Web concurrently rising alongside, and complimentary to, the emerging Semantic Web.

For the Semantic Web to reach its full potential in the Cloud, it must have access to more than just publicly available data sources. It must find a gateway into the closely-held, confidential and classified information that people consider to be their identity, that participants to complex supply chains consider to be confidential, and that governments classify as secret. Only with the empowerment of technological ‘data ownership’ in the hands of people, businesses, and governments will the Semantic Cloud make contact with a horizon of new, ‘blue ocean’ data.

The Ownership Web would be separate from the Semantic Web, though semantically connected as layer of distributed, enterprise-class web platforms residing in the Cloud.

Ownership%20Web.PNG

The Ownership Web would contain diverse registries of uniquely identified data elements for the direct authoring, and further registration, of uniquely identified data objects. Using these platforms people, businesses and governments would directly host the authoring, publication, sharing, control and tracking of the movement of their data objects.

The technological construct best suited for the dynamic of networked efficiency, scalability, granularity and trustworthy ownership is the data object in the form of an immutable, granularly identified, ‘informational’ object.

A marketing construct well suited to relying upon the trustworthiness of immutable, informational objects would be the 'data bank'.

Data Banking

Bank_Man%20and%20Money%20Supporting.PNG Traditional monetary banks meet the expectations of real people and real businesses in the real world.

As blogged in Part I ... 

People are comfortable and familiar with monetary banks. That’s a good thing because without people willingly depositing their money into banks, there would be no banking system as we know it. By comparison, we live in a world that is at once awash in on-demand information courtesy of the Internet, and at the same time the Internet is strangely impotent when it comes to information ownership.

In many respects the Internet is like the Wild West because there is no information web similar to our monetary banking system. No similar integrated system exists for precisely and efficiently delivering our medical records to a new physician, or for providing access to a health history of the specific animal slaughtered for that purchased steak. Nothing out there compares with how the banking system facilitates gasoline purchases.

If an analogy to the Wild West is apropos, then it is interesting to reflect upon the history of a bank like Wells Fargo, formed in 1852 in response to the California gold rush. Wells Fargo wasn’t just a monetary bank, it was also an express delivery company of its time for transporting gold, mail and valuables across the Wild West. While we are now accustomed to next morning, overnight delivery between the coasts, Wells Fargo captured the imagination of the nation by connecting San Francisco and the East coast with its Pony Express. As further described in Banking on Granular Information Ownership, today’s Web needs data banks that do for the on-going gold rush on information what Wells Fargo did for the Forty-niners.

Banks meet the expectations of their customers by providing them with security, yes, but also credibility, compensation, control, convenience, integration and verification. It is the dynamic, transactional combination of these that instills in customers the confidence that they continue to own their money even while it is in the hands of a third-party bank.

A data bank must do no less.

Ownership Web: What's Philosophically Needed

Money_Brazilian.PNG Where exactly is the sweet spot of data ownership?

In truth, it will probably vary depending upon what kind of data bank we are talking about. Data ownership will be one thing for personal health records, another for product supply chains, and yet another for government classified information. And that's just for starters because there will no doubt be niches within niches, each with their own interpretation of data ownership. But the philosophical essence of the Ownership Web that will cut across all of these data banks will be this:

  • That information must be treated either or both as a tangible, commercial product or banked, traceable money.

The trustworthiness of information is crucial. Users will not be drawn to data banks if the information they author, store, publish and access can be modified. That means that even the authors themselves must be proscribed from modifying their information once registered with the data bank. Their information must take on the immutable characteristic of tangible, traceable property. While the Semantic Web is about the statistical reliability of data, the Ownership Web is about the reliability of data, period.

Ownership Web: What's Technologically Needed

What is technologically required is a flexible, integrated architectural framework for information object authoring and distribution. One that easily adjusts to the definition of data ownership as it is variously defined by the data banks serving each social network, information supply chain, and product supply chain. Users will interface with one or more ‘data banks’ employing this architectural framework. But the lowest common denominator will be the trusted, immutable informational objects that are authored and, where the definition of data ownership permits, controllable and traceable by each data owner one-step, two-steps, three-steps, etc. after the initial share.

2093760-1700737-thumbnail.jpgClick on the thumbnail to the left for the key architectural features for such a data bank. They include a common registry of standardized data elements, a registry of immutable informational objects, a tracking/billing database and, of course, a membership database. This is the architecture for what may be called a Common Point Authoring™ system. Again, where the definition of data ownership permits, users will host their own 'accounts' within a data bank, and serve as their own 'network administrators'. What is made possible by this architectural design is a distributed Cloud of systems (i.e., data banks). The overall implementation would be based upon a massive number of user interfaces (via API’s, web browsers, etc.) interacting via the Internet between a large number of data banks overseeing their respective enterprise-class, object-oriented database systems.

2093760-1666391-thumbnail.jpgClick on the thumbnail to the right for an example of an informational object and its contents as authored, registered, distributed and maintained with data bank services. Each comprises a unique identifier that designates the informational object, as well as one or more data elements (including personal identification), each of which itself is identified by a corresponding unique identifier. The informational object will also contain other data, such as ontological formatting data, permissions data, and metadata. The actual data elements that are associated with a registered (and therefore immutable) informational object would be typically stored in the Registered Data Element Database (look back at 124 in the preceding thumbnail). That is, the actual data elements and are linked via the use of pointers, which comprise the data element unique identifiers or URIs. Granular portability is built in. For more information see the blogged entry US Patent 6,671,696: Informational object authoring and distribution system (Pardalis Inc.).

Ownership Web: Where Will It Begin?

2093760-1729103-thumbnail.jpg
Aristotle
Metaweb Technologies
is a pre-revenue, San Francisco start-up developing and patenting technology for a semantic ‘Knowledge Web’ marketed as Freebase™. Philosophically, Freebase is a substitute for a great tutor, like Aristotle was for Alexander. Using Freebase users do not modify existing web documents but instead annotate them. The annotations of Amazon.com are the closest example but Freebase further links the annotations so that the documents are more understandable and more findable. Annotations are also modifiable by their authors as better information becomes available to them. Metaweb characterizes its service as an open, collaboratively-edited database (like Wikipedia, the free encyclopedia) of cross-linked data but it is really very much a next generation competitor to Google.

Not that Hillis hasn't thought about data ownership. He has. You can see it in an interview conducted by his patent attorney and filed on December 21, 2001 in the provisional USPTO Patent Application 60/343,273:

Danny Hillis: "Here's another idea that's super simple. I've never seen it done. Maybe it's too simple. Let's go back to the terrorist version [of Knowledge Web]. There's a particular problem in the terrorist version that the information is, of course, highly classified .... Different people have some different needs to know about it and so on. What would be nice is if you ... asked for a piece of information. That you [want access to an] annotation that you know exists .... Let's say I've got a summary [of the annotation] that said,  'Osama bin Laden is traveling to Italy.' I'd like to know how do you know that. That's classified. Maybe I really have legitimate reasons for that. So what I'd like to do, is if I follow a link that I know exists to a classified thing, I'd like the same system that does that to automatically help me with the process of getting the clearance to access that material." [emphasis added]

What Hillis was tapping into just a few months after 9/11 is just as relevant to today's information sharing needs.

In the War on Terror the world is still wrestling with classified information exchange between governments, between agencies within governments, and even between the individuals making up the agencies themselves. Fear factors revolving around data ownership – not legal ownership, but technological ownership – create significant frictions to information sharing throughout these Byzantine information supply chains.

Fear%20Factors_Woman%20Fretting.PNGSomething similar is happening within the global healthcare system. It's a complex supply chain in which the essential product is the health of the patients themselves. People want to share their entire personal health records with a personal physician but only share granular parts of it with an impersonal insurance company. ‘Fear factors’ are keeping people from becoming comfortable with posting their personal health information into online accounts despite the advent of Microsoft HealthVault and Google Health.

And then, in this era of both de facto and de jure deregulation, there are the international product supply chains providing dangerous toys and potential ‘mad cow’ meat products to unsuspecting consumers. Unscrupulous supply chain participants will always hide in the ‘fog’ of their supply chains. The manufacturers of safe products want to differentiate themselves from the manufacturers of unsafe products. But, again, fear factors keep the good manufacturers from posting information online that may put them at a competitive disadvantage to downstream competitors.

I'm painting a large picture here but what Hillis is talking about is not limited to the bureaucratic ownership of data but to matching up his Knowledge Web with another system - like the Ownership Web - for automatically working out the data ownership issues.

But bouncing around ideas about how we need data ownership is not the same as developing methods or designs to solve it. What Hillis non-provisionally filed, subsequent to his provisional application, was the Knowledge Web (aka Freebase) application. Because of its emphasis upon the statistical reliability of annotations, Knowledge web's IP is tailored made for the Semantic Web.  See the blogged entry US Patent App 20050086188: Knowledge Web (Hillis, Daniel W. et al). And because the conventional wisdom within Silicon Valley is that the Semantic Web is about to emerge, Metaweb is being funded like it is “the next big thing”. Metaweb’s Series B raised $42.4M more in January, 2008. What Hillis well recognizes is that as Freebase strives to become the premier knowledge source for the Web, it will need access to new, blue oceans of data residing within the Ownership Web.

2093760-1660740-thumbnail.jpgRadar Networks may be the “next, next big thing”. Also a pre-revenue San Francisco start-up, its bankable founder, Nova Spivack, has gone out of his way to state that his product Twine™ is more like a semantic Facebook while Metaweb’s Freebase is more like a semantic Wikipedia. Twine employs W3C standards in a community-driven, bottom up process, from which mappings are created to infer a higher resolution (see thumbnail to the right) of semantic equivalences or connections among and between the data inputted by social networkers. Again, this data is modifiable by the authors as better information becomes available to them. Twine holds four pending U.S. patent applications though none of these applications. See the blogged entry US Patent App 20040158455:  Methods and systems for managing entities in a computing device using semantic objects (Radar Networks). Twine’s Series B raised $15M-$20M in February, 2008 following on the heels of Metaweb's latest round. Twine’s approach in its systems and its IP is to emphasize perhaps a higher resolution Web than that of MetaWeb. Twine and the Ownership Web should be especially complimentary to each other in regard to object granularity. You can see this, back above, in the comparative resemblance between the thumbnail image of Spivack's Data Record ID object with the thumbnail image of Pardalis' Informational Object. Nonetheless, the IP supportive of Twine, like that Hillis' Knowledge Web, places a strong emphasis upon the statistical reliability of information. Twine's IP is tailored made for the Semantic Web.

Dossia is a private consortium pursuing the development of a national, personally controlled health record (PCHR) system. Dossia is also governed by very large organizations like AT&T, BP America, Cardinal Health, Intel, Pitney Bowes and Wal-Mart. In September, 2007, Dossia outsourced development to the IndivoHealth™ PCHR system. IndivoHealth, funded from public and private health grants, shares Pardalis' philosophy that "consumers are managing bank accounts, investments, and purchases online, and … they will expect this level of control to be extended to online medical portfolios." IndivoHealth empowers patients with direct access to their centralized electronic medical records via the Web.

But given the current industry needs for a generic storage model, the IndivoHealth medical records, though wrapped in an XML structure (see the next paragraph), are essentially still just paper documents in electronic format. IndivoHealth falls far short of empowering patients with the kind of control that people intuitively recognize as ‘ownership’. See US Patent Application 20040199765 entitled System and method for providing personal control of access to confidential records over a public network in which access privileges include "reading, creating, modifying, annotating, and deleting." And it reasonably follows that this is one reason why personal health record initiatives like those of not just Dossia, but also Microsoft’s HealthVault™ and GoogleHealth™, are not tipping the balance. For Microsoft and Google another reason is that they so far have not been able to think themselves out of the silos of the current privacy paradigm. The Ownership Web is highly disruptive of the prevailing privacy paradigm because it empowers individuals with direct control over their radically standardized, immutable data.

World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. W3C is headed by Sir Tim Berners-Lee, creator of the first web browser and the primary originator of the Web specifications for URL, HTTP and HTML. These are the principal technologies that form the basis of the World Wide Web. W3C has further developed standards for XML, SPARQL, RDF, URI, OWL, and GRDDL with the intention of facilitating the Semantic Web. While Berners-Lee has described in his own words (above) his perplexity about data ownership, nonetheless, the data object standards created by the W3C should be more than friendly to an Ownership Web employing object-oriented architecture. Surely, in Common Point Authoring™ will be found many of the ‘best of breed’ standards for an Ownership Web that is most complimentary to the emerging Semantic Web.

2093760-1723853-thumbnail.jpgEPCglobal is a private, standards setting consortium governed by very large organizations like Cisco Systems, Wal-Mart, Hewlett-Packard, DHL, Dow Chemical Company, Lockheed Martin, Novartis Pharma AG, Johnson & Johnson, Sony Corporation and Proctor & Gamble. EPCglobal is architecting essential, core services (see EPCglobal's Architectural Framework in the thumbnail to the right) for tracking physical products identified by unique electronic product codes (including RFID tags) across and within enterprise-scale, relational database systems controlled by large organizations.

Though it would be a natural extension to do so, EPCglobal has yet to envision providing its large organizations (and small businesses, individual supply chain participants and even consumers) with the ability to independently author, track, control and discover granularly identified informational products. See the blogged entry EPCglobal & Prescription Drug Tracking. It is not difficult to imagine that the Semantic Web, without a complimentary Ownership Web, would frankly be abhorrent to EPCglobal and its member organizations. For the Semantic Web to have any reasonable chance of connecting itself into global product and service supply chains, it must work through the Ownership Web.

Ownership Web: Where It Will Begin

The Ownership Web will begin along complex product and service supply chains where information must be trustworthy, period. Statistical reliability is not enough. And, in fact, the Ownership Web is beginning to form along the most dysfunctional of information supply chains. But that's for discussion in later blogs, as the planks of the Ownership Web are nailed into place, one by one.


[This concludes Part II of a three part series. On to Part III.]