Ashok Malhotra's Blog: 2013

Monday, August 19, 2013

CAMP Goes to Public Review

Disclaimer: The opinions expressed in this post are, for better or for worse, my own and not intended to reflect the policies or positions of my employer, Oracle, or those of OASIS.

Introduction

The CAMP (Cloud Application Management for Platforms) specification has been released for Public Review. Comments are due by September 19, 2013 and should be sent according to the instructions in How to Submit Public Review Comments. Public comments submitted on this work and for other work of this TC are publicly archived and can be viewed at: Public Comments List.

CAMP is an emerging standard for Platform as a Service (PaaS). If your have not been following the developments in Cloud services you may be wondering what PaaS is, and how it differs from other kinds of Cloud services, so a brief introduction may be helpful.

A good place to start is the NIST Cloud Definition paper which defines three cloud architectures: IaaS (Infrastructure as a Service), PaaS (Platform as a Service) and SaaS (Software as a Service). These have recently been followed by a plethora of *aaS acronyms: NaaS (Network as a Service), DBaaS (Database as a Service), STaaS (Storage as a Service) and even DTaaS (Desktop as a Service) but the original three catogories are fundamental.

IaaS was the first to arrive. It allowed the user to rent computing, storage and networking resources on a remote data center. This started as an economic proposition: even factoring in the communication costs, it was less expensive per cycle to run on a very large data center than your own server. But there were other benefits that were more significant. You did not have to staff and naintain a data center and keep the software updated. If you were a little startup you could start immediately and renting provided flexibility; it was easy to ride out seasonal bumps and if your business did well it was easy to add additional resources.

Initiating and managing interdependent remote resources is, however, not straightforward. And how do you know when to scale up or scale down? Wouldn't it be simpler to just upload your application to a platform and, as the platform may advertise, leave the driving to it? This is what PaaS provides. You upload your application to a platform and the platform runs it for you. In fact, there is another, bigger benefit. The platform can offer services such as database and messaging and the application can be written to use the services. This can significantly simplify application construction and provide great value but, of course, if you use exotic platform services that will lock you in to the platform.

CAMP

To run an application on a platform you need to be able to encapsulate it and stage it on the platform. CAMP calls this a Platform Development Package (PDP) and it is a fairly general format that lets you package an application along with its components and some metadata. The components have dependencies on one another and, more important, have dependencies on platform services provided by platform components. Platform components exhibit capabilities and during the deployment process requirements from the application components are matched with platform component capabilities.

CAMP identifies resources by URIs and defines REST operations to upload a PDP onto a platform, register it and deploy it. During the deployment process the application can be customized. If successfully deployed, the the application can be instantiated (run). The figure below, from an earlier version of the spec, depicts the application lifecycle.

An important aspect of Cloud Servicing is scaling. CAMP does not define specific scaling policies but, instead, provides a framework consisting of sensors that gather metrics which can be used with other data such as static or dynamic thresholds to trigger actions. Users can use this framework to provide their own scaling policies.

Summary

CAMP is a an important first step towards the standardization of PaaS. It is a simple specification that provides basic operations to upload a application to a platform and manage its lifecycle. It is designed to be extensible so users can customize it to their needs. Please review look at the spec and comment.

Friday, August 9, 2013

Linked Data Goes to Last Call

Disclaimer: The opinions expressed in this post are, for better or for worse, my own and not intended to reflect the policies or positions of my employer, Oracle, or those of the W3C

The Linked Data spec that I blogged about earlier in this space has gone to Last Call. This is an important step because it indicates that the spec has reached a certain level of maturity.

Please review and send comments to: public-ldp-comments@w3.org (subscribe, archives). Comments period ends Sept 2, 2013.

Sunday, July 28, 2013

Do Not Track

Disclaimer: The opinions expressed in this post are, for better or for worse, my own and are not intended to reflect the policies or positions of my employer, Oracle, or those of the W3C

If you have Do Not Track on your radar you must have seen a number of news items and blogs most of them reporting that the Do Not Track initiative is deadlocked; at an impasse. See, for example from Bloomberg: Web’s Mad Men Fight Browser Makers Over Online Tracking which starts off by saying: "Yahoo!, AOL and other companies dependent on Internet ad revenue are fighting Web-browser makers, including Microsoft, over how to let consumers avoid being tracked online." See also the New York Times blog: Wrangling over 'Do Not Track' and review Don't Track Us and Dan Appelquist's blog.

If you have not been following, Do Not Track is a W3C working group that is attempting to standardize a HTTP header that indicates that the user does not want his visits to websites to be tracked and his personal data collected and shared with advertising networks. Other aspects of the proposed standard include a well-known location (URI) for providing a machine-readable tracking status resource that describes a service's DNT compliance and a HTTP response header field for resources to communicate their compliance or non-compliance with the user's expressed preference. The Working Group has been meeting for about two years and browser makers have enabled a Do Not Track option, some of them turning it on by default but compliance from advertisers has yet to come.

At the 2011 Web Tracking Workshop one of the arguments advanced for starting the Do Not Track WG was that if the industry did not agree on a standard it would be imposed on them by legislation.

In May 2011 the EU published an EU e-Privacy directive, that requires websites to indicate on the page whether cookies are being used, where to go for more information and how to give or withhol dconsent. If you visit, for example, the Guardian website there is a banner right at the top that says cookies are being used and points you to a link that tells you more about how the Guardian uses cookies. There is no such legislation in sight for the U.S.

Another option is, of course, self regulation or voluntary compliance. Have you seen the AdChoice icon?

This is brought to us by the Digital Advertising Alliance (DAA), a coalition of advertisers, publishers, and marketers that has been working to increase transparency on the Web and create controls for online advertising. This clickable icon floats near ads and is meant to give users information about targeted ads and the data collected by ads. It also gives users a Do Not Track option. Now, the AdChoice icon is coming to mobile browsers.

The DAA says that the AdChoice icon is used in 30 countries but I have not seen a lot of it on the websites I frequent but that may be just me and where I walk.

On July 26, 2013 the New York Times reported agreement by a variety of groups, including app developers and consumer advocates to test a voluntary code of conduct that would require participating app developers to offer notices about whether their apps collect certain personal data from users or share user-specific data with entities like advertising networks or consumer data resellers.

So, perhaps, we will end up with self-regulation; better than nothing but not really enough. Self regulation may stave off legislation but it is unenforceable and it depends upon the cooperation and goodwill of advertisers :-( .

Saturday, July 20, 2013

Linking and the Law

Disclaimer: The opinions expressed in this post are, for better or for worse, our own and not intended to reflect the policies or positions of our employers or those of the W3C

Ashok Malhotra (Oracle), Larry Masinter (Adobe) --some thoughts, based on "Publishing and Linking on the Web" co-authored with Jeni Tennison and Dan Appelquist for the W3C TAG.

Also published as Larry Masinter's blog

If you type a Web address into your browser you will most likely be taken to a Web page consisting of text and images. This is less true now than it used to be. Today, you may be taken to a game where you can pretend to be a race car driver or throw stones at pigs but still, in most cases, you will get a Web page. From the information on the page you may be able to access related material by simply clicking. This capability is what makes the Web the Web.

If you are creating a Web page you can use material from other sources in different ways. You can provide a link to the material or you can embed it -- include or transclude (a wonderful word coined by the Hypertext visionary Ted Nelson) -- within your material. To include material, you copy it into your Web page. To transclude it you provide a reference to the material and it is included as part of your Web page when it is rendered. Inclusion and transclusion as opposed to linking are quite different and treated differently by courts.

Here is a page from Wikipedia that includes the picture of a whale from another web site:

The above page is from "http://en.wikipedia.org/wiki/Blue_whale" and if you click on the image in Wikipedia it tells you where the image came from and that it is in the public domain "because it contains materials that originally came from the U.S. National Oceanic and Atmospheric Administration, taken or made as part of an employee's official duties.”

With embedding you see the embedded content on the page. Linking, on the other hand requires a user action. The link, often rendered in another color requires the user to click on it and when she does it takes her to another Web page. But there are advantages to inclusion vs. linking or transclusion. If you include material, that material is not going to change out from under you, whereas that material at the end of a link or material you transclude may change. In the worst case, it could be replaced by malware or a virus.

Legal cases

In recent years there have been a rash of legal cases relating to linking and embedding. There was, for example, the case of Richard O'Dwyer -- a student who resides in the UK and was facing possible extradition to the US for posting links on a Web site, which itself is not US-based and is not primarily intended for US users, to material that the US considers to be copyrighted. (The situtation also raises the question of jurisdiction, but more on that later).

A broad general principle seems to be the notion of agency. If you link to something, you're less responsible for it being available than if you transclude it; if you transclude something, you're less responsible then if you include it (transclude a copy you made). Most of the questions are whether you're responsible for making information available that people don't want shared (bomb making, pornography, copyright infringement). If you do decide to embed, the material should be attributed and, unless it is a brief quote, requires permission; otherwise, you may be held responsible for copyright violation.

Linking, sometimes called hyperlinking, is generally allowed -- the argument has been made in several places that restricting linking is like interfering with free speech. Tim Berners-Lee argued in an early design document that a standard hyperlink is nothing more than a reference or footnote, and that the ability to refer to a document is a fundamental right of free speech. Others have argued similarly that a link is like telling you where you can find a particular book is a library or where you can go and watch a particular movie. When you go to the library you may, in fact, steal the book but that is not instigated by the link.

Jennifer Kyrmin, in The Legalities of Linking -- Web Links and the Law says “There have been one or two cases in the United States that imply that the act of linking without permission is legally actionable, but these have been overturned every time they come up.”

But, still, you need to be careful.

The words accompanying a link can express an opinion -- for example the HTML code:

<p> <a href="http://www.joe's.bar/menu.html">Joe's Bar</a> has great food! </p>

Which renders as:

Joe's bar has great food.

links "great food" to the bar's menu -- but some opinions may be construed as defamatory or libellous.

Consider: "Why pay for the new Daft Punk song when you can download it for free at http://copiedsongs.example.org/daftp/getlucky?

In other words, not only can the text around a link result in libel, the use of a link does not in general make otherwise illegal text legal. And then again, the material you link to may be so inflammatory that even minor responsibility might be risky; it's best not to link to Nazi propaganda, child pornography or “How to Make a Bomb”. Web media has been very effective in political campaigns, but if you link to political material it may be judged to be seditious by some governments, and you may be held responsible.

Restricting Linking

Even though linking, in general, does not violate copyright, some sites may want to restrict linking to all or part of their content.

The Digital Reader article Irish Newspaper Collective Wants to Charge License Fees for Links
ridicules the attempt to charge for merely giving directions on where to find information. But the request for payment is understandable. If you are a newspaper that invests in creating original content you would like to monetize your investment. The New York Times now allows a certain number of links per month. The Wall Street Journal requires you to subscribe. Other news media have similar policies. So, a link may tell you where to find a book but the library may charge a fee or be accessible only via membership.

Incidentally, the original links to The Digital Reader article ceased to work. While a link may not violate copyright, publishers have the right to restrict linking and may impose a number of conditions such as pay barriers or age verification that must be satisfied before a link is followed.

Restricting Deep Linking

Many web sites restrict deep-linking, i.e. links to pages other than the top page, because this allows links to bypass advertising or the legal Terms and Conditions or because a deep link may leave the source of the material unclear. Often, legal Terms and Conditions are used to restrict deep linking but not only are such terms difficult to enforce but there are simple technical mechanisms that are more effective. See ¹

Jurisdiction

The World Wide Web is truly an international phenomenon and as we have discussed, linking has been compared to freedom of speech. But there are limits to freedom of speech and, as we discuss above, some uses of external material may lead to legal action. If I live in the US and host a web site in a Scandinavian country that has links to offensive material, where could I be prosecuted? If I host a website in a country that does not have a bilateral copyright agreement with the US and the website includes swaths of US copyrighted material, can I be prosecuted? If so, where? In the case of certain kinds of international disputes, there are agreements that such disputes will be settled by mediation or arbitration. Perhaps, we need to formalize a similar capability for the Web.

Summary

Linking to material that did not originate with you is an essential feature of the Web and one that gives it much of its power. In general, linking to other material, as opposed to inclusion or transclusion, is safe and carries little risk but, as we explain above, you still need to be careful.

----------------------
^{1
It is straightforward to prevent linking to pages by not giving them URLs or making the URLs undiscoverable.
This can also be accomplished by using the HTTP
referrer header which indicates the last page that was referenced.
If it was not a page on your own site, then you can redirect to your site's home page,
for example.
/It is also possible to do this check in JavaScript, which can then be used to
bring up an interactive dialog window to check whether the contractual terms have been read, to
confirm that the user is over 18, or to ask for a password.

You can also use a cookie to, for example, start a session only when a page is accessed through a given
gateway page and reject or provide an alternative path for requests that don't have the cookie set.

The User-Agent HTTP header which indicates
the identity of the software making the request is particularly useful in preventing access from web
crawlers and search engines. A robots.txt
file on the web site can be used to prevent deep linking by crawlers and search engines.

The domain name or IP address of the client making the connection can also be used to prevent specific
users from accessing material.
-}

Friday, June 28, 2013

Linked Data

Disclaimer: The opinions expressed in this post are, for better or for worse, my own and not intended to reflect the policies or positions of my employer, Oracle, or those of the W3C

If you are looking for the next big thing in middleware and integration, you should look at Linked Data.

In December of 2011, the W3C organized a well-attended workshop on Linked Enterprise Data Patterns. (There is a link to the position papers at the bottom of the page.) The thinking behind this workshop was that although RDF and the use of URIs to identify and link to items of data and REST were gaining popularity they had not yet been applied to solve enterprise-levels problems. Would extensions be required to scale up to the enterprise level? The title of the workshop implied that we needed to develop patterns to apply to particular situations.

Martin Nally of IBM Rational led off the workshop by discussing the need to integrate tools flexibly and efficiently. An example of what he needs in his business is to integrate data about softwate bugs: bug reports, provenance, screen shots, responsibilities and so on. He said that they had tried different technologies for integration (Database integration, Enterprise Systems Bus) but found them rigid and hard to adapt. With RDF and URI integration, you don't need to update the schema to add another related piece of information or access another tool. Here is a picture from Martin's presentation that shows links between different kinds of data.

The same argument can be made for integrating applications. If you have an ordering system and want to add a FedEx tracking link to the customer's online receipt, all you have to do is to tweak the UI and add a button which links you to the FedEx system.

If you use RDF and Linked Data to integrate applications and tools, you need to be able to manipulate the data and add to it, not just read it. This changes the paradigm and enables new kinds of applications. The data needs to be in stable storage with read/write capability. This is a new face for the Web - it makes the Web writable and this means we need to think about the usual "database" capabilities such as access control and transactions.

Based on the success of the LEDP workshop the W3C chartered a Working Group called the Linked Data Platform (LDP) WG. Two of the areas the WG is working on are how to deal with collections and the need for pagination in case the dataset to be presented is very large, questions that REST does not address. Here is the current working draft.

The WG is making good progress and should have a working draft of the specification ready for public review in a few weeks. I will keep you posted.