That would be a semantic matter
When preparing my slides for my talk at the Open Group meeting in Brussels last month, I noticed that I was speaking after Ivan Herman of the W3C, who would be updating us on Semantic Web progress. So, I added a section to my talk to discuss the relevance of the Semantic Web for Web Services security.
The conference was kicked off by Allan Brown (President of the Open Group) who made an great point: The field of IT has focussed much more on the "T" than on the "I". Security is a great example of where the "T" in "IT" has moved much faster than the "I". Work on the syntax of security technologies has overtaken work on the semantics of the data which we are securing. Let's put this into context with three examples:
1) Confidentiality and Privacy. It's easy to confuse these two concepts and say something like "we use encryption to guarantees the privacy of all the data on the wire", when in fact you really mean "confidentiality of the data". So what is the difference? The answer is that while confidentiality refers just to the business of keeping data safe from prying eyes, without any reference to what that data actually means, privacy is linked to the meaning of data. For example, under HIPAA, an individual can control who can view their medical records. If you are designing a HIPAA compliant system to support privacy, you must add safeguards to allow a customer to control who can access their personal data, even if that data is part of others documents which are being passed between healthcare providers. When a document is being stored or transmitted, the system much effectively say "I can see that this piece of data here is somebody's medical record, so that means I must perform the following set of security rules for privacy reasons".
When part of a document must be kept confidential for privacy reasons, and the document is an XML document, then XML Encryption is the obvious choice. It turns out that the syntactic aspects of this are well defined - XML Encryption has been a W3C recommendation for a few years now. WS-Security defines how XML Encryption relates to SOAP messages, allowing portions of them to be encrypted. WS-Policy allows a Web Service to communicate the fact that it encrypts response data. For example, the following WS-Policy block (extracted from a sample in the Microsoft WSE 2.0 toolkit) means that the Web Service is effectively saying "Dear Web Service client, I am going to use the public key in this particular X.509 certificate to encrypt the SOAP bodies of all the response messages which I send back to you".
<wsp:Policy wsu:Id="Encrypt-X.509"> <wssp:Confidentiality wsp:Usage="wsp:Required"> <wssp:KeyInfo> <wssp:SecurityToken wse:IdentityToken="true"> <wssp:TokenType>http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-x509-token-profile-1.0#X509v3</wssp:TokenType> <wssp:TokenIssuer>CN=Root Agency</wssp:TokenIssuer> <wssp:Claims> <wssp:SubjectName MatchType="wssp:Exact">CN=WSE2QuickStartClient</wssp:SubjectName> <wssp:X509Extension OID="2.5.29.14" MatchType="wssp:Exact">gBfo0147lM6cKnTbbMSuMVvmFY4=</wssp:X509Extension> </wssp:Claims> </wssp:SecurityToken> </wssp:KeyInfo> <wssp:MessageParts Dialect="http://schemas.xmlsoap.org/2002/12/wsse#part">wsp:Body()</wssp:MessageParts> </wssp:Confidentiality> </wsp:Policy>
So, we know how to encrypt some XML data (XML Encryption), how to package it into a SOAP message (WS-Security), and how a Web Service can tell its clients that this is its confidentiality policy (WS-Policy). That's the syntactic aspect taken care of. What about the semantic aspect? That would mean a policy which says something like "I am going to encrypt any medical records which I send to you in reponses, and I expect you to encrypt all medical records which you send me". That would be a semantic rule - it concerns what the data means.
Work to solve this problem is underway in the Semantic Web world. This 2004 paper, "Authorization and Privacy for Semantic Web Services", proposes attaching semantic-aware security information to OWL-S input and output parameters. For example, the foaf:Person definition from the FOAF (Friend of a Friend) ontology could be used to specify that personal information is always transmitted as encrypted data, never appears as output of a Web Service, and is never forwarded on to a third party such as a direct marketer. The Web Service would then know what personal information looks like, and so could recognise it in order to apply the privacy rules.
2) Authentication and Authorization. Again, like Confidentiality and Privacy, these concepts are often confused. One often hears statements like "we only let authenticated people access our Intranet" - which, if you think about it, doesn't make sense [what if I successfully authenticate, but I am the CEO of your competitor. You now know who I am, so am I still allowed access your Intranet?]. Authentication refers to who you are, whereas authorization refers to what you can do. Remember that most of the 9/11 hijackers flew under their own passports and so were successfully authenticated, meaning that the failure was at the authorization level.
In the Web Service security world, we have SAML to use to assert that the subject was authenticated at a certain time using a certain method in a certain security domain - in a SAML Authentication Statement. SAML is also used to convey authorization information, in an Authorization Decision Statement. Finally, SAML can be used to communicate attributes of a subject. This third aspect of SAML, the attribute assertion, is where we get into semantics. If the user has an attribute of "Manager", how can we be sure that the recipient of the SAML assertion understands what "Manager" means? As Frank Cohen's excellent "Debunking SAML myths and misunderstandings" article states, SAML does not predefine any attribute meanings:
"SAML does not define attribute meanings for any industry. Instead, SAML defines a namespace mechanism that industry consortia may use to define attributes for their particular industry. For example, in the aerospace industry the SAML attribute role:mechanic defines a mechanic at an airline. The parties at both ends of a system need to agree separately on the namespaces used by SAML. " http://www-106.ibm.com/developerworks/xml/library/x-samlmyth.html?Open&ca=daw-se-news
3) Licenses and Permissions. XRML can be used, at a syntactic level, to assign a license to digital content. This license is linked to the content, and indicates rights to the data - such as "Play" or "Loan". But what about the semantics - what does "Play" mean for a screenplay, or what does "loan" mean for an email message, or for a book which has to be printed onto paper before you loan it? The implementation of rights management technology, including XRML, comes up against these semantic issues. It is one thing to say "this is how a license is formatted", but quite another thing to say "this is what loaning a book means".
The "technology" part of IT security might seem like it's the hard part, but in actual fact, a lot of headway has been made into it. We know how to make data confidential, how to authenticate someone or something, and how to attach a rights management license to digital content. That's the syntactic part. It may seem complicated, but it's the "information" part of IT security which is more difficult. If we want some data to be private, we need a way to know what the data is. If we need to authorize someone based on their attributes, we need to know what the attribute is. And if digital content has a certain access right, then the software accessing the content must know how to interpret that right. In short, semantics are the hard part of security.
|
|