Various starter questions

Jul 24, 2008 at 10:01 AM
Edited Jul 24, 2008 at 10:04 AM
Hi, there are questions that come to mind. I hope you don't mind me asking so that I know where this project is going:

In what way is this a sample app? Is it a personal learning exercise or a community project?
Why the apache licence specifically?
What is the roadmap? You mention WordPress-like features. This is not a bad thing, WordPress is excellent even if it is in PHP. But surely that's a long-term target and there must be other more modest goals first?
Have you looked at the other MVC blog projects here on codeplex?

Coordinator
Jul 28, 2008 at 2:24 PM

It's a sample app in the sense that is a ASP.NET MVC/Linq to SQL sample application for developer's who are new to ASP.NET MVC and Linq to SQL and need an example to learn ASP.NET MVC.

I'll get a roadmap written and put this up and in the next few days.

Let me talk on WordPress like features. Let me clarify that MVC Press is a multi tenant blogging system. It's not a Blog engine with single instance install like the Subtext blog system, etc. MVC Press will allow for more than one blog within the system. It will use a shared database and shared schema. WordPress itself is not even multi tenant in the true sense since each blog has it's own codebase and database, so I am merely using concepts and features of "WordPress" and rolling into a multi tenant architecture. My short term goal is to build a multi tenant architecture in ASP.NET MVC/Linq. Further to this I am going to implement a "tenant data encryption" pattern which is a multi tenant security pattern to further protect tenant data is by encrypting it within the database, so that data will remain secure even if it falls into another tenant's hands.

I have looked at other MVC blog projects and found the following:

Blog MVC.NET - when I started MVC Press while this project(Blog MVC.NET) existed they had no source code/releases.
MvcBlog - they have source code released in January 2008 which only includes a database schema and no source code.

Other new MVC blog projects have popped up after the creation of MVC Press, but they are single instance blog projects.

Is there something about the Apache license that should be considered as a negative? I'm pretty open so I'm curious as to why you wouldn't use this license?

Coordinator
Jul 28, 2008 at 2:37 PM
Sorry I missed some clarification on my part, it's a community project.
Jul 28, 2008 at 3:41 PM
Hi

Thanks for the reply. I support the idea of a blog engine allowing multiple logins, multiple authors and multiple blogs (and not necessarily a 1-1 mapping between these). I hadn't thought of encryption as necessary for that.

Is the "sample" part a main goal, a statement on code quality aims, or just an observation that it would make a good place to start?

For a roadmap, I'm a fan of agile approaches, i.e.  get a feature set that is simple, but does a few things well and can be achieved in a few months, and then get to a release. Then add features and repeat.

For the apache licence,  I'm just curious as to why that particular licence. The various MS open source licences are perhaps more common here on codeplex, and the GPL and MPL are better known elsewhere. All licences have their pros and cons. With the apache licence it allows others to take the code into commercial products, like the BSD licence. If that's what you want to allow, then fine.

For the other MVC blog projects, there seem to be a lot of false starts and other beginnings but nothing that has really taken off.

There is also blogengine.net (http://www.dotnetblogengine.net/), but I'm not finding their development platform of asp.net 2.0 with no third-party assemblies allowed as very satisfying.

I haven't had time to look into http://code.google.com/p/blogcontroller/ yet, there seems to be a lot of code there but I haven't assesed quility or goals. The google code pages don't show prominently when the last checkin took place.

Coordinator
Jul 28, 2008 at 5:19 PM

Sorry it's monday and I'm kinda slow....can you clarify what you mean by "Is the "sample" part a main goal, a statement on code quality aims, or just an observation that it would make a good place to start?"

Regarding the need for "tenant data encryption pattern", I'll word this as best as possible as I want to be careful not to offend anyone. I'll give an example to help illustrate my point. WordPress as it stands right now was developed in such a way so that there was no blurring of data between tenant's by using the typical single install for each blog(tenant) that way there is no bleeding of data between blogs. MySpace are imploying some pattern in order to assure I don't see someone else data when I shouldn't which I what I am trying to achieve with the "tenant data encryption pattern" as an extra level of assurance that I don't see someone else's blog data when I shouldn't. 

There are other multi tenant patterns that can be used, which can be found here:  http://msdn.microsoft.com/en-us/library/aa479086.aspx




Jul 28, 2008 at 8:08 PM
I just haven't quite understeed what the word "sample" means there. A sample to me is simple code for learning purposes.
What's the more important goal - to have a application that is usable as a blog, or to teach people how to use MVC?
These goals may be aligned - readable code is good code, but they may not: the "blog" goal will drive towards features, and the "sample" goal will drive towards simplicity.

Coordinator
Jul 29, 2008 at 1:13 AM
Yes, a sample in the sense for learning purposes.

Their on the same level where they are of equal importance.
The goal's in my mind is to be both a usable/functional multi tenant blogging system while being documented to allow people to learn MVC.


Jul 29, 2008 at 8:23 AM
What data needs to be hidden? On most blogs, posts and comments are viewable by the public, the only restriction is on who can make or edit posts.
Unless you are thinking of a model like LiveJournal or Facebook where some content is shown only to "friends". This is not necessarily a bad thing.
Coordinator
Jul 29, 2008 at 1:47 PM
Did you have a chance to read the page above about "Multi-Tenant Data Architecture" at http://msdn.microsoft.com/en-us/library/aa479086.aspx?

I like that you are challenging this concept as it allows me to put this out in the open to be discussed and validated as a requirement. I'll provide some use cases in the documentation page that further validate the need for this requirement.
Coordinator
Jul 29, 2008 at 2:10 PM

I should note that I would prefer to have the "View filter pattern" and I may incorporate this sometime in a future release, but for now the "Tenant Data Encryption" pattern will have to suffice.

To save time I pulled the part of that article that pertains to "Tenant Data Encryption":

Tenant Data Encryption

A way to further protect tenant data is by encrypting it within the database, so that data will remain secure even if it falls into the wrong hands.

Cryptographic methods are categorized as either symmetric or asymmetric. In symmetric cryptography, a key is generated that is used to encrypt and decrypt data. Data encrypted with a symmetric key can be decrypted with the same key. In asymmetric cryptography (also called public-key cryptography), two keys are used, designated the public key and the private key. Data that is encrypted with a given public key can only be decrypted with the corresponding private key, and vice versa. Generally, public keys are distributed to any and all parties interested in communicating with the key holder, while private keys are held secure. For example, if Alice wishes to send an encrypted message to Bob, she obtains Bob's public key through some agreed-upon means, and uses it to encrypt the message. The resulting encrypted message, or cyphertext, can only be decrypted by someone in possession of Bob's private key (in practice, this should only be Bob). This way, Bob never has to share his private key with Alice. To send a message to Bob using symmetric encryption, Alice would have to send the symmetric key separately—which runs the risk that the key might be intercepted by a third party during transmission.

Public-key cryptography requires significantly more computing power than symmetric cryptography; a strong key pair can take hundreds or even thousands of times as long to encrypt and decrypt data as a symmetric key of similar quality. For SaaS applications in which every piece of stored data is encrypted, the resulting processing overhead can render public-key cryptography infeasible as an overall solution. A better approach is to use a key wrapping system that combines the advantages of both systems.

With this approach, three keys are created for each tenant as part of the provisioning process: a symmetric key and an asymmetric key pair consisting of a public key and a private key. The more-efficient symmetric key is used to encrypt the tenant's critical data for storage. To add another layer of security, a public/private key pair is used to encrypt and decrypt the symmetric key, to keep it secure from any potential interlopers.

When an end user logs on, the application uses impersonation to access the database using the tenant's security context, which grants the application process access to the tenant's private key. The application (still impersonating the tenant, of course) can then use the tenant's private key to decrypt the tenant's symmetric key and use it to read and write data.

This is another example of the defense-in-depth principle in action. Accidental or malicious exposure of tenant data to other tenants—a nightmare scenario for the security-conscious SaaS provider—is prevented on multiple levels. The first line of defense, at the database level, prevents end users from accessing the private data of other tenants. If a bug or a virus in the database server were to cause an incorrect row to be delivered to the tenant, the encrypted contents of the row would be useless without access to the tenant's private key.

The importance of encryption increases the closer a SaaS application is to the "shared" end of the isolated/shared continuum. Encryption is especially important in situations involving high-value data or privacy concerns, or when multiple tenants share the same set of database tables.

Because you can't index encrypted columns, selecting which columns of which tables to encrypt involves making a tradeoff between data security and performance. Think about the uses and sensitivity of the various kinds of data in your data model when making decisions about encryption.





Coordinator
Jul 29, 2008 at 2:35 PM
Some initial use cases:

1. The Multi tenant blog system is hosted on a large shared hosting company giving full access to any DBA(of said shared hosting company) to freely write a select statement allowing them to export all your data into a CSV file for later reuse.
2. The Multi tenant blog system is hosted on a Virtual Private Server at a hosting company where one Blog Tenant A receives a large amount of traffic causing locks in the database and due to an error with MS SQL server the wrong database record is returned in a SELECT statement to other Blog Tenants B, Blog Tenants C, Blog Tenants D  on the same Virtual Private Server.
3. An error in coding causes the wrong data(from Blog Tenant A) to be served up Blog Tenant B.
Jul 29, 2008 at 4:30 PM
Edited Jul 29, 2008 at 4:37 PM
Hm ok, I don't know of any other blog engines that do this or have needed it.

I'd say for the data layer what's needed next is an API for creating, updating and possibly removing blogs (records in the blog table).

Jul 29, 2008 at 8:26 PM
Edited Jul 29, 2008 at 8:26 PM
Ok, let me be straightforward.
If the SQL server or data layer are returning the wrong data, they have errors that need fixing. Encryption is not the fix to these.
If the posts and comments are encrypted so that they needs the user to be logged in to decrypt it, how are they going to be shown on web pages and in RSS feeds to the anonymous public?

Coordinator
Jul 29, 2008 at 8:37 PM

Well I agree to a degree that there probably aren't any existing blog engines that do this, but I believe it is a needed feature in order to assure data isolation. The only other Blog engine that I am aware that is Multi tenant with a shared database and shared schema is Community Server to see how they do data isolation. Even if it doesn't have Data encryption this doesn't mean it's wrong to include Data encryption.  There are other Multi tenant blogging engines I am sure but I can't think of them offhand.  I guess the larger question is do Multi tenant applications in general require measures such as "Tenant Data Encryption" and I am going to say yes as it's been done in practice. 

Just because other blog engines don't do this doesn't mean it's not okay to go to that extra length in order to assure data isolation.

What I am going to do is put in the ability for the Admin to enable/disable the "Tenant Data Encryption" feature when first installing the system.

My intention is that someday this application may be used at the enterprise level, but it may fall flat on it's face and become vaporware. If patterns such as "Tenant Data Encryption" or "View filter pattern" are used then this application can serve as another example/sample of how to do this with ASP.NET MVC and Linq.



Coordinator
Jul 29, 2008 at 8:57 PM

Do you agree than any of these test cases are possible in the real world?

1. The Multi tenant blog system is hosted on a large shared hosting company giving full access to any DBA(of said shared hosting company) to freely write a select statement allowing them to export all your data into a CSV file for later reuse.
2. The Multi tenant blog system is hosted on a Virtual Private Server at a hosting company where one Blog Tenant A receives a large amount of traffic causing locks in the database and due to an error with MS SQL server the wrong database record is returned in a SELECT statement to other Blog Tenants B, Blog Tenants C, Blog Tenants D  on the same Virtual Private Server.
3. An error in coding causes the wrong data(from Blog Tenant A) to be served up Blog Tenant B.