Update – 06-Aug-2010
This post was inaccurate, in that it failed to acknowledge that Microsoft.Data supports parameters in its text querying. I have made changes to address this inaccuracy.
What is it?
It’s a very lightweight data access framework that I wouldn’t dream of calling an ORM. It uses the dynamic features of Microsoft .NET 4 to provide very expressive ways of retrieving data. It provides non-SQL ways of doing things, inspired by the Ruby ActiveRecord library. It also provides SQL ways of doing things, but with features to protect against SQL injection.
Why?
Because Microsoft have identified a gap in the data access toolkit ecosystem which they think scares casual web developers who are used to environments like PHP. These developers neither want nor need a full-blown ORM like NHibernate or Entity Framework, and they don’t want to deal with the complexity of ADO.NET connections, commands and readers.
Fair enough.
Microsoft have recently released a preview of a new library, Microsoft.Data.dll, which aims to serve these developers. It provides various simple ways of opening up a database, and then lets you run SQL against it, which is how casual developers are used to working with MySQL in the PHP world. They build up their SQL strings and then they execute them.
The problem is, that’s wrong. And I believe that attempting to attract people to your stack by giving them a really easy way to carry on doing things wrong is also wrong. The right thing to do is give them a way to do things right that is as easy as, if not easier than, what they have been using.
Simple.Data is my attempt at that.
Example?
Using Microsoft.Data, you would should query a Users table like this:
var db = Database.Open(); // Connection specified in config. string sql = "select * from users where name = @0 and password = @1"; var user = db.Query(sql, name, password).FirstOrDefault();
But you could query it like this:
var db = Database.Open(); // Connection specified in config. string sql = "select * from users where name = '" + name + "' and password = '" password + "'"; var user = db.Query(sql).FirstOrDefault();
A lot of people are quite cross about this. One of the main problems they have is that building query strings in this way opens your application up to SQL injection attacks. This is a common pattern in PHP applications written by new developers.
To achieve the same task with Simple.Data, you can do this:
var db = Database.Open(); // Connection specified in config. var user = db.Users.FindByNameAndPassword(name, password);
That’s pretty neat, right? So, did we have to generate the Database class and a bunch of table classes to make this work?
No.
In this example, the type returned by Database.Open() is dynamic. It doesn’t have a Users property, but when that property is referenced on it, it returns a new instance of a DynamicTable type, again as dynamic. That instance doesn’t actually have a method called FindByNameAndPassword, but when it’s called, it sees “FindBy” at the start of the method, so it pulls apart the rest of the method name, combines it with the arguments, and builds an ADO.NET command which safely encapsulates the name and password values inside parameters. The FindBy* methods will only return one record; there are FindAllBy* methods which return result sets. This approach is used by the Ruby/Rails ActiveRecord library; Ruby’s metaprogramming nature encourages stuff like this.
[A section which wrongly implied that Microsoft.Data did not support parameterized queries has been removed here.]
More examples…
Inserting using named parameters:
db.Users.Insert(Name: "Steve", Password: "Secret", Age: 21);
Executing an update:
db.Execute("update Stock set Current = Current - 1 where ProductId = ?", productId);
(Planned) inserting/updating using an object:
db.Stock.Insert(newStockItem); db.Stock.UpdateByProductId(stockItem);
Roadmap
This project is more of a constructive criticism of Microsoft’s preview than anything else. I’m going to develop it a bit further (for example, the “planned” syntax support above), mainly as an exercise. One thing I’d maybe like to explore further is whether it can be used as a layer over NoSQL stores as well as RDBMS.
The project is hosted at http://github.com/markrendle/Simple.Data for anybody who wants to download and play with it.
I’d really appreciate feedback, so please do use the comments if you’ve got any criticisms, suggestions or encouragement to express.
Y'know what - yeah. This is how Microsoft.Data *should* work, assuming it's needed (to attract the PHP crowd it probably is). Nice work.
ReplyDeleteLooks interesting, but I think the biggest issue with the Microsoft.Data stuff was the sample code advocated string concatenation over parameters (as has sample Microsoft code for a long long time). Microsoft.Data can use parameters in a similar manner to you do in your example (using @0, @1 rather than just ?), and on the flip side, a "novice" could still use string concatenation with your db.Execute statement and shun parameters all together. Unfortunately there's no saving some people from themselves, but we *should* make sure samples are free from crap like string concatenation for SQL statements.
ReplyDeleteNot sure if it already supports it but it would be nice to have the same syntax for update as you do in your insert example (with the first parameter(s) being the where clause), and for both insert and update to be able to handle named params, anonymous type abuse, real types and dictionaries to cover all eventualities :-)
Simple, clean explanation. Would be nice to see an example JOIN using object. Keep the info coming.
ReplyDeleteI have read your post, downloaded the code, and walked through some of the code for Database and DynamicTable.
ReplyDeleteI am now equal parts 'fascinated', 'in awe' (of both yourself and DynamicObject) and 'scared' (Of the mess that is potentially possible using DynamicObject)
Your idea and your code both rock. I have however decided that there are some Devs I know, who will never be able to handle the responsibility that comes with 'DynamicObject'.
And these same Devs are the ones I'll have to clean up after.
Let's try to keep this Dynamic thing between just you and me eh ? :D
@Rory
ReplyDeleteI'm not sure if you're being ironic here?
@Grumpydev: Maybe db.Stock.UpdateById(Id: 1, Current: 42, ...)?
ReplyDeleteI'm considering adding support for reflecting over actual classes, whether anonymous types or concrete, so that developers could gradually move toward more OO-style code.
Ideas: db.Users.FindByNameAndPassword(name, password);
And the implementation uses Dictionary internally, so exposing it as a public option wouldn't be too hard.
@Rory If it was a toss up between dynamic or sql strings, I know what I would pick...
ReplyDeleteBeer and a shotgun...
In all seriousness, I think this is a far better idea, it's a shame no one will listen...
In the MS ecosystem, no one can hear you scream
@Grumpydev:
ReplyDeleteSorry, that idea syntax should have been
db.Users.FindByNameAndPassword<User>(name, password);
My reaction to Microsoft.Data.dll included, "You know, based on the target audience, this thing should *mandate* parameters, not just *allow* them." FxCop/Code Analysis already does this for standard SqlCommands, though I suspect most people don't turn it on. Anything targeted at non-tech users should make this on by default.
ReplyDelete@Mark Rendle in your comment to @GrumpyDev you use the syntax:
ReplyDeletedb.Users.FindByNameAndPassword(name, password);
Why would the generic be needed? Isn't it automatically implied by the fact that you're calling the method from db.Users? Wouldn't the generic automatically be inferred?
Thus it would make more sense to have:
db.Users.FindByNameAndPassword(name, password)
Just my 2c.
I'm assuming that I could use:
Database["ThisApplicationDb"].Open()
To get the correct database connection and open it?
I have to agree with Rory's statement, however ironic it was (or wasn't). This is wonderful - in the right hands. I always get nervous when I start having to coding by convention rather than by explicit definition.
The .NET implementation of MVC gives me the same sense of amazement, awe and anxiety. I love it, but it scares me that I don't really understand what's going on any more :P
@BenAlabaster
ReplyDeleteThe default behaviour for the non-generic FindBy* method is to return a dynamic instance, which is in keeping with the entry-level approach and matches what Microsoft's offering does.
In theory you would use Database.OpenConnection or Database.OpenFile to get specific database connections.
@Craig
ReplyDeleteI considered trying to reject SQL with embedded values, but it would require fully parsing the statement to determine if they're doing that. The other option is to try and provide full functionality without any SQL at all, which would be pretty complicated and a potential barrier to adoption.
My hope would be that by making as much stuff as possible very easy to do without string-building, people won't be tempted to use that technique.
One of my own comments on Microsoft.Data called it "a magic wand without a safety catch". I think Simple.Data is a magic wand WITH a safety catch, but you can still turn it off if you really want to. :)
I took a quick browse through the code, and it seems like your dynamic tables and queries doesn't support databases with case sensitive table and field names, nor does it handle table and field names with spaces in them.
ReplyDeleteThis is definitely an interesting step in the right direction. I think you should throw it up on Codeplex, eh? I'd love to help.
ReplyDelete@Ants: Good points. The relationship between object names in the database and method and property names in the code is something I am going to explore.
ReplyDeleteI'm 100% behind the crusade against feeding the uneducated masses SQL Injection vulnerabilities, but even though your approach is innovative, this isn't quite it imho.
ReplyDeleteThe people Microsoft.Data is targeting can just about figure out writing the sql logic required - moving them away from sql by putting dynamics on top of the whole thing will probably make them use your Execute() all the time. And I can see tons of larger sql constructs that will be impossible or generate a train wreck when using methods.
SQL and parameters seems like the only viable option. The solution should be so close to SQL as possible - it's a very sensible design goal - yet prevent injection. If you took your parameter approach and added features like:
- Naming of parameters - instead of ? or {0} to make it easy for the noobs
- No values in the template sql string – never ever
- Reflection of data layer to determine types - on strings, handle string delimiters
- Validation the inputs match determined types
You could then do your queries like:
var result1 = db.Query("select * from products where UnitsInStock < {MaxUnits}",
MaxUnits: 20 );
var result2 = db.Query("select * from products where Title like {KeywordMatch}",
KeywordMatch: "%"+keyword+"%");
var result3 = db.Query("select * from users where password = {Password} and user = {User}",
Password: "secret",
User: "root");
The queries could be arbitrarily complex.
Examples of queries to reject could be:
var result1 = db.Query("select * from products where UnitsInStock < 20"); // no values allowed!
var result2 = db.Query("select * from products where Id like {Keyword}",
Keyword: keyword); // wrong type – could be lax and allow casts
var result3 = db.Query("select * from users where password = {Password} and user = {User}",
Password: "secret"); // User not defined
A little extra benefit would be that we would be freed from the ugly string concatenations dynamic sql always seems to end up with :)
Marcus Wendt
http://twitter.com/CompositeC1
Nice work. How much can I do with this library before I have to drop back to raw SQL? Where does it break down? Does it handle joins (etc)?
ReplyDeleteThis is neat, it's good to see people coming up with improvements and alternatives. This is how we all move forward ;-)
ReplyDeleteOne comment though, I think it's important to understand and realize the importance SQL have had on our industry for quite a while and the massive amount of SQL-domain knowledge and experience that Database Administrators and others have. By wrapping everything up in C#-syntax, it will please those of us who are experienced developers, but it reduces the usefulness of existing SQL-competency and experience.
How about combining this with the dynamic mapping of AutoMapper? That would be nice.
ReplyDelete