Py's Sty: 2 very good reasons why you should never pass entities to the client.

So you are building a web application with Entity Framework on the back end. You've got the code to load the entities now you need to get that data to the client. Then the question enters your head, "I've already loaded this data, why not simply return the entity?" Why bother creating a view model when it's just going to contain many of the fields you've already got here? Well, here's 2 very good reasons why you shouldn't, and if you have, you should seriously consider changing it.

1. Security

If you are passing an entity to the client, and critically, if you are accepting that entity back from the client, attaching it to a context, and saving it, you have opened a massive, gaping security hole in your application.

As an example lets say I have a system that a user logs into and displays a list of debts that user currently has, and allows them to do something innocent like updating a status or record a new note. The debt entity might look something like:

We have a model which contains a list of debts and a view that lets a user tick off a debt and mark it as accepted. We might only display the debt amount and maybe a couple extra details on the screen, however when you look at the debugger where the model might be exposed such as our action to accept the debt:

Since we passed the entity, and our serializer diligently serialized all properties and children we expose all of that data to the client. Hence I might have notes that aren't meant to be seen by the client, yet here they are.

The action handler looks something like:

Now this is a very basic example. In reality we could have a view that allows for a more meaningful edit against an entity. When we click the Accept button the record gets updated as accepted, all seems good. Except...

I don't want to accept a debt of $5000. I want to make that $1.

Using the debug tools, with a break-point before that call to UpdateDebt, I can edit the debt payload to change the Amount from "5000" to "1" then resume the execution.

Suddenly my data shows that the debt was accepted, but now has an amount of $1?!

Heck, I don't even want to pay $1. Let's make it someone else's problem. Using the same break-point we can see that it's assigned to a Customer with ID = 1. Let's change those Customer ID references to 2, maybe update that customer name to "Fred" and anything else that might identify me. Presto, that debt is now someone else's problem. What happened?

Let's have a look at the UpdateDebt method:

Part of the attraction I fear developers have with Entity Framework and reattaching entities is that it will save an extra read to the database. Just attach it and save. It's also only 3 lines of code! Win. There are libraries such as GraphDiff that even help dive through related entities to attach them and mark them as modified. Doing this violates the cardinal rule of client-server software development. "Trust nothing that the client sends you!"

Now you might argue that it doesn't make sense to have a client update a single field like that. That an "accept" action would simply take a debt ID and do that behind the scenes. True, but any update action in a system that takes an entity and attaches it is vulnerable to this tampering, and importantly, any entity associated with that entity via it's references.

Even using a view model is not immune to this form of tampering. If you use a tool like Automapper to map entities to view models, but then use Automapper to map view model's back to entities you can expose yourself to the same unintended changes, though to a lesser degree provided those view models limit the data visible.

In any case, *never* trust the data coming back from the client. Even if it's an action that just accepts a debt ID, verify that the debt ID actually belongs to the currently authenticated user. *always*. If it isn't, log the user out immediately, and log the attempt to notify administrators. You either have a bug that should be fixed, or someone actively trying to circumvent or compromise the system.

2. Performance

Now point #1 should be enough, but if you've chosen to pass entities anyways, and are validating everything coming back from the client, the second reason you should reconsider using entities is that sending entities incur a performance cost, and are easily susceptible to serious performance issues. One issue that I see commonly tripping up users around entity framework is lazy loading where they pass entities to the client. When MVC / Web API go to send entities back to the client, the serializer touches each property on the entity. Where that property is a lazy-loadable proxy method, that touch triggers off a lazy-load call. Each of those loaded entities then have each of their properties touched to be serialized, triggering more lazy loads. What is worse is that where you might have loaded 20 rows for search results, the lazy loads for each of those 20 entities and their children will be performed 1 by 1, not using the criteria you used to load the 20 parent entities as if would have been the case if you used .Include() to eager load them. The simple work-around is to eager load everything, but that means eager loading absolutely everything associated with that entity, whether you intend to display it or not.

Even in this case we are now faced with a situation where we are loading dozens of columns of data where we might only want to display a handful. That is extra work and memory for your database to provide those unnecessary columns, little to no optimization the database can do with indexes, extra data across the wire to your app server, then extra data across the wire to the client, and extra memory needed on both the app server and client. How much faster would it be responding if it just queried and returned that handful of columns? It's noticeable.

Addressing these issues

The simple answer to avoid these issues? Use view-models. A view model is a POCO (Plain 'ol C# Object) that contains just the data you want your view to see, and the IDs you will need to perform any action from that view. Automapper now provides support for the IQueryable interface and Linq integration via the .ProjectTo() method which can pass through the Select criteria that the database will need to pass back just what is needed for the view-model. The client doesn't see anything it shouldn't, and the data is optimal for performance by not sending anything that isn't needed.

When it comes to getting those view models back: Trust nothing. Every operation and update absolutely must be authenticated and have all applicable IDs resolved back to the user. Any deviation should be logged and generate an alert to administrators.

Conclusion

Hopefully this helps convince developers to reconsider ideas to pass entities between their controllers and views / API consumers. If you are working on a system that does pass entities around, at least at a minimum be sure to test your client with the debugger tools thoroughly to see exactly what data is being exposed, and ensure that every screen and action is properly guarding against JSON data tampering.

Py's Sty

Friday, December 14, 2018

2 very good reasons why you should never pass entities to the client.

1. Security

2. Performance

Addressing these issues

Conclusion

No comments:

Post a Comment