escaping variables design

Hi all,

I think this subject has already been handled in this forum, but I've got a few doubt about what's the better way to handle escaping of values when creating SQL commands.

As reported in this treads:

http://forum.java.sun.com/thread.jspa?forumID=48&threadID=586328

http://forum.java.sun.com/thread.jspa?forumID=48&threadID=543270

the use of PreparedStatement against Statement is better for string escaping, but of course you should use the setString method. Now, as I guess many of us, I wrote a database wrapper class that includes a method

public ResultSet query(String sqlString);

that provides me results for query in plain SQL strings, such as:

query("SELECT * FROM TABLE1");

Of course the problem arises when I've got to pass values in the query, where I should use PreparedStatement. I choose to implement the above code to let the other code do not get dirty with all the same call to create a statement, prepare it and then execute getting the result set. In my head it's cleaner to submit the sql string than to use all around the code the preparedstatement. Moreover I often embed into some objects (e.g., a tree node) the sql query that has generated it, so that the element is able to auto-update itself.

Now it seem I have to rewrite code handling preparedStatement, since my input can include not rightly escaped strings, and I've seen there are other developers that use custom methods to escape fields. So my doubts are: do you think it is better to use preparedStatement everywhere a parametric query happens (most in general everywhere, since code can change one day!) or to prepare a custom method to escape strings and then use the plain sql string? This method could be maybe more efficient for a one shot query, and allows me to store in the object the query that has generated it, while I dind't find a way to convert a preparedStatement (already executed) into its correspondant sql string.

Any clue or documentation to read regard this choice?

Thanks,

Luca

[2149 byte] By [cat4hirea] at [2007-11-26 14:18:50]
# 1

Either pass in a PreparedStatement, use a framework that does this for you, or (if absolutely necessary), allow for passing of a query string and parameters thus:

public ResultSet query(final String query, final Object[] params, final Class[] types) {

...

// prepare query

// Assign parameters by type

}

I don't recommend the above, but it's better than what you're currently doing.

dcmintera at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 2

> public ResultSet query(String

> sqlString);

> that provides me results for query in plain SQL

> strings, such as:

> query("SELECT * FROM TABLE1");

I don't know if I like a design like this much, unless this class is used only within a persistence package. By allowing users to pass in SQL, you're forcing them to know about SQL and relational databases. If I want an Order object, I should be able to ask for one without having to know all about the schema and relational details. That's what I want my persistence layer to hide from me. That's what you're not doing.

> Of course the problem arises when I've got to pass

> values in the query, where I should use

> PreparedStatement. I choose to implement the above

> code to let the other code do not get dirty with all

> the same call to create a statement, prepare it and

> then execute getting the result set. In my head it's

> cleaner to submit the sql string than to use all

> around the code the preparedstatement. Moreover I

> often embed into some objects (e.g., a tree node) the

Why a tree node? Something that specific ties you into using a tree. Where's the Order class? You're coupling every layer from front to back this way.

> Now it seem I have to rewrite code handling

> preparedStatement, since my input can include not

> rightly escaped strings, and I've seen there are

> other developers that use custom methods to escape

> fields. So my doubts are: do you think it is better

> to use preparedStatement everywhere a parametric

> query happens (most in general everywhere, since code

> can change one day!) or to prepare a custom method to

> escape strings and then use the plain sql string?

There's little object-orientation about any of this.

If you're trying to design a general purpose JDBC framework, you'd do well to look at a good one like that in Spring: http://www.springframework.org

> This method could be maybe more efficient for a one

> shot query, and allows me to store in the object the

> query that has generated it, while I dind't find a

> way to convert a preparedStatement (already executed)

> into its correspondant sql string.

There isn't such a thing, because you bind variables each time.

> Any clue or documentation to read regard this choice?

Have a look at Spring JDBC. It'll give you some ideas.

I'd recommend that you start thinking about how to layer your application so SQL and relational ideas are restricted to one package.

%

duffymoa at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 3

> query("SELECT * FROM TABLE1");

And my response would be - huh?

1. That expession is not and can not use escaping.

2. It is generally considered poor practice to use "*" in queries. Doing so means that it becomes hard to touch the database itself without the real possibility that other code is impacted.

3. Escaping is not just with SetString().

> ResultSet query(String sqlString);

That scares me. Returning result sets often means that it is escaping the scope of the statement/connection.

jschella at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 4

> 2. It is generally considered poor practice to use

> "*" in queries.

uh, no it's not. people do it all the time...when they want all records

> Doing so means that it becomes hard

> to touch the database itself without the real

> possibility that other code is impacted.

huh?! the only thing impacted is the database

SoulTech2012a at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 5

> uh, no it's not. people do it all the time...when

> they want all records

That doesn't make it a good idea.

> > Doing so means that it becomes hard

> > to touch the database itself without the real

> > possibility that other code is impacted.

>

> huh?! the only thing impacted is the database

Nope. Code that uses the wildcard query is likely to rely upon column ordering. Alterations to the table in the database will then break that code if it changes the column ordering.

D.

dcmintera at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 6

> > uh, no it's not. people do it all the time...when

> > they want all records

>

> That doesn't make it a good idea.

So you're saying if I have 72 columns in my table that I should do this:

select col1, col2, col3, col4, etc?! That's just silly

> > > Doing so means that it becomes hard

> > > to touch the database itself without the real

> > > possibility that other code is impacted.

> >

> > huh?! the only thing impacted is the database

>

> Nope. Code that uses the wildcard query is likely to

> rely upon column ordering. Alterations to the table

> in the database will then break that code if it

> changes the column ordering.

you're right! but not necessarily going to be an issue, since JDK 1.4 (i would hope) most people are accessing columns via name, not to mention that almost ANY time you change the db structure you're going to have to test and possibly rework some of the code

SoulTech2012a at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 7

> > 2. It is generally considered poor practice to use

> > "*" in queries.

>

> uh, no it's not. people do it all the time...when

> they want all records

DBA Rule #1:

Never, ever use 搒elect *?within your program. Any question, please see rule #1.

I'm not sure what the original question was about, but I can say without a doubt that using "select *?is not done all the time in PROGRAMS. In fact, it is almost universally disallowed through programming standards in all languages. I have never worked in a shop where this was allowed and if they did I would immediately work to change that practice. It's a bad if not horrible practice to use "select *" in a program.

Since the first simple example of where this breaks down didn't help (and it was an extremely good example), I don't know if a 2nd will help either but:

Let抯 say you have a table with 72 columns and as a DBA I must add and populate 10 new columns in that table each column holding 4000 bytes in preparation for new functionality that will be added in 3 months. It is common practice for a DBA to add or reorder columns within a table without worrying about the application. As a DBA I wouldn抰 give this practice a second thought.

Because of "select *" in your production programs, you will be (suddenly) implicitly pulling across and additional 40,000 bytes for each row you get. This will probably destroy performance due to the additional network load and could cause out of memory condition within your programs all within a production environment. It would also put added strain on the database server while providing you no additional functionality. You can avoid this costly mistake by simply NOT using "select *" in your code.

WorkForFooda at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 8
WFF, you're awesome. I love seeing your answers here. What terrific replies. Very educational. I like getting the DBA view.Sincerely, %
duffymoa at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 9

> > > uh, no it's not. people do it all the time...when

> > > they want all records

> >

> > That doesn't make it a good idea.

>

> So you're saying if I have 72 columns in my table

> that I should do this:

> select col1, col2, col3, col4, etc?! That's just

> silly

>

Say I have an ongoing production system.

This is what version 1 of the database looks like

Balance: customerId, Amt

This is what version 2 of the database looks like for NEW systems.

Balance: customerId, Category, Amt

This is what version 2 of the database looks like for EXISTING systems.

Balance: customerId, Amt, Category

The above is due to the way that the database applies alters versus the way that you create a database from scratch.

Using "*" guarantees that in the above two cases that the field order will not be the same.

> > > > Doing so means that it becomes hard

> > > > to touch the database itself without the real

> > > > possibility that other code is impacted.

> > >

> > > huh?! the only thing impacted is the database

> >

> > Nope. Code that uses the wildcard query is likely

> to

> > rely upon column ordering. Alterations to the

> table

> > in the database will then break that code if it

> > changes the column ordering.

>

> you're right! but not necessarily going to be an

> issue, since JDK 1.4 (i would hope) most people are

> accessing columns via name, not to mention that

> almost ANY time you change the db structure you're

> going to have to test and possibly rework some of the

> code

I don't use names.

1. They are probably slower.

2. There is no guarantee that a driver can extract named fields from a row if they are not in order. Sometimes fields don't even have names.

As for reworking I am not sure how that is relevant.

jschella at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 10

I guess the problem of using '*' can be solved using a view over the table. While the table can change its structure, due to DBA commands, the view (that will perform a column-name based select) will keep the columns in the right place and the right order.

I always do this on my database, allowing my clients to perform a select * on the view (never on the table).

Luca

cat4hirea at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 11

I guess I've misexplained how my code works: I've got a group of business objects, let's say an Order, a OrderRow, etc. I place into such objects the sql query, and that object into the TreeNode (so the tree node includes indirectly the sql query) and only the business object requires services such as the query or update using SQL.

Just another design question: suppose you've got three kinds of tables (let's say order, orderrow, people). Do you think it's better to build three business objects that can wrap each record/table or (if possible) a generic business object with a type within it that specifies if it refers to which table?

Thanks,

Luca

cat4hirea at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 12

> Just another design question: suppose you've got

> three kinds of tables (let's say order, orderrow,

> people). Do you think it's better to build three

> business objects that can wrap each record/table or

> (if possible) a generic business object with a type

> within it that specifies if it refers to which

> table?

I don't necessarily map the table schema and object design 1:1. I think objects can be finer grained than a given table structure.

In the case of the three tables you cited I would have Order, OrderItem, and Person objects.

%

duffymoa at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 13
there might be 1:1, 1:m relationships between those objects. (e.g., An Order might have many OrderItem objects and a Person), but there would still be three separate objects in the OO model. %
duffymoa at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...
# 14

> I guess the problem of using '*' can be solved using

> a view over the table. While the table can change its

> structure, due to DBA commands, the view (that will

> perform a column-name based select) will keep the

> columns in the right place and the right order.

> I always do this on my database, allowing my clients

> to perform a select * on the view (never on the

> table).

This does not solve problems related to select *, it only increases them. In SQL Server for example, if you were to

CREATE VIEW TBL1V AS SELECT * FROM TBL1

// and then later

ALTER TABLE TBL1 DROP COL1

ALTER TABLE TBL1 ADD COL2 INT

View TBL1V will now return the second column data under the DROPPED column name without errors or warnings. Very bad and very difficult to debug when it occurs. The issue is that it is not intuitively obvious that rebuilding of this view is required.

There may be times when select * is appropriate in a view, but it is a bad practice in general. Some database will alert you to these types of view inconsistencies but only if you know how to look. In addition views have limited and confusing join capabilities and performance can be compromised due to excessive temp space usage if not implemented with care.

Views have a place and when used correctly are wonderful. However, I wouldn't recommend this practice of enabling views 1:1 on every table with "Select *", and without going into detail, I wouldn抰 recommend enabling 1:1 views on every table for any reason.

WorkForFooda at 2007-7-8 2:09:39 > top of Java-index,Database Connectivity,Java Database Connectivity (JDBC)...